Excel VBA – 无需加载图片即可获取博客文章

我正在尝试使用以下代码获取博客文章(仅文本):

Function extractPostBody(myURL As String) As String Dim IE As New InternetExplorer IE.Visible = True IE.navigate myURL On Error GoTo 0 Do DoEvents Loop Until IE.readyState = READYSTATE_COMPLETE Dim Doc As HTMLDocument Set Doc = IE.Document For i = 0 To Doc.getElementsByTagName("p").Length - 1 If InStr(1, Doc.getElementsByTagName("p")(i).innerText, "Tags: ") > 0 Then Exit For End If PostBody = PostBody & vbNewLine & Doc.getElementsByTagName("p")(i).innerText Next i IE.Quit extractPostBody = PostBody End Function 

一旦文本被检索,我将它分配给一个单元格,然后使用分割函数来计算提取的文本中的单词数量。 但是,代码工作,对于有很多图像的网站,代码等待,直到这些图片加载,这大大减缓执行。

有没有另外一种方法可以让文本离开博客而不用等待图片加载?

编辑:

使用Jeeped的build议,我使用下面的代码,我从另一个StackOverflowpost但是似乎无法回到它给作者的功劳:

 Function ScrapeWebPage(ByVal URL As String) Dim HTMLDoc As New HTMLDocument Dim tmpDoc As New HTMLDocument Dim PostBody As String Dim i As Integer, row As Integer Dim ws As Worksheet Set ws = ThisWorkbook.Sheets("Sheet1") Set XMLHttpRequest = CreateObject("MSXML2.XMLHTTP") XMLHttpRequest.Open "GET", URL, False XMLHttpRequest.send While XMLHttpRequest.readyState <> 4 DoEvents Wend With HTMLDoc.body 'Set HTML Document .innerHTML = XMLHttpRequest.responseText Set ListItems = .getElementsByTagName("p") 'Let's process each data of the list items For Each li In ListItems PostBody = PostBody & vbNewLine & li.innerText Next End With ScrapeWebPage = PostBody End Function 

这工作,但代码现在返回一个captcha消息,显然我不能填满了,因为我无法显示IE浏览器。 或者我可以吗?