VBA HTML Scraping – 复杂表格中的“.innertext”

所有,

我创build了以下模块来从以下地址中提取单个值(伦敦房价的1%变化):

https://www.hometrack.com/uk/insight/uk-cities-house-price-index/

具体的值嵌套在以下代码中:

请参阅高亮显示的行

下面的VBA代码是我的拼图尝试。 我可能错了,觉得我非常接近捕捉价值 – 但代码将无法正常工作。

有没有人知道我在哪里错了? 它不显示错误消息,但也不输出任何值。

Sub HousePriceData() Dim wb As Workbook Dim ws As Worksheet Dim TxtRng As Range Dim ie As Object Dim V As Variant Dim myValue As Variant Set ie = CreateObject("INTERNETEXPLORER.APPLICATION") ie.NAVIGATE "https://www.hometrack.com/uk/insight/uk-cities-house-price-index/" ie.Visible = False While ie.ReadyState <> 4 DoEvents Wend Set wb = ActiveWorkbook Set ws = wb.Sheets("Input") Set TxtRng = ws.Range("C15") Set myValue = ie.document.getElementById("cities-index-table").getElementsByTagName("tr")(7).g‌​etElementsByTagName("td")(5) TxtRng = myValue.innerText End Sub 

尝试使用XHR和原语parsing,而不是尴尬的IE

 Sub Test() Dim strUrl As String Dim strTmp As String Dim arrTmp As Variant strUrl = "https://www.hometrack.com/uk/insight/uk-cities-house-price-index/" With CreateObject("MSXML2.XMLHttp") .Open "GET", strUrl, False .Send "" strTmp = .ResponseText End With arrTmp = Split(strTmp, ">London</a></td>", 2) strTmp = arrTmp(1) arrTmp = Split(strTmp, "<td>", 7) strTmp = arrTmp(6) arrTmp = Split(strTmp, "</td>", 2) strTmp = arrTmp(0) ThisWorkbook.Sheets("Input").Range("C15").Value = strTmp End Sub 

尝试使用这个

 Dim Engmt As String Engmt = "ERRORHERE" On Error Resume Next Engmt = Trim(ie.document.getElementById("cities-index- table").getElementsByTagName("tr")(12).g‌​etElementsByTagName("td")(4).innerText) On Error GoTo 0 If Engmt = "ERRORHERE" Then TxtRng.Value = "ERROR" Else TxtRng.Value = Engmt End If 

试试这个。 它也将给你正是你所需要的。

 Sub HousePriceData() Const URL = "https://www.hometrack.com/uk/insight/uk-cities-house-price-index/" Dim http As New XMLHTTP60, html As New HTMLDocument Dim htmla As Object, tRow As Object, tCel As Object With http .Open "GET", URL, False .send html.body.innerHTML = .responseText End With Set htmla = html.getElementsByTagName("table")(0) For Each tRow In htmla.Rows For Each tCel In tRow.Cells c = c + 1: Cells(x + 1, c) = tCel.innerText Next tCel c = 0 x = x + 1 Next tRow End Sub