从xml查询到excel。 类似于Google Spreadsheet上的importxml

我正在尝试使用类似于Google Spreadsheet的importxml的excel函数。

这里是代码:

Function GetData(sURL As String, sItem As String) As Variant Dim oHttp As New MSXML2.XMLHTTP60 Dim xmlResp As MSXML2.DOMDocument60 Dim result As Variant On Error GoTo EH 'open the request and send it oHttp.Open "GET", sURL, False oHttp.Send 'get the response as xml Set xmlResp = oHttp.responseXML ' get Item GetData = xmlResp.SelectNodes(sItem).Item(0).Text ' Examine output of these in the Immediate window Debug.Print sName Debug.Print xmlResp.XML CleanUp: On Error Resume Next Set xmlResp = Nothing Set oHttp = Nothing Exit Function EH: GetData = CVErr(xlErrValue) GoTo CleanUp End Function 

以下公式将返回192799976.00

 =GetData("http://api.eve-central.com/api/marketstat?typeid=24692&usesystem=30000142","//sell/min") 

这个公式将返回34

 =GetData("http://util.eveuniversity.org/xml/itemLookup.php?name=Tritanium","//itemLookup/typeID") 

我越来越#VALUE! 当试图从这个网站拉数据,但它应该是179美元。

  =GetData("http://www.hotels.com/hotel/details.html?current-location=Chicago%2C+Illinois%2C+United+States+of+America&arrivalDate=10%2F30%2F14&departureDate=10%2F31%2F14&searchParams.rooms.compact_occupancy_dropdown=compact_occupancy_1_2&rooms_=1&rooms%5B0%5D.numberOfAdults=2&children%5B0%5D=0&searchParams.landmark=&hotelId=113158&roomno=1&srsReport=HomePage%7CAutoR%7CHOTEL%7Cthe++drake+Chicago%2C+Illinois%2C+United+States+of+America%7C0%7C0%7C0%7C1%7C1%7C1%7C113158&resolvedLocation=HOTEL%3A113158%3ASRS%3AUNKNOWN&pageName=HomePage&destinationId=&rooms.compact_occupancy_dropdown=compact_occupancy_1_2&landmark= ","//span/strong") 

编辑1:试图把@portlandrunner的sub转换成函数,但是excel表示这个函数是无效的。

  Function extract(URL As String) As Variant Dim IE As InternetExplorer Dim html As HTMLDocument Set IE = New InternetExplorerMedium IE.Visible = False IE.Navigate2 URL ' Wait while IE loading Do While IE.Busy Application.Wait DateAdd("s", 1, Now) Loop Set html = IE.Document Set spanElement = html.getElementsByTagName("span") For Each spn In spanElement If Left(spn.innertext, 1) = "$" Then extract = spn.innertext Exit For End If Next spn 'Cleanup IE.Quit Set IE = Nothing End Function 

最后一个例子中的URL只返回HTML而不是XML

您可以使用IE文档通过标签或类名获取HTML元素。 以下代码将显示第一个<span>标记,其中$是$ 179。

确保你:

  1. 请参考“Microsoft Internet控制”添加
  2. 请参考“Microsoft HTML对象库”添加
  3. 根据您的IE版本,您可能需要在IE Internet选项菜单中的安全设置下禁用保护模式。

 Sub extract() Dim IE As InternetExplorer Dim html As HTMLDocument Set IE = New InternetExplorerMedium IE.Visible = False IE.Navigate2 "http://www.hotels.com/hotel/details.html?current-location=Chicago%2C+Illinois%2C+United+States+of+America&arrivalDate=10%2F30%2F14&departureDate=10%2F31%2F14&searchParams.rooms.compact_occupancy_dropdown=compact_occupancy_1_2&rooms_=1&rooms%5B0%5D.numberOfAdults=2&children%5B0%5D=0&searchParams.landmark=&hotelId=113158&roomno=1&srsReport=HomePage%7CAutoR%7CHOTEL%7Cthe++drake+Chicago%2C+Illinois%2C+United+States+of+America%7C0%7C0%7C0%7C1%7C1%7C1%7C113158&resolvedLocation=HOTEL%3A113158%3ASRS%3AUNKNOWN&pageName=HomePage&destinationId=&rooms.compact_occupancy_dropdown=compact_occupancy_1_2&landmark=" ' Wait while IE loading Do While IE.Busy Application.Wait DateAdd("s", 1, Now) Loop Set html = IE.Document Set spanElement = html.getElementsByTagName("span") For Each spn In spanElement If Left(spn.innertext, 1) = "$" Then MsgBox spn.innertext Exit For End If Next spn 'Cleanup IE.Quit Set IE = Nothing End Sub 

经testing

在这里输入图像说明


更新2

以下是我如何设置它作为一个function:

 Public Function extractURL(url As String, tag As String) As String extractURL = "" Dim IE As InternetExplorer Dim html As HTMLDocument Set IE = New InternetExplorerMedium IE.Visible = False IE.Navigate2 url ' Wait while IE loading Do While IE.Busy Application.Wait DateAdd("s", 1, Now) Loop Set html = IE.Document Set spanElement = html.getElementsByTagName(tag) For Each spn In spanElement If Left(spn.innertext, 1) = "$" Then extractURL = spn.innertext Exit For End If Next spn 'Cleanup IE.Quit Set IE = Nothing End Function 

工作表看起来像这样:

在这里输入图像说明

单元格A2公式如下所示: =extractURL(C2,B2)

注意:这个页面需要一段时间才能加载(在我的慢速连接上),有时从脚本中什么也没有返回。 如果我通过代码,并强制它等待页面完成加载,那么我总是得到正确的结果。 在IE发出信号后,可能还有一些页面脚本正在加载数据。 解决这个问题的唯一方法是增加等待时间。