从网站上的列表中获取数据以优化VBA

我正试图find一种方法来从yelp.com获取数据

我有一个电子表格,其中有几个关键字和位置。 我正在寻找基于这些关键字和位置已经在我的电子表格提取yelp列表中的数据。

我创build了下面的代码,但它似乎得到荒谬的数据,而不是我正在寻找的确切信息。

我想获得商业名称,地址和电话号码,但我所得到的是什么都没有。 如果有人能帮我解决这个问题。

Sub find() Dim ie As Object Set ie = CreateObject("InternetExplorer.Application") With ie ie.Visible = False ie.Navigate "http://www.yelp.com/search?find_desc=boutique&find_loc=New+York%2C+NY&ns=1&ls=3387133dfc25cc99#start=10" ' Don't show window ie.Visible = False 'Wait until IE is done loading page Do While ie.Busy Application.StatusBar = "Downloading information, lease wait..." DoEvents Loop ' Make a string from IE content Set mDoc = ie.Document peopleData = mDoc.body.innerText ActiveSheet.Cells(1, 1).Value = peopleData End With peopleData = "" 'Nothing Set mDoc = Nothing End Sub 

如果您在IE中右键单击并执行View Source ,显然在网站上提供的数据不是文档的.Body.innerText属性的一部分。 我注意到,dynamic提供的数据通常是这种情况,这种方法对于大多数网页抓取来说太简单了。

我在Google Chrome浏览器中打开它,检查元素,以了解我真正在寻找的内容,以及如何使用DOM / HTMLparsing器查找它。 您将需要添加对Microsoft HTML对象库的引用。

在这里输入图像说明

我想你可以得到它返回一个<DIV>标签的集合,然后在循环内用If语句检查这些类名。

我对原来的答案做了一些修改,这应该将每个logging打印在一个新的单元格中:

 Option Explicit Private Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long) Sub find() 'Uses late binding, or add reference to Microsoft HTML Object Library ' and change variable Types to use intellisense Dim ie As Object 'InternetExplorer.Application Dim html As Object 'HTMLDocument Dim Listings As Object 'IHTMLElementCollection Dim l As Object 'IHTMLElement Dim r As Long Set ie = CreateObject("InternetExplorer.Application") With ie .Visible = False .Navigate "http://www.yelp.com/search?find_desc=boutique&find_loc=New+York%2C+NY&ns=1&ls=3387133dfc25cc99#start=10" ' Don't show window 'Wait until IE is done loading page Do While .readyState <> 4 Application.StatusBar = "Downloading information, Please wait..." DoEvents Sleep 200 Loop Set html = .Document End With Set Listings = html.getElementsByTagName("LI") ' ## returns the list For Each l In Listings '## make sure this list item looks like the listings Div Class: ' then, build the string to put in your cell If InStr(1, l.innerHTML, "media-block clearfix media-block-large main-attributes") > 0 Then Range("A1").Offset(r, 0).Value = l.innerText r = r + 1 End If Next Set html = Nothing Set ie = Nothing End Sub