VBA Web Scrape(getelementsbyclassname)

我正在试图抓取下列链接的右侧窗格给出的VBA课程列表“www.tutorialspoint.com/vba/index.htm”

但是由于一些错误,我无法清除这个列表:

Sub tutorailpointsscrap() Dim ie As InternetExplorer Set ie = New InternetExplorer With ie .navigate "https://www.tutorialspoint.com//vba/index.htm" .Visible = True Do While ie.readyState <> READYSTATE_COMPLETE DoEvents Loop End With Dim html As HTMLDocument Set html = ie.document Dim ele As IHTMLElement Dim lists As IHTMLElementCollection Dim row As Long Set ele = html.getElementsByClassName("nav nav-list primary left-menu") Set lists = ele.getElementsByTagName("a") row = 1 For Each li In lists Cells(row, 1) = li.innerText row = row + 1 Next ie.Quit End Sub 

包含数据的HTML是:

 <ul class="nav nav-list primary left-menu"> <li class="heading">VBA Tutorial</li> <li><a href="/vba/index.htm" style="background-color: rgb(214, 214, 214);">VBA - Home</a></li> <li><a href="/vba/vba_overview.htm">VBA - Overview</a></li> <li><a href="/vba/vba_excel_macros.htm">VBA - Excel Macros</a></li> <li><a href="/vba/vba_excel_terms.htm">VBA - Excel Terms</a></li> <li><a href="/vba/vba_macro_comments.htm">VBA - Macro Comments</a></li> <li><a href="/vba/vba_message_box.htm">VBA - Message Box</a></li> <li><a href="/vba/vba_input_box.htm">VBA - Input Box</a></li> <li><a href="/vba/vba_variables.htm">VBA - Variables</a></li> <li><a href="/vba/vba_constants.htm">VBA - Constants</a></li> <li><a href="/vba/vba_operators.htm">VBA - Operators</a></li> <li><a href="/vba/vba_decisions.htm">VBA - Decisions</a></li> <li><a href="/vba/vba_loops.htm">VBA - Loops</a></li> <li><a href="/vba/vba_strings.htm">VBA - Strings</a></li> <li><a href="/vba/vba_date_time.htm">VBA - Date and Time</a></li> <li><a href="/vba/vba_arrays.htm">VBA - Arrays</a></li> <li><a href="/vba/vba_functions.htm">VBA - Functions</a></li> <li><a href="/vba/vba_sub_procedure.htm">VBA - SubProcedure</a></li> <li><a href="/vba/vba_events.htm">VBA - Events</a></li> <li><a href="/vba/vba_error_handling.htm">VBA - Error Handling</a></li> <li><a href="/vba/vba_excel_objects.htm">VBA - Excel Objects</a></li> <li><a href="/vba/vba_text_files.htm">VBA - Text Files</a></li> <li><a href="/vba/vba_programming_charts.htm">VBA - Programming Charts</a></li> <li><a href="/vba/vba_userforms.htm">VBA - Userforms</a></li> </ul> 

如果我正确地解决你的问题,你需要以下。 HTH

 Dim lists As IHTMLElementCollection Dim anchorElements As IHTMLElementCollection Dim ulElement As HTMLUListElement Dim liElement As HTMLLIElement Dim row As Long Set lists = html.getElementsByClassName("nav nav-list primary left-menu") row = 1 For Each ulElement In lists For Each liElement In ulElement.getElementsByTagName("li") Set anchorElements = liElement.getElementsByTagName("a") If anchorElements.Length > 0 Then Cells(row, 1) = anchorElements.Item(0).innerText row = row + 1 End If Next liElement Next ulElement 

结果在这(所有名单)

 VBA - Home VBA - Overview VBA - Excel Macros VBA - Excel Terms VBA - Macro Comments VBA - Message Box VBA - Input Box VBA - Variables VBA - Constants VBA - Operators VBA - Decisions VBA - Loops VBA - Strings VBA - Date and Time VBA - Arrays VBA - Functions VBA - SubProcedure VBA - Events VBA - Error Handling VBA - Excel Objects VBA - Text Files VBA - Programming Charts VBA - Userforms VBA - Quick Guide VBA - Useful Resources VBA - Discussion Developer's Best Practices Questions and Answers Effective Resume Writing HR Interview Questions Computer Glossary Who is Who 

如果你想只是第一个列表的锚点的内容,就像这样。

 For Each liElement In lists.Item(0).getElementsByTagName("li") Set anchorElements = liElement.getElementsByTagName("a") If anchorElements.Length > 0 Then Cells(row, 1) = anchorElements.Item(0).innerText row = row + 1 End If Next liElement 

这个怎么样:

 Sub TutorailsPoint() Const URL = "https://www.tutorialspoint.com//vba/index.htm" Dim http As New MSXML2.XMLHTTP60, html As New HTMLDocument Dim topics As Object, posts As Object, topic As Object Dim x As Long x = 2 http.Open "GET", URL, False http.send html.body.innerHTML = http.responseText Set topics = html.getElementsByClassName("nav nav-list primary left-menu") For Each posts In topics For Each topic In posts.getElementsByTagName("a") Cells(x, 1) = topic.innerText x = x + 1 Next topic Next posts End Sub 

这是另一种你可能喜欢的方式。 它会给你只是教程,没有别的:

 Sub TpData() Const URL = "https://www.tutorialspoint.com//vba/index.htm" Dim http As New XMLHTTP60, html As New HTMLDocument, post As Object With http .Open "GET", URL, False .send html.body.innerHTML = .responseText End With For Each post In html.getElementsByClassName("left-menu")(1).getElementsByTagName("li") With post.getElementsByTagName("a") If .Length Then i = i + 1: Cells(i, 1) = .item(0).innerText End With Next post End Sub