VBA从网站刮取数据 – 空数据错误

我正在创build一个macros来从网站上抓取数据。 我遇到的问题是当最后一个完整的页面被刮,列A没有数据,但其他列做,我收到一个运行时间1004错误。 例如,如果要刮取的总页数是6,并且列A在第5页上的最后一个条目中没有数据,则macros将擦除第5页上的所有数据,但是在尝试获取时会抛出运行时错误到第6页。第6页也有数据,但是我认为,因为A列中没有数据,所以它只是决定给出运行时间错误。 任何想法呢? 另外,用我所包含的代码,直到下一个箭头消失,macros观循环会更容易吗? 如果是这样,我怎么会这样做呢?

'Macro to query Daily Activity Search for DFB Counties 'Run Monday to pull data from Friday Sub queryActivityDailyMforFWorking() Dim nextrow As Integer, i As Integer Dim dates dates = Date - 3 Application.ScreenUpdating = False Application.DisplayStatusBar = True Do While i <= 50 Application.StatusBar = "Processing Page " & i nextrow = ActiveSheet.Cells(Rows.Count, "A").End(xlUp).row + 1 With ActiveSheet.QueryTables.Add(Connection:= _ "URL;https://www.myfloridalicense.com/delinquency_results.asp?SID=&page=" & i & "&county_1=16&county_1=21&county_1=23&county_1=32&county_1=36&county_1=41&county_1=46&county_1=53&county_1=54&county_1=57&county_1=60&county_1=66&status=R&send_date=" & dates & "&search_1.x=1", _ Destination:=Range("A" & nextrow)) '.Name = _ "2015&search_1.x=40&search_1.y=11&date=on&county_1=AL&lic_num_del=&lic_num_rep=&status=NS&biz_name=&owner_name=" .FieldNames = False .RowNumbers = False .FillAdjacentFormulas = False .PreserveFormatting = True .RefreshOnFileOpen = False .BackgroundQuery = True .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .WebSelectionType = xlSpecifiedTables .WebFormatting = xlWebFormattingNone .WebTables = "10" .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .WebSingleBlockTextImport = False .WebDisableDateRecognition = False .WebDisableRedirections = False .Refresh BackgroundQuery:=False 'autofit columns Columns("A:G").Select Selection.EntireColumn.AutoFit 'check for filter, if not then turn on filter ActiveSheet.AutoFilterMode = False If Not ActiveSheet.AutoFilterMode Then ActiveSheet.Range("A:G").AutoFilter End If i = i + 1 End With Application.StatusBar = False 'Align text left Cells.Select With Selection .HorizontalAlignment = xlLeft .VerticalAlignment = xlBottom .WrapText = False .Orientation = 0 .AddIndent = False .IndentLevel = 0 .ShrinkToFit = False .ReadingOrder = xlContext .MergeCells = False End With Loop End Sub 

我无法复制你的错误,但是我猜测它和你的nextrowvariables有关。 如果页面上的数据以空单元格nextrow ,则下一页数据的nextrow值将在上一页的数据内设置。 我想这会导致一些问题,当你添加另一个查询表,然后尝试刷新数据,因为表将重叠。 如果知道每行总是有数据,则可以通过获取其他列中的最后一行来解决此问题。 我做了一些更新,似乎对我来说很好:

  • 增加了error handling
  • 检查列A和B的底部数据行
  • 添加了一些逻辑来检查是否返回一个完整的页面,如果不是退出循环,所以你不必保持parsing空白页面
  • 格式化连接string中的date,因为我发现过去会导致问题
  • 如果你不需要,添加了选项来摆脱标题
  • 将单元格格式移出循环,因此只能执行一次

希望这可以帮助。

 Sub queryActivityDailyMforFWorking() On Error GoTo Err_queryActivityDailyMforFWorking Const RowsPerPage As Byte = 20 Const DeleteHeader As Boolean = True Dim nextrow As Integer, maxrow As Integer, i As Integer Dim dates As Date dates = Date - 3 Application.ScreenUpdating = False Application.DisplayStatusBar = True nextrow = 1 For i = 1 To 50 Application.StatusBar = "Processing Page " & i With ActiveSheet.QueryTables.Add(Connection:= _ "URL;https://www.myfloridalicense.com/delinquency_results.asp?SID=&page=" & i & "&county_1=16&county_1=21&county_1=23&county_1=32&county_1=36&county_1=41&county_1=46&county_1=53&county_1=54&county_1=57&county_1=60&county_1=66&status=R&send_date=" & Format(dates, "m/d/yyyy") & "&search_1.x=1", _ Destination:=Range("A" & nextrow)) '.Name = _ "2015&search_1.x=40&search_1.y=11&date=on&county_1=AL&lic_num_del=&lic_num_rep=&status=NS&biz_name=&owner_name=" .FieldNames = False .RowNumbers = False .FillAdjacentFormulas = False .PreserveFormatting = True .RefreshOnFileOpen = False .BackgroundQuery = True .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .WebSelectionType = xlSpecifiedTables .WebFormatting = xlWebFormattingNone .WebTables = "10" .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .WebSingleBlockTextImport = False .WebDisableDateRecognition = False .WebDisableRedirections = False .Refresh BackgroundQuery:=False End With ' Delete the header as required If DeleteHeader And i > 1 And ActiveSheet.Cells(nextrow, 1).Value = "License" Then ActiveSheet.Cells(nextrow, 1).EntireRow.Delete ' Find the bottom row maxrow = Application.WorksheetFunction.Max(ActiveSheet.Cells(Rows.Count, 1).End(xlUp).Row, ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row) ' Stop scraping if a full page wasn't returned If (maxrow - nextrow) < (RowsPerPage - IIf(DeleteHeader, 1, 0)) Then Exit For ' Otherwise set the row for the next page of data Else nextrow = maxrow + 1 End If Next i Application.StatusBar = "Formatting data" 'autofit columns ActiveSheet.Columns.EntireColumn.AutoFit 'check for filter, if not then turn on filter ActiveSheet.AutoFilterMode = False If Not ActiveSheet.AutoFilterMode Then ActiveSheet.Range("A:G").AutoFilter 'Align text left With ActiveSheet.Cells .HorizontalAlignment = xlLeft .VerticalAlignment = xlBottom .WrapText = False .Orientation = 0 .AddIndent = False .IndentLevel = 0 .ShrinkToFit = False .ReadingOrder = xlContext .MergeCells = False End With Exit_queryActivityDailyMforFWorking: Application.StatusBar = False Application.ScreenUpdating = True Exit Sub Err_queryActivityDailyMforFWorking: MsgBox Err.Description, vbCritical + vbOKOnly, Err.Number & " - Web Scraping Error" Resume Exit_queryActivityDailyMforFWorking End Sub