在同一工作簿中,通过循环运行两个工作表中超过100,000行的数据

我目前有代码允许我查看与表1和表2匹配的ID行。当两个ID匹配时,表2信息将粘贴到具有相同ID的Sheet 1行。 我的代码工作在less于1000行,当我testing它在一分钟内给出结果。

问题是,当我试图运行它1,000,000行时,它一直运行,超过20分钟,从此以后永远不会停止运行。 我希望任何人都可以帮助我修改代码,让我做一个循环,并复制粘贴信息从Sheet 2到Sheet 1的200,000行。

Sub Sample() Dim tracker As Worksheet Dim master As Worksheet Dim cell As Range Dim cellFound As Range Dim OutPut As Long Set tracker = Workbooks("test.xlsm").Sheets("Sheet1") Set master = Workbooks("test.xlsm").Sheets("Sheet2") Application.ScreenUpdating = False For Each cell In master.Range("A2:A200000") Set cellFound = tracker.Range("A5:A43000").Find(What:=cell.Value, LookIn:=xlValues, LookAt:=xlWhole) If Not cellFound Is Nothing Then matching value cellFound.Offset(ColumnOffset:=1).Value2 = cell.Offset(ColumnOffset:=2).Value2 Else End If Set cellFound = Nothing Debug.Print cell.Address Next Application.ScreenUpdating = True OutPut = MsgBox("Update over!", vbOKOnly, "Update Status") End Sub 

以上是我现在的代码。

包含@ paulbica的build议,这对我来说在几秒钟内运行。

 Sub Sample() Dim rngTracker As Range Dim rngMaster As Range Dim arrT, arrM Dim dict As Object, r As Long, tmp With Workbooks("test.xlsm") Set rngTracker = .Sheets("Tracker").Range("A2:B43000") Set rngMaster = .Sheets("Master").Range("A2:C200000") End With 'get values in arrays arrT = rngTracker.Value arrM = rngMaster.Value 'load the dictionary Set dict = CreateObject("scripting.dictionary") For r = 1 To UBound(arrT, 1) dict(arrT(r, 1)) = r Next r 'map between the two arrays using the dictionary For r = 1 To UBound(arrM, 1) tmp = arrM(r, 1) If dict.exists(tmp) Then arrT(dict(tmp), 2) = arrM(r, 3) End If Next r rngTracker.Value = arrT End Sub 

您可以使用Dictionary对象的索引并使用其本地索引属性来执行lokups。 我不确定在一个200Klogging的数据集中会有怎样的performance,在这个数据集中会出现一个高的失败报告,而且你显示至less有78%的失败率(200Klogging匹配和更新43Klogging)。

 Sub Sample3() Dim tracker As Worksheet, master As Worksheet Dim OutPut As Long Dim v As Long, p As Long, vMASTER As Variant, vTRACKER As Variant, dMASTER As Object Set tracker = Workbooks("test.xlsm").Sheets("Sheet1") Set master = Workbooks("test.xlsm").Sheets("Sheet2") Set dMASTER = CreateObject("Scripting.Dictionary") Debug.Print Timer 'Application.ScreenUpdating = False '<~~no real need to do this if working in memory With tracker vTRACKER = .Range(.Cells(5, 2), .Cells(Rows.Count, 1).End(xlUp)).Value2 End With With master vMASTER = .Range(.Cells(2, 1), .Cells(Rows.Count, 3).End(xlUp)).Value2 For v = LBound(vMASTER, 1) To UBound(vMASTER, 1) If Not dMASTER.exists(vMASTER(v, 1)) Then _ dMASTER.Add Key:=vMASTER(v, 1), Item:=vMASTER(v, 3) Next v End With For v = LBound(vTRACKER, 1) To UBound(vTRACKER, 1) If dMASTER.exists(vTRACKER(v, 1)) Then _ vTRACKER(v, 2) = dMASTER.Item(vTRACKER(v, 1)) Next v With ThisWorkbook.Sheets("Sheet1") 'tracker .Cells(5, 1).Resize(UBound(vTRACKER, 1), 2) = vTRACKER End With 'Application.ScreenUpdating = True '<~~no real need to do this if working in memory Debug.Print Timer OutPut = MsgBox("Update over!", vbOKOnly, "Update Status") dMASTER.RemoveAll: Set dMASTER = Nothing Set tracker = Nothing Set master = Nothing End Sub 

一旦将两个范围镜像到变体arrays中,就会创build一个字典,以充分利用其索引属性进行识别。

以上显示了跟踪器中的logging与43Klogging中的200Klogging的效率显着提高。

顺便说一句,我确实使用了一个.XLSB。 不是.XLSM。

使用ADODB也可能会更快。

 Dim filepath As String Dim conn As New ADODB.Connection Dim sql As String filepath = "c:\path\to\excel\file\book.xlsx" With conn .Provider = "Microsoft.ACE.OLEDB.12.0" .ConnectionString = "Data Source=""" & filepath & """;" & _ "Extended Properties=""Excel 12.0;HDR=No""" sql = _ "UPDATE [Sheet1$A2:B200000] AS master " & _ "INNER JOIN [Sheet2$] AS tracker ON master.F1 = tracker.F1 " & _ "SET master.F2 = tracker.F2" .Execute sql End With 

这适用于Office 2007.Office 2010(我在2013年还没有testing过)有一个安全措施,可以防止用SQL语句更新Excel电子表格 。 在这种情况下,您可以使用旧的Jet提供程序,该提供程序没有此安全措施。 此提供程序不支持.xlsx.xlsm.xlsb文件; 只有.xls

 With conn .Provider = "Microsoft.Jet.OLEDB.4.0" .ConnectionString = "Data Source=""" & filepath & """;" & _ "Extended Properties=""Excel 8.0;HDR=No""" 

或者,您可以将结果数据读取到断开连接的logging集并将logging集粘贴到原始工作表中:

 Dim filepath As String Dim conn As New ADODB.Connection Dim sql As String Dim rs As New ADODB.Recordset filepath = "c:\path\to\excel\file\book.xlsx" With conn .Provider = "Microsoft.ACE.OLEDB.12.0" .ConnectionString = "Data Source=""" & filepath & """;" & _ "Extended Properties=""Excel 12.0;HDR=No""" sql = _ "SELECT master.F1, IIF(tracker.F1 Is Not Null, tracker.F2, master.F2) " & _ "FROM [Sheet1$A2:B200000] AS master " & _ "LEFT JOIN [Sheet2$] AS tracker ON master.F1 = tracker.F1 " rs.CursorLocation = adUseClient rs.Open sql, conn, adOpenForwardOnly, adLockReadOnly conn.Close End With Workbooks.Open(filepath).Sheets("Sheet1").Cells(2, 1).CopyFromRecordset rs 

如果使用CopyFromRecordset,请记住,不能保证logging返回的顺序,如果master工作表中除列A和B以外还有其他数据,则可能会出现问题。要解决此问题,可以将这些数据包括在内logging集中的其他列也是如此。 或者,可以使用ORDER BY子句强制执行logging的顺序,然后在开始之前对工作表中的数据进行sorting。