VBA更快的数据清理

我正在开发一个Sub,以帮助清理每周需要处理的大型数据集。 数据是产品列表,总数和每个产品的大小,因此它看起来像这样:

产品1全部

产品1小

产品1中

产品2全部

我只想保留一个产品,如果它符合3个标准之一,但如果它,我想保留该产品的所有行。 如果一个产品不符合三个标准中的任何一个,我想要删除每一个有该产品的行。

我写了下面的代码,似乎工作,但需要时间。

For i = Data.Cells(Rows.Count, "B").End(xlUp).Row To 3 Step -1 If Data.Range("F" & i) = "All" Then TY_Sales = Data.Range("K" & i) LY_Sales = Data.Range("L" & i) TY_Stock = Data.Range("O" & i) Sales_Stock = TY_Sales + LY_Sales + TY_Stock If Sales_Stock <= 0 Then vendor_ref = Data.Range("E" & i) For j = i + 10 To i Step -1 If Data.Range("E" & j) = vendor_ref Then Data.Range("E" & j).EntireRow.Delete End If Next End If End If Next 

因为我原来的数据集是17k行,所以我花了很多时间,我知道我一遍又一遍地遍历它,但是我不知道更好的方法来更快地完成它。 任何帮助是极大的赞赏。

我的一般build议是创build一个字典,这是一个值的数组,可以通过一个有序的索引号或一个名称键来访问。 用这个字典,首先遍历所有的数据行。 看E栏:在“E”中是否存在名字? 如果没有,请将其添加到字典中。 然后取出字典ID(新创build的或在前一行创build的),然后将该行添加到字典条目的值K,L和O中。

然后,一旦你收集了所有与字典收集的名字,并且已经添加了K,L和O列,那么通过所有行(从下往上)回来。 对于该行的索引ID,是来自词典条目> 0的值? 如果是,请删除该行。

但是,使事情复杂化,你需要添加一个单独的(免费,微软支持)脚本包来使用字典。 相反,我们会做我们自己的。 这意味着唯一的意思是每次检查一个新行的唯一名称时,我们需要循环访问唯一名称列表,并单独检查每个名称,而不是使用该名称作为索引。 请参阅下面我修改的代码,并提供有关您所做更改的意见。 请注意,我在开始时设置了所有variables,包括声明数据为= sheets(1),这可能与您的sub不同。

 Sub Delete_Unnecessary_Rows() Dim i As Integer Dim TY_Sales As Long, LY_Sales As Long, TY_Stock As Long, Sales_Stock As Long, LastRow As Long Dim data As Worksheet Dim vendor_ref As String Dim VendorStringArray() As String 'This Array will hold all unique vendor names Dim VendorNumArray() As Long 'This array will hold the Sales Stock value for each unique vendor name Dim VendorRowIdentifier() As Long 'For each row, this will hold the index for particular unique vendor name Dim UniqueNameCounter As Long 'This will hold the number of confirmed unique names Dim UniqueCheck As Boolean Set data = Sheets(1) LastRow = data.Cells(data.Rows.Count, "B").End(xlUp).Row ReDim VendorStringArray(3 To LastRow) 'resize the array to be the full possible amount of unique string values ReDim VendorNumArray(3 To LastRow) ReDim VendorRowIdentifier(3 To LastRow) For i = 3 To LastRow 'new loop to find new dictionary names If data.Range("F" & i) = "All" Then 'This is a data row to be searched for a unique vendor name UniqueCheck = True 'Holds TRUE until a duplicate value is found in a higher row vendor_ref = data.Range("E" & i).Formula 'Grabs the vendor name and Sales_Stock amount for that row TY_Sales = data.Range("K" & i) LY_Sales = data.Range("L" & i) TY_Stock = data.Range("O" & i) Sales_Stock = TY_Sales + LY_Sales + TY_Stock If UniqueNameCounter > 0 Then 'If there's already been at least 1 unique name, check prior unique names to try and find a match For j = UniqueNameCounter To 1 Step -1 'works backwards through prior unique counters to find a match If vendor_ref = VendorStringArray(j + 2) Then UniqueCheck = False 'A match has been found VendorRowIdentifier(i) = j + 2 'associates the row being searched with the index of the unique vendor name for the matched row VendorNumArray(VendorRowIdentifier(i)) = VendorNumArray(VendorRowIdentifier(i)) + Sales_Stock 'adds the new sales stock value to the old one with that unique vendor name j = 0 'stops the formula from looping after a match is found End If Next j End If If UniqueCheck Then 'no match was found for that name in an above row UniqueNameCounter = UniqueNameCounter + 1 VendorStringArray(UniqueNameCounter + 2) = vendor_ref 'adds the text to be matched against future values in the array, starting at 3 instead of 1 VendorRowIdentifier(i) = UniqueNameCounter + 2 'associates the row being searched with the index of the unique vendor name VendorNumArray(UniqueNameCounter + 2) = Sales_Stock End If End If Next i For i = LastRow To 3 Step -1 'After determining which rows have values, delete all such rows If data.Range("F" & i) = "All" Then If VendorNumArray(VendorRowIdentifier(i)) > 0 Then 'Pull the value of the unique vendor name associated with that row #'s vendor and check the size associated data.Rows(i).Delete 'Delete the row if any value has been assigned to that vendor End If End If Next End Sub 

正如Trey博士所build议的那样,您还可以在处理过程中消除自动更新等,进一步节省操作时间。

这是另一种方法。 此方法不是在匹配的供应商名称中手动循环和检查值,而是在每行上使用本机Excel SUMIFS函数来查看是否有匹配的行具有值。 然后每行通过布尔值数组赋值为TRUE或FALSE。 然后循环再次执行,标记为真的行被删除。 此方法只循环遍历所有行2X,尽pipe使用SUMIFS可能比上面的手动循环更encryption集。 不过我相信这个方法更容易理解。

披露:我已经testing了这两种方法,并确认了它们的工作原理,但不清楚处理时间的差别。

 Sub CheckDelete_WithSumifs() Dim i As Integer Dim TY_Sales As Long, LY_Sales As Long, TY_Stock As Long, Sales_Stock As Long, LastRow As Long Dim data As Worksheet Dim Vendor_Ref As String Dim DeleteRowCheck() As Boolean Set data = Sheets(1) LastRow = data.Cells(data.Rows.Count, "B").End(xlUp).Row ReDim DeleteRowCheck(3 To LastRow) 'resize the array to be the full possible amount of unique string values For i = LastRow To 3 Step -1 'new loop to find new dictionary names If data.Range("F" & i) = "All" Then 'Only check to delete if the word All is in column F Vendor_Ref = data.Range("E" & i).Formula 'Grabs the vendor name and Sales_Stock amount for that row TY_Sales = GrabSumifs(data.Range("K:K"), Vendor_Ref, data) ' See function below LY_Sales = GrabSumifs(data.Range("L:L"), Vendor_Ref, data) TY_Stock = GrabSumifs(data.Range("O:O"), Vendor_Ref, data) Sales_Stock = TY_Sales + LY_Sales + TY_Stock 'Total value of all columns K, L, O for that vendor name If Sales_Stock > 0 Then DeleteRowCheck(i) = True 'Used in the loop below to define whether to delete the row Else DeleteRowCheck(i) = False End If End If Next i For i = LastRow To 3 Step -1 'After determining which rows have are marked TRUE to delete, delete those rows If DeleteRowCheck(i) Then data.Rows(i).Delete 'Delete the row if any value has been assigned to that vendor End If Next End Sub Function GrabSumifs(SumRange, Vendor_Ref, data) As Long 'This function uses the SUMIFS formula native to Excel, to check the given column to see if any values are present with an identicial vendor name & "All" in column F. GrabSumifs = Application.WorksheetFunction.SumIfs(SumRange, data.Range("F:F"), "All", data.Range("E:E"), Vendor_Ref) End Function