根据它们的出现删除重复项

我想检查某个列(W)的重复项(出现次数存储在另一列(AZ)),并删除所有行这种方式:

  • 值在列中find两次 – 只删除包含该值的一行。
  • 在列中find更多的值 – 删除所有包含值的行。

我的代码工作得很好,但有时它不会删除所有重复项,因为它应该这样做。 任何想法改善?

编辑:更新的代码工作真的很好,除了它总是错过一个重复,并保持不被删除。

fin = ws.UsedRange.Rows.count For i = 2 To fin ws.Range("AZ" & i).value = Application.WorksheetFunction.CountIf(ws.Range("W2:W" & fin), ws.Range("W" & i)) Next i For j = fin To 2 Step -1 If ws.Range("AZ" & j).value > 2 Then ws.Range("AZ" & j).EntireRow.Delete fin = ws.UsedRange.Rows.count ElseIf ws.Range("AZ" & j).value = 2 Then Set rng = Range("W:W").Find(Range("W" & j).value, , xlValues, xlWhole, , xlNext) rngRow = rng.Row If rngRow <> j Then ws.Range("AZ" & rngRow) = "1" ws.Range("AZ" & j).EntireRow.Delete fin = ws.UsedRange.Rows.count Else MsgBox "Error at row " & rngRow End If End If Next j 

不需要在第二部分使用低效的第二个循环,只需要使用像这样的实时计数

 fin = ws.UsedRange.Rows.count For i = 2 To fin ws.Range("AZ" & i).value = Application.WorksheetFunction.CountIf(ws.Range("W2:W" & fin), ws.Range("W" & i)) Next i For j = fin To 2 Step -1 If ws.Range("AZ" & j).value > 2 OR Application.WorksheetFunction.CountIf(ws.Range("W2:W" & fin), ws.Range("W" & j)) = 2 Then ws.Range("AZ" & j).EntireRow.Delete End If Next j 

如果速度是一个问题,这里应该是一个更快的方法,因为它创build了一个要删除的行集合,然后删除它们。 由于除了实际的行删除之外,所有的事情都是在VBA中完成的,所以对工作表来callback用的次数要less得多。

例行程序可以加快,如内嵌评论所述。 如果仍然太慢,取决于工作表的大小,将整个工作表读入VBA数组可能是可行的; testing重复; 将结果写回到一个新的数组中,并将其写入工作表。 (如果你的工作表太大,这个方法可能会用尽内存)。

无论如何,我们既需要一个你必须重命名cPhrases的类模块 ,也需要一个常规模块

类模块

 Option Explicit Private pPhrase As String Private pCount As Long Private pRowNums As Collection Public Property Get Phrase() As String Phrase = pPhrase End Property Public Property Let Phrase(Value As String) pPhrase = Value End Property Public Property Get Count() As Long Count = pCount End Property Public Property Let Count(Value As Long) pCount = Value End Property Public Property Get RowNums() As Collection Set RowNums = pRowNums End Property Public Function ADDRowNum(Value As Long) pRowNums.Add Value End Function Private Sub Class_Initialize() Set pRowNums = New Collection End Sub 

常规模块

 Option Explicit Sub RemoveDuplicateRows() Dim wsSrc As Worksheet Dim vSrc As Variant Dim CP As cPhrases, colP As Collection, colRowNums As Collection Dim I As Long, K As Long Dim R As Range 'Data worksheet Set wsSrc = Worksheets("sheet1") 'Read original data into VBA array With wsSrc vSrc = .Range(.Cells(1, "W"), .Cells(.Rows.Count, "W").End(xlUp)) End With 'Collect list of items, counts and row numbers to delete 'Collection object will --> error when trying to add ' duplicate key. Use that error to increment the count Set colP = New Collection On Error Resume Next For I = 2 To UBound(vSrc, 1) Set CP = New cPhrases With CP .Phrase = vSrc(I, 1) .Count = 1 .ADDRowNum I colP.Add CP, CStr(.Phrase) Select Case Err.Number Case 457 'duplicate With colP(CStr(.Phrase)) .Count = .Count + 1 .ADDRowNum I End With Err.Clear Case Is <> 0 'some other error. Stop to debug Debug.Print "Error: " & Err.Number, Err.Description Stop End Select End With Next I On Error GoTo 0 'Rows to be deleted Set colRowNums = New Collection For I = 1 To colP.Count With colP(I) Select Case .Count Case 2 colRowNums.Add .RowNums(2) Case Is > 2 For K = 1 To .RowNums.Count colRowNums.Add .RowNums(K) Next K End Select End With Next I 'Revers Sort the collection of Row Numbers 'For speed, if necessary, could use ' faster sort routine RevCollBubbleSort colRowNums 'Delete Rows 'For speed, could create Unions of up to 30 rows at a time Application.ScreenUpdating = False With wsSrc For I = 1 To colRowNums.Count .Rows(colRowNums(I)).Delete Next I End With Application.ScreenUpdating = True End Sub 'Could use faster sort routine if necessary Sub RevCollBubbleSort(TempCol As Collection) Dim I As Long Dim NoExchanges As Boolean ' Loop until no more "exchanges" are made. Do NoExchanges = True ' Loop through each element in the array. For I = 1 To TempCol.Count - 1 ' If the element is less than the element ' following it, exchange the two elements. If TempCol(I) < TempCol(I + 1) Then NoExchanges = False TempCol.Add TempCol(I), after:=I + 1 TempCol.Remove I End If Next I Loop While Not (NoExchanges) End Sub 

虽然你的逻辑基本上是健全的,但这个方法并不是最有效的。 AutoFilter方法可以快速删除大于2的所有计数,然后Range.RemoveDuplicates¹方法可以快速删除列W中仍然包含重复值的行之一。

 Dim r As Long, c As Long With ws If .AutoFilterMode Then .AutoFilterMode = False r = .Cells.SpecialCells(xlLastCell).Row c = Application.Max(52, .Cells.SpecialCells(xlLastCell).Column) With .Range("A1", .Cells(r, c)) '.UsedRange With .Columns(52) If IsEmpty(.Cells(1, 1)) Then .Cells(1, 1) = "count" With .Resize(.Rows.Count - 1, 1).Offset(1, 0) .Cells.FormulaR1C1 = "=COUNTIF(C[-29], RC[-29])" .Cells = .Cells.Value End With .AutoFilter field:=1, Criteria1:=">2" With .Resize(.Rows.Count - 1, 1).Offset(1, 0) If CBool(Application.Subtotal(103, .Cells)) Then .SpecialCells(xlCellTypeVisible).EntireRow.Delete End If End With .AutoFilter End With .RemoveDuplicates Columns:=23, Header:=xlYes End With End With 

当你在列AZ中重写计数值时,你可能会重写3个计数到2等。


¹Range.RemoveDuplicates方法从下往上删除重复的行。

Interesting Posts