Excel VBA删除重复重复VSfilter

我有一个工作表,其中包含约8000行已被过滤。 我试图从工作表列中删除重复项的值的集合。 阅读这里的post有两种方法来做到这一点。 如果值不在新集合中,则循环收集并复制到新集合。
或者将列中的数据复制到临时电子表格中,过滤并将数据复制到另一列,然后将其添加到集合中。

在处理大量数据时,复制filter具有最佳的性能,但是由于必须创build新的工作表,所以复制过程非常笨重。

我没有看到它,但是有没有办法做内存中的复制filter,而不是创build一个工作表来做到这一点?

重述:

Sub GetColumnValues(Ws As Worksheet, Column As Long, CollValues As Collection) Dim RowIndex As Long For RowIndex = 1 To GetLastRow(Ws) If CollValues.Count = 0 Then CollValues.Add (Ws.Cells(RowIndex, Column).Value) Else If IsInCollection(CollValues, Ws.Cells(RowIndex, Column).Value) = False Then CollValues.Add (Ws.Cells(RowIndex, Column).Value) End If End If Next RowIndex End Sub 

过滤和复制:

 Sub GetColumnValues(Ws As Worksheet, Column As Long, CollValues As Collection) Dim rowLast As Long Dim c As Range Dim tmpWS As Worksheet Dim tmpWsName As String tmpWsName = "TempWS" Call DeleteWs(TsWb, tmpWsName) Set tmpWS = TsWb.Sheets.Add() tmpWS.Name = tmpWsName rowLast = GetLastRow(Ws) Ws.Range(Ws.Cells(1, Column), Ws.Cells(rowLast, Column)).Copy tmpWS.Range("A1").PasteSpecial rowLast = GetLastRow(tmpWS) tmpWS.Range(tmpWS.Cells(1, 1), tmpWS.Cells(rowLast, 1)).AdvancedFilter _ Action:=xlFilterCopy, _ CopyToRange:=tmpWS.Range("B1"), _ Unique:=True rowLast = GetLastRow(tmpWS) For Each c In tmpWS.Range(tmpWS.Cells(1, 2), tmpWS.Cells(rowLast, 2)) If Len(c.value) > 0 Then CollValues.Add (c.value) End If Next c Call DeleteWs(TsWb, tmpWsName) End Sub 

我不知道为什么它必须是一个集合,但要快速获得一个数组没有双打(过滤列表)的数值,你可以这样做:(非常接近你的第一个例子)

 Function GetColVal(Ws As Worksheet, Column As Long) As Variant Dim runner As Variant, outputVal() As Variant, i As Long ReDim outputVal(Ws.Range(Ws.Cells(1, Column), Ws.Cells(GetLastRow(Ws), Column)).SpecialCells(xlCellTypeVisible).Count) For Each runner In Ws.Range(Ws.Cells(1, Column), Ws.Cells(GetLastRow(Ws), Column)).SpecialCells(xlCellTypeVisible) If i = 0 Then outputVal(0) = runner.Value: i = 1 Else If IsError(Application.Match(runner.Value, outputVal, 0)) Then outputVal(i) = runner.Value: i = i + 1 End If Next ReDim Preserve outputVal(i - 1) GetColVal= outputVal End Function 

Application.Match是VBA中最快的函数之一,而IsInCollection可以是非常缓慢的…更好地运行一个For Each ...循环来添加一个集合中的所有东西,而不是检查集合…

 Dim a As Variant For Each a in GetColVal(Worksheets("SheetX"),7) MyCollection.Add a Next 

应该比你的例子快得多…我仍然build议不要使用集合,特别是如果你只是使用值…如果可能,更好地使用GetColVal
variantVariable = GetColVal(Worksheets("SheetX"),7)然后使用variablesvariables为你想做的任何事情(你也可以直接粘贴到表单中的某个地方)

一个简单的输出到工作表将是这样的:

 Dim a As Variant a = GetColVal(Worksheets("Sheet1"),13) 'values from sheet1 column M 'pasting in one row (starting at A1 in Sheet2) ThisWorkbook.Sheets("Sheet2").Range(Cells(1, 1), Cells(1, ubound(a) + 1)).value = a 'pasting in one column (starting at C5 in Sheet4) ThisWorkbook.Sheets("Sheet4").Range(Cells(5, 3), Cells(ubound(a) + 5, 3)).value = Application.Transpose(a) 

编辑

展示不同的东西:

 Function GetColumnValues(ws As Worksheet, Column As Long) As Range With ws Dim srcRng As Range, outRng As Range, runRng1 As Range, runRng2 As Range, dBool As Boolean Set srcRng = .Range(.Cells(1, Column), .Cells(GetLastRow(ws), Column)).SpecialCells(xlCellTypeVisible) For Each runRng1 In a If outRng Is Nothing Then Set outRng = runRng1 For Each runRng2 In outRng If Intersect(runRng1, runRng2) Is Nothing Then If runRng2.Value = runRng1.Value Then dBool = True: Exit For End If Next If dBool Then dBool = False Else Set outRng = Union(outRng, runRng1) Next End With Set GetColumnValues = outRng End Function 

有了这个function,您将得到一系列可以select或复制到另一个位置(格式化和其他所有内容)的单元格。 仍然可以使用For Each ...将所有元素添加到集合中。 我也没有使用Match来避免“Len> 255” – 错误

是的,只需创build数组,然后检查数组,然后将结果传回工作表。 我个人喜欢在内存中而不是通过应用程序IDE做事情。

它速度要快得多(特别是数万行),您不必担心屏幕刷新,或让用户不知道如何快速移动所有内容。 我通常处理内存中的所有内容,将其交回,然后激活我希望用户看到的工作表。

 dim set1Array() as String dim set2Array() as String dim set1Rows as Long dim set2Rows as Long dim lngX as Long dim lngY as Long dim blnDebug as Boolean; blnDebug = true ' flag for debugging ' get count of rows so we know how big to make the arrays set1Rows = GetLastRow(Ws1) set2Rows = GetLastRow(Ws2) ' set arrays to the proper size redim set1Rows(set1Rows - 1, 1)' 1 represents 2 columns since it's 0 based. the second column is a flag for duplicated. redim set2Rows(set2Rows - 1, 0)' 0 represents 1 column since it's 0 based ' load the arrays with the sheet data for lngX = 1 to set1Rows set1Rows(lngX - 1, 0) = Worksheets("Sheet1").range("A" & lngX).Text next lngX for lngX = 1 to set2Rows set2Rows(lngX - 1, 0) = Worksheets("Sheet2").range("A" & lngX).Text next lngX ' I like to do a debug callout here to see what I got to make sure that I am good to go with the dataset if blnDebug then for lngX = 0 to Ubound(set1Rows) debug.print "set1Rows(" & lngX & ") - col1: " & set1Rows(lngX, 0) next lngX for lngX = 0 to Ubound(set2Rows) debug.print "set2Rows(" & lngX & ") - col1: " & set2Rows(lngX, 0) next lngX end if ' now do your comparison for lngX = 0 to Ubound(set1Rows) for lngY = 0 to Ubound(set2Rows) if set1Rows(lngX, 0) = set2Rows(lngY, 0) then set1Rows(lngX, 1) = "1" end if next lngY next lngX ' now your duplicates are flagged in the set1Rows array for lngX = 0 to Ubound(set1Rows) if set1Rows(lngX, 1) = "1" then ' code for duplicated else ' code for unique end if next lngX