VBA,高级筛选和删除重复

列A中列出了不同的path。我在B和C中列出了一些细节。

我怎么能在一个新的表:1)拉每个独特的path,2)为每个path编译B * C的值,并删除重复。 3)在最后一行完成后重复下一个path。

我确实有一个错误的macros观,但为了简洁和准确,我不会发布。 除非有人想阅读,否则请申请

在这里输入图像说明

任何帮助将不胜感激。

这是我所拥有的(我理解它的长期,我会试着去清理它):

Sub FileDetail() 'Does not fill down, go to bottom to unleased fill down 'Skips unreadable files 'This Macro retrieves data from files picked. The data is based on header. Data is also filtered for unique values. 'You must make sure headers are in the first row and delimted. Dim wb As Workbook, fileNames As Object, errCheck As Boolean Dim ws As Worksheet, wks As Worksheet, wksSummary As Worksheet Dim y As Range, intRow As Long, i As Integer Dim r As Range, lr As Long, myrg As Range, z As Range Dim boolWritten As Boolean, lngNextRow As Long Dim intColNode As Integer, intColScenario As Integer Dim intColNext As Integer, lngStartRow As Long Dim lngLastNode As Long, lngLastScen As Long Dim intColinstrument As Integer, lngLastinstrument As Long 'Skipped worksheet for file names Dim wksSkipped As Worksheet Set wksSkipped = ThisWorkbook.Worksheets("Skipped") ' Turn off screen updating and automatic calculation With Application .ScreenUpdating = False .Calculation = xlCalculationManual End With ' Create a new worksheet, if required On Error Resume Next Set wksSummary = ActiveWorkbook.Worksheets("Unique data") On Error GoTo 0 If wksSummary Is Nothing Then Set wksSummary = ActiveWorkbook.Worksheets.Add(After:=ActiveWorkbook.Worksheets(ActiveWorkbook.Worksheets.Count)) wksSummary.Name = "Unique data" End If ' Set the initial output range, and assign column headers With wksSummary Set y = .Cells(.Rows.Count, 3).End(xlUp).Offset(1, 0) Set r = y.Offset(0, 1) Set z = y.Offset(0, -2) lngStartRow = y.Row .Range("A1:E1").Value = Array("File Name", "Sheet Name", "Node", "Book", "Instrument") End With 'get user input for files to search Set fileNames = CreateObject("Scripting.Dictionary") errCheck = UserInput.FileDialogDictionary(fileNames) If errCheck Then Exit Sub End If ''' For Each Key In fileNames 'loop through the dictionary On Error Resume Next Set wb = Workbooks.Open(fileNames(Key)) If Err.Number <> 0 Then Set wb = Nothing ' or set a boolean error flag End If On Error GoTo 0 ' or your custom error handler If wb Is Nothing Then wksSkipped.Cells(wksSkipped.Cells(wksSkipped.Rows.Count, "A").End(xlUp).Row + 1, 1) = fileNames(Key) Else Debug.Print "Successfully loaded " & fileNames(Key) wb.Application.Visible = False 'make it not visible ' more working with wb ' Check each sheet in turn For Each ws In ActiveWorkbook.Worksheets With ws ' Only action the sheet if it's not the 'Unique data' sheet If .Name <> wksSummary.Name Then boolWritten = False ''''''''''''''''''testing additional column..trouble here ' Find the Anchor Date intColScenario = 0 On Error Resume Next intColScenario = WorksheetFunction.Match("instrument.instrumentType", .Rows(1), 0) On Error GoTo 0 If intColScenario > 0 Then ' Only action if there is data in column E If Application.WorksheetFunction.CountA(.Columns(intColScenario)) > 1 Then lr = .Cells(.Rows.Count, intColScenario).End(xlUp).Row ' Copy unique values from the formula column to the 'Unique data' sheet, and write sheet & file details .Range(.Cells(1, intColScenario), .Cells(lr, intColScenario)).AdvancedFilter xlFilterCopy, , r, True r.Offset(0, -2).Value = ws.Name r.Offset(0, -3).Value = ws.Parent.Name ' Delete the column header copied to the list r.Delete Shift:=xlUp boolWritten = True End If End If ''''''''''''''''''''''''''''''''''''below is working''''''''''''''''''''''' ' Find the Desk column intColNode = 0 On Error Resume Next intColNode = WorksheetFunction.Match("book.reportingLine.pathName", .Rows(1), 0) On Error GoTo 0 If intColNode > 0 Then ' Only action if there is data in column A If Application.WorksheetFunction.CountA(.Columns(intColNode)) > 1 Then lr = .Cells(.Rows.Count, intColNode).End(xlUp).Row ' Copy unique values from column A to the 'Unique data' sheet, and write sheet & file details (if not already written) .Range(.Cells(1, intColNode), .Cells(lr, intColNode)).AdvancedFilter xlFilterCopy, , y, True If Not boolWritten Then y.Offset(0, -1).Value = ws.Name y.Offset(0, -2).Value = ws.Parent.Name End If ' Delete the column header copied to the list y.Delete Shift:=xlUp End If End If ' Find the Intrument intColinstrument = 0 On Error Resume Next intColinstrument = WorksheetFunction.Match("instrument.instrumentType", .Rows(1), 0) On Error GoTo 0 If intColinstrument > 0 Then ' Only action if there is data in column A If Application.WorksheetFunction.CountA(.Columns(intColinstrument)) > 1 Then lr = .Cells(.Rows.Count, intColinstrument).End(xlUp).Row ' Copy unique values from column A to the 'Unique data' sheet, and write sheet & file details (if not already written) .Range(.Cells(1, intColinstrument), .Cells(lr, intColinstrument)).AdvancedFilter xlFilterCopy, , z, True If Not boolWritten Then z.Offset(0, -3).Value = ws.Name z.Offset(0, -4).Value = ws.Parent.Name End If ' Delete the column header copied to the list z.Delete Shift:=xlUp End If End If ' Identify the next row, based on the most rows used in columns C & D lngLastNode = wksSummary.Cells(wksSummary.Rows.Count, 3).End(xlUp).Row lngLastScen = wksSummary.Cells(wksSummary.Rows.Count, 4).End(xlUp).Row lngLastinstrument = wksSummary.Cells(wksSummary.Rows.Count, 5).End(xlUp).Row lngNextRow = WorksheetFunction.Max(lngLastNode, lngLastScen) + 1 If (lngNextRow - lngStartRow) > 1 Then ' Fill down the workbook and sheet names z.Resize(lngNextRow - lngStartRow, 2).FillDown ''''''''Optional if you want headers to be filled down. 'If (lngNextRow - lngLastNode) > 1 Then ' Fill down the last Node value 'wksSummary.Range(wksSummary.Cells(lngLastNode, 3), wksSummary.Cells(lngNextRow - 1, 3)).FillDown 'End If 'If (lngNextRow - lngLastScen) > 1 Then ' Fill down the last Scenario value 'wksSummary.Range(wksSummary.Cells(lngLastScen, 4), wksSummary.Cells(lngNextRow - 1, 4)).FillDown 'End If End If Set y = wksSummary.Cells(lngNextRow, 3) Set r = y.Offset(0, 1) Set z = y.Offset(0, -2) lngStartRow = y.Row End If End With Next ws wb.Close savechanges:=False 'close the workbook do not save Set wb = Nothing 'release the object End If Next 'End of the fileNames loop Set fileNames = Nothing ' Autofit column widths of the report wksSummary.Range("A1:E1").EntireColumn.AutoFit ' Reset system settings With Application .Calculation = xlCalculationAutomatic .ScreenUpdating = True .Visible = True End With End Sub 

所以这个代码获取文件名,表单名称和列指定的数据。

1)但是,我无法添加额外的列。 (我目前得到2提取列),也

2)我很难把它放在一个列的基础上的格式。 例如,它会给我每个path的独特价值,但不是每项运动的独特价值。

编辑包括数据(我也想包括第四和第五列,但为了简单起见保持为3):

 +-------------------------------+------------+--------------+ | path | sport | Teams | +-------------------------------+------------+--------------+ | stack/over/flow/larrybird | basketball | celtics | +-------------------------------+------------+--------------+ | stack/over/flow/michaeljordan | basketball | bulls | +-------------------------------+------------+--------------+ | stack/over/flow/tigerwoods | golf | pga | +-------------------------------+------------+--------------+ | stack/over/flow/josebautista | baseball | bluejays | +-------------------------------+------------+--------------+ | stack/over/flow/jordanspeith | golf | pga | +-------------------------------+------------+--------------+ | stack/over/flow/kevinlove | basketball | timberwolves | +-------------------------------+------------+--------------+ | stack/over/flow/lebronjames | basketball | cavs | +-------------------------------+------------+--------------+ | stack/over/flow/stephencurry | basketball | warriors | +-------------------------------+------------+--------------+ | stack/over/flow/larrybird | baseball | redsox | +-------------------------------+------------+--------------+ | stack/over/flow/michaeljordan | baseball | whitesox | +-------------------------------+------------+--------------+ | stack/over/flow/michaeljordan | chess | knight | +-------------------------------+------------+--------------+ | stack/over/flow/michaeljordan | basketball | hornets | +-------------------------------+------------+--------------+ | stack/over/flow/kevinlove | basketball | cavs | +-------------------------------+------------+--------------+ | stack/over/flow/tigerwoods | golf | pga | +-------------------------------+------------+--------------+ 

和预期的结果(我包括在这填写)

 +-------------------------------+------------+--------------+ | path | sport | teams | +-------------------------------+------------+--------------+ | stack/over/flow/larrybird | basketball | celtics | +-------------------------------+------------+--------------+ | | baseball | red sox | +-------------------------------+------------+--------------+ | stack/over/flow/tigerwoods | golf | pga | +-------------------------------+------------+--------------+ | stack/over/flow/michaeljordan | basketball | bulls | +-------------------------------+------------+--------------+ | | | hornets | +-------------------------------+------------+--------------+ | | baseball | whitesox | +-------------------------------+------------+--------------+ | | chess | knight | +-------------------------------+------------+--------------+ | stack/over/flow/kevinlove | basketball | timberwolves | +-------------------------------+------------+--------------+ | | | cavs | +-------------------------------+------------+--------------+ | stack/over/flow/josebautista | baseball | bluejays | +-------------------------------+------------+--------------+ 

对于第三(也是第四和第五)列获得唯一值似乎是一个问题。

如果您不介意将结果sorting,而不是以原始顺序sorting,则以下代码将执行此操作。 它应该“自动适应”任何数量的列。

(如果您需要原始顺序的结果,我会使用集合或字典和用户定义对象的方法)

您的数据应该以A1开头(第一行是列标签),您可以在代码中看到您为源和结果数据定义工作表的位置。

由于大部分“工作”是在VBA数组中完成的,而不是在工作表上,所以它应该运行得非常快。

在这里输入图像说明

 Option Explicit Sub SortFormat() Dim wsSrc As Worksheet, wsRes As Worksheet, rRes As Range Dim vRes As Variant Dim R As Range, C As Range Dim V As Variant Dim I As Long, J As Long 'Set source and results worksheets, ranges Set wsSrc = Worksheets("Sheet1") Set wsRes = Worksheets("Sheet2") wsRes.Cells.Clear Set rRes = wsRes.Cells(1, 1) Application.ScreenUpdating = False 'Copy source data to results worksheet Dim LastRow As Long, LastCol As Long With wsSrc LastRow = .Cells(.Rows.Count, 1).End(xlUp).Row LastCol = .Cells(1, .Columns.Count).End(xlToLeft).Column Set R = .Range(.Cells(1, 1), .Cells(LastRow, LastCol)) R.Copy rRes Application.CutCopyMode = False End With 'Go to Results sheet With wsRes .Select .UsedRange.EntireColumn.AutoFit End With rRes.Select 'Sort the data With wsRes.Sort.SortFields .Clear Set R = wsRes.UsedRange.Columns For Each C In R .Add Key:=C, SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal Next C End With With wsRes.Sort .SetRange R .Header = xlYes .MatchCase = False .Orientation = xlTopToBottom .Apply End With 'Remove any completely duplicated rows 'Create array of columns ReDim V(0 To R.Columns.Count - 1) For I = 0 To UBound(V) V(I) = I + 1 Next I R.RemoveDuplicates Columns:=(V), Header:=xlYes 'Remove Duplicated items in each row 'Work in VBA array for more speed vRes = R For I = UBound(vRes, 1) To 3 Step -1 If vRes(I, 1) = vRes(I - 1, 1) Then vRes(I, 1) = "" For J = 2 To UBound(vRes, 2) If vRes(I, J) = vRes(I - 1, J) And _ vRes(I, J - 1) = "" Then vRes(I, J) = "" Next J Next I R = vRes Application.ScreenUpdating = True End Sub 

简单的方法是复制整个范围,对其进行sorting,然后进行一些计算:

 Sub Macro1() Application.ScreenUpdating = False Dim str As String With Sheet1 str = .Range(.Cells(.Rows.Count, 1).End(xlUp), .Cells(1, 3)).Address .Range(str).Copy Sheet2.Cells(1, 1) End With Application.CutCopyMode = False With Sheet2 .Activate Dim str2 As String str2 = .Range(str).Offset(1).Resize(.Range(str).Rows.Count - 1).Address .Range(str2).Value = Evaluate("if(" & str2 & "="""",-1E+99," & str2 & ")") .Sort.SortFields.Clear .Sort.SortFields.Add .Range(str).Offset(1).Resize(, 1), 0, 1, , 0 .Sort.SortFields.Add .Range(str).Offset(1, 1).Resize(, 1), 0, 1, , 0 .Sort.SortFields.Add .Range(str).Offset(1, 2).Resize(, 1), 0, 1, , 0 .Sort.SetRange .Range(str).Offset(1) .Sort.Header = 2 .Sort.Apply .Range(str2).Value = Evaluate("if(" & str2 & "=-1E+99,""""," & str2 & ")") Dim val As Variant, i As Long, rng2 As Range val = .Range(str).Value Set rng2 = .Range(str).Offset(.Range(str).Rows.Count).Resize(1) For i = 3 To UBound(val) If val(i - 1, 1) = val(i, 1) And val(i - 1, 2) = val(i, 2) And val(i - 1, 3) = val(i, 3) Then Set rng2 = Union(rng2, .Range(str).Rows(i)) Next i = .Range(str).Rows.Count - rng2.Rows.Count rng2.EntireRow.Delete xlShiftUp With .Range(str).Offset(1).Resize(i - 1, 1) .Value = Evaluate("if(" & .Address & "=" & .Offset(-1).Address & ",""""," & .Address & ")") With .Offset(, 1) .Value = Evaluate("if((" & .Address & "=" & .Offset(-1).Address & ")*(" & .Offset(, -1).Address & "=""""),""""," & .Address & ")") End With End With End With End Sub 

通过电话完成,可能包含错误!
现在改变了很多,请复制整个代码并再次testing。

编辑

好的,完全不同的解决scheme。 应该是快速的,但可能不是很清楚它的工作方式:P

 Sub Macro2() Dim inVal As Variant, outVal() As Variant, orderArr() As Variant Dim startRng As Range Dim i As Long, j As Long, k As Long, iCount As Long Set startRng = Sheet1.Range("A2:C2") 'upmost row-range of the range to be copied (exclude headers!) With startRng.Parent inVal = .Range(startRng, .Cells(.Rows.Count, startRng.Column).End(xlUp)).Value End With ReDim orderArr(1 To UBound(inVal)) For i = 1 To UBound(inVal) iCount = 1 For j = 1 To UBound(inVal) For k = 1 To UBound(inVal, 2) If StrComp(inVal(i, k), inVal(j, k), 1) = 1 Then iCount = iCount + 1 If StrComp(inVal(i, k), inVal(j, k), 1) <> 0 Then Exit For Next Next orderArr(i) = iCount Next k = 1 ReDim outVal(1 To UBound(inVal, 2), 1 To UBound(inVal)) For i = 0 To Application.Max(orderArr) If IsNumeric(Application.Match(i, orderArr, 0)) Then iCount = Application.Match(i, orderArr, 0) For j = 1 To UBound(inVal, 2) outVal(j, k) = inVal(iCount, j) Next k = k + 1 End If Next ReDim Preserve outVal(1 To UBound(inVal, 2), 1 To k - 1) For i = 1 To UBound(outVal) For j = UBound(outVal, 2) To 2 Step -1 If outVal(i, j - 1) = outVal(i, j) Then If i = 1 Then outVal(i, j) = "" ElseIf outVal(i - 1, j) = "" Then outVal(i, j) = "" End If End If Next Next 'upper left cell of the output-range Sheet2.Range("A2").Resize(UBound(outVal, 2), UBound(outVal)).Value = Application.Transpose(outVal) End Sub 

随意设置起始范围( Sheet1.Range("A2:C2") )为Selection ,然后只需select范围并启动macros。 可以使用任何大小(虽然很大的范围可能冻结excel一段时间)。

一如既往:如果您有任何疑问,只需要问:)

一个有效的解决scheme是:

  • 用Range.Copy复制值
  • 然后使用Range.Sort对行进行Range.Sort
  • 然后用Range.RemoveDuplicates删除重复的行
  • 最后用循环删除重复的分支

此过程将删除重复的行和格式为树视图:

 Sub RemoveDuplicates() Dim rgSource As Range, rgTarget As Range, data(), r&, c& ' define the source, the target and the number of columns Const columnCount = 3 Set rgSource = Range("Sheet1!A3") Set rgTarget = Range("Sheet1!F3") ' copy the values to the targeted range Set rgSource = rgSource.Resize(rgSource.End(xlDown).Row - rgSource.Row + 1, columnCount) Set rgTarget = rgTarget.Resize(rgSource.Rows.Count, columnCount) rgSource.Copy rgTarget ' sort the rows on each column For c = columnCount To 1 Step -1 rgTarget.Sort rgTarget.Columns(c) Next ' build the array of columns for RemoveDuplicates Dim rdColumns(0 To columnCount - 1) For c = 1 To columnCount: rdColumns(c - 1) = c: Next ' remove the duplicated rows rgTarget.RemoveDuplicates rdColumns Set rgTarget = rgTarget.Resize(rgTarget.End(xlDown).Row - rgTarget.Row + 1, columnCount) ' format as a tree view by removing the duplicated branches data = rgTarget.Value For r = UBound(data) To 2 Step -1 For c = 1 To columnCount - 1 If data(r, c) <> data(r - 1, c) Then Exit For data(r, c) = Empty Next Next rgTarget.Value = data End Sub 

如果你想做一个唯一的列表,使用Dictionary对象 。

确保添加对脚本运行时控件的引用! 根据您的示例数据,只需要一些快速和脏的代码(如完全未经testing):

 Sub GetUniques() Dim oDic as New Dictionary Dim r as Integer Dim strKey as String Dim varValue(2) as Variant 'Get a unique list of Column A values r = 3 'Your data starts on row 3 Do While Cells(r,1).value <> "" 'Run until the first blank line strKey = Cells(r,1).value varValue(0) = Cells(r,2).Value varValue(1) = Cells(r,3).Value If Not oDic.Exists(strKey) Then oDic.Add strKey, varValue End If r = r +1 Loop 'Now display your list of unique values Dim K as Variant Dim myArray as Variant r = 3 'We'll start on row 3 again but move over to column I (9) For Each K in oDic.Keys Cells(r,9).Value = K myArray = oDic.Item(K) Cells(r,10).Value = myArray(0) Cells(r,11).Value = myArray(1) r = r + 1 Next K End Sub