链接来自两列的条目

在这里输入图像说明 我有两列中的ID,比方说A列和B列。在这些列中是ID,它们会多次出现在a或b中。 我想要做的是提供一个批号,如下面的例子,任何相关的ID被放在一个批次下。

有没有什么好的想法,如何在Excel / VBA中做到这一点? 我有15000行。 到目前为止,我已经尝试循环遍历每一行,并试图用2,然后2到4等标记1,但for循环突然变得几乎无限。 我不在乎提供代码,更多的是逻辑的一面!

这是我在18Oct上发布的代码的一个修改,它有一个严重的bug和一些其他的错误。 希望这个版本更有效。 在这篇文章的最后,我已经提到了这个bug。

该解决scheme使用表示组的标识符的类cGroup,其中组被定义为列A和列B中出现在相同行上的所有string。 因此,组的成员资格意味着在工作表的某个地方有一行,同一组中有两个成员,并且至less有一个成员也出现在列表的另一行(除非这两个值只出现一次,同一排,他们有一组自己的)。 每个组都有一个创build时分配的原始类ID,但可能会稍后链接到另一个父组(见下文),在这种情况下,它将采用其父组的类ID。

代码沿着列表运行,并将列A和列B中的键值分配给预先存在的组,如果其中任何一个键已经出现在列表中。 如果以前都没有出现,则为他们创build一个新的类别ID。 如果他们以前出现在不同的行上并被分配到不同的组,那么这些组必须被链接。 这是通过select一个组作为另一个的父亲,然后可以形成子>父母关系的层次结构。 子组采用其父级的classID – ClassID属性包含此的逻辑。 这种方法的一大优点是它避免了大规模迭代,尽pipe仍然迭代child> Parent ClassID层次结构来发现hiearchy中较低级的子类的ClassID。

我使用Scripting.Dictionary来提供从一个键到它的类的查找。 要在代码中使用此工具,请在“工具”>“引用”中设置对Microsoft脚本运行时库的引用。

我已经实现了处理关键数据的代码作为一个单独的类cGrouper与一个单一的方法,AllocateClassIDs,这是一个工作表的3列区域进行处理 – 三列是KeyA,KeyBinput每行的前两列和第三列中相应的输出类号。 要使用这个类的代码将是这样的:

Public Sub run() Dim oGrouper As New cGrouper '// rTestData1 is a named range in the a worksheet that is 3 columns xn rows, containing n pairs of keys '// in col1 and col2. the allocated class number is written into column 3 of the range oGrouper.AllocateClassIDs [rTestData1] End Sub 

这是cGrouper类的代码:

 Option Explicit '// This class will identify groups of related key values in two Key columns of a worksheet and then assign group numbers. '// A group is defined as the set of Keys that appear on the same rows in the two key columns. So if A and B are on '// row 3 and B and C on row 4, then A, B and C are in the same group, along with any other key values that share '// the same relationship with each other. '// Corollary: Keys are in different goups only if each key in the group never appears on the same row as any of the keys in any other group '// Dictionaries '// Lookup from a key value to the related class. Key value is a string that appears in colA or colB Dim GroupMembers As New Scripting.Dictionary '// Lookup to the groups that have already been created. The key is the GroupGroupID (integer assigned on creation) Dim Groups As New Scripting.Dictionary '// This subroutine does all the work Public Sub AllocateClassIDs(Keys As Range) '// First clear out the dictionaries GroupMembers.RemoveAll Groups.RemoveAll g.Reset '// Check the given ranges If Keys.Columns.Count <> 3 Then MsgBox "Range must have three columns - cannot process" Exit Sub End If '// Set up references to the sub-ranges within the sheet Dim KeysA As Range, KeysB As Range, ClassIDs As Range Set KeysA = Keys.Columns(1) Set KeysB = Keys.Columns(2) Set ClassIDs = Keys.Columns(3) Dim iRow As Integer, sAKey As String, sBKey As String Dim iAGroup As cGroup, iBGroup As cGroup '// Run down every row of the given range For iRow = 1 To KeysA.Rows.Count '// Get the key values from Col A and Col B sAKey = KeysA.Cells(iRow) sBKey = KeysB.Cells(iRow) '// Check if these keys have already been found earlier If GroupMembers.Exists(sAKey) Then Set iAGroup = GroupMembers.Item(sAKey) Else Set iAGroup = Nothing If GroupMembers.Exists(sBKey) Then Set iBGroup = GroupMembers.Item(sBKey) Else Set iBGroup = Nothing '// Now check the combination of possibilities: Select Case True Case iAGroup Is Nothing And iBGroup Is Nothing '// Neither key was found so we need to create a new group to hold the class number If Len(sAKey) > 0 Or Len(sBKey) > 0 Then With New cGroup '// Add the group to the dictionary of groups Groups.Add .GroupID, .Self '// Add the keys to the dictionary of group members. This links the key to the group If Len(sAKey) > 0 Then GroupMembers.Add sAKey, .Self If sAKey <> sBKey And Len(sBKey) > 0 Then GroupMembers.Add sBKey, .Self End With End If Case iBGroup Is Nothing '// Key in col A is already in a group from an earlier line, but key in Col B is not '// we just add ColB key to the same group as the col A key If Len(sBkey)>0 Then Set iAGroup = GroupMembers.Item(sAKey) GroupMembers.Add sBKey, iAGroup End If Case iAGroup Is Nothing '// Key in Col B is already in a group, but Key in col A is not '// We just add ColA key to the same group as the col B key IF Len(sAkey)>0 Then Set iBGroup = GroupMembers.Item(sBKey) GroupMembers.Add sAKey, iBGroup End IF Case Else '// They are both already in a group. That's fine if they are members of the same class but... If iAGroup.ClassID <> iBGroup.ClassID Then '// They are in DIFFERENT Classes so we must merge them together by settung '// the class ID of one group to be the same as the other '// Always use the lower-numbered class ID If iAGroup.ClassID < iBGroup.ClassID Then iBGroup.JoinGroupMembership iAGroup Else iAGroup.JoinGroupMembership iBGroup End If End If End Select Next iRow '// Remember the last row Dim iLastRow As Integer: iLastRow = iRow - 1 '// Assign the class numbers. This just makes sure each unique class has a number, starting at 1. Dim ClassNumbers As New Scripting.Dictionary Dim ix As Integer Dim iGroup As cGroup Dim iClassNumber As Integer For ix = 0 To Groups.Count - 1 '// Get the next group object Set iGroup = Groups.Item(Groups.Keys(ix)) '// Check if this is a "ROOT" group, ie the group ID is the same as the class ID If iGroup.bIsRootGroup Then iClassNumber = iClassNumber + 1 'If iClassNumber = 30 Then MsgBox "Classnumber 30" '// Add it to the dictionary of class numbers ClassNumbers.Add iGroup.ClassID, iClassNumber End If Next ix '// Finally, we can assign the class numbers to the rows in the spreadsheet Application.Calculation = xlCalculationManual For ix = 1 To iLastRow '// Put the relevant class number into column 3 ClassIDs.Cells(ix) = ClassNumbers.Item(GroupMembers.Item(KeysA.Cells(ix).Value).ClassID) Next ix Application.Calculation = xlCalculationAutomatic MsgBox "done" End Sub 

这是cGroup类的代码

  Option Explicit '// Properties of the class Public GroupID As Integer '// The group master of this class (ie another group to which it has been joined) '// Can be Nothing if not joined to any other group or if this is the master group '// of a set of joined groups Private memberOfGroup As cGroup Private Sub class_initialize() '// Assign an ID to myself GroupID = g.NextGroupID '// I am not a member of any other group Set memberOfGroup = Nothing End Sub Public Sub JoinGroupMembership(NewLinkedGroup As cGroup) '// Links this group to membership of another group. '// Note that this group may already be a member of another group, in which case '// group membership is changed on the parent group as well as this group '// To avoid circular references, the group with the lower classid is always chosen to be the parent If NewLinkedGroup.ClassID > Me.ClassID Then NewLinkedGroup.JoinGroupMembership Me Exit Sub End If '// If I am already member of a group, make sure my parent group '// joins the new group If Not memberOfGroup Is Nothing Then memberOfGroup.JoinGroupMembership NewLinkedGroup End If '// Now set the new linked group to be my parent Set memberOfGroup = NewLinkedGroup End Sub Public Function ClassID() As Integer '// Returns the classID of this group's master group '// Note that this is recursive, it will work up through the hierarchy of '// parent groups until it hits the group with no parent. '// Check if I am the master group If memberOfGroup Is Nothing Then '// Return my GroupID as the classID ClassID = GroupID Else '// Return the classID of my parent ClassID = memberOfGroup.ClassID End If End Function Public Function bIsRootGroup() As Boolean '// Returns true if this group is not a member of another group bIsRootGroup = memberOfGroup Is Nothing End Function Public Function Self() As cGroup Set Self = Me End Function 

这是我命名为g的模块的代码

  Option Explicit '// Global register of Group ID Private gMaxGroupNumber As Integer '// Method to get the next available GroupID Public Function NextGroupID() As Integer gMaxGroupNumber = gMaxGroupNumber + 1 NextGroupID = gMaxGroupNumber End Function '// Method to reset the groupID Public Sub Reset() gMaxGroupNumber = 0 End Sub 

关于错误:在我的代码的早期版本中,组层次结构不起作用,因为它是将父类ID简单地分配给组。 只要这些组以一个受控的顺序连接,这就没有问题,但是如果两个独立的组已经形成,那么后面的两组合并可以隔离先前链接的成员 – 他们的classID没有被新的父代更新所以他们是有效的孤儿。

假设:

  • 第一列始终包含启动链接
  • 第二列始终包含链接
  • 要继续链接链,第二列中列出的内容必须在第一列中find才能继续链(如您的示例所示)。 这可以防止链接可以分裂成不同链的“链接分裂”。

如果这些假设是真的,那么这个代码将为你工作:

 Sub tgr() Const Link1Col As String = "A" Const Link2Col As String = "B" Const LinkIDCol As String = "C" Dim ws As Worksheet Dim linkColumns(1 To 2) As Range Dim FoundLink As Range Dim LinkID As Long Dim i As Long Set ws = ActiveWorkbook.ActiveSheet Set linkColumns(1) = ws.Range(Link1Col & "1", ws.Cells(ws.Rows.Count, Link1Col).End(xlUp)) Set linkColumns(2) = Intersect(linkColumns(1).EntireRow, ws.Columns(Link2Col)) Intersect(linkColumns(1).EntireRow, ws.Columns(LinkIDCol)).ClearContents LinkID = 0 For i = linkColumns(1).Row To linkColumns(1).Row + linkColumns(1).Rows.Count - 1 If Len(ws.Cells(i, LinkIDCol).Value) = 0 Then LinkID = LinkID + 1 ws.Cells(i, LinkIDCol).Value = LinkID Set FoundLink = linkColumns(1).Find(ws.Cells(i, Link2Col).Value, , xlValues, xlWhole) If Not FoundLink Is Nothing Then Do ws.Cells(FoundLink.Row, LinkIDCol).Value = LinkID Set FoundLink = linkColumns(1).Find(ws.Cells(FoundLink.Row, Link2Col).Value, , xlValues, xlWhole) Loop While Not FoundLink Is Nothing End If End If Next i End Sub