使用VBA识别Excelstring中的产品代码

快速背景 :我正在创build一个在Visual Basic中的search工具,这将允许我在数据库中search不一致命名的材料,这些材料是以自由文本formsinput的。 虽然我已经在Stack Overflow用户的帮助下开发了一个可以同时search数百个或者多个项目的工具,但我还需要进一步改进。

我的问题:我需要能够从这些材料描述中提取项目代码。 这些项目是一般数字,例如: 20405-002或者A445甚至B463-563 。 这些是我将要search的代码的主要types,这些将是唯一的标识符。

一些例子:

在意大利的一家工厂,我有一个材料命名为:

西门子;电机; A4002

在德国的一家工厂,它被称为:

电机; FP4742;西门子; TurnFast; A4002

我会search西门子,电机的条款

我目前的search将返回西门子,马达旁边的第一个, 马达,西门子旁边的第二个。 然后,我想要视觉基本在本质上说'这些可能是相同的部分',然后在内部寻找匹配的代码。 当它find匹配的代码,我想它会返回某种指标在Excel单元格。

总体目标:有一个工具,我可以用来find两个材料是否实际上是相同的,用最less的人力投入。 两家工厂可能有多达50,000种材料。 我也有这些部分的价格和供应商。 虽然75%的供应商时间相同,但在不同国家,相同材料的价格通常在20%以内。 如果您有任何其他的想法来看看两个免费的文字资料是否相同,我会很高兴听到。

我的search代码:

Function MultiSplitX(ByVal SourceText As String, RemoveBlankItems As Boolean, ParamArray Delimiters()) As String() Dim a As Integer, b As Integer, n As Integer Dim i As Integer: i = 33 Dim u As Variant, v As Variant Dim tempArr() As String, finalArr() As String, fDelimiters() As String If InStr(TypeName(Delimiters(0)), "()") <> 0 And LBound(Delimiters) = UBound(Delimiters) Then ReDim fDelimiters(LBound(Delimiters(0)) To UBound(Delimiters(0))) 'If passing array vs array items then For a = LBound(Delimiters(0)) To UBound(Delimiters(0)) 'build that array fDelimiters(a) = Delimiters(0)(a) Next a Else fDelimiters = Delimiters(0) End If Do While InStr(SourceText, Chr(i)) <> 0 'Find an unused character i = i + 1 Loop For a = LBound(fDelimiters) To UBound(fDelimiters) 'Sort Delimiters by length For b = a + 1 To UBound(fDelimiters) If Len(fDelimiters(a)) < Len(fDelimiters(b)) Then u = fDelimiters(b) fDelimiters(b) = fDelimiters(a) fDelimiters(a) = u End If Next b Next a For Each v In fDelimiters 'Replace Delimiters with a common character SourceText = Replace(SourceText, v, Chr(i)) Next tempArr() = Split(SourceText, Chr(i)) 'Remove empty array items If RemoveBlankItems = True Then ReDim finalArr(LBound(tempArr) To UBound(tempArr)) n = LBound(tempArr) For i = LBound(tempArr) To UBound(tempArr) If tempArr(i) <> "" Then finalArr(n) = tempArr(i) n = n + 1 End If Next i n = n - 1 ReDim Preserve finalArr(LBound(tempArr) To n) MultiSplitX = finalArr Else: MultiSplitX = tempArr End If Erase finalArr Erase tempArr End Function 

谢谢大家的帮助 :)

这是使用VBA for Excel编写的响应,但是使用数组来获取/放置数据,因此您应该可以轻松修改数据库。 VB 非常相似。 如果我要完成这项工作,我会在MS Access中完成,在这种情况下,您可以更容易地调整这些代码。 当然,直的VB总是一个选项。 VB不是一个很好的工具。

如果您使用的数据太多,我强烈build议您学习免费和开源的Python语言。 你可以从Sentdex的Youtube上find一个很棒的Pythonvideo系列。 他的video很好,很慢。 你会很快超过你可以用VB完成的。

由于缺乏细节和less量的样本数据,很难全面回答这个问题。

有很多方法来解决这个问题,取决于你想要的输出。 我正在做以下假设。

  1. 你是编码新手,想要易于阅读的输出。 因此,我的解决scheme默认为单个2×2结果数组。 您可以通过设置DeepArr = True将其更改为3+维度。
  2. 您希望将结果粘贴在同一工作表中。
  3. 您可以在零件代码中find单独的供应商/供应商代码列表。 GuessSupplier函数依赖于这个假设。 如果需要,根据实际要求更新function。
  4. 我打电话给你的原始input(如西门子;电机; A4002)的部分代码。
  5. 我假定最后一个分号后面的文字将始终是部件号。 如果没有,你可以很容易地replaceGuessPartNum函数中的假设。

以下描述了我用于简单testing的电子表格。 工作表“零件代码”在单元格B3:B6(B2中的表头)中包含单个列中的零件代码。 GH列保留为结果。 工作表“供应商”在一个列中包含唯一的供应商列表(B3:B6)。 您可以在RunMain()子文件中指定input和输出的图纸名称和范围。
为了方便起见,我在一些地方硬编码表名。 你应该把这些作为参数。 代码有些冗长,以便于理解。
我没有testing性能,因为我没有数据集,并期望你不会经常运行。 我只添加了一个微不足道的error handling。

我的全套代码如下。 你会发现底部附近的RunMain()子。 这将启动控制工作stream程的Main()子文件。

 Option Base 0 Option Explicit ' 1) Manually eliminate duplicates in your parts list using Excel built-in feature. ' a) highlight the range ' b) Data ribbon > Remove Duplicates ' 2) Create a supplier list on a separate sheet in teh same workbook ' 3) Edit the RunMain() procedure per your data. I assume: your part code list ' - part code list is in cells B3:B10 of the PartCodes sheet. ' - supplier list in cells b4:b6 of Suppliers sheet. ' - output goes to D2 in PartCodes sheet. ' 4) Run the RunMain() procedure simply kicks off Main. ' Main() sub does the following: ' a)Run ProcessPartCodes: ' i. load the parts codes from the worksheet into an array ' ii. run GuessPartNum and GuessSupplier and place results in the parts code array. ' b) Run FindMatches to add more to the array. Finds other part codes that may be for the same part. ' Logic is described in the function. ' c) Run ArrayToRange to paste part of the result set to the workseet. Note that ' the ourput array is more than 2 dimensions, so not all data is pasted neatly. ' I leave it to you to determine how you want to format the data for output. ' Function RangeToArray(inputRange As Range) 'Copies values from a rectangular range to a 2D Array. 'Array is always 2D, even if data is a single column or row. 'inputRange: a rectangular range Dim Col1 As Integer, row1 As Integer Dim i As Integer, j As Integer Dim rowCnt As Integer Dim colCnt As Integer Dim retArr() As Variant ' Size output array rowCnt = inputRange.Rows.Count colCnt = inputRange.Columns.Count ReDim retArr(1 To rowCnt, 1 To colCnt) As Variant ' Load range values into array For i = 1 To rowCnt For j = 1 To (colCnt) retArr(i, j) = Trim(inputRange.Cells(i, j)) Next j Next i ' Return array RangeToArray = retArr End Function Sub ArrayToRange(myArr As Variant, Target As Range) ' Copies the content of a 2D array to a Range. ' myArr must be exactly 2 dimensions ' Target is a range. If more than 1 cell, the top left cell is used. ' Copies the array to the range starting with the top left cell. ' Target Range size can be a single cell and need not match the array dimensions. Dim r As Long, tgtRow As Long Dim c As Long, tgtCol As Long Dim firstRow As Long Dim firstCol As Long Dim lastRow As Long Dim lastCol As Long ' Find the top left cell of the Target Range tgtRow = Target.Row tgtCol = Target.Column ' Set target range dimesions based on array size. firstRow = tgtRow + LBound(myArr, 1) firstCol = tgtCol + LBound(myArr, 2) lastRow = tgtRow + UBound(myArr, 1) lastCol = tgtCol + UBound(myArr, 2) ' The next row would usually work. If you get funky data, it will fail, ' so, we will use a loop instead. ' Range(Cells(firstRow, firstCol), Cells(lastRow, lastCol)) = myArr ' Loop through rows and columns, setting cell values one at a time. For r = LBound(myArr, 1) To UBound(myArr, 1) For c = LBound(myArr, 2) To UBound(myArr, 2) On Error Resume Next ' Prevent one bad value from killing the entire operation. Cells(tgtRow + r - 1, tgtCol + c) = myArr(r, c) On Error GoTo 0 Next c Next r End Sub ' Not used, this is just an example 'Public Function RangeCorners(Optional MyRange As Range = Range("c2:c10")) ' TopLeft = MyRange.Cells(1) ' BottomLeft = MyRange.Cells(.Rows.Count, 1) ' TopRight = MyRange.Cells(1, .Columns.Count) ' BottomRigt = MyRange.Cells(.Cells.Count) ' RangeCorners = Array(TopLeft, TopRight, BottomLeft, BottomRight) 'End Function Function IsInArray(stringToBeFound As String, arr As Variant) As Boolean 'Returns True if stringToBeFound is in the array (arr); else False 'This one-liner need not be in a fucntion, but makes reading code easier. IsInArray = (UBound(Filter(arr, stringToBeFound)) > -1) End Function Function GuessPartNum(splitPartCode As Variant, Optional delim As String = ";") ' Find a way to determine what part of the partCode is the part number. ' Perhaps it is always last. Perhaps it always has at least 3 digits. ' Simply takes the last item from the part code. Update this logic to whatever ' makes sense for your dataset (which I could nto see when writing this). GuessPartNum = splitPartCode(UBound(splitPartCode)) End Function Function GuessSupplier(splitPartCode As Variant, supplierList As Variant, Optional delim As String = ";") ' Determine the supplier of this part from the partCode. ' For each supplier in the supplierList, see if the supplier name is in the partCode. Dim i As Integer For i = LBound(supplierList) To UBound(supplierList) 'Simply verifies if a supplier from supplierList is in the part code. Uses first match. If (UBound(Filter(splitPartCode, supplierList(i, 1))) > -1) Then 'if arr(i) is in supplier_array GuessSupplier = supplierList(i, 1) Exit Function End If Next i End Function Function ProcessPartCodes(partCodeRange As Range, supplierListRange As Range, Optional delim As String = ";") ' Main ProcessPartCodes ' ' PartCodeRange: a range representing the part code list; ' must be in single column form. ' SupplierList: array of supplier names as strings ' ' Load part code array into array Dim resultArr As Variant 'result set as array Dim supplierList As Variant Dim splitCode As Variant Dim i As Integer resultArr = RangeToArray(partCodeRange) ReDim Preserve resultArr(LBound(resultArr) To UBound(resultArr), 0 To 4) As Variant supplierList = RangeToArray(supplierListRange) ' Get supplier and part num from each part code For i = LBound(resultArr) To UBound(resultArr) If Len(resultArr(i, 0)) > 0 Then splitCode = Split(resultArr(i, 0), delim) ' Split the partCode by delimiters, semi-colon (;) resultArr(i, 0) = resultArr(i, 0) ' Part Code (not parsed) resultArr(i, 1) = GuessPartNum(splitCode) ' Part Number resultArr(i, 2) = GuessSupplier(splitCode, supplierList) ' Supplier resultArr(i, 3) = splitCode ' Part Code (parsed) 'resultArr(i, 4) ' reserved for match information Else ' Empty array element. splitCode = "" resultArr(i, 3) = Array() End If Next i ProcessPartCodes = resultArr End Function Function CompareParts(splitPartCode1 As Variant, splitPartCode2 As Variant) ' ' 'splitPartCode1 is an array of a parsed partCode string 'splitPartCode2 is an array of a parsed partCode string Dim matches() As String Dim i As Integer Dim matchCnt As String ReDim matches(0 To 0) As String ' Check each item in arr1 (each substring of partCode1) for a match in arr2 For i = LBound(splitPartCode1) To UBound(splitPartCode1) If (UBound(Filter(splitPartCode2, splitPartCode1(i))) > -1) Then 'if arr1(i) is in arr2 ' Found an item in splitPartCode1 (a substring in partCode1) that is also in splitPartCode2. ' Add this item to the list of matches. If LBound(matches) = -1 Then ReDim matches(0 To 0) As String Else ReDim Preserve matches(LBound(matches) To UBound(matches) + 1) As String ' grow the matches array by one End If matches(UBound(matches)) = splitPartCode1(i) ' set value of last item in matches() = this item (this substring of partCode1) End If Next i matchCnt = UBound(matches) - LBound(matches) + 1 ' Total number of matching substrings from each part. CompareParts = Array(matchCnt, matches) End Function Function FindMatches(partCodeArr As Variant, Optional DeepArr As Boolean = False) ' Fucntion compares 2 part numbers to determine likelihood of a match. ' Parses partCode1 and partCode2 using the delimiter into arrays of strings. ' Then counts the number of matching strings in each array. ' Then determines if the part numbers (assumed to be the last string of each array) match. ' After running this, you can use the match count (matchCnt integer) and part number match ' (partNumMatch boolean) as a basis for determining how likely it is that partCode1=partCode2. ' ' ' DeepArr: If True, returns 3+ dimensional array. If False, flattens results to 2D array. ' ' Returns: Array(partCode1, partCode2, partNum1, partNum2, matchCnt, pricePct, supplierMatch, partNumMatch) ' partCode1 = partCode1 input argument ' partCode2 = partCode2 input argument ' partNum1 = the portion (substring) of partCode1 after the last ocurrence of the delimiter, delim. ' partNum2 = the portion (substring) of partCode2 after the last ocurrence of the delimiter, delim. ' match (boolean) = True if parts are likely the same. ' matchCnt = number of matching sub-strings between partCode1 and part 2 ' (essentially, a match score, where higher is more likely a positive match) ' Returns -1 if partCode1=partCode2, meaning exact match. ' pricePct = percentage price match calculated as (decimal portion of price1/price2) * 100 ' partNumMatch = True is partNum1=partNum2; else False Dim i As Integer, j As Integer, k As Integer Dim partCodei, partCodej Dim partNumi As String, partNumj As String, numMatch As Boolean Dim Duplicate As Boolean, newMatch As Boolean Dim partSupplieri As String, partSupplierj As String, supplierMatch As Boolean Dim splitCodei() As String, splitCodej() As String, matchCnt As Integer Dim splitCompare Dim matches() As String 'empty array has LBound=0 and UBound=-1, so UBound-LBound=-1 indicates an empty array Dim matchstr As String Dim s As String matchCnt = 0 ' matchCnt = UBound(matches) - LBound(matches) + 1 ' starting with 0 matches. For i = LBound(partCodeArr) To UBound(partCodeArr) If i = 1 Or i = UBound(partCodeArr) Or i Mod 100 = 0 _ Then Debug.Print "Starting record " & i & ": " & Now() If partCodeArr(i, 0) <> "" Then matchstr = "" For j = i + 1 To UBound(partCodeArr) If Len(partCodeArr(j, 0)) > 0 Then partCodei = partCodeArr(i, 0) partCodej = partCodeArr(j, 0) Duplicate = partCodei = partCodej 'found duplicate entry in table. partNumi = partCodeArr(i, 1) partNumj = partCodeArr(j, 1) numMatch = partNumi = partNumj partSupplieri = partCodeArr(i, 2) partSupplierj = partCodeArr(j, 2) supplierMatch = partSupplieri = partSupplierj splitCodei = partCodeArr(i, 3) splitCodej = partCodeArr(j, 3) splitCompare = CompareParts(splitCodei, splitCodej) matchCnt = splitCompare(0) newMatch = False If Duplicate Then ' You should have removed duplicates before starting. On Error GoTo redimErr ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String On Error GoTo 0 newMatch = True matches(UBound(matches), 0) = partCodej 'The duplicate partCode matches(UBound(matches), 1) = "0" ' Matching score, where -1 indicates an exact duplicate. matches(UBound(matches), 2) = "Duplicate Entry. Part codes are identical." ' Matching score, where -1 indicates an exact duplicate. ElseIf supplierMatch And numMatch Then ' Possible duplicate part since supplier and part number both match. On Error GoTo redimErr ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String On Error GoTo 0 newMatch = True matches(UBound(matches), 0) = partCodej 'The duplicate partCode matches(UBound(matches), 1) = "1" ' Matching score, where -1 indicates an exact duplicate. matches(UBound(matches), 2) = "Probably same part with differnt part code. Same supplier and part number." ' Matching score, where -1 indicates an exact duplicate. ElseIf supplierMatch And matchCnt > 2 Then ' Possible duplicate part since supplier and part number both match. On Error GoTo redimErr ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String On Error GoTo 0 newMatch = True matches(UBound(matches), 0) = partCodej 'The duplicate partCode matches(UBound(matches), 1) = "2" ' Matching score, where -1 indicates an exact duplicate. matches(UBound(matches), 2) = "Possible duplicate. More likely a similar part from same supplier" ' Matching score, where -1 indicates an exact duplicate. ElseIf supplierMatch = False And matchCnt > 2 Then ' Possible duplicate part since supplier and part number both match. On Error GoTo redimErr ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String On Error GoTo 0 newMatch = True matches(UBound(matches), 0) = partCodej 'The duplicate partCode matches(UBound(matches), 1) = "3" ' Matching score, where -1 indicates an exact duplicate. matches(UBound(matches), 2) = "Possible part match from different supplier" ' Matching score, where -1 indicates an exact duplicate. ElseIf supplierMatch = False And matchCnt > 1 Then ' Possible duplicate part since supplier and part number both match. On Error GoTo redimErr ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String On Error GoTo 0 newMatch = True matches(UBound(matches), 0) = partCodej 'The duplicate partCode matches(UBound(matches), 1) = "4" ' Matching score, where -1 indicates an exact duplicate. matches(UBound(matches), 2) = "Low probability part match from different supplier" ' Matching score, where -1 indicates an exact duplicate. End If If newMatch And Not DeepArr Then For k = LBound(matches) To UBound(matches) matchstr = matchstr & "[" & partCodej & "," & matches(UBound(matches), 1) & "," & matches(UBound(matches), 2) & "], " Next k End If End If Next j If DeepArr Then ' return 3+ dimensional array partCodeArr(i, 4) = matches Else ' return 2D array for easier pasting to worksheet ' Flatten partCodeArr(i, 4), the parsed potential part matches to an ordinary string ' with format [[part code, match value, match description],[part code, match value, match description],...] If Len(matchstr) > 0 Then matchstr = "[ " & Left(matchstr, Len(matchstr) - 2) & "] " End If partCodeArr(i, 4) = matchstr ' Flatten the parsed part code back to original string format. partCodeArr(i, 3) = partCodeArr(i, 0) End If ReDim matches(0) As String End If Next i FindMatches = partCodeArr Exit Function redimErr: ReDim matches(0 To 0, 0 To 2) As String Resume Next End Function Sub RunMain() ' Kicks off Main(partCodeRange As Range, supplierListRange As Range, destination As Range) ' ' Arguments: ' partCodeRange = Excel Range (not string name of range) ' that contains the raw part code list ' supplierListRange = Excel Range (not string name of range) ' that contains a unique list of supplier ' codes found in the part codes. ' Call Main(Sheets("PartCodes").Range("B3:B10"), Sheets("Suppliers").Range("B4:B6"), Range("PartCodes!D2")) End Sub Sub Main(partCodeRange As Range, supplierListRange As Range, destination As Range) ' This is the main sub that runs the full process of finding equivalent part ' codes and writing the findings to an excel worksheet. ' See RunMain() sub for example use. ' ' Arguments: ' partCodeRange = Excel Range (not string name of range) ' that contains the raw part code list ' supplierListRange = Excel Range (not string name of range) ' that contains a unique list of supplier ' codes found in the part codes. ' Dim partCodesArr, matchArr Dim startdate As Date, stopdate As Date startdate = Now() Debug.Print Debug.Print String(70, "*") Debug.Print Debug.Print "Starting: " & startdate Debug.Print partCodesArr = ProcessPartCodes(partCodeRange, supplierListRange) matchArr = FindMatches(partCodesArr) ' FindMatches(partCodesArr, True) for 3+ dimensional results Sheets("PartCodes").Activate 'Write column headers. destination.Offset(0, 0) = "Part Code" destination.Offset(0, 1) = "Part Num" destination.Offset(0, 2) = "Part Supplier" destination.Offset(0, 3) = "Part Code" destination.Offset(0, 4) = "Potential equivalent part numbers" Call ArrayToRange(matchArr, destination.Offset(1, 0)) stopdate = Now() Debug.Print Debug.Print "Finished: " & stopdate Debug.Print Debug.Print "Run time: " & (stopdate - startdate) Debug.Print Debug.Print String(70, "*") Debug.Print End Sub