将Excel中的2个列表与VBA正则expression式进行比较

我想用它们来比较Excel中的两个列表（列）以查找匹配项。由于这是一个非常复杂的操作，我以前在Excel中使用了几个不同的函数（非VBA），但是事实certificate它最多是尴尬的，所以我想尝试一个全合一的VBA解决scheme，如果可能的话。

第一列有不规则的名称（例如引用的昵称，后缀如“jr”或“sr”，括号中的“首选”版本）。另外，当中间名字出现时，它们可能是名字或者是名字。

第一列的顺序是：

<first name or initial> <space> <any parenthetical 'preferred' names - if they exist> <space> <middle name or initial - if it exists> <space> <quoted nickname or initial - if it exists> <space> <last name> <comma - if necessary><space - if necessary><suffix - if it exists>

第二栏的顺序是：

  `<lastname><space><suffix>,<firstname><space><middle name, if it exists>`

，没有任何第一栏中的“违规行为”。

我的主要目标是按照以下顺序“清理”第一列：

  `lastname-space-suffix,firstname-space-preferred name-space- middle name-space-nickname`

尽pipe我在这里保留了“违规行为”，但是我可能会在比较代码中使用某种“标志”来逐个提醒我。

我一直在尝试几种模式，这是我最近的：

 ["]?([A-Za-z]?)[.]?["]?[.]?[\s]?[,]?[\s]?

不过，我想允许姓和后缀（如果存在）。我已经用“全局”来testing它，但是我不知道如何通过反向引用来分隔姓和后缀。

然后，我想比较两个列表之间的最后一个，第一个，中间首字母（因为大多数名字只是第一个列表中的首字母）。

  An example would be: (1st list) John (Johnny) B. "Abe" Smith, Jr. turned into: Smith Jr,John (Johnny) B "Abe" or Smith Jr,John B and (2nd list) Smith Jr,John Bertrand turned into: Smith Jr,John B Then run a comparison between the two columns.

这个清单比较会是一个好的开始还是延续点？

2012年4月10日附件：

作为一个便笺，我将需要消除来自首选名称的昵称和括号中的引号。我可以将分组引用进一步分解为子组（在下面的例子中）？

  (?: ([ ] \( [^)]* \)))? # (2) parenthetical 'preferred' name (optional) (?: ([ ] (["'] ) .*?) \6 )? # (5,6) quoted nickname or initial (optional)

我可以像这样对他们进行分组：

  (?:(([ ])(\()([^)]*)(\))))? # (2) parenthetical 'preferred' name (optional) not sure how to do this one - # (5,6) quoted nickname or initial (optional)

我在“Regex Coach”和“RegExr”中试过，他们工作的很好，但是在VBA中，当我想要返回的反向引用时，所有返回的都是名字，数字1和逗号（例如“Carl1”）。我要回去检查是否有错别字。谢谢你的帮助。

2012年4月17日附件：

我忽略了一个名字“情况”，那就是由两个或两个以上单词组成的姓氏，例如“St Cyr”或“Von Wilhelm”。
会增加下面的内容

  `((St|Von)[ ])?

在这个正则expression式，你提供的？

  `((St|Von)[ ])?([^\,()"']+)

我在Regex Coach和RegExr中的testing还没有完成，因为replace返回“St”，前面有一个空格。

重做 –

这是不同的方法。它可能在你的VBA中工作，只是一个例子。我在Perl中testing了它，它工作得很好。但是，我不会显示Perl代码，
只是正则expression式的一些解释。

这是一个两步的过程。

标准化列文本
做主要的parsing

规范化过程

获取列值
去掉所有的点. – 全局search\. ，什么都不换
将空格转换为空格 – 全局search\s+ ，replace为单个空格[ ]

（请注意，如果不能正常化，不pipe尝试什么，我都没有太多的成功机会）

主要parsing过程

标准化一个列值后（对两列做），通过这些正则expression式运行。

第1列正则expression式

 ^ [ ]? ([^\ ,()"']+) # (1) first name or initial (required) (?: ([ ] \( [^)]* \)) )? # (2) parenthetical 'preferred' name (optional) (?: ([ ] [^\ ,()"'] ) # (3,4) middle initial OR name (optional) ([^\ ,()"']*) # name and initial are both captured )? (?: ([ ] (["'] ) .*?) \6 )? # (5,6) quoted nickname or initial (optional) [ ] ([^\ ,()"']+) # (7) last name (required) (?: [, ]* ([ ].+?) [ ]? # (8) suffix (optional) | .*? )? $

更换取决于你想要的。
定义了三种types（根据需要用\replace$ ）：

1a型全中 – $7$8,$1$2$3$4$5$6
1b型中间初始 – $7$8,$1$2$3$5$6
types2中间初始 – $7$8,$1$3

转换示例：

 Input (raw) = 'John (Johnny) Bertrand "Abe" Smith, Jr. ' Out type 1 full middle = 'Smith Jr,John (Johnny) Bertrand "Abe"' Out type 1 middle initial = 'Smith Jr,John (Johnny) B "Abe"' Out type 2 middle initial = 'Smith Jr,John B'

第2列正则expression式

 ^ [ ]? ([^\ ,()"']+) # (1) last name (required) (?: ([ ] [^\ ,()"']+) )? # (2) suffix (optional) , ([^\ ,()"']+) # (3) first name or initial (required) (?: ([ ] [^\ ,()"']) # (4,5) middle initial OR name (optional) ([^\ ,()"']*) )? .* $

更换取决于你想要的。
定义了两种types（根据需要用$replace$ ）：

1a型全中 – $1$2,$3$4$5
types1b中间初始 – $1$2,$3$4

转换示例：

 Input = 'Smith Jr.,John Bertrand ' Out type 1 full middle = 'Smith Jr,John Bertrand' Out type 1 middle initial = 'Smith Jr,John B'

VBAreplace帮助

这工作在一个非常旧的Excel副本，创build一个VBA项目。
这两个模块是为了显示一个例子而创build的。
他们都做同样的事情。

第一个是所有可能的replacetypes的详细例子。
第二个是使用types2比较的修剪版本。

我以前没有做过VB，但是应该很简单
为你收集如何更换工作，以及如何配合的Excel
列。

如果你只是做一个平坦的比较，你可能想要做一个col 1 val
一次，然后检查列2中的每个值，然后转到下一个val
第1列，然后重复。

为了最快的方式做到这一点，创build2个额外的列，转换尊重
列valstypes2（variablesstrC1_2和strC2_2，请参阅示例），然后复制它们
到新的专栏。
之后，你不需要regex，只需比较列，find匹配的行，
然后删除types2列。

详细 –

 Sub RegexColumnValueComparison() ' Column 1 and 2 , Sample values ' These should probably be passed in values ' ============================================ strC1 = "John (Johnny) Bertrand ""Abe"" Smith, Jr. " strC2 = "Smith Jr.,John Bertrand " ' Normalization Regexs for whitespace's and period's ' (use for both column values) ' ============================================= Set rxDot = CreateObject("vbscript.regexp") rxDot.Global = True rxDot.Pattern = "\." Set rxWSp = CreateObject("vbscript.regexp") rxWSp.Global = True rxWSp.Pattern = "\s+" ' Column 1 Regex ' ================== Set rxC1 = CreateObject("vbscript.regexp") rxC1.Global = False rxC1.Pattern = "^[ ]?([^ ,()""']+)(?:([ ]\([^)]*\)))?(?:([ ][^ ,()""'])([^ ,()""']*))?(?:([ ]([""']).*?)\6)?[ ]([^ ,()""']+)(?:[, ]*([ ].+?)[ ]?|.*?)?$" ' Column 2 Regex ' ================== Set rxC2 = CreateObject("vbscript.regexp") rxC2.Global = False rxC2.Pattern = "^[ ]?([^ ,()""']+)(?:([ ][^ ,()""']+))?,([^ ,()""']+)(?:([ ][^ ,()""'])([^ ,()""']*))?.*$" ' Normalize column 1 and 2, Copy to new var ' ============================================ strC1_Normal = rxDot.Replace(rxWSp.Replace(strC1, " "), "") strC2_Normal = rxDot.Replace(rxWSp.Replace(strC2, " "), "") ' ------------------------------------------------------ ' This section is informational ' Shows some sample replacements before comparison ' Just pick 1 replacement from each column, discard the rest ' ------------------------------------------------------ ' Create Some Replacement Types for Column 1 ' ===================================================== strC1_1a = rxC1.Replace(strC1_Normal, "$7$8,$1$2$3$4$5$6") strC1_1b = rxC1.Replace(strC1_Normal, "$7$8,$1$2$3$5$6") strC1_2 = rxC1.Replace(strC1_Normal, "$7$8,$1$3") ' Create Some Replacement Types for Column 2 ' ===================================================== strC2_1b = rxC2.Replace(strC2_Normal, "$1$2,$3$4$5") strC2_2 = rxC2.Replace(strC2_Normal, "$1$2,$3$4") ' Show Types in Message Box ' ===================================================== c1_t1a = "Column1 Types:" & Chr(13) & "type 1a full middle - " & strC1_1a c1_t1b = "type 1b middle initial - " & strC1_1b c1_t2 = "type 2 middle initial - " & strC1_2 c2_t1b = "Column2 Types:" & Chr(13) & "type 1b middle initial - " & strC2_1b c2_t2 = "type 2 middle initial - " & strC2_2 MsgBox (c1_t1a & Chr(13) & c1_t1b & Chr(13) & c1_t2 & Chr(13) & Chr(13) & c2_t1b & Chr(13) & c2_t2) ' ------------------------------------------------------ ' Compare a Value from Column 1 vs Column 2 ' For this we will compare Type 2 values ' ------------------------------------------------------ If strC1_2 = strC2_2 Then MsgBox ("Type 2 values are EQUAL: " & Chr(13) & strC1_2) Else MsgBox ("Type 2 values are NOT Equal:" & Chr(13) & strC1_2 & " != " & strC1_2) End If ' ------------------------------------------------------ ' Same comparison (Type 2) of Normalized column 1,2 values ' In esscense, this is all you need ' ------------------------------------------------------ If rxC1.Replace(strC1_Normal, "$7$8,$1$3") = rxC2.Replace(strC2_Normal, "$1$2,$3$4") Then MsgBox ("Type 2 values are EQUAL") Else MsgBox ("Type 2 values are NOT Equal") End If End Sub

只有types2 –

 Sub RegexColumnValueComparison() ' Column 1 and 2 , Sample values ' These should probably be passed in values ' ============================================ strC1 = "John (Johnny) Bertrand ""Abe"" Smith, Jr. " strC2 = "Smith Jr.,John Bertrand " ' Normalization Regexes for whitespace's and period's ' (use for both column values) ' ============================================= Set rxDot = CreateObject("vbscript.regexp") rxDot.Global = True rxDot.Pattern = "\." Set rxWSp = CreateObject("vbscript.regexp") rxWSp.Global = True rxWSp.Pattern = "\s+" ' Column 1 Regex ' ================== Set rxC1 = CreateObject("vbscript.regexp") rxC1.Global = False rxC1.Pattern = "^[ ]?([^ ,()""']+)(?:([ ]\([^)]*\)))?(?:([ ][^ ,()""'])([^ ,()""']*))?(?:([ ]([""']).*?)\6)?[ ]([^ ,()""']+)(?:[, ]*([ ].+?)[ ]?|.*?)?$" ' Column 2 Regex ' ================== Set rxC2 = CreateObject("vbscript.regexp") rxC2.Global = False rxC2.Pattern = "^[ ]?([^ ,()""']+)(?:([ ][^ ,()""']+))?,([^ ,()""']+)(?:([ ][^ ,()""'])([^ ,()""']*))?.*$" ' Normalize column 1 and 2, Copy to new var ' ============================================ strC1_Normal = rxDot.Replace(rxWSp.Replace(strC1, " "), "") strC2_Normal = rxDot.Replace(rxWSp.Replace(strC2, " "), "") ' Comparison (Type 2) of Normalized column 1,2 values ' ============================================ strC1_2 = rxC1.Replace(strC1_Normal, "$7$8,$1$3") strC2_2 = rxC2.Replace(strC2_Normal, "$1$2,$3$4") If strC1_2 = strC2_2 Then MsgBox ("Type 2 values are EQUAL") Else MsgBox ("Type 2 values are NOT Equal") End If End Sub

帕伦/报价回应

As a side note, I will need to eliminate the quotes from the nicknames and the parentheses from the preferred names.

如果我理解正确

是的，您可以单独地在引号和括号内捕捉内容。
这只是需要一些修改。下面的正则expression式有能力
用或不用引号和/或括号制定替代scheme，
或其他forms。

下面的样品给出了制定替代品的方法。

非常重要请注意这里

如果你正在讨论去除引号“”和括号（）
匹配正则expression式，这也可以做到。它需要一个新的正则expression式。

唯一的问题是所有区分首选/中间/尼克
被扔出窗外，因为这些都是位置以及
（即：（首选）中间“尼克”）。

取消这个考虑将需要像这样的正则expression式

 (?:[ ]([^ ,]+))? # optional preferred (?:[ ]([^ ,]+))? # optional middle (?:[ ]([^ ,]+))? # optional nick

而且，他们是可选的，失去了所有的位置参考，并呈现中期的初始
expression无效。

结束注释

正则expression式模板（用于制定replacestring）

 ^ [ ]? # (required) # First # $1 name # ----------------------------------------- ([^\ ,()"']+) # (1) name # (optional) # Parenthetical 'preferred' # $2 all # $3$4 name # ----------------------------------------- (?: ( # (2) all ([ ]) \( ([^)]*) \) # (3,4) space and name ) )? # (optional) # Middle # $5 initial # $5$6 name # ----------------------------------------- (?: ([ ] [^\ ,()"'] ) # (5) first character ([^\ ,()"']*) # (6) remaining characters )? # (optional) # Quoted nick # $7$8$9$8 all # $7$9 name # ----------------------------------------- (?: ([ ]) # (7) space (["']) # (8) quote (.*?) # (9) name \8 )? # (required) # Last # $10 name # ----------------------------------------- [ ] ([^\ ,()"']+) # (10) name # (optional) # Suffix # $11 suffix # ----------------------------------------- (?: [, ]* ([ ].+?) [ ]? # (11) suffix | .*? )? $

VBA正则expression式（第二版，从上面的VBA项目中testing）

 rxC1.Pattern = "^[ ]?([^ ,()""']+)(?:(([ ])\(([^)]*)\)))?(?:([ ][^ ,()""'])([^ ,()""']*))?(?:([ ])([""'])(.*?)\8)?[ ]([^ ,()""']+)(?:[, ]*([ ].+?)[ ]?|.*?)?$" strC1_1a = rxC1.Replace( strC1_Normal, "$10$11,$1$2$5$6$7$8$9$8" ) strC1_1aa = rxC1.Replace( strC1_Normal, "$10$11,$1$3$4$5$6$7$9" ) strC1_1b = rxC1.Replace( strC1_Normal, "$10$11,$1$2$5$7$8$9$8" ) strC1_1bb = rxC1.Replace( strC1_Normal, "$10$11,$1$3$4$5$7$9" ) strC1_2 = rxC1.Replace( strC1_Normal, "$10$11,$1$5" )

示例input/输出可能性

 Input (raw) = 'John (Johnny) Bertrand "Abe" Smith, Jr. ' Out type 1a full middle = 'Smith Jr,John (Johnny) Bertrand "Abe"' Out type 1aa full middle = 'Smith Jr,John Johnny Bertrand Abe' Out type 1b middle initial = 'Smith Jr,John (Johnny) B "Abe"' Out type 1bb middle initial = 'Smith Jr,John Johnny B Abe' Out type 2 middle initial = 'Smith Jr,John B' Input (raw) = 'John (Johnny) Smith, Jr.' Out type 1a full middle = 'Smith Jr,John (Johnny)' Out type 1aa full middle = 'Smith Jr,John Johnny' Out type 1b middle initial = 'Smith Jr,John (Johnny)' Out type 1bb middle initial = 'Smith Jr,John Johnny' Out type 2 middle initial = 'Smith Jr,John' Input (raw) = 'John (Johnny) "Abe" Smith, Jr.' Out type 1a full middle = 'Smith Jr,John (Johnny) "Abe"' Out type 1aa full middle = 'Smith Jr,John Johnny Abe' Out type 1b middle initial = 'Smith Jr,John (Johnny) "Abe"' Out type 1bb middle initial = 'Smith Jr,John Johnny Abe' Out type 2 middle initial = 'Smith Jr,John' Input (raw) = 'John "Abe" Smith, Jr.' Out type 1a full middle = 'Smith Jr,John "Abe"' Out type 1aa full middle = 'Smith Jr,John Abe' Out type 1b middle initial = 'Smith Jr,John "Abe"' Out type 1bb middle initial = 'Smith Jr,John Abe' Out type 2 middle initial = 'Smith Jr,John'

回复：4/17关注

last names that have 2 or more words. Would the allowance for certain literal names, rather than generic word patterns, be the solution?

其实不，不会。在这种情况下，对于你的表单，允许多个单词的姓氏
将空间字段分隔符注入到姓氏字段中。

然而，对于你的特定forms，这是可以做到的，因为唯一的障碍就是当时的情况
"nick"字段丢失。当它缺less，并给予只有一个词在中
中间名，列出2个排列。

希望您可以从下面的3个正则expression式和testing用例输出中获得解决scheme。正则expression式已经从捕获中删除了空格分隔符。所以，你可以写作
Replace方法的replace，或者只是存储捕获缓冲区进行比较
其他列的捕获scheme的结果。

 Nick_rx.Pattern (template) * This pattern is multi-word last name, NICK is required ^ [ ]? # First (req'd) ([^\ ,()"']+) # (1) first name # Preferred first (?: [ ] ( # (2) (preferred), -or- \( ([^)]*?) \) # (3) preferred ) )? # Middle (?: [ ] ( # (4) full middle, -or- ([^\ ,()"']) # (5) initial [^\ ,()"']* ) )? # Quoted nick (req'd) [ ] ( # (6) "nick", (["']) # (7) n/a -or- (.*?) # (8) nick \7 ) # Single/Multi Last (req'd) [ ] ( # (9) multi/single word last name [^\ ,()"']+ (?:[ ][^\ ,()"']+)* ) # Suffix (?: [ ]? , [ ]? (.*?) )? # (10) suffix [ ]? $ ----------------------------------- FLs_rx.Pattern (template) * This pattern has no MIDDLE/NICK, is single-word last name, * and has no permutations. ^ [ ]? # First (req'd) ([^\ ,()"']+) # (1) first name # Preferred first (?: [ ] ( # (2) (preferred), -or- \( ([^)]*?) \) # (3) preferred ) )? # Single Last (req'd) [ ] ([^\ ,()"']+) # (4) single word last name # Suffix (?: [ ]? , [ ]? (.*?) )? # (5) suffix [ ]? $ ----------------------------------- FLm_rx.Pattern (template) * This pattern has no NICK, is multi-word last name, * and has 2 permutations. * 1. Middle as part of Last name. * 2. Middle is separate from Last name. ^ [ ]? # First (req'd) ([^\ ,()"']+) # (1) first name # Preferred first (?: [ ] ( # (2) (preferred), -or- \( ([^)]*?) \) # (3) preferred ) )? # Multi Last (req'd) [ ] ( # (4) Multi, as Middle + Last, # -or- (?: # Middle ( # (5) full middle, -or- ([^\ ,()"']) # (6) initial [^\ ,()"']* ) [ ] ) # Last (req'd) ( # (7) multi/single word last name [^\ ,()"']+ (?:[ ][^\ ,()"']+)* ) ) # Suffix (?: [ ]? , [ ]? (.*?) )? # (8) suffix [ ]? $ ----------------------------------- Each of these regexes are mutually exclusive and should be checked in an if-then-else like this (Pseudo code): str_Normal = rxDot.Replace(rxWSp.Replace(str, " "), "") If Nick_rx.Test(str_Normal) Then N_1a = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $2 $4 $6 "), " ") N_1aa = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $3 $4 $8 "), " ") N_1b = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $2 $5 $6 "), " ") N_1bb = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $3 $5 $8 "), " ") N_2 = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $5 "), " ") ' see test case results in output below Else If FLs_rx.Test(str_Normal) Then FLs_1a = rxWSp.Replace( FLs_rx.Replace(str_Normal, "$4 $5 , $1 $2 "), " ") FLs_1aa = rxWSp.Replace( FLs_rx.Replace(str_Normal, "$4 $5 , $1 $3 "), " ") FLs_2 = rxWSp.Replace( FLs_rx.Replace(str_Normal, "$4 $5 , $1 "), " ") Else If FLm_rx.Test(str_Normal) Then ' Permutation 1: FLm1_1a = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$4 $8 , $1 $2 "), " ") FLm1_1aa = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$4 $8 , $1 $3 "), " ") FLm1_2 = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$4 $8 , $1 "), " ") ' Permutation 2: FLm2_1a = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $2 $5 "), " ") FLm2_1aa = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $3 $5 "), " ") FLm2_1b = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $2 $6 "), " ") FLm2_1bb = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $3 $6 "), " ") FLm2_2 = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $6 "), " ") ' At this point, the odds are that only one of these permutations will match ' a different column. Else ' The data could not be matched against a valid form End If ----------------------------- Test Cases Found form 'Nick' Input (raw) = 'John1 (JJ) Bert "nick" St Van Helsing ,Jr ' Normal = 'John1 (JJ) Bert "nick" St Van Helsing ,Jr ' Out type 1a full middle = 'St Van Helsing Jr , John1 (JJ) Bert "nick" ' Out type 1aa full middle = 'St Van Helsing Jr , John1 JJ Bert nick ' Out type 1b middle initial = 'St Van Helsing Jr , John1 (JJ) B "nick" ' Out type 1bb middle initial = 'St Van Helsing Jr , John1 JJ B nick ' Out type 2 middle initial = 'St Van Helsing Jr , John1 B ' ======================================================= Found form 'Nick' Input (raw) = 'John2 Bert "nick" Helsing ,Jr ' Normal = 'John2 Bert "nick" Helsing ,Jr ' Out type 1a full middle = 'Helsing Jr , John2 Bert "nick" ' Out type 1aa full middle = 'Helsing Jr , John2 Bert nick ' Out type 1b middle initial = 'Helsing Jr , John2 B "nick" ' Out type 1bb middle initial = 'Helsing Jr , John2 B nick ' Out type 2 middle initial = 'Helsing Jr , John2 B ' ======================================================= Found form 'Nick' Input (raw) = 'John3 Bert "nick" St Van Helsing ,Jr ' Normal = 'John3 Bert "nick" St Van Helsing ,Jr ' Out type 1a full middle = 'St Van Helsing Jr , John3 Bert "nick" ' Out type 1aa full middle = 'St Van Helsing Jr , John3 Bert nick ' Out type 1b middle initial = 'St Van Helsing Jr , John3 B "nick" ' Out type 1bb middle initial = 'St Van Helsing Jr , John3 B nick ' Out type 2 middle initial = 'St Van Helsing Jr , John3 B ' ======================================================= Found form 'First-Last (single)' Input (raw) = 'John4 Helsing ' Normal = 'John4 Helsing ' Out type 1a no middle = 'Helsing , John4 ' Out type 1aa no middle = 'Helsing , John4 ' Out type 2 = 'Helsing , John4 ' ======================================================= Found form 'First-Last (single)' Input (raw) = 'John5 (JJ) Helsing ' Normal = 'John5 (JJ) Helsing ' Out type 1a no middle = 'Helsing , John5 (JJ) ' Out type 1aa no middle = 'Helsing , John5 JJ ' Out type 2 = 'Helsing , John5 ' ======================================================= Found form 'First-Last (multi)' Input (raw) = 'John6 (JJ) Bert St Van Helsing ,Jr ' Normal = 'John6 (JJ) Bert St Van Helsing ,Jr ' Permutation 1: Out type 1a no middle = 'Bert St Van Helsing Jr , John6 (JJ) ' Out type 1aa no middle = 'Bert St Van Helsing Jr , John6 JJ ' Out type 2 = 'Bert St Van Helsing Jr , John6 ' Permutation 2: Out type 1a full middle = 'St Van Helsing Jr , John6 (JJ) Bert ' Out type 1aa full middle = 'St Van Helsing Jr , John6 JJ Bert ' Out type 1b middle initial = 'St Van Helsing Jr , John6 (JJ) B ' Out type 1bb middle initial = 'St Van Helsing Jr , John6 JJ B ' Out type 2 middle initial = 'St Van Helsing Jr , John6 B ' ======================================================= Found form 'First-Last (multi)' Input (raw) = 'John7 Bert St Van Helsing ,Jr ' Normal = 'John7 Bert St Van Helsing ,Jr ' Permutation 1: Out type 1a no middle = 'Bert St Van Helsing Jr , John7 ' Out type 1aa no middle = 'Bert St Van Helsing Jr , John7 ' Out type 2 = 'Bert St Van Helsing Jr , John7 ' Permutation 2: Out type 1a full middle = 'St Van Helsing Jr , John7 Bert ' Out type 1aa full middle = 'St Van Helsing Jr , John7 Bert ' Out type 1b middle initial = 'St Van Helsing Jr , John7 B ' Out type 1bb middle initial = 'St Van Helsing Jr , John7 B ' Out type 2 middle initial = 'St Van Helsing Jr , John7 B ' ======================================================= Form *** (unknown) Input (raw) = ' do(e)s not. match ,' Normal = ' do(e)s not match ,' =======================================================

这是一个可能有用的正则expression式，这将给你6个捕获组，按照以下顺序：名字，优先名称，中间名，昵称，姓氏，后缀。

 ([az]+)\.?\s(?:(\([az]+\))\s)?(?:([az]+)\.?\s)?(?:("[az]+")\s)?([az]+)(?:,\s([az]+))?

这里是一个解释：

 ([az]+)\.?\s # First name, followed by optional '.' (required) (?:(\([az]+\))\s)? # Preferred name, optional (?:([az]+)\.?\s)? # Middle name, optional (?:("[az]+")\s)? # Nickname, optional ([az]+) # Last name, required (?:,\s([az]+))? # Suffix, optional

例如，你可以把John (Johnny) B. "Abe" Smith, Jr.变成Smith Jr,John (Johnny) B "Abe" \5 \6,\1 \2 \3 \4或者你可以用\5 \6,\1 \3把它变成Smith Jr,John B

将Excel中的2个列表与VBA正则expression式进行比较

规范化过程

主要parsing过程

Visual Basic Excel正则expression式{}

合并正则expression式并填充最less的单元格

正则expression式：用逗号replace每个逗号不要在引号内

从具有模式的Excel公式提取列名称

索引/匹配IF语句

需要从string中的数字中分离出字母，通过vb.net

正则expression式与某些字符不匹配

使用jQuery Datatables 2016将带有换行符的值导出到Excel中的单元格中

在Excel公式中的正则expression式

在Excel中使用正则expression式的通用UDF