Excel VBA中的正则expression式

我在Excel VBA中使用Microsoft正则expression式引擎。 我对正则expression式很陌生,但我现在有一个模式。 我需要扩大它,我有麻烦。 这是我的代码到目前为止:

Sub ImportFromDTD() Dim sDTDFile As Variant Dim ffile As Long Dim sLines() As String Dim i As Long Dim Reg1 As RegExp Dim M1 As MatchCollection Dim M As Match Dim myRange As Range Set Reg1 = New RegExp ffile = FreeFile sDTDFile = Application.GetOpenFilename("DTD Files,*.XML", , _ "Browse for file to be imported") If sDTDFile = False Then Exit Sub '(user cancelled import file browser) Open sDTDFile For Input Access Read As #ffile Lines = Split(Input$(LOF(ffile), #ffile), vbNewLine) Close #ffile Cells(1, 2) = "From DTD" J = 2 For i = 0 To UBound(Lines) 'Debug.Print "Line"; i; "="; Lines(i) With Reg1 '.Pattern = "(\<\!ELEMENT\s)(\w*)(\s*\(\#\w*\)\s*\>)" .Pattern = "(\<\!ELEMENT\s)(\w*)(\s*\(\#\w*\)\s*\>)" .Global = True .MultiLine = True .IgnoreCase = False End With If Reg1.Test(Lines(i)) Then Set M1 = Reg1.Execute(Lines(i)) For Each M In M1 sExtract = M.SubMatches(1) sExtract = Replace(sExtract, Chr(13), "") Cells(J, 2) = sExtract J = J + 1 'Debug.Print sExtract Next M End If Next i Set Reg1 = Nothing End Sub 

目前,我在这样一组数据上匹配:

  <!ELEMENT DealNumber (#PCDATA) > 

并提取Dealnumber,但现在,我需要添加像这样的数据的另一个匹配:

 <!ELEMENT DealParties (DealParty+) > 

并提取没有Parens和+的Dealparty

我一直在使用这个作为参考,这是真棒,但我仍然有点困惑。 如何在Microsoft Excel中使用正则expression式(正则expression式)在单元格内和循环中

编辑

我遇到了一些必须匹配的新场景。

  Extract Deal <!ELEMENT Deal (DealNumber,DealType,DealParties) > Extract DealParty the ?,CR are throwing me off <!ELEMENT DealParty (PartyType,CustomerID,CustomerName,CentralCustomerID?, LiabilityPercent,AgentInd,FacilityNo?,PartyReferenceNo?, PartyAddlReferenceNo?,PartyEffectiveDate?,FeeRate?,ChargeType?) > Extract Deals <!ELEMENT Deals (Deal*) > 

你可以使用这个正则Regex模式;

  .Pattern = "\<\!ELEMENT\s+(\w+)\s+\((#\w+|(\w+)\+)\)\s+\>" 
  1. 这部分

(#\w+|(\w+)\+)

说匹配

#A-Z0-9
A-Z0-9 +

在括号内。

即匹配

(#PCDATA)
(DealParty +)

validation整个string

  1. 然后,使用子匹配为第一个有效匹配提取DealNumber ,为其他有效匹配使用DealParty

编辑下面的代码 – 注意submatch现在是M.submatches(0)

  Sub ImportFromDTD() Dim sDTDFile As Variant Dim ffile As Long Dim sLines() As String Dim i As Long Dim Reg1 As RegExp Dim M1 As MatchCollection Dim M As Match Dim myRange As Range Set Reg1 = New RegExp J = 1 strIn = "<!ELEMENT Deal12Number (#PCDATA) > <!ELEMENT DealParties (DealParty+) >" With Reg1 .Pattern = "\<\!ELEMENT\s+(\w+)\s+\((#\w+|(\w+)\+)\)\s+\>" .Global = True .MultiLine = True .IgnoreCase = False End With If Reg1.Test(strIn) Then Set M1 = Reg1.Execute(strIn) For Each M In M1 sExtract = M.SubMatches(2) If Len(sExtract) = 0 Then sExtract = M.SubMatches(0) sExtract = Replace(sExtract, Chr(13), "") Cells(J, 2) = sExtract J = J + 1 Next M End If Set Reg1 = Nothing End Sub 

看你的模式,你有太多的捕获组。 你只想捕获PCDATADealParty 。 尝试改变你的模式:

  With Reg1 .Pattern = "\<!ELEMENT\s+\w+\s+\(\W*(\w+)\W*\)" .Global = True .MultiLine = True .IgnoreCase = False End With 

这是存根: Regex101 。