Excel VBA中的正则expression式 – 第二部分

这是Excel VBA中正则expression式的扩展

我想出了一些我认为超出了我原来的问题范围的比赛。 这是我现有的代码:

Sub ImportFromDTD() Dim sDTDFile As Variant Dim ffile As Long Dim sLines() As String Dim i As Long Dim Reg1 As RegExp Dim M1 As MatchCollection Dim M As Match Dim myRange As Range Set Reg1 = New RegExp ffile = FreeFile sDTDFile = Application.GetOpenFilename("DTD Files,*.XML", , _ "Browse for file to be imported") If sDTDFile = False Then Exit Sub '(user cancelled import file browser) Open sDTDFile For Input Access Read As #ffile Lines = Split(Input$(LOF(ffile), #ffile), vbNewLine) Close #ffile Cells(1, 2) = "From DTD" J = 2 For i = 0 To UBound(Lines) 'Debug.Print "Line"; i; "="; Lines(i) With Reg1 .Pattern = "\<\!ELEMENT\s+(\w+)\s+\((#\w+|(\w+)\+)\)\s+\>" .Global = True .MultiLine = True .IgnoreCase = False End With If Reg1.Test(Lines(i)) Then Set M1 = Reg1.Execute(Lines(i)) For Each M In M1 sExtract = M.SubMatches(2) If Len(sExtract) = 0 Then sExtract = M.SubMatches(0) sExtract = Replace(sExtract, Chr(13), "") Cells(J, 2) = sExtract J = J + 1 'Debug.Print sExtract Next M End If Next i Set Reg1 = Nothing End Sub 

这是从我的文件摘录:

 <!ELEMENT ProductType (#PCDATA) > <!ELEMENT Invoices (InvoiceDetails+) > <!ELEMENT Deal (DealNumber,DealType,DealParties) > <!ELEMENT DealParty (PartyType,CustomerID,CustomerName,CentralCustomerID?, LiabilityPercent,AgentInd,FacilityNo?,PartyReferenceNo?, PartyAddlReferenceNo?,PartyEffectiveDate?,FeeRate?,ChargeType?) > <!ELEMENT Deals (Deal*) > 

目前,我正在匹配:

 extract ProductType <!ELEMENT ProductType (#PCDATA) > extract InvoiceDetails <!ELEMENT Invoices (InvoiceDetails+) > 

我还需要提取以下内容:

  Extract Deal <!ELEMENT Deal (DealNumber,DealType,DealParties) > Extract DealParty the ?,CR are throwing me off <!ELEMENT DealParty (PartyType,CustomerID,CustomerName,CentralCustomerID?, LiabilityPercent,AgentInd,FacilityNo?,PartyReferenceNo?, PartyAddlReferenceNo?,PartyEffectiveDate?,FeeRate?,ChargeType?) > Extract Deal <!ELEMENT Deals (Deal*) > 

也许我错过了一些东西,但(抱歉,我现在没有VBA,所以这是VBS,你将不得不适应某些东西)

 Option Explicit Dim fileContents fileContents = WScript.CreateObject("Scripting.FileSystemObject").OpenTextFile("input.xml").ReadAll Dim matches With New RegExp .Multiline = True .IgnoreCase = False .Global = True .Pattern = "<!ELEMENT\s+([^\s>]+)\s+([^>]*)\s*>" Set matches = .Execute( fileContents ) End With Dim match For Each match in matches WScript.Echo match.Submatches(0) WScript.Echo match.Submatches(1) WScript.Echo "---------------------------------------" Next 

正如我所看到的,您的主要问题是尝试将多行正则expression式与一行一行的单独一行匹配,而不是将其与全文进行匹配。