Excel VBA XMLparsing性能

我正在采取一些input数据在Excel中,parsing到XML和使用它来运行SQL存储过程,但我遇到了性能问题的XMLparsing。 input表看起来像这样:

Dates_|_Name1_Name2_Name3_..._NameX Date1 | Date2 | . . . | Date1Y| 

我有一些代码来循环通过每个单元格,并parsing出数据到一个XMLstring,但即使是一个约300×300的网格执行需要五分钟的顺序,我正在寻找使用数据集数千列。 我已经尝试了一些事情来帮助加快速度,比如将数据读入Variant,然后重复这个或排除DoEvents,但是我一直无法加快速度。 这是一个问题的代码位:

 Dim lastRow As Long lRows = (oWorkSheet.Cells(Rows.Count, 1).End(xlUp).Row) Dim lastColumn As Long lCols = (oWorkSheet.Cells(1, Columns.Count).End(xlToLeft).Column) Dim sheet As Variant With Sheets(sName) sheet = .Range(.Cells(1, 1), .Cells(lRows, lCols)) End With ReDim nameCols(lCols) As String 

 resultxml = "<DataSet>" For i = 2 To rows resultxml = resultxml & "<DateRow>" For j = 1 To cols If Trim(sheet(i, j)) <> "" Then lResult = "<" & nameCols(j) & ">" rResult = "</" & nameCols(j) & ">" tmpValue = Trim(sheet(i, j)) If IsDate(tmpValue) And Not IsNumeric(tmpValue) Then If Len(tmpValue) >= 8 Then tmpValue = Format(tmpValue, "yyyy-mm-dd") End If End If resultxml = resultxml & lResult & tmpValue & rResult DoEvents End If Next j resultxml = resultxml & "</DateRow>" Next i resultxml = resultxml & "</DataSet>" 

任何意见,以获得运行时间将不胜感激。

考虑使用MSXML ,这是一个全面的符合W3C标准的XML API库,您可以使用它来使用DOM方法( createElementappendChildsetAttribute )构buildXML,而不是连接文本string。 XML不是一个文本文件,而是一个带有编码和树结构的标记文件。 Excel通过引用或后期绑定配备了MSXML COM对象,并且可以从Excel数据中迭代地构build树,如下所示。

随着300行12列的随机date,下面甚至没有花一分钟(点击macros后几秒钟),它甚至漂亮的打印原始输出与换行符和缩进使用embedded式XSLT样式表(如果你不漂亮, MSXML将文档输出为一个长而连续的行)。

input

名称日期电子表格

VBA (当然与实际数据alignment)

 Sub xmlExport() On Error GoTo ErrHandle ' VBA REFERENCE MSXML, v6.0 ' Dim doc As New MSXML2.DOMDocument60, xslDoc As New MSXML2.DOMDocument60, newDoc As New MSXML2.DOMDocument60 Dim root As IXMLDOMElement, dataNode As IXMLDOMElement, datesNode As IXMLDOMElement, namesNode As IXMLDOMElement Dim i As Long, j As Long Dim tmpValue As Variant ' DECLARE XML DOC OBJECT ' Set root = doc.createElement("DataSet") doc.appendChild root ' ITERATE THROUGH ROWS ' For i = 2 To Sheets(1).UsedRange.Rows.Count ' DATA ROW NODE ' Set dataNode = doc.createElement("DataRow") root.appendChild dataNode ' DATES NODE ' Set datesNode = doc.createElement("Dates") datesNode.Text = Sheets(1).Range("A" & i) dataNode.appendChild datesNode ' NAMES NODE ' For j = 1 To 12 tmpValue = Sheets(1).Cells(i, j + 1) If IsDate(tmpValue) And Not IsNumeric(tmpValue) Then Set namesNode = doc.createElement("Name" & j) namesNode.Text = Format(tmpValue, "yyyy-mm-dd") dataNode.appendChild namesNode End If Next j Next i ' PRETTY PRINT RAW OUTPUT ' xslDoc.LoadXML "<?xml version=" & Chr(34) & "1.0" & Chr(34) & "?>" _ & "<xsl:stylesheet version=" & Chr(34) & "1.0" & Chr(34) _ & " xmlns:xsl=" & Chr(34) & "http://www.w3.org/1999/XSL/Transform" & Chr(34) & ">" _ & "<xsl:strip-space elements=" & Chr(34) & "*" & Chr(34) & " />" _ & "<xsl:output method=" & Chr(34) & "xml" & Chr(34) & " indent=" & Chr(34) & "yes" & Chr(34) & "" _ & " encoding=" & Chr(34) & "UTF-8" & Chr(34) & "/>" _ & " <xsl:template match=" & Chr(34) & "node() | @*" & Chr(34) & ">" _ & " <xsl:copy>" _ & " <xsl:apply-templates select=" & Chr(34) & "node() | @*" & Chr(34) & " />" _ & " </xsl:copy>" _ & " </xsl:template>" _ & "</xsl:stylesheet>" xslDoc.async = False doc.transformNodeToObject xslDoc, newDoc newDoc.Save ActiveWorkbook.Path & "\Output.xml" MsgBox "Successfully exported Excel data to XML!", vbInformation Exit Sub ErrHandle: MsgBox Err.Number & " - " & Err.Description, vbCritical Exit Sub End Sub 

产量

 <?xml version="1.0" encoding="UTF-8"?> <DataSet> <DataRow> <Dates>Date1</Dates> <Name1>2016-04-23</Name1> <Name2>2016-09-22</Name2> <Name3>2016-09-23</Name3> <Name4>2016-09-24</Name4> <Name5>2016-10-31</Name5> <Name6>2016-09-26</Name6> <Name7>2016-09-27</Name7> <Name8>2016-09-28</Name8> <Name9>2016-09-29</Name9> <Name10>2016-09-30</Name10> <Name11>2016-10-01</Name11> <Name12>2016-10-02</Name12> </DataRow> <DataRow> <Dates>Date2</Dates> <Name1>2016-06-27</Name1> <Name2>2016-08-14</Name2> <Name3>2016-07-08</Name3> <Name4>2016-08-22</Name4> <Name5>2016-11-03</Name5> <Name6>2016-07-28</Name6> <Name7>2016-08-23</Name7> <Name8>2016-11-01</Name8> <Name9>2016-11-01</Name9> <Name10>2016-08-11</Name10> <Name11>2016-08-18</Name11> <Name12>2016-09-23</Name12> </DataRow> ... 

我想比较用于将Excel范围转换为VBAstring的Psuedo-String Builder与Parfait的MSXML实现,将范围输出到xml。 我修改了Parfait的代码添加一个计时器,并允许非date值。

该数据有一个标题行和300行乘300列(90,000个单元格)。 虽然string生成器大约快了400%,但仍然使用Parfait的MSXML方法。 作为一个行业标准,它已经有了很好的文件。

在这里输入图像说明

 Sub XMLFromRange() Dim Start: Start = Timer Const AVGCELLLENGTH As Long = 100 Dim LG As Long, index As Long, x As Long, y As Long Dim data As Variant, Headers As Variant Dim result As String, s As String data = getDataArray Headers = getHeaderArray(data) result = Space(UBound(data, 1) * UBound(data, 2) * AVGCELLLENGTH) index = 1 Mid(result, index, 11) = "<DataSet>" & vbCrLf index = index + 11 For x = 2 To UBound(data, 1) Mid(result, index, 11) = "<DataRow>" & vbCrLf index = index + 11 For y = 1 To UBound(data, 2) LG = Len(Headers(1, y)) Mid(result, index, LG) = Headers(1, y) index = index + LG s = RTrim(data(x, y)) LG = Len(s) Mid(result, index, LG) = s index = index + LG LG = Len(Headers(2, y)) Mid(result, index, LG) = Headers(2, y) index = index + LG Next Mid(result, index, 12) = "</DataRow>" & vbCrLf index = index + 12 Next Mid(result, index, 12) = "</DataSet>" & vbCrLf index = index + 12 result = Left(result, index) MsgBox (Timer - Start) & " Second(s)" & vbCrLf & _ (UBound(data, 1) - 1) * UBound(data, 2) & " Data Cells", vbInformation, "Execution Time" Dim myFile As String myFile = ThisWorkbook.Path & "\demo.txt" Open myFile For Output As #1 Print #1, result Close #1 Shell "Notepad.exe " & myFile, vbNormalFocus End Sub Function getDataArray() With Worksheets("Sheet1") getDataArray = .Range(.Range("A" & .Rows.Count).End(xlUp), .Cells(1, .Columns.Count).End(xlToLeft)) End With End Function Function getHeaderArray(DataArray As Variant) Dim y As Long Dim Headers() As String ReDim Headers(1 To 2, 1 To UBound(DataArray, 2)) For y = 1 To UBound(DataArray, 2) Headers(1, y) = "<" & DataArray(1, y) & ">" Headers(2, y) = "</" & DataArray(1, y) & ">" & vbCrLf Next getHeaderArray = Headers End Function