使用ADO将Excel电子表格导入数组更快捷的方法

我正在尝试使用Excel 2007 VBA将大型Excel报表中的数据导入到新文件中。 到目前为止,我已经提出了两种方法来做到这一点:

  1. 让Excel实际打开文件(下面的代码),将所有数据收集到数组中,并将数组输出到同一个文件中的新工作表中,然后保存/closures它。

    Public Sub GetData() Dim FilePath As String FilePath = "D:\File_Test.xlsx" Workbooks.OpenText Filename:=FilePath, FieldInfo:=Array(Array(2, 2)) ActiveWorkbook.Sheets(1).Select End Sub 
  2. 使用ADO从已closures的工作簿中获取所有数据,将整个数据表导入一个数组(下面的代码),并从那里对数据进行sorting,然后将数据输出到一个新的工作簿并保存/closures。

      Private Sub PopArray() 'Uses ADO to populate an array that will be used to sort data Dim dbConnection As ADODB.Connection, rs As ADODB.Recordset Dim Getvalue, SourceRange, SourceFile, dbConnectionString As String SourceFile = "D:\File_Test.xlsx" SourceRange = "B1:Z180000" dbConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;" & _ "Data Source=" & SourceFile & ";" & _ "Extended Properties=""Excel 12.0 Xml;HDR=No"";" Set dbConnection = New ADODB.Connection dbConnection.Open dbConnectionString 'open the database connection Set rs = dbConnection.Execute("SELECT * FROM [" & SourceRange & "]") Arr = rs.GetRows UpBound = UBound(Arr, 2) rs.Close End Sub 

使用的testing文件有大约65000条logging进行sorting(约三分之一我将最终使用它)。 当ADO版本比开放式工作表稍微好一点时(〜44秒vs 40秒运行时间),我感到很失望。 我想知道是否有一些方法来改进ADO导入方法(或者一个完全不同的方法 – ExecuteExcel4Macro也许? – 如果有的话),这将提高我的速度。 我能想到的唯一的事情就是我使用"B1:Z180000"作为我的SourceRange作为最大范围,然后通过设置Arr = rs.GetRows来截断,以准确反映logging的总数。 如果这是造成减速的原因,我不知道如何去查找工作表中有多less行。

编辑 – 我使用范围(“A1:A”和我)=(数组)插入数据到新的工作表。

这个答案可能不是你正在寻找的,但我仍然觉得不得不根据你的附注或完全不同的方法发布] …]。

在这里,我正在处理200MB(以及更多)的文件,这些文件只是包含分隔符的文本文件。 我不把它们加载到Excel中了。 我也有Excel的速度太慢,需要加载整个文件的问题。 但是,使用Open方法打开这些文件时,Excel非常快速:

 Open strFileNameAndPath For Input Access Read Lock Read As #intPointer 

在这种情况下,Excel不会加载整个文件,而只是逐行阅读。 所以,Excel可以处理数据(转发),然后抓取下一行数据。 像这样的Excel不neet内存加载200MB。

使用这种方法,我然后将数据加载到本地安装的SQL中,将数据直接传输到我们的DWH(也是SQL)。 为了加快使用上述方法的传输,并将数据快速获取到SQL服务器,我将以1000行的数据块为单位传输数据。 Excel中的stringvariables最多可容纳20亿个字符。 所以,那里没有问题。

有人可能会问,为什么我不是简单地使用SSIS,如果我已经使用SQL的本地安装。 然而,问题是,我不是加载所有这些文件了。 使用Excel生成这个“导入工具”允许我将这些工具转发给其他人,他们现在正在为我上传所有这些文件。 给他们所有人访问SSIS不是一种select,也不可能使用一个可以放置这些文件的目标networking驱动器,并且SSIS会自动加载它们(大约10分钟左右)。

最后我的代码看起来像这样。

 Set conRCServer = New ADODB.Connection conRCServer.ConnectionString = "PROVIDER=SQLOLEDB; " _ & "DATA SOURCE=" & Ref.Range("C2").Value2 & ";" _ & "INITIAL CATALOG=" & Ref.Range("C4").Value & ";" _ & "Integrated Security=SSPI " On Error GoTo SQL_ConnectionError conRCServer.Open On Error GoTo 0 'Save the name of the current file strCurrentFile = ActiveWorkbook.Name 'Prepare a dialog box for the user to pick a file and show it ' ...if no file has been selected then exit ' ...otherwise parse the selection into it's path and the name of the file Call Application.FileDialog(msoFileDialogOpen).Filters.Clear Call Application.FileDialog(msoFileDialogOpen).Filters.Add("Extracts", "*.csv") Application.FileDialog(msoFileDialogOpen).Title = "Select ONE Extract to import..." Application.FileDialog(msoFileDialogOpen).AllowMultiSelect = False intChoice = Application.FileDialog(msoFileDialogOpen).Show If intChoice <> 0 Then strFileToPatch = Application.FileDialog(msoFileDialogOpen).SelectedItems(1) Else Exit Sub End If 'Open the Extract for import and close it afterwards intPointer = FreeFile() Open strFileNameAndPath For Input Access Read Lock Read As #intPointer intCounter = 0 strSQL = vbNullString Do Until EOF(intPointer) Line Input #intPointer, strLine If Left(strLine, 4) = """@@@" Then Exit Sub '********************************************************************* '** Starting a new SQL command '********************************************************************* If intCounter = 0 Then Set rstResult = New ADODB.Recordset strSQL = "set nocount on; " strSQL = strSQL & "insert into dbo.tblTMP " strSQL = strSQL & "values " End If '********************************************************************* '** Transcribe the current line into SQL '********************************************************************* varArray = Split(strLine, ",") strSQL = strSQL & " (" & varArray(0) & ", " & varArray(1) & ", N'" & varArray(2) & "', " strSQL = strSQL & " N'" & varArray(3) & "', N'" & varArray(4) & "', N'" & varArray(5) & "', " strSQL = strSQL & " N'" & varArray(6) & "', " & varArray(8) & ", N'" & varArray(9) & "', " strSQL = strSQL & " N'" & varArray(10) & "', N'" & varArray(11) & "', N'" & varArray(12) & "', " strSQL = strSQL & " N'" & varArray(13) & "', N'" & varArray(14) & "', N'" & varArray(15) & "' ), " '********************************************************************* '** Execute the SQL command in bulks of 1.000 '********************************************************************* If intCounter >= 1000 Then strSQL = Mid(strSQL, 1, Len(strSQL) - 2) rstResult.ActiveConnection = conRCServer On Error GoTo SQL_StatementError rstResult.Open strSQL On Error GoTo 0 If Not rstResult.EOF And Not rstResult.BOF Then strErrorMessage = "The server returned the following error message(s):" & Chr(10) While Not rstResult.EOF And Not rstResult.BOF strErrorMessage = Chr(10) & strErrorMessage & rstResult.Fields(0).Value rstResult.MoveNext Wend MsgBox strErrorMessage & Chr(10) & Chr(10) & "Aborting..." Exit Sub End If End If intCounter = intCounter + 1 Loop Close intPointer Set rstResult = Nothing Exit Sub SQL_ConnectionError: Y = MsgBox("Couldn't connect to the server. Please make sure that you have a working internet connection. " & _ "Do you want me to prepare an error-email?", 52, "Problems connecting to Server...") If Y = 6 Then Set OutApp = CreateObject("Outlook.Application") Set OutMail = OutApp.CreateItem(0) With OutMail .To = Ref.Range("C7").Value2 .CC = Ref.Range("C8").Value2 .Subject = "Problems connecting to database '" & Ref.Range("C4").Value & "' on server '" & Ref.Range("C2").Value & "'" .HTMLBody = "<span style=""font-size:10px"">---Automatically generated Error-Email---" & _ "</span><br><br>Error report from the file '" & _ "<span style=""color:blue"">" & ActiveWorkbook.Name & _ "</span>' located and saved on '<span style=""color:blue"">" & _ ActiveWorkbook.Path & "</span>'.<br>" & _ "Excel is not able to establish a connection to the server. Technical data to follow." & "<br><br>" & _ "Computer Name: <span style=""color:green;"">" & Environ("COMPUTERNAME") & "</span><br>" & _ "Logged in as: <span style=""color:green;"">" & Environ("USERDOMAIN") & "/" & Environ("USERNAME") & "</span><br>" & _ "Domain Server: <span style=""color:green;"">" & Environ("LOGONSERVER") & "</span><br>" & _ "User DNS Domain: <span style=""color:green;"">" & Environ("USERDNSDOMAIN") & "</span><br>" & _ "Operating System: <span style=""color:green;"">" & Environ("OS") & "</span><br>" & _ "Excel Version: <span style=""color:green;"">" & Application.Version & "</span><br>" & _ "<br><span style=""font-size:10px""><br>" & _ "<br><br>---Automatically generated Error-Email---" .Display End With Set OutMail = Nothing Set OutApp = Nothing End If Exit Sub SQL_StatementError: Y = MsgBox("There seems to be a problem with the SQL Syntax in the programming. " & _ "May I send an error-email to development team?", 52, "Problems with the coding...") If Y = 6 Then Set OutApp = CreateObject("Outlook.Application") Set OutMail = OutApp.CreateItem(0) With OutMail .To = Ref.Range("C8").Value2 '.CC = "" .Subject = "Problems with the SQL Syntax in file '" & ActiveWorkbook.Name & "'." .HTMLBody = "<span style=""font-size:10px"">" & _ "---Automatically generated Error-Email---" & _ "</span><br><br>" & _ "Error report from the file '" & _ "<span style=""color:blue"">" & _ ActiveWorkbook.Name & _ "</span>" & _ "' located and saved on '" & _ "<span style=""color:blue"">" & _ ActiveWorkbook.Path & _ "</span>" & _ "'.<br>" & _ "It seems that there is a problem with the SQL-Code within trying to upload an extract to the server." & _ "SQL-Code causing the problems:" & _ "<br><br><span style=""color:green;"">" & _ strSQL & _ "</span><br><br><span style=""font-size:10px"">" & _ "---Automatically generated Error-Email---" .Display End With Set OutMail = Nothing Set OutApp = Nothing End If Exit Sub End Sub 

我认为@Mr。 Mascaro是正确的将数据从Recordset传递到电子表格的最简单的方法是:

 Private Sub PopArray() ..... Set rs = dbConnection.Execute("SELECT * FROM [" & SourceRange & "]") '' This is faster Range("A1").CopyFromRecordset rs ''Arr = rs.GetRows End Sub 

但如果你仍然想使用Arrays你可以试试这个:

 Sub ArrayTest '' Array for Test Dim aSingleArray As Variant Dim aMultiArray as Variant '' Set values aSingleArray = Array("A","B","C","D","E") aMultiArray = Array(aSingleArray, aSingleArray) '' You can drop data from the Array using 'Resize' '' Btw, your Array must be transpose to use this :P Range("A1").Resize( _ UBound(aMultiArray(0), 1) + 1, _ UBound(aMultiArray, 1) + 1) = Application.Transpose(aMultiArray) End Sub