将csv文件加载到VBA数组而不是Excel工作表中

我目前能够通过下面的代码上传数据到Excel VBA中inputCSV文件数据,然后处理表格,当然不是最好的方式,因为我只对某些数据感兴趣,并在使用数据后删除表格:

Sub CSV_Import() Dim ws As Worksheet, strFile As String Set ws = ActiveSheet 'set to current worksheet name strFile = Application.GetOpenFilename("Text Files (*.csv),*.csv", ,"Please select text file...") With ws.QueryTables.Add(Connection:="TEXT;" & strFile, Destination:=ws.Range("A1")) .TextFileParseType = xlDelimited .TextFileCommaDelimiter = True .Refresh End With End Sub 

有没有可能简单地将csv加载到VBA中的二维variables数组而不是使用Excel工作表?

好吧,看起来你需要两件事:从文件中stream出数据,并填充一个二维数组。

我有一个“Join2d”和一个“Split2d”函数(我记得前一段时间在StackOverflow的另一个回复中logging了他们)。 看看代码中的注释,如果你正在处理大型文件,你可能需要了解一些有效的string处理。

但是,这不是一个复杂的function:只要粘贴代码,如果你匆忙。

stream文件很简单,但是我们对文件格式做了假设:文件中的行是由回车符还是回车换行符来分隔的? 我假设'CR'而不是CRLF,但你需要检查。

关于格式的另一个假设是数字数据将按原样显示,而string或字符数据将封装在引号中。 这应该是真实的,但往往不是…删除引号添加了很多处理 – 大量的分配和释放string – 你真的不想在一个大arrays中做。 我已经简化了明显的逐个单元查找和replace,但它仍然是一个大文件的问题。

无论如何:这是源代码:小心由StackOverflow的文本框控件插入的换行符:

运行代码:

请注意,您需要引用Microsoft脚本运行时(system32 \ scrrun32.dll)

 Private Sub test() Dim arrX As Variant arrX = ArrayFromCSVfile("MyFile.csv") End Sub 

stream式传输一个CSV文件。

请注意,我假设您的文件在临时文件夹中:C:\ Documents and Settings [$ USERNAME] \ Local Settings \ Temp您将需要使用文件系统命令将文件复制到本地文件夹:它总是比通过networking工作。

 Public Function ArrayFromCSVfile( _ strName As String, _ Optional RowDelimiter As String = vbCr, _ Optional FieldDelimiter = ",", _ Optional RemoveQuotes As Boolean = True _ ) As Variant ' Load a file created by FileToArray into a 2-dimensional array ' The file name is specified by strName, and it is exected to exist ' in the user's temporary folder. This is a deliberate restriction: ' it's always faster to copy remote files to a local drive than to ' edit them across the network ' RemoveQuotes=TRUE strips out the double-quote marks (Char 34) that ' encapsulate strings in most csv files. On Error Resume Next Dim objFSO As Scripting.FileSystemObject Dim arrData As Variant Dim strFile As String Dim strTemp As String Set objFSO = New Scripting.FileSystemObject strTemp = objFSO.GetSpecialFolder(Scripting.TemporaryFolder).ShortPath strFile = objFSO.BuildPath(strTemp, strName) If Not objFSO.FileExists(strFile) Then ' raise an error? Exit Function End If Application.StatusBar = "Reading the file... (" & strName & ")" If Not RemoveQuotes Then arrData = Join2d(objFSO.OpenTextFile(strFile, ForReading).ReadAll, RowDelimiter, FieldDelimiter) Application.StatusBar = "Reading the file... Done" Else ' we have to do some allocation here... strTemp = objFSO.OpenTextFile(strFile, ForReading).ReadAll Application.StatusBar = "Reading the file... Done" Application.StatusBar = "Parsing the file..." strTemp = Replace$(strTemp, Chr(34) & RowDelimiter, RowDelimiter) strTemp = Replace$(strTemp, RowDelimiter & Chr(34), RowDelimiter) strTemp = Replace$(strTemp, Chr(34) & FieldDelimiter, FieldDelimiter) strTemp = Replace$(strTemp, FieldDelimiter & Chr(34), FieldDelimiter) If Right$(strTemp, Len(strTemp)) = Chr(34) Then strTemp = Left$(strTemp, Len(strTemp) - 1) End If If Left$(strTemp, 1) = Chr(34) Then strTemp = Right$(strTemp, Len(strTemp) - 1) End If Application.StatusBar = "Parsing the file... Done" arrData = Split2d(strTemp, RowDelimiter, FieldDelimiter) strTemp = "" End If Application.StatusBar = False Set objFSO = Nothing ArrayFromCSVfile = arrData Erase arrData End Function 

Split2d函数,它从一个string创build一个2维的VBA数组; 和Join2D,这相反:

 Public Function Split2d(ByRef strInput As String, _ Optional RowDelimiter As String = vbCr, _ Optional FieldDelimiter = vbTab, _ Optional CoerceLowerBound As Long = 0 _ ) As Variant ' Split up a string into a 2-dimensional array. ' Works like VBA.Strings.Split, for a 2-dimensional array. ' Check your lower bounds on return: never assume that any array in ' VBA is zero-based, even if you've set Option Base 0 ' If in doubt, coerce the lower bounds to 0 or 1 by setting ' CoerceLowerBound ' Note that the default delimiters are those inserted into the ' string returned by ADODB.Recordset.GetString On Error Resume Next ' Coding note: we're not doing any string-handling in VBA.Strings - ' allocating, deallocating and (especially!) concatenating are SLOW. ' We're using the VBA Join & Split functions ONLY. The VBA Join, ' Split, & Replace functions are linked directly to fast (by VBA ' standards) functions in the native Windows code. Feel free to ' optimise further by declaring and using the Kernel string functions ' if you want to. ' ** THIS CODE IS IN THE PUBLIC DOMAIN ** ' Nigel Heffernan Excellerando.Blogspot.com Dim i As Long Dim j As Long Dim i_n As Long Dim j_n As Long Dim i_lBound As Long Dim i_uBound As Long Dim j_lBound As Long Dim j_uBound As Long Dim arrTemp1 As Variant Dim arrTemp2 As Variant arrTemp1 = Split(strInput, RowDelimiter) i_lBound = LBound(arrTemp1) i_uBound = UBound(arrTemp1) If VBA.LenB(arrTemp1(i_uBound)) <= 0 Then ' clip out empty last row: a common artifact in data 'loaded from files with a terminating row delimiter i_uBound = i_uBound - 1 End If i = i_lBound arrTemp2 = Split(arrTemp1(i), FieldDelimiter) j_lBound = LBound(arrTemp2) j_uBound = UBound(arrTemp2) If VBA.LenB(arrTemp2(j_uBound)) <= 0 Then ' ! potential error: first row with an empty last field... j_uBound = j_uBound - 1 End If i_n = CoerceLowerBound - i_lBound j_n = CoerceLowerBound - j_lBound ReDim arrData(i_lBound + i_n To i_uBound + i_n, j_lBound + j_n To j_uBound + j_n) ' As we've got the first row already... populate it ' here, and start the main loop from lbound+1 For j = j_lBound To j_uBound arrData(i_lBound + i_n, j + j_n) = arrTemp2(j) Next j For i = i_lBound + 1 To i_uBound Step 1 arrTemp2 = Split(arrTemp1(i), FieldDelimiter) For j = j_lBound To j_uBound Step 1 arrData(i + i_n, j + j_n) = arrTemp2(j) Next j Erase arrTemp2 Next i Erase arrTemp1 Application.StatusBar = False Split2d = arrData End Function Public Function Join2d(ByRef InputArray As Variant, _ Optional RowDelimiter As String = vbCr, _ Optional FieldDelimiter = vbTab,_ Optional SkipBlankRows As Boolean = False _ ) As String ' Join up a 2-dimensional array into a string. Works like the standard ' VBA.Strings.Join, for a 2-dimensional array. ' Note that the default delimiters are those inserted into the string ' returned by ADODB.Recordset.GetString On Error Resume Next ' Coding note: we're not doing any string-handling in VBA.Strings - ' allocating, deallocating and (especially!) concatenating are SLOW. ' We're using the VBA Join & Split functions ONLY. The VBA Join, ' Split, & Replace functions are linked directly to fast (by VBA ' standards) functions in the native Windows code. Feel free to ' optimise further by declaring and using the Kernel string functions ' if you want to. ' ** THIS CODE IS IN THE PUBLIC DOMAIN ** ' Nigel Heffernan Excellerando.Blogspot.com Dim i As Long Dim j As Long Dim i_lBound As Long Dim i_uBound As Long Dim j_lBound As Long Dim j_uBound As Long Dim arrTemp1() As String Dim arrTemp2() As String Dim strBlankRow As String i_lBound = LBound(InputArray, 1) i_uBound = UBound(InputArray, 1) j_lBound = LBound(InputArray, 2) j_uBound = UBound(InputArray, 2) ReDim arrTemp1(i_lBound To i_uBound) ReDim arrTemp2(j_lBound To j_uBound) For i = i_lBound To i_uBound For j = j_lBound To j_uBound arrTemp2(j) = InputArray(i, j) Next j arrTemp1(i) = Join(arrTemp2, FieldDelimiter) Next i If SkipBlankRows Then If Len(FieldDelimiter) = 1 Then strBlankRow = String(j_uBound - j_lBound, FieldDelimiter) Else For j = j_lBound To j_uBound strBlankRow = strBlankRow & FieldDelimiter Next j End If Join2d = Replace(Join(arrTemp1, RowDelimiter), strBlankRow, RowDelimiter, "") i = Len(strBlankRow & RowDelimiter) If Left(Join2d, i) = strBlankRow & RowDelimiter Then Mid$(Join2d, 1, i) = "" End If Else Join2d = Join(arrTemp1, RowDelimiter) End If Erase arrTemp1 End Function 

分享和享受。

是的,将其作为文本文件读取。

看到这个例子

 Option Explicit Sub Sample() Dim MyData As String, strData() As String Open "C:\MyFile.CSV" For Binary As #1 MyData = Space$(LOF(1)) Get #1, , MyData Close #1 strData() = Split(MyData, vbCrLf) End Sub 

跟进

就像我在下面的评论中所提到的那样,AFAIK没有直接的从csv填充二维数组的方法。 你将不得不使用上面给出的代码,然后每行分割它,最后填充一个很麻烦的二维数组。 填写列很容易,但是如果您特别想从第5行到第7列的数据说话,则会变得很麻烦,因为您必须检查数据中是否有足够的列/行。 这里是一个基本的例子来获得二维数组中的B列。

:我没有做任何error handling。 我相信你可以照顾的。

假设我们的CSV文件看起来像这样。

在这里输入图像说明

当你运行这个代码

 Option Explicit Const Delim As String = "," Sub Sample() Dim MyData As String, strData() As String, TmpAr() As String Dim TwoDArray() As String Dim i As Long, n As Long Open "C:\Users\Siddharth Rout\Desktop\Sample.CSV" For Binary As #1 MyData = Space$(LOF(1)) Get #1, , MyData Close #1 strData() = Split(MyData, vbCrLf) n = 0 For i = LBound(strData) To UBound(strData) If Len(Trim(strData(i))) <> 0 Then TmpAr = Split(strData(i), Delim) n = n + 1 ReDim Preserve TwoDArray(1, 1 To n) '~~> TmpAr(1) : 1 for Col B, 0 would be A TwoDArray(1, n) = TmpAr(1) End If Next i For i = 1 To n Debug.Print TwoDArray(1, i) Next i End Sub 

你会得到如下所示的输出

在这里输入图像说明

顺便说一句,我很好奇,因为你在Excel中这样做,为什么不使用内置Workbooks.OpenQueryTables方法,然后将范围读入二维数组? 那会简单得多…

好了,在看了这个之后,我已经得到的解决scheme是使用ADODB(需要参考ActiveX数据对象,这将csv文件加载到数组中,而不必循环行列,要求数据处于良好状态。

 Sub LoadCSVtoArray() strPath = ThisWorkbook.Path & "\" Set cn = CreateObject("ADODB.Connection") strcon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strPath & ";Extended Properties=""text;HDR=Yes;FMT=Delimited"";" cn.Open strcon strSQL = "SELECT * FROM SAMPLE.csv;" Dim rs As Recordset Dim rsARR() As Variant Set rs = cn.Execute(strSQL) rsARR = WorksheetFunction.Transpose(rs.GetRows) rs.Close Set cn = Nothing [a1].Resize(UBound(rsARR), UBound(Application.Transpose(rsARR))) = rsARR End Sub 

或者,您可以使用这样的代码,在您将csv文件的每一行读入到您传递给csvline的string之后 。 您需要一个数组R()作为string来接收列值。 在每次调用CSVtoArray之前,应该将该数组重命名为(0)

 Public Sub CSVtoArray(A() As String, csvline As String) '*************************************************************************** '* WARNING: Array A() needs to be Redim-ed (0) each time BEFORE routine is* '* called!! * '*************************************************************************** Dim k As Integer, j As Integer k = InStr(csvline, ",") ' Or whatever delimiter you use j = UBound(A) j = j + 1 ReDim Preserve A(j) If k = 0 Then A(j) = Trim(csvline) Exit Sub End If A(j) = Trim(Mid(csvline, 1, k - 1)) CSVtoArray A(), Mid(csvline, k + 1) End Sub 

在这种情况下,您需要确保在每次调用例程之前Redim (0)将要保存列值的数组,否则您的手中将出现内存溢出。 请注意,此代码一次只能将一行加载到接收数组中。