如何从Excel电子表格中读取单个列?

我正在尝试从Excel文档读取单个列。 我想阅读整个专栏,但显然只存储有数据的单元格。 我也想尝试处理这种情况,列中的某个单元格是空的,但是如果列中的某个单元格更靠后,它将在稍后的单元格值中读取。 例如:

| Column1 | |---------| |bob | |tom | |randy | |travis | |joe | | | |jennifer | |sam | |debby | 

如果我有这个列,我不介意在joe后面有一个""的值,但是我希望它能够在空白单元之后继续获取值。 但是,假设debby是列中的最后一个值,我不希望它继续超过debby 35,000行。

假设这将永远是第一列也是安全的。

到目前为止,我有这样的:

 Excel.Application myApplication = new Excel.Application(); myApplication.Visible = true; Excel.Workbook myWorkbook = myApplication.Workbooks.Open("C:\\aFileISelect.xlsx"); Excel.Worksheet myWorksheet = myWorkbook.Sheets["aSheet"] as Excel.Worksheet; Excel.Range myRange = myWorksheet.get_Range("A:A", Type.Missing); foreach (Excel.Range r in myRange) { MessageBox.Show(r.Text); } 

我从.NET的旧版本中发现了很多类似的东西,但并不完全是这样,而是想确保我做了一些更现代的东西(假设用来做这件事的方法已经改变了一些)。

我目前的代码读取整个列,但包括最后一个值后的空白单元格。


EDIT1

我喜欢Isedlacek的答案,但我确实有一个问题,我不确定是特定于他的代码。 如果我以这种方式使用它:

 Excel.Application myApplication = new Excel.Application(); myApplication.Visible = true; Excel.Workbook myWorkbook = myApplication.Workbooks.Open("C:\\aFileISelect.xlsx"); Excel.Worksheet myWorksheet = myWorkbook.Sheets["aSheet"] as Excel.Worksheet; Excel.Range myRange = myWorksheet.get_Range("A:A", Type.Missing); var nonEmptyRanges = myRange.Cast<Excel.Range>() .Where(r => !string.IsNullOrEmpty(r.Text)); foreach (var r in nonEmptyRanges) { MessageBox.Show(r.Text); } MessageBox.Show("Finished!"); 

Finished! MessageBox从不显示。 我不知道为什么发生这种情况,但似乎从来没有真正完成search。 我试图在循环中添加一个计数器来查看它是否只是不断地在列中search,但似乎并不是…似乎只是停下来。

Finished! MessageBox是,我试图closures工作簿和电子表格,但该代码从未运行(如预期的,因为MessageBox从未运行)。

如果手动closuresExcel电子表格,则会出现COMException:

COMException未被用户代码处理
附加信息:来自HRESULT的exception:0x803A09A2

有任何想法吗?

答案取决于是否要获取使用的单元格的边界范围,或者是否要从列中获取非空值。

以下是如何有效地从列中获取非空值的方法。 请注意,一次读取整个tempRange.Value属性的速度比读取单元格的速度快得多,但是折中的结果是所得到的数组可能会占用大量内存。

 private static IEnumerable<object> GetNonNullValuesInColumn(_Application application, _Worksheet worksheet, string columnName) { // get the intersection of the column and the used range on the sheet (this is a superset of the non-null cells) var tempRange = application.Intersect(worksheet.UsedRange, (Range) worksheet.Columns[columnName]); // if there is no intersection, there are no values in the column if (tempRange == null) yield break; // get complete set of values from the temp range (potentially memory-intensive) var value = tempRange.Value2; // if value is NULL, it's a single cell with no value if (value == null) yield break; // if value is not an array, the temp range was a single cell with a value if (!(value is Array)) { yield return value; yield break; } // otherwise, the value is a 2-D array var value2 = (object[,]) value; var rowCount = value2.GetLength(0); for (var row = 1; row <= rowCount; ++row) { var v = value2[row, 1]; if (v != null) yield return v; } } 

这是一个有效的方法来获得列中包含非空单元格的最小范围。 请注意,我仍然一次读取整个tempRange值,然后使用结果数组(如果是多单元格范围)来确定哪些单元格包含第一个和最后一个值。 然后,我在确定哪些行有数据之后构造了边界范围。

 private static Range GetNonEmptyRangeInColumn(_Application application, _Worksheet worksheet, string columnName) { // get the intersection of the column and the used range on the sheet (this is a superset of the non-null cells) var tempRange = application.Intersect(worksheet.UsedRange, (Range) worksheet.Columns[columnName]); // if there is no intersection, there are no values in the column if (tempRange == null) return null; // get complete set of values from the temp range (potentially memory-intensive) var value = tempRange.Value2; // if value is NULL, it's a single cell with no value if (value == null) return null; // if value is not an array, the temp range was a single cell with a value if (!(value is Array)) return tempRange; // otherwise, the temp range is a 2D array which may have leading or trailing empty cells var value2 = (object[,]) value; // get the first and last rows that contain values var rowCount = value2.GetLength(0); int firstRowIndex; for (firstRowIndex = 1; firstRowIndex <= rowCount; ++firstRowIndex) { if (value2[firstRowIndex, 1] != null) break; } int lastRowIndex; for (lastRowIndex = rowCount; lastRowIndex >= firstRowIndex; --lastRowIndex) { if (value2[lastRowIndex, 1] != null) break; } // if there are no first and last used row, there is no used range in the column if (firstRowIndex > lastRowIndex) return null; // return the range return worksheet.Range[tempRange[firstRowIndex, 1], tempRange[lastRowIndex, 1]]; } 

如果你不介意完全丢失空行:

 var nonEmptyRanges = myRange.Cast<Excel.Range>() .Where(r => !string.IsNullOrEmpty(r.Text)) foreach (var r in nonEmptyRanges) { // handle the r MessageBox.Show(r.Text); } 
  /// <summary> /// Generic method which reads a column from the <paramref name="workSheetToReadFrom"/> sheet provided.<para /> /// The <paramref name="dumpVariable"/> is the variable upon which the column to be read is going to be dumped.<para /> /// The <paramref name="workSheetToReadFrom"/> is the sheet from which te column is going to be read.<para /> /// The <paramref name="initialCellRowIndex"/>, <paramref name="finalCellRowIndex"/> and <paramref name="columnIndex"/> specify the length of the list to be read and the concrete column of the file from which to perform the reading. <para /> /// Note that the type of data which is going to be read needs to be specified as a generic type argument.The method constraints the generic type arguments which can be passed to it to the types which implement the IConvertible interface provided by the framework (eg int, double, string, etc.). /// </summary> /// <typeparam name="T"></typeparam> /// <param name="dumpVariable"></param> /// <param name="workSheetToReadFrom"></param> /// <param name="initialCellRowIndex"></param> /// <param name="finalCellRowIndex"></param> /// <param name="columnIndex"></param> static void ReadExcelColumn<T>(ref List<T> dumpVariable, Excel._Worksheet workSheetToReadFrom, int initialCellRowIndex, int finalCellRowIndex, int columnIndex) where T: IConvertible { dumpVariable = ((object[,])workSheetToReadFrom.Range[workSheetToReadFrom.Cells[initialCellRowIndex, columnIndex], workSheetToReadFrom.Cells[finalCellRowIndex, columnIndex]].Value2).Cast<object>().ToList().ConvertAll(e => (T)Convert.ChangeType(e, typeof(T))); }