如何优化从C#中的Excel表读取?

我正在研究一个将parsing大约5500×9细胞范围的应用程序,我确实设法让基础知识以某种方式工作,但是我对此很感兴趣,而且解决scheme非常基本,并且需要很长时间才能获得100行,更不用说5.5k了,现在我被困在这里了,因为到目前为止我查过的任何教程都没有真正的帮助,或者与我当前的代码相比没有提供更好的性能。

Excel.Application xlApp; Excel.Workbook xlWorkBook; Excel.Worksheet xlWorkSheet; xlApp = new Excel.Application(); xlWorkBook = xlApp.Workbooks.Open(@"C:\...\report.xlsx", 0, true, 5, "", "", true, Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0); xlWorkSheet = (Excel.Worksheet)xlWorkBook.Worksheets.get_Item(1); Excel.Range range = xlWorkSheet.UsedRange; int rows = range.Rows.Count; for (int i = 2; i <= 15; i++) { Devices.Add(new DeviceInfo((string)(range.Cells[i, 1] as Excel.Range).Value2, (string)(range.Cells[i, 2] as Excel.Range).Value2, (string)(range.Cells[i, 3] as Excel.Range).Value2, (string)(range.Cells[i, 4] as Excel.Range).Value2, (string)(range.Cells[i, 5] as Excel.Range).Value2, (string)(range.Cells[i, 6] as Excel.Range).Value2, (string)(range.Cells[i, 7] as Excel.Range).Value2, (string)(range.Cells[i, 8] as Excel.Range).Value2, (string)(range.Cells[i, 9] as Excel.Range).Value2, (string)(range.Cells[i, 10] as Excel.Range).Value2, Convert.ToDateTime((range.Cells[i, 11] as Excel.Range).Value2), Convert.ToDateTime((range.Cells[i, 12] as Excel.Range).Value2))); } xlWorkBook.Close(false, System.Reflection.Missing.Value, System.Reflection.Missing.Value); xlApp.Quit(); 

唯一想到的是将整行作为单个范围读取,然后在构造函数中处理这些数据。 但是,我不认为这样做会很实际,因为要获取所有的数据还需要很长的时间。

DeviceInfo类只是与Excel工作表中的列匹配的属性列表。

我最近也有类似的要求,最终使用EPPlus 4.1.0

这是一个优秀的图书馆,已经有一段时间了,是有据可查的,并积极维护。

在这里输入图像描述

您可以使用软件包pipe理器控制台安装Nuget软件包 。

在这里输入图像描述

他们的样本解决scheme涵盖了代码的所有用例,你可以从字面上复制粘贴,只是修改几件事情,你会很好。

是的,这是快速的!**

以下是您的用例的示例代码。 从Codeplex网站上的示例项目中获取。

 /******************************************************************************* * You may amend and distribute as you like, but don't remove this header! * * All rights reserved. * * EPPlus is an Open Source project provided under the * GNU General Public License (GPL) as published by the * Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * EPPlus provides server-side generation of Excel 2007 spreadsheets. * See http://www.codeplex.com/EPPlus for details. * * * * The GNU General Public License can be viewed at http://www.opensource.org/licenses/gpl-license.php * If you unfamiliar with this license or have questions about it, here is an http://www.gnu.org/licenses/gpl-faq.html * * The code for this project may be used and redistributed by any means PROVIDING it is * not sold for profit without the author's written consent, and providing that this notice * and the author's name and all copyright notices remain intact. * * All code and executables are provided "as is" with no warranty either express or implied. * The author accepts no liability for any damage or loss of business that this product may cause. * * * Code change notes: * * Author Change Date ******************************************************************************* * Jan Källman Added 10-SEP-2009 *******************************************************************************/ using System; using System.Collections.Generic; using System.Text; using System.IO; using OfficeOpenXml; namespace EPPlusSamples { /// <summary> /// Simply opens an existing file and reads some values and properties /// </summary> class Sample2 { public static void RunSample2(string FilePath) { Console.WriteLine("Reading column 2 of {0}", FilePath); Console.WriteLine(); FileInfo existingFile = new FileInfo(FilePath); using (ExcelPackage package = new ExcelPackage(existingFile)) { // get the first worksheet in the workbook ExcelWorksheet worksheet = package.Workbook.Worksheets[1]; int col = 2; //The item description // output the data in column 2 for (int row = 2; row < 5; row++) Console.WriteLine("\tCell({0},{1}).Value={2}", row, col, worksheet.Cells[row, col].Value); // output the formula in row 5 Console.WriteLine("\tCell({0},{1}).Formula={2}", 3, 5, worksheet.Cells[3, 5].Formula); Console.WriteLine("\tCell({0},{1}).FormulaR1C1={2}", 3, 5, worksheet.Cells[3, 5].FormulaR1C1); // output the formula in row 5 Console.WriteLine("\tCell({0},{1}).Formula={2}", 5, 3, worksheet.Cells[5, 3].Formula); Console.WriteLine("\tCell({0},{1}).FormulaR1C1={2}", 5, 3, worksheet.Cells[5, 3].FormulaR1C1); } // the using statement automatically calls Dispose() which closes the package. Console.WriteLine(); Console.WriteLine("Sample 2 complete"); Console.WriteLine(); } } } 

(** =如果你发现你的用例的性能是不可接受的,那么你可以看看这个EPPlus的分支,它修复了一些性能问题。

https://github.com/RadoslavGatev/EPPlus-Performance

请注意,这些性能问题是针对极其庞大的excel数据集的 – 我们正在讨论超过100个和1000个表中的5万个单元格 – 而大多数用例都不会遇到这种情况。 )