导入Excel工作表并使用松散耦合来validation导入的数据

我试图开发一个模块,它将读取Excel表格(可能来自其他数据源,所以它应该松散耦合),并将其转换为实体以便保存。

逻辑将是这样的:

  1. Excel工作表可以是不同的格式,例如Excel工作表中的列名可以不同,所以我的系统需要能够将不同的字段映射到我的实体。
  2. 现在我将假设上面定义的格式将是相同的,现在硬编码,而不是在configuration映射UI上设置后dynamic来自数据库。
  3. 数据在被映射之前需要被validation。 所以我应该能够事先validation它。 我们不使用像XSD或其他东西,所以我应该根据我用作导入模板的对象结构进行validation。

问题是,我把一些东西放在一起,但我不说我喜欢我所做的。 我的问题是我如何改进下面的代码,使事情更模块化,并修复validation问题。

下面的代码是一个模型,并不希望工作,只是看看devise的一些结构。

这是迄今为止我已经提出的代码,我已经意识到一件事情,我需要提高我的devise模式的技能,但现在我需要你的帮助,如果你能帮助我:

//The Controller, a placeholder class UploadController { //Somewhere here we call appropriate class and methods in order to convert //excel sheet to dataset } 

在使用MVC控制器上传文件后,可能会有不同的控制器专门导入某些行为,在这个例子中,我将上传人员相关的表格,

 interface IDataImporter { void Import(DataSet dataset); } 

//除了PersonImporter类PersonImporter:IDataImporter {//我们可以使用许多其他导入器://我们将数据集划分为合适的数据表并调用所有与Person数据导入相关的IImportActions //我们在这里调用DataContext的数据库函数插入,方式//我们可以做更less的数据库往返。

 public string PersonTableName {get;set;} public string DemographicsTableName {get;set;} public Import(Dataset dataset) { CreatePerson(); CreateDemograhics(); } //We put different things in different methods to clear the field. High cohesion. private void CreatePerson(DataSet dataset) { var personDataTable = GetDataTable(dataset,PersonTableName); IImportAction addOrUpdatePerson = new AddOrUpdatePerson(); addOrUpdatePerson.MapEntity(personDataTable); } private void CreateDemograhics(DataSet dataset) { var demographicsDataTable = GetDataTable(dataset,DemographicsTableName); IImportAction demoAction = new AddOrUpdateDemographic(demographicsDataTable); demoAction.MapEntity(); } private DataTable GetDataTable(DataSet dataset, string tableName) { return dataset.Tables[tableName]; } 

}

我有IDataImporter和专门的具体类PersonImporter 。 然而,我不确定它看起来不错,因为事情应该是SOLID,所以在项目周期的后期基本上很容易扩展,这将是未来改进的基础,让我们继续:

IImportActions是魔术大多发生的地方。 我不是基于表devise事物,而是基于行为开发,所以可以调用其中的任何一个来以更模块化的模型导入事物。 例如一个表可能有两个不同的动作。

 interface IImportAction { void MapEntity(DataTable table); } //A sample import action, AddOrUpdatePerson class AddOrUpdatePerson : IImportAction { //Consider using default values as well? public string FirstName {get;set;} public string LastName {get;set;} public string EmployeeId {get;set;} public string Email {get;set;} public void MapEntity(DataTable table) { //Each action is producing its own data context since they use //different actions. using(var dataContext = new DataContext()) { foreach(DataRow row in table.Rows) { if(!emailValidate(row[Email])) { LoggingService.LogWarning(emailValidate.ValidationMessage); } var person = new Person(){ FirstName = row[FirstName], LastName = row[LastName], EmployeeId = row[EmployeeId], Email = row[Email] }; dataContext.SaveObject(person); } dataContext.SaveChangesToDatabase(); } } } class AddOrUpdateDemographic: IImportAction { static string Name {get;set;} static string EmployeeId {get;set;} //So here for example, we will need to save dataContext first before passing it in //to get the PersonId from Person (we're assuming that we need PersonId for Demograhics) public void MapEntity(DataTable table) { using(var dataContext = new DataCOntext()) { foreach(DataRow row in table.Rows) { var demograhic = new Demographic(){ Name = row[Name], PersonId = dataContext.People.First(t => t.EmployeeId = int.Parse(row["EmpId"])) }; dataContext.SaveObject(person); } dataContext.SaveChangesToDatabase(); } } } 

而validation,主要是我不幸的吸吮。 validation需要容易扩展和松散耦合,我也需要能够事先调用这个validation,而不是添加所有东西。

 public static class ValidationFactory { public static Lazy<IFieldValidation> PhoneValidation = new Lazy<IFieldValidation>(()=>new PhoneNumberValidation()); public static Lazy<IFieldValidation> EmailValidation = new Lazy<IFieldValidation>(()=>new EmailValidation()); //etc. } interface IFieldValidation { string ValidationMesage{get;set;} bool Validate(object value); } class PhoneNumberValidation : IFieldValidation { public string ValidationMesage{get;set;} public bool Validate(object value) { var validated = true; //lets say... var innerValue = (string) value; //validate innerValue using Regex or something //if validation fails, then set ValidationMessage propert for logging. return validated; } } class EmailValidation : IFieldValidation { public string ValidationMesage{get;set;} public bool Validate(object value) { var validated = true; //lets say... var innerValue = (string) value; //validate innerValue using Regex or something //if validation fails, then set ValidationMessage propert for logging. return validated; } } 

我在一个项目上做了同样的事情。 不同的是,我不必导入Excel工作表,但CSV文件。 我创build了一个CSVValueProvider。 因此,CSV数据自动绑定到我的IEnumerable模型。

至于validation,我想通过所有的行和单元格,并逐一validation它们不是很有效,特别是当CSV文件有成千上万的logging。 所以,我所做的是创build了一些validation方法,通过逐列扫描CSV数据列,而不是逐行扫描,并对每列执行linq查询,并返回具有无效数据的单元格的行号。 然后,将无效的行号/列名称添加到ModelState中。

更新:

这是我所做的…

CSVReader类:

 // A class that can read and parse the data in a CSV file. public class CSVReader { // Regex expression that's used to parse the data in a line of a CSV file private const string ESCAPE_SPLIT_REGEX = "({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*"; // String array to hold the headers (column names) private string[] _headers; // List of string arrays to hold the data in the CSV file. Each string array in the list represents one line (row). private List<string[]> _rows; // The StreamReader class that's used to read the CSV file. private StreamReader _reader; public CSVReader(StreamReader reader) { _reader = reader; Parse(); } // Reads and parses the data from the CSV file private void Parse() { _rows = new List<string[]>(); string[] row; int rowNumber = 1; var headerLine = "RowNumber," + _reader.ReadLine(); _headers = GetEscapedSVs(headerLine); rowNumber++; while (!_reader.EndOfStream) { var line = rowNumber + "," + _reader.ReadLine(); row = GetEscapedSVs(line); _rows.Add(row); rowNumber++; } _reader.Close(); } private string[] GetEscapedSVs(string data) { if (!data.EndsWith(",")) data = data + ","; return GetEscapedSVs(data, ",", "\""); } // Parses each row by using the given separator and escape characters private string[] GetEscapedSVs(string data, string separator, string escape) { string[] result = null; int priorMatchIndex = 0; MatchCollection matches = Regex.Matches(data, string.Format(ESCAPE_SPLIT_REGEX, separator, escape)); // Skip empty rows... if (matches.Count > 0) { result = new string[matches.Count]; for (int index = 0; index <= result.Length - 2; index++) { result[index] = data.Substring(priorMatchIndex, matches[index].Groups["Separator"].Index - priorMatchIndex); priorMatchIndex = matches[index].Groups["Separator"].Index + separator.Length; } result[result.Length - 1] = data.Substring(priorMatchIndex, data.Length - priorMatchIndex - 1); for (int index = 0; index <= result.Length - 1; index++) { if (Regex.IsMatch(result[index], string.Format("^{0}.*[^{0}]{0}$", escape))) result[index] = result[index].Substring(1, result[index].Length - 2); result[index] = result[index].Replace(escape + escape, escape); if (result[index] == null || result[index] == escape) result[index] = ""; } } return result; } // Returns the number of rows public int RowCount { get { if (_rows == null) return 0; return _rows.Count; } } // Returns the number of headers (columns) public int HeaderCount { get { if (_headers == null) return 0; return _headers.Length; } } // Returns the value in a given column name and row index public object GetValue(string columnName, int rowIndex) { if (rowIndex >= _rows.Count) { return null; } var row = _rows[rowIndex]; int colIndex = GetColumnIndex(columnName); if (colIndex == -1 || colIndex >= row.Length) { return null; } var value = row[colIndex]; return value; } // Returns the column index of the provided column name public int GetColumnIndex(string columnName) { int index = -1; for (int i = 0; i < _headers.Length; i++) { if (_headers[i].Replace(" ","").Equals(columnName, StringComparison.CurrentCultureIgnoreCase)) { index = i; return index; } } return index; } } 

CSVValueProviderFactory类:

 public class CSVValueProviderFactory : ValueProviderFactory { public override IValueProvider GetValueProvider(ControllerContext controllerContext) { var uploadedFiles = controllerContext.HttpContext.Request.Files; if (uploadedFiles.Count > 0) { var file = uploadedFiles[0]; var extension = file.FileName.Split('.').Last(); if (extension.Equals("csv", StringComparison.CurrentCultureIgnoreCase)) { if (file.ContentLength > 0) { var stream = file.InputStream; var csvReader = new CSVReader(new StreamReader(stream, Encoding.Default, true)); return new CSVValueProvider(controllerContext, csvReader); } } } return null; } } 

CSVValueProvider类:

 // Represents a value provider for the data in an uploaded CSV file. public class CSVValueProvider : IValueProvider { private CSVReader _csvReader; public CSVValueProvider(ControllerContext controllerContext, CSVReader csvReader) { if (controllerContext == null) { throw new ArgumentNullException("controllerContext"); } if (csvReader == null) { throw new ArgumentNullException("csvReader"); } _csvReader = csvReader; } public bool ContainsPrefix(string prefix) { if (prefix.Contains('[') && prefix.Contains(']')) { if (prefix.Contains('.')) { var header = prefix.Split('.').Last(); if (_csvReader.GetColumnIndex(header) == -1) { return false; } } int index = int.Parse(prefix.Split('[').Last().Split(']').First()); if (index >= _csvReader.RowCount) { return false; } } return true; } public ValueProviderResult GetValue(string key) { if (!key.Contains('[') || !key.Contains(']') || !key.Contains('.')) { return null; } object value = null; var header = key.Split('.').Last(); int index = int.Parse(key.Split('[').Last().Split(']').First()); value = _csvReader.GetValue(header, index); if (value == null) { return null; } return new ValueProviderResult(value, value.ToString(), CultureInfo.CurrentCulture); } } 

正如我前面提到的那样,为了validation,我认为使用DataAnnotation属性进行validation是没有效率的。 对数据进行逐行validation需要很长时间才能处理包含数千行的CSV文件。 所以,我决定在Model Binding完成后在Controller中validation数据。 我还应该提到,我需要validation数据库中的某些数据在CSV文件中的数据。 如果您只需要validation电子邮件地址或电话号码等内容,则不妨使用DataAnnotation。

以下是validation电子邮件地址列的示例方法:

 private void ValidateEmailAddress(IEnumerable<CSVViewModel> csvData) { var invalidRows = csvData.Where(d => ValidEmail(d.EmailAddress) == false).ToList(); foreach (var invalidRow in invalidRows) { var key = string.Format("csvData[{0}].{1}", invalidRow.RowNumber - 2, "EmailAddress"); ModelState.AddModelError(key, "Invalid Email Address"); } } private static bool ValidEmail(string email) { if(email == "") return false; else return new System.Text.RegularExpressions.Regex(@"^[\w-\.]+@([\w-]+\.)+[\w-]{2,6}$").IsMatch(email); } 

更新2:

为了使用DataAnnotaion进行validation,您只需在下面的CSVViewModel中使用DataAnnotation属性(CSVViewModel是您的CSV数据将绑定到您的Controller Action中的类):

 public class CSVViewModel { // User proper names for your CSV columns, these are just examples... [Required] public int Column1 { get; set; } [Required] [StringLength(30)] public string Column2 { get; set; } }