通过SSIS只导入Excel的最后一列

我有一个我每天收到的excel文件。 该文件中的列数不是特定的。 我的要求是只是通过SSIS加载我的表中的最后一列。 我将如何能够dynamic识别上次使用的列?

你可以使用c#脚本:

确保你添加使用System.Data.OleDb; 到名称空间区域并添加输出列LastCol并select数据types。

public override void CreateNewOutputRows() { /* Add rows by calling the AddRow method on the member variable named "<Output Name>Buffer". For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput". */ string fileName = @"C:\test.xlsx"; string SheetName = "Sheet1"; string cstr = "Provider.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 12.0;HDR=No;IMEX=1\""; OleDbConnection xlConn = new OleDbConnection(cstr); xlConn.Open(); OleDbCommand xlCmd = xlConn.CreateCommand(); xlCmd.CommandText = "Select * from [" + SheetName + "]"; xlCmd.CommandType = CommandType.Text; OleDbDataReader rdr = xlCmd.ExecuteReader(); int rowCt = 0; //Counter while (rdr.Read()) { //skip headers if (rowCt != 0) { int maxCol = rdr.FieldCount; Output0Buffer.AddRow(); Output0Buffer.LastCol = (int)rdr[maxCol]; } rowCt++; //increment counter } } 

解决scheme概述

使用脚本任务来:

  • 获取最后一列索引
  • 使用以下函数将索引转换为列字母(例如:1 – > A)

     Private Function GetExcelColumnName(columnNumber As Integer) As String Dim dividend As Integer = columnNumber Dim columnName As String = String.Empty Dim modulo As Integer While dividend > 0 modulo = (dividend - 1) Mod 26 columnName = Convert.ToChar(65 + modulo).ToString() & columnName dividend = CInt((dividend - modulo) / 26) End While Return columnName End Function 
  • 构build只读取最后一列的SQL命令

  • select这个查询作为Excel来源

详细解决scheme

这个答案假定Sheet Name是Sheet1 ,而使用的编程语言是VB.Net

  1. 首先创build一个stringtypes的SSISvariables(即@ [User :: strQuery])
  2. 添加另一个包含Excel文件path的variables(即@ [User :: ExcelFilePath])
  3. 添加一个脚本任务,select@[User::strQuery]作为ReadWritevariables, @[User::ExcelFilePath]作为ReadOnlyvariables(在脚本任务窗口中)
  4. 将脚本语言设置为VB.Net并在脚本编辑器窗口中编写以下脚本:

注意:你必须导入System.Data.OleDb

  m_strExcelPath = Dts.Variables.Item("ExcelFilePath").Value.ToString Dim strSheetname As String = String.Empty Dim intLastColumn As Integer = 0 m_strExcelConnectionString = Me.BuildConnectionString() Try Using OleDBCon As New OleDbConnection(m_strExcelConnectionString) If OleDBCon.State <> ConnectionState.Open Then OleDBCon.Open() End If 'Get all WorkSheets m_dtschemaTable = OleDBCon.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, New Object() {Nothing, Nothing, Nothing, "TABLE"}) 'Loop over work sheet to get the first one (the excel may contains temporary sheets or deleted ones For Each schRow As DataRow In m_dtschemaTable.Rows strSheetname = schRow("TABLE_NAME").ToString If Not strSheetname.EndsWith("_") AndAlso strSheetname.EndsWith("$") Then Using cmd As New OleDbCommand("SELECT * FROM [" & strSheetname & "]", OleDBCon) Dim dtTable As New DataTable("Table1") cmd.CommandType = CommandType.Text Using daGetDataFromSheet As New OleDbDataAdapter(cmd) daGetDataFromSheet.Fill(dtTable) End Using 'Get the last Column Index intLastColumn = dtTable.Columns.Count End Using 'when the first correct sheet is found there is no need to check others Exit For End If Next OleDBCon.Close() End Using Catch ex As Exception Throw New Exception(ex.Message, ex) End Try Dim strColumnname as String = GetExcelColumnName(intLastColumn) Dts.Variables.Item("strQuery").Value = "SELECT * FROM [" & strSheetname & strColumnname & ":" & strColumnname & "]" Dts.TaskResult = ScriptResults.Success End Sub Private Function GetExcelColumnName(columnNumber As Integer) As String Dim dividend As Integer = columnNumber Dim columnName As String = String.Empty Dim modulo As Integer While dividend > 0 modulo = (dividend - 1) Mod 26 columnName = Convert.ToChar(65 + modulo).ToString() & columnName dividend = CInt((dividend - modulo) / 26) End While Return columnName End Function 
  1. 然后,您必须添加一个Excel连接pipe理器,然后select您想要导入的Excel文件(只需select一个样本即可首次定义元数据)
  2. Select * from [Sheet1$]Select * from [Sheet1$]的默认值Select * from [Sheet1$]variables@[User::strQuery]
  3. 在数据stream任务中添加一个Excel Source,从variables中selectSQL Command,并select@[User::strQuery]
  4. 将DataFlow任务Delay Validation属性设置为True
  5. 将其他组件添加到DataFlow任务

参考

  • 导入具有可变标题的Excel文件
  • 将数字转换为Excel字母列vb.net

不,你不能这样做。 列数和数据types必须事先确定,不能改变。 否则SSIS将会失败。 所以没办法dynamic获取最后一列。 解决方法是使用一些macros从Excel内部获取最后一列,然后使用它作为SSIS的源代码。