遍历数据集并合并R或Excel中数据为空的特定行对

我有一个数百行的数据集。 大多数行具有完整的信息,但在某些情况下,两行共享相同的密钥,而某些属性重复,而其他行则不行。 这里是一个例子:

Key Campaign Message Stat1 Stat2 Stat3 Stat4 123 Fun yay 1 2 123 temp yay 3 4 Intended result 123 Fun yay 1 2 3 4 

问题:

  1. 需要search数百条logging的整个dataframe,其中大部分不是重复的。 忽略不重复
  2. 必须指定在组合行接受不是“临时”的Campaign数据时
  3. 所有其他数据匹配的列是好的
  4. 其中一个值为空的列将导致在新logging中使用非空值
  5. 我打开R,SQL或Excel的解决scheme(VBA)

感谢任何帮助!

原来比我想象的要多一点,但这里是。 我正在使用一个集合来合并重复的键。 更改IGNORE_TEMP常量以包含或排除临时logging。

在这里输入图像说明

 Sub mergeNonNulls() ' change this constant to ignore or include temp results Const IGNORE_TEMP As Boolean = True ' temporary store of merged rows Dim cMerged As New Collection ' data part of the table Dim data As Range Set data = ActiveSheet.[a2:g3] Dim rw As Range ' current row Dim r As Range ' temporary row Dim c As Range ' temporary cell Dim key As String Dim arr() As Variant Dim v As Variant Dim vv As Variant Dim i As Long Dim isChanged As Boolean For Each rw In data.Rows key = rw.Cells(1) ' the first column is key If IGNORE_TEMP And rw.Cells(2) = "temp" Then DoEvents ' pass temp if enabled Else If Not contains(cMerged, key) Then ' if this is new key, just add it arr = rw cMerged.Add arr, key Else ' if key exists - extract, merge nulls and replace arr = cMerged(key) ' iterate through cells in current and stored rows, ' identify blanks and merge data if current is empty i = 1 isChanged = False For Each c In rw.Cells If Len(Trim(arr(1, i))) = 0 And Len(Trim(c)) > 0 Then arr(1, i) = c isChanged = True End If i = i + 1 Next ' collections in vba are immutable, so if temp row ' was changed, replace it in collection If isChanged Then cMerged.Remove key cMerged.Add arr, key End If End If End If Next ' output the result Dim rn As Long: rn = 1 ' output row Dim numRows As Long Dim numCols As Long With ActiveSheet.[a6] ' output start range For Each v In cMerged numRows = UBound(v, 1) - LBound(v, 1) + 1 numCols = UBound(v, 2) - LBound(v, 2) + 1 .Cells(rn, 1).Resize(numRows, numCols).Value = v rn = rn + 1 Next End With End Sub ' function that checks if the key exists in a collection Function contains(col As Collection, key As String) As Boolean On Error Resume Next col.Item key contains = (Err.Number = 0) On Error GoTo 0 End Function