转置一个数据框，同时保存类/数据types信息

TL;博士

如何以一种保存类/数据types信息的方式，将不同类别的列向量（一列是character ，下一个是numeric ，另一个是logical等）的数据框转置？

示例数据：

 mydata <- data.frame( col0 = c("row1", "row2", "row3"), col1 = c(1, 2, 3), col2 = letters[1:3], col3 = c(TRUE, FALSE, TRUE) )

这里还有一个xlsx文件，它提供了两种数据方向的示例： https ： //github.com/rappster/stackoverflow/blob/master/excel/row-and-column-based-data.xlsx

问题

一个简单的t()或者一些比较像这个post中提到的例程是很棒的，但是不保存类信息或者原始数据框的列。我也知道，类data.frame从来没有把混合类信息存储在其列中。

然而，至less我希望尽可能接近简单地“颠倒” data.frame的意图是：将视angular集中在行而不是列视angular上。也就是说，行向量中的所有元素都需要属于同一个类，而类向量之间可能有所不同。

上下文

我经常在人们用来表示水平时间序列数据的项目中工作，而不是垂直（“variables在列”方向）我们都习惯于在R（其中，恕我直言，也更有意义）。

更重要的是，他们正在广泛使用MS Excel。我需要同时以“宽格式”读取数据，并且通过直接从R到XLConnect和/或openxlsx编写公式来更新现有的Excel文件（与在R中进行计算相反，将最终结果转储到Excel文件中）。

虽然我不断尝试告诉他们，使用这样一个方向意味着跨语言/工具（至less对于R和MS Excel）正确的标准工作，他们不太可能会转换。所以我必须以某种方式处理。

目前的做法

所以我尽pipe关于保留一个底层的list但尽可能data.frame像数据data.frame一样“看起来和感觉”。它的工作，但很相关。我认为可能有一些更聪明的解决scheme。

函数def：

 transpose <- function ( x, col = character(), rnames_or_col = c("col", "rnames") ) { rnames_or_col <- match.arg(rnames_or_col, c("col", "rnames")) ## Buffering column names // cnames <- if (length(col)) { x[[col]] } else { make.names(1:nrow(x)) } ## Removing anchoring column // if (inherits(x, "data.table")) { x <- as.data.frame(x, stringsAsFactors = FALSE) } ## I don't like this part. Any suggestions on how a) build on top of existing ## data.table functionality b) the easiest way to make a data.table behave ## like a data.frame when indexing (remove operation below will yield ## "undesired" results from a data.frame perspective; it's fine in from ## data.table's perspective/paradigm of course) if (length(col)) { x <- x[ , -which(names(x) == col)] } ## Buffer classes // classes <- lapply(x, class) ## Buffer row names // rnames <- names(x) ## Listify // x <- lapply(as.list(x), function(row) { df <- do.call(data.frame, list(as.list(row), stringsAsFactors = FALSE)) names(df) <- cnames df }) names(x) <- rnames ## Actual row names or row names as first column // if (rnames_or_col == "col") { x <- lapply(x, function(ii) { data.frame(variable = row.names(ii), ii, stringsAsFactors = FALSE, row.names = NULL, check.names = FALSE) }) } ## Class // class(x) <- c("df_transposed", class(x)) x }

打印方法：

 print.df_transposed <- function(object) { cat("df_transposed: \n") out <- do.call(rbind, object) rownames(out) <- NULL print(out) }

Getter和setter方法：

 "[<-.df_transposed" <- function(x, i, j, value) { x[[i]][ , j] <- value x } "[.df_transposed" <- function(x, i, j, drop = FALSE) { # foo <- function(x, i, j, drop = FALSE) { has_i <- !missing(i) has_j <- !missing(j) cls <- class(x) scope <- if (has_i) { i } else { 1:length(x) } out <- lapply(unclass(x)[scope], function(ii) { nms <- names(ii) if (has_j) { tmp <- ii[ , j, drop = drop] names(tmp) <- nms[j] ## --> necessary due to `check.names` missing for `[.data.frame` :-/ tmp } else { ii } }) class(out) <- cls out }

类function：

 class2 <- function(x) { sapply(x, function(ii) { value <- if ("variable" %in% names(ii)) { unlist(ii[, -1]) } else { unlist(ii) } class(value) }) }

应用

示例数据：

 mydata <- data.frame( col0 = c("row1", "row2", "row3"), col1 = c(1, 2, 3), col2 = letters[1:3], col3 = c(TRUE, FALSE, TRUE) )

实际转置和打印方法：

 > (df_t <- transpose(mydata, col = "col0")) df_transposed: variable row1 row2 row3 1 col1 1 2 3 2 col2 abc 3 col3 TRUE FALSE TRUE > (df_t2 <- transpose(mydata, col = "col0", rnames_or_col = "rnames")) df_transposed: row1 row2 row3 col1 1 2 3 col2 abc col3 TRUE FALSE TRUE

打印unclassed对象：

 > unclass(df_t) $col1 variable row1 row2 row3 1 col1 1 2 3 $col2 variable row1 row2 row3 1 col2 abc $col3 variable row1 row2 row3 1 col3 TRUE FALSE TRUE > unclass(df_t2) $col1 row1 row2 row3 col1 1 2 3 $col2 row1 row2 row3 col2 abc $col3 row1 row2 row3 col3 TRUE FALSE TRUE

class级查询：

 > class2(df_t) col1 col2 col3 "numeric" "character" "logical"

索引：

 > dat_t[1, ] df_transposed: variable row1 row2 row3 1 1 1 2 3 > dat_t[, 1] df_transposed: variable 1 1 2 1 3 1 > > dat_t[1, 2] df_transposed: row1 col1 1 > dat_t[2, 3] df_transposed: row2 col2 b > > dat_t[1:2, ] df_transposed: variable row1 row2 row3 1 1 1 2 3 2 1 abc > dat_t[, 1:3] df_transposed: variable row1 row2 1 1 1 2 2 1 ab 3 1 TRUE FALSE > > dat_t[c(1, 3), 2:4] df_transposed: row1 row2 row3 col1 1 2 3 col3 1 0 1

转置一个数据框，同时保存类/数据types信息

TL;博士

问题

上下文

目前的做法

应用

Excel将行转换为包含组的列

转置和分组数据

使用引用将数据从列转移到行

Excel VBA：将数据从一个工作簿中的列转移到另一个工作簿中的行

有条件转置行到列

在Excel中将多行转换为单个堆积列

Excel VBA：将列转置到一行

快速将2细胞行转换为2细胞列

如何将数据从水平转换到垂直

VBA：转置最后一行