数据随转置一起卷起
我想在客户独特的身份证级别卷起,每个观察被转置如下给出下面是我的数据的快照
basedata <- structure(list(customer = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L), .Label = c("a", "b", "d"), class = "factor"), obs = c(12L, 11L, 12L, 10L, 3L, 5L, 7L, 8L, 1L)), .Names = c("customer", "obs" ), class = "data.frame", row.names = c(NA, -9L))
要么
customer obs a 12 a 11 a 12 a 10 b 3 b 5 b 7 d 8 d 1
我想以下面的forms转换它
customer obs1 obs2 obs3 obs4 a 12 11 12 10 b 3 5 7 - d 8 1 - -
我使用了下面的代码
basedata$shopping <- unlist(tapply(rawdata$customer, rawdata$customer, function (x) seq(1, len = length(x)))) reshape(basedata, idvar = "customer", direction = "wide")
它给出了以下错误
Error in `[.data.frame`(data, , timevar) : undefined columns selected
我如何在R和Excel中做到这一点? 谢谢
x <- structure(list(customer = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L), .Label = c("a", "b", "d"), class = "factor"), obs = c(12L, 11L, 12L, 10L, 3L, 5L, 7L, 8L, 1L)), .Names = c("customer", "obs" ), class = "data.frame", row.names = c(NA, -9L))
我select使用一些额外的软件包( plyr
和reshape2
),因为我发现它们比使用base
软件包更容易,更通用。
library(plyr) library(reshape2) ## add observation number x2 <- ddply(x,"customer",transform,num=1:length(customer)) ## reshape dcast(x2,customer~num,value.var="obs")
一个基本的R路,假设dat
是数据,
> s <- split(dat$obs, dat$customer) > df <- data.frame(do.call(rbind, lapply(s, function(x){ length(x) <- 4; x }))) > names(df) <- paste0('obs', seq(df)) > df # obs1 obs2 obs3 obs4 # a 12 11 12 10 # b 3 5 7 NA # d 8 1 NA NA
如果你想要唯一的客户ID是一个列,
> df2 <- cbind(customer = rownames(df), df) > rownames(df2) <- seq(nrow(df2)) > df2 # customer obs1 obs2 obs3 obs4 # 1 a 12 11 12 10 # 2 b 3 5 7 NA # 3 d 8 1 NA NA
我假设 “基础数据”和“原始数据”应该是相同的(或至less相互拷贝)。 如果是这样的话,你只是缺less指定reshape
的timevar
参数应该是什么。
从你离开的地方继续:
rawdata$shopping <- unlist(tapply(rawdata$customer, rawdata$customer, function (x) seq(1, len = length(x)))) ## rawdata$shopping <- with(rawdata, ave(customer, customer, FUN = seq_along))
这是实际的重塑步骤:
reshape(rawdata, idvar = "customer", timevar="shopping", direction = "wide") # customer obs.1 obs.2 obs.3 obs.4 # 1 a 12 11 12 10 # 5 b 3 5 7 NA # 8 d 8 1 NA NA