使用R读特殊字符的名字

我有一个excel(xlsx)表,在“PLAYERS”这个专栏中,欧洲球员的名字中有一个星号,而南美人没有。 像这样的东西

PLAYERS Neymar *Bale* Messi *Ronaldo* *Benzema* *Iniesta* DiMaria 

有什么方法可以使用R(或Excel本身)将这个数据集与欧洲人(带星号)和另一个与南美洲人分成一个? 当然,数据集还包含其他栏目,如“SALARY”,“SCORED GOALS”,“OFFSITE”,“AGE”等等。

谢谢,迭戈

您可以检查玩家名称中是否有“*”,并在新的列中写入“欧洲”或“南美洲”,如果您愿意,可以将dataframe拆分成两个dataframe,一个与欧洲人和南美洲的其他人:

 df <- data.frame(PLAYERS = c("Neymar", "*Ronaldo*", "Messi"), SALARY = 5:7) df # PLAYERS SALARY #1 Neymar 5 #2 *Ronaldo* 6 #3 Messi 7 # check if there's a * in the PLAYERS column df$Location <- ifelse(grepl("\\*", df$PLAYERS), "European", "South American") df # PLAYERS SALARY Location #1 Neymar 5 South American #2 *Ronaldo* 6 European #3 Messi 7 South American #split the data based on location: dflist <- split(df, df$Location) dflist #$European # PLAYERS SALARY Location #2 *Ronaldo* 6 European # #$`South American` # PLAYERS SALARY Location #1 Neymar 5 South American #3 Messi 7 South American 

现在你可以通过input来访问每个列表元素(这是一个data.frame)

 dflist[["European"]] # or "South American" instead # PLAYERS SALARY Location #2 *Ronaldo* 6 European 

您可以拆分这个特定的列,并使用splitsetNames命名结果列表

 > dat <- structure(list(PLAYERS = structure(c(6L, 1L, 5L, 7L, 2L, 4L, 3L), .Label = c("*Bale*", "*Benzema*", "DiMaria", "*Iniesta*", "Messi", "Neymar", "*Ronaldo*"), class = "factor")), .Names = "PLAYERS", class = "data.frame", row.names = c(NA,-7L)) > setNames(split(dat, grepl("[*]", dat$PLAYERS)), nm = c("Euro", "SoAm")) #$Euro # PLAYERS # 1 Neymar # 3 Messi # 7 DiMaria # # $SoAm # PLAYERS # 2 *Bale* # 4 *Ronaldo* # 5 *Benzema* # 6 *Iniesta* 

使用PLAYERS for ROW从源数据创build数据透视表。 使用标签filter进行过滤,包含… ~* ,然后单击Grand Total 。 返回到PT,select不包含…并再次点击Grand Total