使用readxl包从URL中读取Excel文件

考虑一下互联网上的文件(就像这个(注意https里的s) https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.xls

如何将文件的表格2读入R?

下面的代码是近似的(但是失败)

url1<-'https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.xls' p1f <- tempfile() download.file(url1, p1f, mode="wb") p1<-read_excel(path = p1f, sheet = 2) 

这在Windows上适用于我:

 library(readxl) library(httr) packageVersion("readxl") # [1] '0.1.1' GET(url1, write_disk(tf <- tempfile(fileext = ".xls"))) df <- read_excel(tf, 2L) str(df) # Classes 'tbl_df', 'tbl' and 'data.frame': 20131 obs. of 8 variables: # $ Code : chr "C115388" "C115800" "C115801" "C115802" ... # $ Codelist Code : chr NA "C115388" "C115388" "C115388" ... # $ Codelist Extensible (Yes/No): chr "No" NA NA NA ... # $ Codelist Name : chr "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" ... # $ CDISC Submission Value : chr "SIXMW1TC" "SIXMW101" "SIXMW102" "SIXMW103" ... # $ CDISC Synonym(s) : chr "6 Minute Walk Functional Test Test Code" "SIXMW1-Distance at 1 Minute" "SIXMW1-Distance at 2 Minutes" "SIXMW1-Distance at 3 Minutes" ... # $ CDISC Definition : chr "6 Minute Walk Test test code." "6 Minute Walk Test - Distance at 1 minute." "6 Minute Walk Test - Distance at 2 minutes." "6 Minute Walk Test - Distance at 3 minutes." ... # $ NCI Preferred Term : chr "CDISC Functional Test 6MWT Test Code Terminology" "6MWT - Distance at 1 Minute" "6MWT - Distance at 2 Minutes" "6MWT - Distance at 3 Minutes" ... 

从Github上的这个问题 (#278):

一些支持更多通用input的function将从readr中被取出,在这一点上readxl可以利用这个function。

所以我们应该可以直接将url传递给read_excel()在希望接近的将来)。

当我执行前3 filed3a2827f129我得到三个文件在一个临时文件夹中,没有文件扩展名的文件名为filed3a2827f129 。 如果我为该文件添加扩展名“.xls”,则可以使用OpenOffice.org的Calc函数打开它,这是查看器面板为sheet2显示的内容的右上angular。

在这里输入图像描述

所以我想知道是否粘贴该文件path可以得到read_excel打开它。 它不会打开原始文件名,但会打开重命名的文件:

 > p1<-read_excel( path ="/private/var/folders/yq/m3j1jqtj6hq6s5mq_v0jn3s80000gn/T/RtmpxfaZRt/filed3a2827f129.xls", sheet = 2) DEFINEDNAME: 21 00 00 01 0b 00 00 00 02 00 00 00 00 00 00 0d 3b 00 00 00 00 a3 4e 00 00 07 00 DEFINEDNAME: 21 00 00 01 0b 00 00 00 02 00 00 00 00 00 00 0d 3b 00 00 00 00 a3 4e 00 00 07 00 DEFINEDNAME: 21 00 00 01 0b 00 00 00 02 00 00 00 00 00 00 0d 3b 00 00 00 00 a3 4e 00 00 07 00 DEFINEDNAME: 21 00 00 01 0b 00 00 00 02 00 00 00 00 00 00 0d 3b 00 00 00 00 a3 4e 00 00 07 00 > str(p1) Classes 'tbl_df', 'tbl' and 'data.frame': 20131 obs. of 8 variables: $ Code : chr "C115388" "C115800" "C115801" "C115802" ... $ Codelist Code : chr NA "C115388" "C115388" "C115388" ... $ Codelist Extensible (Yes/No): chr "No" NA NA NA ... $ Codelist Name : chr "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" ... $ CDISC Submission Value : chr "SIXMW1TC" "SIXMW101" "SIXMW102" "SIXMW103" ... $ CDISC Synonym(s) : chr "6 Minute Walk Functional Test Test Code" "SIXMW1-Distance at 1 Minute" "SIXMW1-Distance at 2 Minutes" "SIXMW1-Distance at 3 Minutes" ... $ CDISC Definition : chr "6 Minute Walk Test test code." "6 Minute Walk Test - Distance at 1 minute." "6 Minute Walk Test - Distance at 2 minutes." "6 Minute Walk Test - Distance at 3 minutes." ... $ NCI Preferred Term : chr "CDISC Functional Test 6MWT Test Code Terminology" "6MWT - Distance at 1 Minute" "6MWT - Distance at 2 Minutes" "6MWT - Distance at 3 Minutes" ...