将包含年份的行转换为列

为了将数据inputSAS,需要采用以下格式:

Country Year Indicator_1 Belgium 1900 x1 Belgium 1901 x2 ... Belarus 1901 x1 

但是,我的大部分数据都采用以下格式:

 Country 1900 1901 1902 ... etc Belgium x1____x2___x3 ...etc Belarus x1____x2___x3 ...etc 

有一个简单的macros或VBA脚本可以帮助吗?

将指标stringparsing为年份variables

假设将有超过3年的数据,你需要调整Y1900-Y1902的格式和数组。

 data original; infile datalines; format Country $20. YearIndicator $50.; input Country YearIndicator; format Y1900-Y1902 $4.; array y(*) y1900-y1902; do i = 1 to dim(y); y[i] = scan(YearIndicator,i,'_'); end; drop i; datalines; Belgium x1____x2___x3 Belarus x1____x2___x3 run; 

使宽桌高

 proc transpose data=original out=talldata(rename=(_NAME_=CYear COL1=Indicator)); by country notsorted; var y1900-y1902; run; 

使年变数字,而不是字符

 data talldata; format Country $20. Year 4. Indicator $4.; set talldata; year=input(compress(cyear,,'kd'),4.); drop cyear; run; 

查看结果

 proc print data=talldata; run; 

产量

 Obs Country Year Indicator 1 Belgium 1900 x1 2 Belgium 1901 x2 3 Belgium 1902 x3 4 Belarus 1900 x1 5 Belarus 1901 x2 6 Belarus 1902 x3 

您可以使用联合查询:

 SELECT Country, 1900 As SYear, [1900] As Indicator FROM Table UNION ALL SELECT Country, 1901 As SYear, [1901] As Indicator FROM Table <..> UNION ALL SELECT Country, 2010 As SYear, [2010] As Indicator FROM Table 

如果无法导出查询,可以使用它创build表。

如果input的原始数据足够规则,那么可以通过简单的数据步骤轻松完成。

  data one; infile cards dlm=" _" missover; input country :$20. @; do year = 1900 to 1902; input indicator $ @; output; end; cards; Belgium x1____x2___x3 Belarus x4____x5___x6 ; run; /* check */ proc print data=one; run; /* on lst Obs country year indicator 1 Belgium 1900 x1 2 Belgium 1901 x2 3 Belgium 1902 x3 4 Belarus 1900 x4 5 Belarus 1901 x5 6 Belarus 1902 x6 */