Convert Character Variables to Numeric
From sasCommunity
The task is to create a successor to a SAS data set, replacing each character variable in the original with a numeric variable. Here is a method which preserves variable order and variable attributes. First create a data set for the demonstration:
proc sql;
create table mixedtype
( Month num format=monyy7.
, MeanTemp char(6) Label = 'Mean Temperature (Celsius)'
, Rainfall char(6) Label = 'Rainfall (Centimeters)'
);
insert into mixedtype (Month, MeanTemp, Rainfall)
values ('01Jan2008'd, '5.5', '6.2')
values ('01Feb2008'd, '6.4', '5.9')
values ('01Mar2008'd, '9.8', '4.9')
;
Result:
Mean
Temperature Rainfall
Month (Celsius) (Centimeters)
-----------------------------------
JAN2008 5.5 6.2
FEB2008 6.4 5.9
MAR2008 9.8 4.9
Next, this somewhat tricky query which uses DICTIONARY.COLUMNS as a source of metadata and creates the bulk of another query:
select case type
when 'num' then name
when 'char' then catx ( ' '
, 'input ('
, name
, ', f32. ) as'
, name
, 'label ='
, quote( strip(label) )
)
else catx ( ' '
, quote('Error on type')
, 'as'
, name
)
end
into : selections separated by ' , '
from dictionary.columns
where libname='WORK' and memname='MIXEDTYPE';
To adapt for other tables change only the literals in the WHERE clause.
Result:
Month input ( MeanTemp , f32. ) as MeanTemp label = "Mean Temperature (Celsius)" input ( Rainfall , f32. ) as Rainfall label = "Rainfall (Centimeters)"
The INTO clause takes these expressions and concatenates them into a comma-separated string which it stores in the macro variable SELECTIONS, which in turn is used in the statement which actually does the work:
create table allnum as select &selections from mixedtype; %symdel selections;
Result:
Mean
Temperature Rainfall
Month (Celsius) (Centimeters)
-----------------------------------
JAN2008 5.5 6.2
FEB2008 6.4 5.9
MAR2008 9.8 4.9
Another query against DICTIONARY.COLUMNS compares the attributes of the original and replacement variables:
select name format=$10.
, memname format=$10.
, varnum format=10.
, type format=$8.
, length format=6.
, format format=$8.
, label format=$30.
from dictionary.columns
where libname='WORK' and memname in ('ALLNUM','MIXEDTYPE')
order by varnum, memname desc;
quit;
Result:
Column Column Member Number in Column Column Column Name Name Table Type Length Format Column Label ------------------------------------------------------------------------------------------ Month MIXEDTYPE 1 num 8 MONYY7. Month ALLNUM 1 num 8 MONYY7. MeanTemp MIXEDTYPE 2 char 6 Mean Temperature (Celsius) MeanTemp ALLNUM 2 num 8 Mean Temperature (Celsius) Rainfall MIXEDTYPE 3 char 6 Rainfall (Centimeters) Rainfall ALLNUM 3 num 8 Rainfall (Centimeters)
This demonstrates that variable order and variable attributes are preserved.
Limitation: This technique will not produce correct results for variables requiring special informats (most notably, times and dates).
See also Convert Numeric Variables to Character and Convert All Variables to Opposite Type.
