SAS gotcha's
From sasCommunity
Computers don't make mistakes, programmers make mistakes. The computer just does what we tell it to do. Sometimes, what we tell it to do is very close to, but not exactly the same as, what we want it to do. This page contains examples of SAS code that will produce "incorrect" results without generating an error message. If you have your own examples, please feel free to add them.
John Hendrickx, Clinquest Europe
Contents |
[edit] Conflict with builtin format names
If you create a user-defined format with the same name as a SAS builtin format, no errors or warnings ensue, either in proc format or in the data step when you assign the format. But the builtin format will be used in the output, the format you defined will be ignored! Try this:
proc format;
value weekv
1='week 1'
2='week 2'
3='week 3'
4='week 4'
;
run;
data test;
do weeknr=1 to 4;
output;
end;
format weeknr weekv.;
run;
proc print data=test;
run;
The result:
Obs weeknr 1 1959-W53-06 2 1959-W53-07 3 1960-W01-01 4 1960-W01-02
This problem only occurs in SAS version 9. In version 8.2, there was no builtin weekv. format.
In my opinion, this one is a bug in SAS. SAS should either produce an error message that a reserved word was being used for a user-defined format or just use the format.
Update: Some built-in formats are protected, e.g. if you use the date. format instead of weekv., you get a note in the SAS log: "NOTE: Format DATE could not be written because it has the same name as an Institute-supplied format.". Formats percent., words., wordf. produce the same note in the log but pvalue., hdate., and s370ff. do not generate a note. So watch out!
[edit] Truncated strings due to wrong format
Obviously you can get wrong results if you assign the wrong format to a variable. But here's a nice example where a numeric format is assigned to a character variable. Since a character version of the same format name exists, SAS uses that, and this does not generate a warning or a note in the log! The "formatted" character variable is unaltered but it will be printed with the default length of the format, which in this case is 3.
proc format;
value yesnov
0='No'
1='Yes'
;
value $yesnov
'0'='No'
'1'='Yes'
;
run;
data text;
length longline $32;
longline='This is a very long line';
format longline yesnov.;
run;
proc print data=text;
run;
And the result is:
Obs longline 1 Thi
Okay, if I assign the wrong format to a variable I deserve what's coming. But I hadn't expected a character format to be substituted for a numeric one without any kind of a note or warning in the log. And a truncated variable, that can happen for several other reasons.
[edit] Of of (in functions)
"x1-x3" can mean "x1 x2 x3". Or it can mean "x1 minus x3". SAS usually knows when to use which. But in functions, it's (kinda) understandable if you might think "x1-x3" meant "x1 x2 x3". Think again.
data numb;
x1=3; x2=10; x3=4;
wrong=sum(x1-x3);
right=sum(of x1-x3);
run;
proc print data=numb;
run;
Obs x1 x2 x3 wrong right
1 3 10 4 -1 17
[edit] Equals minus
On a standard (U.S.) keyboard layout, the "minus" sign and the "equals" sign are on adjacent keys. If you hit the "equals" instead of the "minus" sign, you've got a syntactically correct typo!
data numb; x1=99; x2=10; x3=4; x4=5; x5=9; x6=11; wrong=x1-x2-x3=x4-x5-x5; right=x1-x2-x3-x4-x5-x5; run; proc print data=numb; run; Obs x1 x2 x3 x4 x5 x6 wrong right 1 99 10 4 5 9 11 0 62
The computer has a point. The statement "99-10-4 equals 5-9-11" is false. But I meant ...
[edit] Mis-specified IF statement does not always result in errors
This construction is syntactically tempting, but erroneous:
if (x = 1 or 2 or 3) then do ;
SAS interprets anything joined by an OR as a logical expression, and logical expressions are true, so long as they do not evaluate to 0 or missing. So the above amounts to:
if (x = 1 or TRUE or TRUE) then do ;
So it would run for all rows.
Note also that SAS will coerce text items to numbers if it can--these produce NOTEs in the log, but not errors. So this line is completely equivalent:
if (x = '1' or '2' or '3')
The correct construction for this is of course:
if x in (1, 2, 3)
