Optimizing PROC TABULATE

From sasCommunity

Jump to: navigation, search

This started on SAS-L.

The problem: Generating a lot of tables (all based on the same SAS data set) takes a long time because SAS rereads the data set for each table to be produced.

Suggestions

  • Store the input data set in a SASFILE.
  • Identify multiple tables which are partly or completely based on the same aggregation. Perform that aggregation once and store it in a data set which can then be used for each of the derivative tables.
  • Reduce the number of PROC TABULATE steps by coding multiple TABLE statements in the same step. For example, replace
proc tabulate data=sashelp.class;
class age;
table age;
run;
proc tabulate data=sashelp.class;
class sex;
table sex;
run;

with

proc tabulate data=sashelp.class;
class age sex;
table age;
table sex;
run;
  • To generate similar tables for different subsets, use CLASS variables in the page dimension rather than WHERE statements. For example, replace
proc tabulate data=sashelp.class;
where sex='F';
class age;
table age , n='F';
run;
proc tabulate data=sashelp.class;
where sex='M';
class age;
table age , n='M';
run;

with

proc tabulate data=sashelp.class;
class sex age;
table sex= , age , n;
run;
Personal tools