Determining the number of observations in a SAS data set efficiently
SAS data sets consist of a descriptor portion and a data portion that contains the data values. The descriptor portion of a SAS data set contains detailed information about the data set. This information includes:
- The name of the data set and its member type
- The date and time the data set was created
- The number of observations
- The number of variables
- The engine type
Using the descriptor portion is one of the most efficient and quickest ways of determining the number of observations in a SAS data set.
Here there are some examples:
/* Before SAS V9*/ data _NULL_; if 0 then set sashelp.class nobs=n; call symput('nrows',trim(left(put(n,8.)))); stop; run; %put nobs=&nrows; /* SAS V9.x and higher */ data _NULL_; if 0 then set sashelp.class nobs=n; call symputx('nrows',n); stop; run; %put nobs=&nrows;
- The "if 0" is a a conditional statement designed to never be executed at run-time and works since the "set sashelp.class" gives the header information of the data set class to the compiler which sets up the pdv accordingly, but is skipped in the execution time because the if condition is false.
- Nobs is a SAS automatic variable which contains the number of records in the dataset named in the set statement. The code, nobs = n, places the value of nobs (the number of records in the sashlep.class dataset) in the variable n.
- The STOP statement is used to prevent an endless loop because EOF is never reached.
Many times users don't need to know how many obs there are, just whether there are any. In such cases, a data step like the following might be just what is needed. This technique also has the advantage that it works for views and other data sets where the number of observations is not available in the data set header (Don Henderson's tip).
data _null_; call symput('AnyObs','0'); /* set to 0 for No */ set &data(obs=1); call symput('AnyObs','1'); /* if got here there are obs so set to 1 for Yes */ run;
See Also NOBS option limitations
The metadata can also be accessed by the macro language using DATA step functions. The ATTRN function is used with the NLOBS argument in the %OBSCNT macro shown below to retrieve the number of non-deleted observations from the meta data.
%macro obscnt(dsn); %local nobs dsnid; %let nobs=.; %* Open the data set of interest; %let dsnid = %sysfunc(open(&dsn)); %* If the open was successful get the; %* number of observations and CLOSE &dsn; %if &dsnid %then %do; %let nobs=%sysfunc(attrn(&dsnid,nlobs)); %let rc =%sysfunc(close(&dsnid)); %end; %else %do; %put Unable to open &dsn - %sysfunc(sysmsg()); %end; %* Return the number of observations; &nobs %mend obscnt;
Submitted by Alberto Negron. Contact me at my Discussion Page.
SAS dictionary tables can also be used to load the number of observations into a macro variable:
proc sql noprint; select nobs into :nobs separated by ' ' from dictionary.tables where libname='SASHELP' and memname='CLASS'; quit; %put TNote: nobs=&nobs;
Submitted by Tom Robinson. Contact me at my Discussion Page.