Subsetting Data into Groups for Complete Processing Within Each Group

From sasCommunity
Jump to: navigation, search

This was the first paper that I presented at any user group conference. Although my coding technique has improved since 1988 (thank goodness), the processes described are still current.
Art Carpenter California Occidental Consultants

Abstract

It is often difficult to follow the results for a particular BY group, when a series of SAS@procedures with a common set of BY variables is processed. The output from a series of SAS procedures and data steps are often used to "tell a story", and while the results of any given procedure may be easy to read, it becomes more difficult to follow a particular BY group from procedure to procedure when the results are separated by the results of the other BY groups. In other words, multiple analyses on each BY group result in output ordered by PROC, rather than by BY group. Processing can be further complicated when large data bases need to be split for analysis into smaller more manageable or more appropriate BY groups, or when the required procedure does not recognize the BY statement. There may be core or disk size considerations for large data bases or BY groups may not be distinct, i.e., the BY groups may share observations, such as in the analysis of moving averages. One solution to these problems is to process each BY group separately. Two macros have been developed which allow the user to split data bases into subgroups and completely process each one. The first macro, %BREAKUP, creates one data set for each already distinct BY group; a macro %DO loop then processes each data set. A second macro, %SPLITUP, reads a control file which is created by the user and contains the information which assigns each observation into one or more data sets and similarly processes-each data set inside a macro %DO loop.


Online Materials

Read SUGI 13 Paper


Contact Info

Feel free to contact Art on his discussion page.