Using Arrays to Quickly Perform Fuzzy Merge Look-ups Case Studies in Efficiency
Merging two data sets when a primary key is not available can be difficult. The MERGE statement cannot be used when BY values do not align, and data set expansion to force BY value alignment can be resource intensive. The use of DATA step arrays, as well as other techniques such as hash tables, can greatly simplify the code, reduce or eliminate the need to sort the data, and significantly improve performance.
This paper walks through two different types of examples where these techniques were successfully employed. The advantages will be discussed as will the syntax and techniques that were applied. The discussion will allow the reader to further extrapolate to other applications.
View the pdf for this paper.
The programs associated with this paper can be downloaded from File:104 FuzzyMerge SAS Code.zip.