As the first step in the decommissioning of sasCommunity.org the site has been converted to read-only mode.


Here are some tips for How to share your SAS knowledge with your professional network.


Tips:Tweak PROC FASTCLUS for closest match lookup

From sasCommunity
Jump to: navigation, search

In most table lookup tasks, we are doing EXACT matching. However, sometimes we are looking for closest match in the lookup table. By 'closest', we mean smallest Eucleadian distance: ||X-Y||_2. Typically we have to manually code the search function in a DATA Step, either using ARRAY or using HASH OBJ.

But if we only care about the 1st closest point in lookup table, we can also tweak PROC FASTCLUS for a simple yet fast implementation. Here is an example with 2-dimension data and Euclidean Distance:

data fix;
input x y;
CLUSTER=_n_;
datalines;
1 3
2 4
3 5
8 0.2
15 1
;run;
data have;
input x y;
datalines;
1.2 6
0.3 4
10 1.2
7 1
2.9 4
;run;
%let dsid=%sysfunc(open(fix));
 
%let ntotal=%sysfunc(attrn(&dsid,NOBS));
 
%let dsid=%sysfunc(close(&dsid));
 
proc fastclus data=have out=have2
     seed=fix maxclusters=&ntotal
     noprint maxiter=0 ;
     var x y;
run;

Submitted by Liang Xie. Contact me at my Discussion Page.