As the first step in the decommissioning of sasCommunity.org the site has been converted to read-only mode.


Here are some tips for How to share your SAS knowledge with your professional network.


Tip of the Day:January 28

From sasCommunity
Jump to: navigation, search

sasCommunity Tip of the Day

In most table lookup tasks, we are doing EXACT matching. However, sometimes we are looking for closest match in the lookup table. By 'closest', we mean smallest Eucleadian distance: ||X-Y||_2. Typically we have to manually code the search function in a DATA Step, either using ARRAY or using HASH OBJ.

But if we only care about the 1st closest point in lookup table, we can also tweak PROC FASTCLUS for a simple yet fast implementation. Here is an example with 2-dimension data and Euclidean Distance:

data fix;
input x y;
CLUSTER=_n_;
datalines;
1 3
2 4
3 5
8 0.2
15 1
;run;
data have;
input x y;
datalines;
1.2 6
0.3 4
10 1.2
7 1
2.9 4
;run;
%let dsid=%sysfunc(open(fix));
 
%let ntotal=%sysfunc(attrn(&dsid,NOBS));
 
%let dsid=%sysfunc(close(&dsid));
 
proc fastclus data=have out=have2
     seed=fix maxclusters=&ntotal
     noprint maxiter=0 ;
     var x y;
run;

Submitted by Liang Xie. Contact me at my Discussion Page.



Feel free to comment on this tip.


Prior tip - Next tip - Random Tip

Submit a Tip