Tips Talk:Tweak PROC FASTCLUS for closest match lookup

Needs catagory tags. Don't know enough about the subject matter to know which ones. --Don Henderson 21:18, 1 October 2009 (UTC)

1-Nearest Neighbor is the simplest form of kNN algorithm in Data Mining.

split up into tip and example

This would be a better tip, by just indicating what can be done with the tip itself and the code itself as a regular article. --Phil Miller (STATPROF) 22:23, 9 October 2009 (UTC)

An example would be found at the following SAS-L discussion: In most cases, we are doing matching on look up table by an exact match, but occationally, we are matching based on range, say find the observation with the shortest distance to one obs in the look up table (possiblly nested within a cluster, say IDs) then, this approach is a fast and simple way to tackle such jobs. --Liang Xie (oloolo) 13:25, 23 October 2009 (UTC)

I edited it to use HTML tables to reduce the vertical space so it fits as a tip. Also moved the assignment of CLUSTER to eliminate an extra data step. Someone with some stat background will need to review however. --Don Henderson 20:52, 16 December 2009 (UTC)

The cluster assignment DATA STEP is necessary and can't be deleted. The cluster flag tells the user which observation this process found to be the closest match. --Liang Xie (oloolo) 18:55, 16 December 2009 (UTC)

The assignment of CLUSTER was not removed. The extra unneeded DATA step was removed and the assignment of CLUSTER was moved the the first data step. If you this this code you will see that the data set Fix has the variable just as in the original code. I also did a move of the Tips page as the Talk page was moved without moving the associated main page. --Don Henderson 02:37, 17 December 2009 (UTC)

I see. Thanks. --Liang Xie (oloolo) 23:55, 16 December 2009 (UTC)

Clear and precise. Good TIP. Promoting to READY. Keep up the good work.
Charlie Shipp 03:00, 21 January 2010 (UTC)