Wednesday, November 9, 2011

PROC CLUSTER, PROC FASTCLUS to CLUSTERING


The following SAS codes is to use Proc cluster to generate random seeds (centroids/initial points), and then use Proc fastclus to create the clusters.

%let demo=var1 var2 ;

title2 'Hierarchical Solution (WARD''S)';
proc cluster data=out1.training method=ward k=10 trim=0.1 outtree=tree noprint;
var &demo;
copy keys;
run;

proc tree data=tree nclusters=5 dock=5 out=out1.results noprint;
copy &demo;
run;

proc freq data=out1.results ;
table cluster;
run;

/* generate the centroids of the hierarchical clusters */
title1 'Cluster Centroids';
proc means data=out1.results;
class cluster;
var &demo;
output mean= out=centroids(where=(_type_ = 1));
run;

title1 'Score Development Data against the Centroids';
proc fastclus data=out1.training seed=centroids maxclusters=5 least=2 out=results noprint;
var &demo;
run;

title1 'USS (5 clusters)';
proc means data=results uss;
var distance;
run;

proc freq data=results;
table cluster;
run; 

No comments:

Post a Comment

Blog Archive