Make a frequency function in SAS/IML Aggregation is probably the most popular operation in the data world. R comes with a handy table() function. Usually in SAS, the FREQ procedure would deal with this job. It will be great if SAS/IML has an equivalent function. I just created a user-defined function or module for such a purpose. Since it contains a DO loop, the efficiency is not very ideal -- always 10 times slower than PROC FREQ for a simulated data set of one million records. /* 1 - Use IML for simulation and aggregation */proc iml; start freq(invec); x = t(unique(invec)); y = repeat(x, 1, ...
Mahalanobis distances on a heat map I just learned Mahalanobis distance from Rick’s blog post yesterday, and realized its significance in detecting outliers. One of SAS’s online documents shows how to use PCA method to find Mahalanobis distances. And in SAS 9.3, the popular heat map becomes availableSAS’s classic help dataset SASHELP.CLASS has weight, height, age and some other information for 19 teenagers. I calculated the pair-wise Mahalanobis distances according to their age, weight and height, and showed those distances on a heat map. It seems that it is helpful to tell how similar two teenagers are to each other./* 1 -- Find pairwise Mahalanobis distances */proc princomp ...
Valentine's Day Happy Valentine's Day!data one; do t = 1 to 3*constant("pi") by 0.05; x = 16*sin(t)**3; y = 13*cos(t) - 5*cos(2*t) - 2*cos(3*t) - cos(4*t); output; end;run;data two; set one; if _n_ = 70 then label = "Valentine's Day"; run;ods graphics on / width=6in height= 5in;proc sgplot data = two; series x = x y = y /lineattrs=(color=red thickness=5) datalabel = label datalabelattrs=(color=red family="garamond" style=italic size=45 weight= bold);run;
Cholesky decomposition to "expand" data Yesterday Rick showed how to use Cholesky decomposition to transform data by the ROOT function of SAS/IML. Cholesky decomposition is so important in simulation. For those DATA STEP programmers who are not very familiar with SAS/IML, PROC FCMP in SAS may be another option, since it has an equivalent routine CALL CHOL.To replicate Rick’s example of general Cholesky transformation for correlates variables,  I randomly chose three variables from a SASHELP dataset SASHELP.CARS and created a simulated dataset which shares the identical variance-covariance structure. A simulated dataset can be viewed as an “expanded’ version of the original data set.Conclusion:In PROC FCMP, ...
A test for memory management of SAS/IML Programming always involves the considerations for the efficiency and the memory usage. For efficient programming in SAS/IML, my shortcut is to look at the tip sheet from Rick Wicklin and search ways to simplify the codes. As for the memory management mechanism of SAS/IML, I only found one page of SAS/IML 9.2 User’s Guide on the Internet.To see the performance of SAS/IML ‘s memory management, I designed a simple test, since the SHOW SPACE statement would indicate the memory details of SAS/IML. A simulated 200 rows * 300 columns matrix occupies about 400k memory. I just requested 1MB memory by specifying ...