Music social network on DNA microarray The incoming 2011 KDD Cup data mining competition [1] by Yahoo! Lab posts an interesting challenge to predict the users' ratings for individual songs out of this company’s huge music database. Unlike previous KDD Cups projects filled by tons of variables that make dimension reduction a serious concern, Yahoo! Lab provides few variables: artist/genre/album. No demographic or geographic information is disclosed. It is interesting to forecast the behavior of a web user by limited web records. Digging valuable clues out for potential following direct marketing is also rewarding. Especially while the competition datasets contain up to 1 million users, 600 ...
Proc Fcmp(1): from VBA to SAS Why use SAS in finance: SAS is a distinguished software package in statistics with more than 40-year development history. Starting from as a scripting tool to do ANOVA for agricultural experimental design in North Carolina, SAS has been heavily built on generalized linear model. For example, SAS institute consistently improve linear model procedures, from Proc Anova, Proc Glm, Proc Mixed to the latest Proc Glimmix. In a summary, SAS is pretty good at processing and analyzing any linear or non-linear models. However, the foundation for finance model, such as fixed income products and derivatives, is continuous-time equations, such as Black-Scholes ...
Vertical collapse by five methods ******************(1) INPUT STEP***********;data have; input id: $ string: $; cards; 001 aaa 001 bbb 002 ccccc 002 dddd 002 eee 003 ffff 004 gggggg ;run;*******************(2) CONCATENATION STEP ***********;***********(2.1) METHOD I: do-loop and substr()***********;data want1(drop = string); length newstring $50.; do _n_ = 1 by 1 until(last.id); set have; by id notsorted; substr(newstring,length(newstring) + 1) = string; end;run;***********(2.2) METHOD II: Proc Transpose***********;proc transpose data = have out = _tmp; by id; var string;run;data want2; set _tmp; newstring = cats(of col:); drop _: col:;run;***********(2.3) METHOD III: retain statement***********;data want3(drop = string); set have; by id notsorted; length newstring $50.; retain newstring ; ...
Some tips about Proc Printto and SAS memory Run SAS Proc Options would have the answers./* Specifies the limit on the total amount of memory to be used by the SAS Syste*/proc options option=MEMSIZE; run;/*Upper limit for data-dependent memory usage during summarization*/proc options option=SUMSIZE; run;/* Upper limit for memory during sorting*/proc options option=SORTSIZE; run;/*Output saslog and saslist to specified locations*/proc printto log="C:\user\myname\mylog.log" print="C:\user\myname\mylist.lst" new;run;/*Open the files with notpads directly*/proc printto log="C:\user\myname\mylog.txt" print="C:\user\myname\mylist.txt" new;run;/*A method to suppress log generation during execuation, especially for simulation*/filename supress dummy;proc printto log=supress;run;/*Reopen the default setting*/proc printto;run;
5D visualiztion: from SAS to Google Motion Chart {"chartType":"MotionChart","chartName":"Chart 1","dataSourceUrl":"//spreadsheets.google.com/tq?key=0Anm4E28zXmo4dDhJd2NQbUVGYTNjY2xZWGRyNkxPdHc&range=A1%3AF396&gid=0&transpose=0&headers=1&pub=1","options":{"displayAnnotations":true,"showTip":true,"dataMode":"markers","maxAlternation":1,"pointSize":"0","colors":["#3366CC","#DC3912","#FF9900","#109618","#990099","#0099C6","#DD4477","#66AA00","#B82E2E","#316395"],"width":620,"smoothLine":false,"lineWidth":"2","labelPosition":"right","is3D":false,"hasLabelsColumn":true,"wmode":"opaque","height":430,"allowCollapse":true,"isStacked":false,"mapType":"hybrid"},"refreshInterval":5} Three dimensions are usually regarded as the maximum for data presentation. With the opening of ODS from SAS 9.2 and its graph template language, 3D graphing is no longer a perplexing problem for SAS programmers. However, nowadays magnificent amount of data with multi-dimension structure needs more vivid and simpler way to be displayed.The emerging of Google Motion Chart now provides a sound solution to visualize data in a more than three dimensions scenario. This web-based analytical technology originated from Dr. Hans Rosling’s innovation. Dr. Rosling and his Gapminder foundation invented a technology to demonstrate the relationship among multiple ...