Four ways to add row number in SQL Server It is easy to sort any data in SQL Server, while it is not a trivial job to add a new variable of row number to an existent table. Here I come with four ways to realize the purpose./********** 0. Input data source *********************/ use tempdbif object_id('class') is not null drop table class;go-- Use a table with 5 variables from 18 teenagerscreate table class (name char(8), sex char(1), age numeric, height numeric, weight numeric );insert into class values('ALFRED','M',14,69,112.5);insert into class values('ALICE','F',13,56.5,84);insert into class values('BARBARA','F',13,65.3,98);insert into class values('CAROL','F',14,62.8,102.5);insert into class values('HENRY','M',14,63.5,102.5);insert into class values('JAMES','M',12,57.3,83);insert into class values('JANE','F',12,59.8,84.5);insert into class values('JEFFREY','M',13,62.5,84);insert into class ...
Correlations of three variables QuestionThere is an interesting question in statistics -- “There are 3 random variables X, Y and Z. The correlation between X and Y is 0.8 and thecorrelation between X and Z is 0.8. What is the maximum and minimum correlation between Y and Z?”Solutions1. Geometric illustrationThe value of corr(Y, Z) is the COS function of the angle between Y and Z. We already know the corr(X, Y) and corr(X, Z). In this particular case, the angle can be zero, which suggests Y and Z are identical and the max value of corr(Y, Z) is 1. The min value of corr(Y, Z) ...
Multicollinearity and the solutions In his book, Rudolf Freund described a confounding phenomenon while fitting a linear regression. Given a small data set below, there are three variables - dependent variable(y) and independent variables(x1 and x2). Using x2 to fit y alone, the estimated parameter of x2 f is positive that is 0.78. Then using x1 and x2 together to fit y, the parameter of x2 becomes -1.29, which is hard to explain since clearly x2 and y has a positive correlation.data raw;input y x1 x2;cards;2 0 23 2 62 2 77 2 56 4 98 4 810 4 77 6 108 6 1112 ...
Use PROC SQL to reshape data SQL is widely used for data manipulation in relational database management systems. With the help of a macro loop, SAS's SQL procedure can perform a lot of duties, such as reshaping data. It can realize the same functionality as PROC TRANSPOSE and DATA step ARRAY. Hereby I use an example of SASHELP.CLASS to show how to transpose data either from wide to long or from long to wide. The two macros wrapping up PROC SQL are both based on the "split-then-merge" logic. data class; set sashelp.class;run;%let var_list = age weight height;proc transpose data = class out = long_proctranspose; by name ...
Make a frequency function in SAS/IML Aggregation is probably the most popular operation in the data world. R comes with a handy table() function. Usually in SAS, the FREQ procedure would deal with this job. It will be great if SAS/IML has an equivalent function. I just created a user-defined function or module for such a purpose. Since it contains a DO loop, the efficiency is not very ideal -- always 10 times slower than PROC FREQ for a simulated data set of one million records. /* 1 - Use IML for simulation and aggregation */proc iml; start freq(invec); x = t(unique(invec)); y = repeat(x, 1, ...