I recently discovered 10 questions most daunting in R development, and I am trying to find the answers below.

1. Data structure -- How many data structures R has? How do you build a binary search tree in R?

2. Sorting -- How many sorting algorithms are available? Show me an example in R.

3. Low level -- How do you build a R function powered by C?

4. String -- How do you implement string operation in R?

5. Vectorization -- If you want to do Monte Carlo simulation by R, how do you improve the efficiency?

6. Function -- How do you take function as argument of another function? What is the apply() function family?

7. Threading -- How do you do multi-threading in R?

8. Memory limit and database -- What is the memory limit of R? How do you avoid it? How do you use SQL in R?

9. Testing -- How do you do testing and debugging in R?

10. Software development -- How do you develop a package? How do you do version control?

** Q1. Data structure -- How many data structures R has? How do you build a binary search tree in R?**

My answer: R mainly has 5 data structures.

Homogeneous(contain the same type of objects): vector --> matrix --> array

Heterogeneous(allow different type of objects): list --> data frame

A binary search tree requires several actions: implement a tree, insert nodes and delete nodes. We should create individual routines in R.

In R, a matrix is the ideal data structure to contain the linked elements. Then a list should be used to wrap the matrix and pass the arguments.

For insert-node routine, there is the pseudocode for it. The key point here is to use recursion in R’s function. Norman Matloff gave a complete example at page 180 of his book.

** Q2. Sorting -- How many sorting algorithms are available? Show me an example in R. **

My answer: there are mainly 5 kinds of sorting algorithms(with their complexity):

Bubble Sort - O(n^2);

Selection Sort - O(n^2)

Merge Sort - O(n log n)

Quick Sort - from O(n log n) to O(n^2)

Bucket Sort - O(n + m)

R has a native sort function sort(), which is written by C and uses Quick Sort.

There is an example of Quick Sort in R by recursion

1. Data structure -- How many data structures R has? How do you build a binary search tree in R?

2. Sorting -- How many sorting algorithms are available? Show me an example in R.

3. Low level -- How do you build a R function powered by C?

4. String -- How do you implement string operation in R?

5. Vectorization -- If you want to do Monte Carlo simulation by R, how do you improve the efficiency?

6. Function -- How do you take function as argument of another function? What is the apply() function family?

7. Threading -- How do you do multi-threading in R?

8. Memory limit and database -- What is the memory limit of R? How do you avoid it? How do you use SQL in R?

9. Testing -- How do you do testing and debugging in R?

10. Software development -- How do you develop a package? How do you do version control?

My answer: R mainly has 5 data structures.

Homogeneous(contain the same type of objects): vector --> matrix --> array

Heterogeneous(allow different type of objects): list --> data frame

A binary search tree requires several actions: implement a tree, insert nodes and delete nodes. We should create individual routines in R.

In R, a matrix is the ideal data structure to contain the linked elements. Then a list should be used to wrap the matrix and pass the arguments.

For insert-node routine, there is the pseudocode for it. The key point here is to use recursion in R’s function. Norman Matloff gave a complete example at page 180 of his book.

insert(X, node){

if(node = NULL)

node = new binaryNode(X,NULL,NULL)

return

}

if(X = node:data)

return

else if(X < node:data)

insert(X, node:leftChild)

else // X > node:data

insert(X, node:rightChild)

}

My answer: there are mainly 5 kinds of sorting algorithms(with their complexity):

Bubble Sort - O(n^2);

Selection Sort - O(n^2)

Merge Sort - O(n log n)

Quick Sort - from O(n log n) to O(n^2)

Bucket Sort - O(n + m)

R has a native sort function sort(), which is written by C and uses Quick Sort.

There is an example of Quick Sort in R by recursion

qs <- function(x) {

if (length(x) <= 1) return(x)

seed <- x[1]

rest <- x[-1]

sv1 <- rest[rest < seed]

sv2 <- rest[rest >= seed]

sv1 <- qs(sv1)

sv2 <- qs(sv2)

return(c(sv1,seed,sv2))

}