Matrices and Arrays

Matrices are 2 dimensional structures used to store data, while arrays are high dimensional structures. In R matrices are just atomic vectors with an added dimension attribute which enables R to portray the vector in rows and columns. Note that the length of the vector must be equal to $n \times p$ where n is the number of rows and p is the number of columns.

vec10 <- 1:12
vec10

 [1]  1  2  3  4  5  6  7  8  9 10 11 12

n <- 4
p <- 3
attr(vec10, 'dim') <- c(n, p)
vec10

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12

Since the dim attribute is common, there is a function for it. ie dim

a <- 1:4 # a is a vector with no dimensions
dim(a) # should return NULL

NULL

dim(a) <- c(2,2) # we set the dimension
a # a is no longer a vector but a matrix/array

     [,1] [,2]
[1,]    1    3
[2,]    2    4

dim(a)

[1] 2 2

Is this the convenient way of creating a matrix?

x <- matrix(1:4, 2)
x

     [,1] [,2]
[1,]    1    3
[2,]    2    4

y <- array(1:4, c(2,2))
y

     [,1] [,2]
[1,]    1    3
[2,]    2    4

From the above we can note that a matrix in R is column-major. ie the values are read in in a column wise manner. We can change the way the data is read using the byrow parameter in the matrix function. Look at the example below.

Matrix arithmetic

A <- matrix(c(1,0,2,3,4,5),nrow = 3, byrow = TRUE)
B <- matrix(1:6, ncol=3)

The arithmetic operators work on these matrices the same way they work on a vector. ie element-wise.

A + 1

     [,1] [,2]
[1,]    2    1
[2,]    3    4
[3,]    5    6

A + A

     [,1] [,2]
[1,]    2    0
[2,]    4    6
[3,]    8   10

A^-1

     [,1]      [,2]
[1,] 1.00       Inf
[2,] 0.50 0.3333333
[3,] 0.25 0.2000000

What if I tried adding A to B above?

B+A

Error in B + A: non-conformable arrays

What about matrix multiplication? Recall that the number of columns for the first matrix need to be equal to the number of Rows in the second matrix otherwise it wont work.

A %*% B

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    8   18   28
[3,]   14   32   50

B%*%A

     [,1] [,2]
[1,]   27   34
[2,]   34   42

A%*%A

Error in A %*% A: non-conformable arguments

A%*%t(A) # t is the transpose function

     [,1] [,2] [,3]
[1,]    1    2    4
[2,]    2   13   23
[3,]    4   23   41

crossprod(A) # t(A) %*% A

     [,1] [,2]
[1,]   21   26
[2,]   26   34

tcrossprod(A) # A %*% t(A)

     [,1] [,2] [,3]
[1,]    1    2    4
[2,]    2   13   23
[3,]    4   23   41

crossprod(A, B)

Error: non-conformable arguments

You can find inverse of a square matrix:

D <- B %*% A
solve(D)

          [,1]      [,2]
[1,] -1.909091  1.545455
[2,]  1.545455 -1.227273

E <- A %*% B
E

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    8   18   28
[3,]   14   32   50

G <- crossprod(A)
solve(G)

           [,1]       [,2]
[1,]  0.8947368 -0.6842105
[2,] -0.6842105  0.5526316

Other matrix functions

t(A)

     [,1] [,2] [,3]
[1,]    1    2    4
[2,]    0    3    5

diag(D)

[1] 27 42

diag(E)

[1]  1 18 50

upper.tri(E)

      [,1]  [,2]  [,3]
[1,] FALSE  TRUE  TRUE
[2,] FALSE FALSE  TRUE
[3,] FALSE FALSE FALSE

lower.tri(E)

      [,1]  [,2]  [,3]
[1,] FALSE FALSE FALSE
[2,]  TRUE FALSE FALSE
[3,]  TRUE  TRUE FALSE

col(A)

     [,1] [,2]
[1,]    1    2
[2,]    1    2
[3,]    1    2

row(A)

     [,1] [,2]
[1,]    1    1
[2,]    2    2
[3,]    3    3

ncol(A)

[1] 2

nrow(A)

[1] 3

diag(D) <- 1
diag(1:4)

     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    2    0    0
[3,]    0    0    3    0
[4,]    0    0    0    4

colMeans(A)

[1] 2.333333 2.666667

rowMeans(A)

[1] 0.5 2.5 4.5

colSums(A)

[1] 7 8

rowSums(A)

[1] 1 5 9

rbind(A, A)

     [,1] [,2]
[1,]    1    0
[2,]    2    3
[3,]    4    5
[4,]    1    0
[5,]    2    3
[6,]    4    5

cbind(A,A,A)

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    0    1    0    1    0
[2,]    2    3    2    3    2    3
[3,]    4    5    4    5    4    5

max.col(A)

[1] 1 2 2

What if I want to subtract the column means from each column?

     [,1] [,2]
[1,]    1    0
[2,]    2    3
[3,]    4    5

colMeans(A)

[1] 2.333333 2.666667

A- colMeans(A) #But is this correct?

           [,1]       [,2]
[1,] -1.3333333 -2.6666667
[2,] -0.6666667  0.6666667
[3,]  1.6666667  2.3333333

t(t(A) - colMeans(A))

           [,1]       [,2]
[1,] -1.3333333 -2.6666667
[2,] -0.3333333  0.3333333
[3,]  1.6666667  2.3333333

sweep(A, 2, colMeans(A))

           [,1]       [,2]
[1,] -1.3333333 -2.6666667
[2,] -0.3333333  0.3333333
[3,]  1.6666667  2.3333333

scale(A, center = TRUE, scale = FALSE)

           [,1]       [,2]
[1,] -1.3333333 -2.6666667
[2,] -0.3333333  0.3333333
[3,]  1.6666667  2.3333333
attr(,"scaled:center")
[1] 2.333333 2.666667

Matrix Attributes:

column names, row names

Mat1 <- matrix(1:4, 2)
Mat1

     [,1] [,2]
[1,]    1    3
[2,]    2    4

colnames(Mat1) <- c("column1", "column2") # Set column names
Mat1

     column1 column2
[1,]       1       3
[2,]       2       4

rownames(Mat1) <- c("row1", "row2")
Mat1

     column1 column2
row1       1       3
row2       2       4

mat2 <- matrix(1:4, 2, dimnames = list(c("row1", "row2"), c("column1", "column2")))
mat2

     column1 column2
row1       1       3
row2       2       4

dimnames(mat2)

[[1]]
[1] "row1" "row2"

[[2]]
[1] "column1" "column2"

dimnames(mat2) <- NULL
dimnames(mat2) <-  list(c("row1", "row2"), c("column1", "column2"))
mat2

     column1 column2
row1       1       3
row2       2       4

dimnames(mat2) <-  list(rows = c("row1", "row2"),columns= c("column1", "column2"))
mat2

      columns
rows   column1 column2
  row1       1       3
  row2       2       4

Matrix Sub-setting

A[1,1] # row 1, column 1

[1] 1

A[2,] # row 1

[1] 2 3

A[,2] # column 1

[1] 0 3 5

A[,2,drop = FALSE] # maintain the structure of the original object ie matrix/array

     [,1]
[1,]    0
[2,]    3
[3,]    5

A[1,2] <- -3

Advanced Matrix functions:

We have seen that if we want to find the column sums we use the colSums function. What if we want the col sds?

use apply(your_matrix, dimension, your_function) where dimension is 1 for rows and 2 form columns

A <- array(1:12, c(2,6))
apply(A, 2, sd) # columnwise sd

[1] 0.7071068 0.7071068 0.7071068 0.7071068 0.7071068 0.7071068

apply(A, 2, mean) # similar to colMeans

[1]  1.5  3.5  5.5  7.5  9.5 11.5

apply(A, 1, max)

[1] 11 12

cbind(seq(nrow(A)), max.col(A)) # What does this do?

     [,1] [,2]
[1,]    1    6
[2,]    2    6

A[cbind(seq(nrow(A)), max.col(A))]

[1] 11 12

Arrays

A <- array(1:12, c(2,6))
A

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    3    5    7    9   11
[2,]    2    4    6    8   10   12

B <- array(1:12, c(1,6,2)) 
B

, , 1

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    2    3    4    5    6

, , 2

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    7    8    9   10   11   12

C <- array(1:24, c(3,4,2)) 
C

, , 1

     [,1] [,2] [,3] [,4]
[1,]    1    4    7   10
[2,]    2    5    8   11
[3,]    3    6    9   12

, , 2

     [,1] [,2] [,3] [,4]
[1,]   13   16   19   22
[2,]   14   17   20   23
[3,]   15   18   21   24

Use apply

apply(C, 1, sum)

[1]  92 100 108

apply(C, 2, sum)

[1]  48  66  84 102

apply(C, 3, sum)

[1]  78 222

apply(C,c(1,2), sum)

     [,1] [,2] [,3] [,4]
[1,]   14   20   26   32
[2,]   16   22   28   34
[3,]   18   24   30   36

apply(C, c(1,3), sum)

     [,1] [,2]
[1,]   22   70
[2,]   26   74
[3,]   30   78

apply(C,c(2,3), sum)

     [,1] [,2]
[1,]    6   42
[2,]   15   51
[3,]   24   60
[4,]   33   69

To transpose a matrix, we use t function, what about transposing an array? we use aperm:

aperm(C, c(1,2,3))

, , 1

     [,1] [,2] [,3] [,4]
[1,]    1    4    7   10
[2,]    2    5    8   11
[3,]    3    6    9   12

, , 2

     [,1] [,2] [,3] [,4]
[1,]   13   16   19   22
[2,]   14   17   20   23
[3,]   15   18   21   24

aperm(C,c(1,3,2))

, , 1

     [,1] [,2]
[1,]    1   13
[2,]    2   14
[3,]    3   15

, , 2

     [,1] [,2]
[1,]    4   16
[2,]    5   17
[3,]    6   18

, , 3

     [,1] [,2]
[1,]    7   19
[2,]    8   20
[3,]    9   21

, , 4

     [,1] [,2]
[1,]   10   22
[2,]   11   23
[3,]   12   24

aperm(C, c(2,1,3)) # etc

, , 1

     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9
[4,]   10   11   12

, , 2

     [,1] [,2] [,3]
[1,]   13   14   15
[2,]   16   17   18
[3,]   19   20   21
[4,]   22   23   24

Exercise

Power of a matrix: implement a function called mat_pow(A, n) that takes in two parameters, a square matrix, and an exponent, and returns the matrix multiplied by itself n number of times:

\[ \begin{aligned} A^2 &= A\times A\\ A^3 &= A\times A\times A\\ \vdots\\ A^n &= \underbrace{A\times A\times\cdots\times A}_{n \text{ factors of }A}=\prod_{i=1}^nA \end{aligned} \]
(Work by hand) Suppose we are interested in solving a problem with the unknowns(parameters) being the values of a symmetric matrix with dimension $p \times p$ . Since the lower triangle of the matrix is equal to the upper triangle, we only need to solve just the lower triangle and the main diagonal. In that case, if P = 2, we need to solve for 3 parameters. If p = 3 we need to solve for 6 parameters, ie 3 in the main diagonal and 3 in the lower triangle. What is the number of parameters to be solved for, in terms of p?

Parameter Optimization: (Continuation on the above) Given a vector x of length p, where p is the number of pa xrameters, write R code to turn the vector into a symmetric matrix.

Hint:

x <- 1:3 # the given vector
p <- length(x) # The number of parameters

# Should output
matrix(c(1,2,2,3), 2)

     [,1] [,2]
[1,]    1    2
[2,]    2    3

x <- 1:6 #The given vector
#Should output
matrix(c(1,2,3,2,4,5,3,5,6),3)

     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    2    4    5
[3,]    3    5    6

Make use of the functions: t, lower.tri,upper.tri

Given the matrix z below, Use R to:
```
z <- matrix(1:16,4)
z
```
```
     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2    6   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16
```
1. Obtain the diagonal elements of matrix shown below. ie 1,6,11,16
2. Obtain the anti-diagonal elements of matrix shown below. ie 4,7,10,13
3. Obtain the elements adjacent to the main diagonal but off by 1. ie 2,7,12,5,10,15
4. Obtain the elements adjacent to the anti-diagonal but off by 1. ie 3,6,9,8,11,14
Hint: Use the functions col, row and the relational operator ==
Linear Regression: For simple prediction purposes, we desire to obtain parameters that would reduce the bias between the actual value and the predicted value. Given the model as $\mathbf{Y = X\beta +\epsilon}$ we desire to have a $\beta$ such that $\mathbf{Y\approx X\beta}$ . ie the error term tends to 0 often written as $\mathbf{\mathbb{E}(Y) = X\beta}$ . This is often referred to as linear regression.

The optimal $\beta$ that reduced the mean squared errors is computed as

\[ \hat\beta = \mathbf{(X^\top X)^{-1}X^\top Y} \]

Given the data below, solve for the parameters $\beta =\begin{bmatrix}\beta_0\\\beta_1\end{bmatrix}$
```
set.seed(1) # For reproducibility
x1 <- c(3,6,3,3,5,8,9,10,4)
Y <- 2.3 + 3*x1 + rnorm(x1, 0, 0.001) # ie B= (2.3, 3)
```
Hint: X must contain a column of 1’s

\[ \begin{aligned}y_i = &\beta_0 + \beta_1x_i + \epsilon_i\\ = &\big[1~~x_i\big] \begin{bmatrix}\beta_0\\\beta_1\end{bmatrix} +\epsilon_i\end{aligned} \]

One Hot Encoding: Given a vector that represents a class unto which an object belongs to, write a program that would transform the data vector into dummy matrix.

Example

my_vec <- c(2,1,1,2,3,4,3,2) # there are 4 classes.

Then the one hot encoded data will look like:

     1    2    3    4  Classes
1    0    1    0    0  The 1st element is in class2
2    1    0    0    0  The 2nd element is in class1
3    1    0    0    0  The 3nd element is in class1
4    0    1    0    0   :
5    0    0    1    0   :
6    0    0    0    1
7    0    0    1    0
8    0    1    0    0

Hint: use diag

Classification: Also write a program that would revert back the dummy matrix into the original vector of classes. The matrix is given below

mat1 <- matrix(c(0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 
1L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 
0L), 8)
mat1

     [,1] [,2] [,3] [,4]
[1,]    0    1    0    0
[2,]    1    0    0    0
[3,]    1    0    0    0
[4,]    0    1    0    0
[5,]    0    0    1    0
[6,]    0    0    0    1
[7,]    0    0    1    0
[8,]    0    1    0    0

Data encoding Suppose you are given the matrix below:

primary_color secondary_color tertiary_color

red blue green

yellow red NA

and we want this to be encoded by checking if the color exists across any of the three columns (1) or none of the 3 columns (0). So, it should yield:

Write an R code to produce the above result:

blue green red yellow

1 1 1 0

0 0 1 1
Matrix Inversion: Write a function inverse(A, B)that uses Gauss-Jordan elimination method to obtain obtain the solution of $x$ in $Ax = b$ where A is a square matrix and b is a known vector. Consider not inverting the matrix $A$.

primary_color	secondary_color	tertiary_color
red	blue	green
yellow	red	NA

Write an R code to produce the above result:
blue	green	red	yellow
1	1	1	0
0	0	1	1