<- 1:12
vec10 vec10
[1] 1 2 3 4 5 6 7 8 9 10 11 12
<- 4
n <- 3
p attr(vec10, 'dim') <- c(n, p)
vec10
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
Matrices are 2 dimensional structures used to store data, while arrays are high dimensional structures. In R matrices are just atomic vectors with an added dimension attribute which enables R to portray the vector in rows and columns. Note that the length of the vector must be equal to \(n \times p\) where n is the number of rows and p is the number of columns.
<- 1:12
vec10 vec10
[1] 1 2 3 4 5 6 7 8 9 10 11 12
<- 4
n <- 3
p attr(vec10, 'dim') <- c(n, p)
vec10
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
Since the dim attribute is common, there is a function for it. ie dim
<- 1:4 # a is a vector with no dimensions
a dim(a) # should return NULL
NULL
dim(a) <- c(2,2) # we set the dimension
# a is no longer a vector but a matrix/array a
[,1] [,2]
[1,] 1 3
[2,] 2 4
dim(a)
[1] 2 2
Is this the convenient way of creating a matrix?
<- matrix(1:4, 2)
x x
[,1] [,2]
[1,] 1 3
[2,] 2 4
<- array(1:4, c(2,2))
y y
[,1] [,2]
[1,] 1 3
[2,] 2 4
From the above we can note that a matrix in R is column-major. ie the values are read in in a column wise manner. We can change the way the data is read using the byrow
parameter in the matrix
function. Look at the example below.
<- matrix(c(1,0,2,3,4,5),nrow = 3, byrow = TRUE)
A <- matrix(1:6, ncol=3) B
The arithmetic operators work on these matrices the same way they work on a vector. ie element-wise.
+ 1 A
[,1] [,2]
[1,] 2 1
[2,] 3 4
[3,] 5 6
+ A A
[,1] [,2]
[1,] 2 0
[2,] 4 6
[3,] 8 10
^-1 A
[,1] [,2]
[1,] 1.00 Inf
[2,] 0.50 0.3333333
[3,] 0.25 0.2000000
What if I tried adding A to B above?
+A B
Error in B + A: non-conformable arrays
What about matrix multiplication? Recall that the number of columns for the first matrix need to be equal to the number of Rows in the second matrix otherwise it wont work.
%*% B A
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 8 18 28
[3,] 14 32 50
%*%A B
[,1] [,2]
[1,] 27 34
[2,] 34 42
%*%A A
Error in A %*% A: non-conformable arguments
%*%t(A) # t is the transpose function A
[,1] [,2] [,3]
[1,] 1 2 4
[2,] 2 13 23
[3,] 4 23 41
crossprod(A) # t(A) %*% A
[,1] [,2]
[1,] 21 26
[2,] 26 34
tcrossprod(A) # A %*% t(A)
[,1] [,2] [,3]
[1,] 1 2 4
[2,] 2 13 23
[3,] 4 23 41
crossprod(A, B)
Error: non-conformable arguments
You can find inverse of a square matrix:
<- B %*% A
D solve(D)
[,1] [,2]
[1,] -1.909091 1.545455
[2,] 1.545455 -1.227273
<- A %*% B
E E
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 8 18 28
[3,] 14 32 50
<- crossprod(A)
G solve(G)
[,1] [,2]
[1,] 0.8947368 -0.6842105
[2,] -0.6842105 0.5526316
t(A)
[,1] [,2] [,3]
[1,] 1 2 4
[2,] 0 3 5
diag(D)
[1] 27 42
diag(E)
[1] 1 18 50
upper.tri(E)
[,1] [,2] [,3]
[1,] FALSE TRUE TRUE
[2,] FALSE FALSE TRUE
[3,] FALSE FALSE FALSE
lower.tri(E)
[,1] [,2] [,3]
[1,] FALSE FALSE FALSE
[2,] TRUE FALSE FALSE
[3,] TRUE TRUE FALSE
col(A)
[,1] [,2]
[1,] 1 2
[2,] 1 2
[3,] 1 2
row(A)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3 3
ncol(A)
[1] 2
nrow(A)
[1] 3
diag(D) <- 1
diag(1:4)
[,1] [,2] [,3] [,4]
[1,] 1 0 0 0
[2,] 0 2 0 0
[3,] 0 0 3 0
[4,] 0 0 0 4
colMeans(A)
[1] 2.333333 2.666667
rowMeans(A)
[1] 0.5 2.5 4.5
colSums(A)
[1] 7 8
rowSums(A)
[1] 1 5 9
rbind(A, A)
[,1] [,2]
[1,] 1 0
[2,] 2 3
[3,] 4 5
[4,] 1 0
[5,] 2 3
[6,] 4 5
cbind(A,A,A)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 0 1 0 1 0
[2,] 2 3 2 3 2 3
[3,] 4 5 4 5 4 5
max.col(A)
[1] 1 2 2
What if I want to subtract the column means from each column?
A
[,1] [,2]
[1,] 1 0
[2,] 2 3
[3,] 4 5
colMeans(A)
[1] 2.333333 2.666667
- colMeans(A) #But is this correct? A
[,1] [,2]
[1,] -1.3333333 -2.6666667
[2,] -0.6666667 0.6666667
[3,] 1.6666667 2.3333333
t(t(A) - colMeans(A))
[,1] [,2]
[1,] -1.3333333 -2.6666667
[2,] -0.3333333 0.3333333
[3,] 1.6666667 2.3333333
sweep(A, 2, colMeans(A))
[,1] [,2]
[1,] -1.3333333 -2.6666667
[2,] -0.3333333 0.3333333
[3,] 1.6666667 2.3333333
scale(A, center = TRUE, scale = FALSE)
[,1] [,2]
[1,] -1.3333333 -2.6666667
[2,] -0.3333333 0.3333333
[3,] 1.6666667 2.3333333
attr(,"scaled:center")
[1] 2.333333 2.666667
column names, row names
<- matrix(1:4, 2)
Mat1 Mat1
[,1] [,2]
[1,] 1 3
[2,] 2 4
colnames(Mat1) <- c("column1", "column2") # Set column names
Mat1
column1 column2
[1,] 1 3
[2,] 2 4
rownames(Mat1) <- c("row1", "row2")
Mat1
column1 column2
row1 1 3
row2 2 4
<- matrix(1:4, 2, dimnames = list(c("row1", "row2"), c("column1", "column2")))
mat2 mat2
column1 column2
row1 1 3
row2 2 4
dimnames(mat2)
[[1]]
[1] "row1" "row2"
[[2]]
[1] "column1" "column2"
dimnames(mat2) <- NULL
dimnames(mat2) <- list(c("row1", "row2"), c("column1", "column2"))
mat2
column1 column2
row1 1 3
row2 2 4
dimnames(mat2) <- list(rows = c("row1", "row2"),columns= c("column1", "column2"))
mat2
columns
rows column1 column2
row1 1 3
row2 2 4
1,1] # row 1, column 1 A[
[1] 1
2,] # row 1 A[
[1] 2 3
2] # column 1 A[,
[1] 0 3 5
2,drop = FALSE] # maintain the structure of the original object ie matrix/array A[,
[,1]
[1,] 0
[2,] 3
[3,] 5
1,2] <- -3 A[
We have seen that if we want to find the column sums we use the colSums
function. What if we want the col sd
s?
use apply(your_matrix, dimension, your_function)
where dimension is 1
for rows and 2
form columns
<- array(1:12, c(2,6))
A apply(A, 2, sd) # columnwise sd
[1] 0.7071068 0.7071068 0.7071068 0.7071068 0.7071068 0.7071068
apply(A, 2, mean) # similar to colMeans
[1] 1.5 3.5 5.5 7.5 9.5 11.5
apply(A, 1, max)
[1] 11 12
cbind(seq(nrow(A)), max.col(A)) # What does this do?
[,1] [,2]
[1,] 1 6
[2,] 2 6
cbind(seq(nrow(A)), max.col(A))] A[
[1] 11 12
<- array(1:12, c(2,6))
A A
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 3 5 7 9 11
[2,] 2 4 6 8 10 12
<- array(1:12, c(1,6,2))
B B
, , 1
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 5 6
, , 2
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 7 8 9 10 11 12
<- array(1:24, c(3,4,2))
C C
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
, , 2
[,1] [,2] [,3] [,4]
[1,] 13 16 19 22
[2,] 14 17 20 23
[3,] 15 18 21 24
Use apply
apply(C, 1, sum)
[1] 92 100 108
apply(C, 2, sum)
[1] 48 66 84 102
apply(C, 3, sum)
[1] 78 222
apply(C,c(1,2), sum)
[,1] [,2] [,3] [,4]
[1,] 14 20 26 32
[2,] 16 22 28 34
[3,] 18 24 30 36
apply(C, c(1,3), sum)
[,1] [,2]
[1,] 22 70
[2,] 26 74
[3,] 30 78
apply(C,c(2,3), sum)
[,1] [,2]
[1,] 6 42
[2,] 15 51
[3,] 24 60
[4,] 33 69
To transpose a matrix, we use t
function, what about transposing an array? we use aperm
:
aperm(C, c(1,2,3))
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
, , 2
[,1] [,2] [,3] [,4]
[1,] 13 16 19 22
[2,] 14 17 20 23
[3,] 15 18 21 24
aperm(C,c(1,3,2))
, , 1
[,1] [,2]
[1,] 1 13
[2,] 2 14
[3,] 3 15
, , 2
[,1] [,2]
[1,] 4 16
[2,] 5 17
[3,] 6 18
, , 3
[,1] [,2]
[1,] 7 19
[2,] 8 20
[3,] 9 21
, , 4
[,1] [,2]
[1,] 10 22
[2,] 11 23
[3,] 12 24
aperm(C, c(2,1,3)) # etc
, , 1
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
[4,] 10 11 12
, , 2
[,1] [,2] [,3]
[1,] 13 14 15
[2,] 16 17 18
[3,] 19 20 21
[4,] 22 23 24
Power of a matrix: implement a function called mat_pow(A, n)
that takes in two parameters, a square matrix, and an exponent, and returns the matrix multiplied by itself n number of times:
\[ \begin{aligned} A^2 &= A\times A\\ A^3 &= A\times A\times A\\ \vdots\\ A^n &= \underbrace{A\times A\times\cdots\times A}_{n \text{ factors of }A}=\prod_{i=1}^nA \end{aligned} \]
(Work by hand) Suppose we are interested in solving a problem with the unknowns(parameters) being the values of a symmetric matrix with dimension \(p \times p\) . Since the lower triangle of the matrix is equal to the upper triangle, we only need to solve just the lower triangle and the main diagonal. In that case, if P = 2, we need to solve for 3 parameters. If p = 3 we need to solve for 6 parameters, ie 3 in the main diagonal and 3 in the lower triangle. What is the number of parameters to be solved for, in terms of p?
Parameter Optimization: (Continuation on the above) Given a vector x
of length p
, where p is the number of pa xrameters, write R code to turn the vector into a symmetric matrix.
Hint:
<- 1:3 # the given vector
x <- length(x) # The number of parameters
p
# Should output
matrix(c(1,2,2,3), 2)
[,1] [,2]
[1,] 1 2
[2,] 2 3
<- 1:6 #The given vector
x #Should output
matrix(c(1,2,3,2,4,5,3,5,6),3)
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 4 5
[3,] 3 5 6
Make use of the functions: t, lower.tri,upper.tri
Given the matrix z
below, Use R to:
<- matrix(1:16,4)
z z
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16
Obtain the diagonal elements of matrix shown below. ie 1,6,11,16
Obtain the anti-diagonal elements of matrix shown below. ie 4,7,10,13
Obtain the elements adjacent to the main diagonal but off by 1. ie 2,7,12,5,10,15
Obtain the elements adjacent to the anti-diagonal but off by 1. ie 3,6,9,8,11,14
Hint: Use the functions col, row
and the relational operator ==
Linear Regression: For simple prediction purposes, we desire to obtain parameters that would reduce the bias between the actual value and the predicted value. Given the model as \(\mathbf{Y = X\beta +\epsilon}\) we desire to have a \(\beta\) such that \(\mathbf{Y\approx X\beta}\) . ie the error term tends to 0 often written as \(\mathbf{\mathbb{E}(Y) = X\beta}\) . This is often referred to as linear regression.
The optimal $\beta$ that reduced the mean squared errors is computed as
\[ \hat\beta = \mathbf{(X^\top X)^{-1}X^\top Y} \]
Given the data below, solve for the parameters \(\beta =\begin{bmatrix}\beta_0\\\beta_1\end{bmatrix}\)
set.seed(1) # For reproducibility
<- c(3,6,3,3,5,8,9,10,4)
x1 <- 2.3 + 3*x1 + rnorm(x1, 0, 0.001) # ie B= (2.3, 3) Y
Hint: X
must contain a column of 1’s
\[ \begin{aligned}y_i = &\beta_0 + \beta_1x_i + \epsilon_i\\ = &\big[1~~x_i\big] \begin{bmatrix}\beta_0\\\beta_1\end{bmatrix} +\epsilon_i\end{aligned} \]
One Hot Encoding: Given a vector that represents a class unto which an object belongs to, write a program that would transform the data vector into dummy matrix.
Example
<- c(2,1,1,2,3,4,3,2) # there are 4 classes. my_vec
Then the one hot encoded data will look like:
1 2 3 4 Classes
1 0 1 0 0 The 1st element is in class2
2 1 0 0 0 The 2nd element is in class1
3 1 0 0 0 The 3nd element is in class1
4 0 1 0 0 :
5 0 0 1 0 :
6 0 0 0 1
7 0 0 1 0
8 0 1 0 0
Hint: use diag
Classification: Also write a program that would revert back the dummy matrix into the original vector of classes. The matrix is given below
<- matrix(c(0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L,
mat1 1L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L,
0L), 8)
mat1
[,1] [,2] [,3] [,4]
[1,] 0 1 0 0
[2,] 1 0 0 0
[3,] 1 0 0 0
[4,] 0 1 0 0
[5,] 0 0 1 0
[6,] 0 0 0 1
[7,] 0 0 1 0
[8,] 0 1 0 0
Data encoding Suppose you are given the matrix below:
primary_color | secondary_color | tertiary_color |
---|---|---|
red | blue | green |
yellow | red | NA |
and we want this to be encoded by checking if the color exists across any of the three columns (1) or none of the 3 columns (0). So, it should yield:
blue | green | red | yellow |
---|---|---|---|
1 | 1 | 1 | 0 |
0 | 0 | 1 | 1 |
Matrix Inversion: Write a function inverse(A, B)
that uses Gauss-Jordan elimination method to obtain obtain the solution of \(x\) in \(Ax = b\) where A is a square matrix and b is a known vector. Consider not inverting the matrix \(A\).