Lecture 11

01/06/2020

Seed used in these slides

set.seed(1024)

Libraries used in these slides

library(ggplot2)
library(nnet)
library(bit64)
library(h2o)

Artificial Neural Networks

Non-linear models
Can solve both classification and regression tasks
Composed of neurons
- Connected together
- Each connection has a weight
- Idea: find best weights that produce the correct output

Artificial Neural Networks

linear computation using the input values \[in_i = \sum_{i=1}^{k}{w_{i,j}a_i}\]
a non-linear (activation) function
output sent to other neurons

Activation Function

The Step Function \[step(x) = \begin{cases} 1, & \text{if}\ x \geq t \\ 0, & \text{otherwise} \end{cases}\]
The Sign Function \[sign(x) = \begin{cases} 1, & \text{if}\ x \geq 0 \\ 0, & \text{otherwise} \end{cases}\]
The Sigmoid Function \[sigmoid(x) = \frac{1}{1+\exp^{-x}}\]

Types

Perceptron
- Single unit
- Linearity assumption
- Incapable
Multi-layer
- Feed-forward (acyclic)
- Recurrent (cyclic)

Feed-forward

Feed-forward Artificial Neural Network

What to Know?

ANNs are universal function approximators
- They can approximate any function if provided the correct architecture
Two major handicaps
- You have to guess the architecture
  - How many layers?
  - How many nodes in each layer?
  - Initial weights
- Expensive computation

Artificial Neural Networks - Implementations

nnet included in base installation
- single hidden layer
RSNNS (Bergmeir and Benitez, 2012)
FCNN4R (Klima, 2016)
neuralnet (Fritsch et al., 2012)

Classification Example

data(iris)
rndSample <- sample(1:nrow(iris), 100)
tr <- iris[rndSample, ]
ts <- iris[-rndSample, ]
n <- nnet(Species ~ ., tr, size = 6 ,trace = FALSE, maxit = 1000)
ps <- predict(n, ts, type="class")
(cm <- table(ps, ts$Species))

##             
## ps           setosa versicolor virginica
##   setosa         12          0         0
##   versicolor      1         20         0
##   virginica       0          1        16

parameter decay can be used to set the learning rate
initial weights are randomized in \([-0.5, 0.5]\)
two consecutive runs can result in different outputs
- unless seed is fixed

Regression Example

data(Boston,package='MASS')
sp <- sample(1:nrow(Boston),354)
tr <- Boston[sp,]
ts <- Boston[-sp,]
nr <- nnet(medv ~ ., tr,
           linout=TRUE,
           trace=FALSE,
           size=6,
           decay=0.01,
           maxit=2000)
psnr <- predict(nr, ts)
mean(abs(psnr-ts$medv))

## [1] 3.198552

plot(ts$medv, psnr)
abline(0, 1)

Deep Learning

Each consecutive layer defines a more complex set of features
Use an unsupervised learning method at each layer to learn the features
Apply ANN to the obtained structure
DLNNs are ANNs with many hidden layers
- They are very popular now, because
  - Hardware improvements
  - Methodological improvements (Hinton, 2006)

Deep Learning

Many implementations exist
- h2o (Aiello et al., 2016)
- mxnet
- darch (Drees, 2013)
- deepnet (Rong, 2014)

Deep Learning

h2oInstance <- h2o.init(ip = "localhost") # start H2O instance locally

## 
## H2O is not running yet, starting it now...
## 
## Note:  In case of errors look at the following log files:
##     /tmp/RtmphH8qFr/filecd5c7a5a11e5/h2o_bgenc_started_from_r.out
##     /tmp/RtmphH8qFr/filecd5c2dc88072/h2o_bgenc_started_from_r.err
## 
## 
## Starting H2O JVM and connecting: . Connection successful!
## 
## R is connected to the H2O cluster: 
##     H2O cluster uptime:         1 seconds 106 milliseconds 
##     H2O cluster timezone:       Europe/Istanbul 
##     H2O data parsing timezone:  UTC 
##     H2O cluster version:        3.30.0.1 
##     H2O cluster version age:    1 month and 28 days  
##     H2O cluster name:           H2O_started_from_R_bgenc_qbt439 
##     H2O cluster total nodes:    1 
##     H2O cluster total memory:   3.88 GB 
##     H2O cluster total cores:    12 
##     H2O cluster allowed cores:  12 
##     H2O cluster healthy:        TRUE 
##     H2O Connection ip:          localhost 
##     H2O Connection port:        54321 
##     H2O Connection proxy:       NA 
##     H2O Internal Security:      FALSE 
##     H2O API Extensions:         Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4 
##     R Version:                  R version 4.0.0 (2020-04-24)

Deep Learning

rndSample <- sample(1:nrow(iris), 100) 
trH <- as.h2o(iris[rndSample, ], "trH")
tsH <- as.h2o(iris[-rndSample, ], "tsH")
mdl <- h2o.deeplearning(x = 1:4, y = 5, training_frame = trH)
preds <- h2o.predict(mdl, tsH)[, "predict"]

(cm <- table(as.vector(preds), as.vector(tsH$Species)))

##             
##              setosa versicolor virginica
##   setosa         16          0         0
##   versicolor      0         15         0
##   virginica       0          5        14

Analyze outputs in console

Deep Learning

data(Boston, package="MASS")
trH <- as.h2o(Boston[sp, ],"trH")
tsH <- as.h2o(Boston[-sp, ],"tsH")
mdl <- h2o.deeplearning(x=1:13, y=14, training_frame=trH,
hidden = c(100, 100, 100, 100), epochs = 500)
preds <- as.vector(h2o.predict(mdl,tsH))

mean(abs(preds - as.vector(tsH$medv)))

## [1] 2.339476

Deep Learning

plot(as.vector(tsH$medv), preds)
abline(0, 1)

plot(as.vector(tsH$medv), preds)
points(as.vector(tsH$medv), psnr, col = "red")
abline(0, 1)

Deep Learning

Don’t forget to shutdown H2O :)

h2o.shutdown(prompt = F);