01/06/2020

## Seed used in these slides

set.seed(1024)

## Libraries used in these slides

library(ggplot2)
library(nnet)
library(bit64)
library(h2o)

## Artificial Neural Networks

• Non-linear models
• Can solve both classification and regression tasks
• Composed of neurons
• Connected together
• Each connection has a weight
• Idea: find best weights that produce the correct output

## Artificial Neural Networks

1. linear computation using the input values $in_i = \sum_{i=1}^{k}{w_{i,j}a_i}$

2. a non-linear (activation) function

3. output sent to other neurons

## Activation Function

• The Step Function $step(x) = \begin{cases} 1, & \text{if}\ x \geq t \\ 0, & \text{otherwise} \end{cases}$
• The Sign Function $sign(x) = \begin{cases} 1, & \text{if}\ x \geq 0 \\ 0, & \text{otherwise} \end{cases}$
• The Sigmoid Function $sigmoid(x) = \frac{1}{1+\exp^{-x}}$

## Types

• Perceptron
• Single unit
• Linearity assumption
• Incapable
• Multi-layer
• Feed-forward (acyclic)
• Recurrent (cyclic)

## Feed-forward

Feed-forward Artificial Neural Network

## What to Know?

• ANNs are universal function approximators
• They can approximate any function if provided the correct architecture
• Two major handicaps
• You have to guess the architecture
• How many layers?
• How many nodes in each layer?
• Initial weights
• Expensive computation

## Artificial Neural Networks - Implementations

• nnet included in base installation
• single hidden layer
• RSNNS (Bergmeir and Benitez, 2012)
• FCNN4R (Klima, 2016)
• neuralnet (Fritsch et al., 2012)

## Classification Example

data(iris)
rndSample <- sample(1:nrow(iris), 100)
tr <- iris[rndSample, ]
ts <- iris[-rndSample, ]
n <- nnet(Species ~ ., tr, size = 6 ,trace = FALSE, maxit = 1000)
ps <- predict(n, ts, type="class")
(cm <- table(ps, ts$Species)) ## ## ps setosa versicolor virginica ## setosa 12 0 0 ## versicolor 1 20 0 ## virginica 0 1 16 • parameter decay can be used to set the learning rate • initial weights are randomized in $$[-0.5, 0.5]$$ • two consecutive runs can result in different outputs • unless seed is fixed ## Regression Example data(Boston,package='MASS') sp <- sample(1:nrow(Boston),354) tr <- Boston[sp,] ts <- Boston[-sp,] nr <- nnet(medv ~ ., tr, linout=TRUE, trace=FALSE, size=6, decay=0.01, maxit=2000) psnr <- predict(nr, ts) mean(abs(psnr-ts$medv))
## [1] 3.198552
plot(ts$medv, psnr) abline(0, 1) ## Deep Learning ## Deep Learning • Each consecutive layer defines a more complex set of features • Use an unsupervised learning method at each layer to learn the features • Apply ANN to the obtained structure • DLNNs are ANNs with many hidden layers • They are very popular now, because • Hardware improvements • Methodological improvements (Hinton, 2006) ## Deep Learning • Many implementations exist • h2o (Aiello et al., 2016) • mxnet • darch (Drees, 2013) • deepnet (Rong, 2014) ## Deep Learning h2oInstance <- h2o.init(ip = "localhost") # start H2O instance locally ## ## H2O is not running yet, starting it now... ## ## Note: In case of errors look at the following log files: ## /tmp/RtmphH8qFr/filecd5c7a5a11e5/h2o_bgenc_started_from_r.out ## /tmp/RtmphH8qFr/filecd5c2dc88072/h2o_bgenc_started_from_r.err ## ## ## Starting H2O JVM and connecting: . Connection successful! ## ## R is connected to the H2O cluster: ## H2O cluster uptime: 1 seconds 106 milliseconds ## H2O cluster timezone: Europe/Istanbul ## H2O data parsing timezone: UTC ## H2O cluster version: 3.30.0.1 ## H2O cluster version age: 1 month and 28 days ## H2O cluster name: H2O_started_from_R_bgenc_qbt439 ## H2O cluster total nodes: 1 ## H2O cluster total memory: 3.88 GB ## H2O cluster total cores: 12 ## H2O cluster allowed cores: 12 ## H2O cluster healthy: TRUE ## H2O Connection ip: localhost ## H2O Connection port: 54321 ## H2O Connection proxy: NA ## H2O Internal Security: FALSE ## H2O API Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4 ## R Version: R version 4.0.0 (2020-04-24) ## Deep Learning rndSample <- sample(1:nrow(iris), 100) trH <- as.h2o(iris[rndSample, ], "trH") tsH <- as.h2o(iris[-rndSample, ], "tsH") mdl <- h2o.deeplearning(x = 1:4, y = 5, training_frame = trH) preds <- h2o.predict(mdl, tsH)[, "predict"] (cm <- table(as.vector(preds), as.vector(tsH$Species)))
• Analyze outputs in console

## Deep Learning

data(Boston, package="MASS")
trH <- as.h2o(Boston[sp, ],"trH")
tsH <- as.h2o(Boston[-sp, ],"tsH")
mdl <- h2o.deeplearning(x=1:13, y=14, training_frame=trH,
hidden = c(100, 100, 100, 100), epochs = 500)
preds <- as.vector(h2o.predict(mdl,tsH))
mean(abs(preds - as.vector(tsH$medv))) ## [1] 2.339476 ## Deep Learning plot(as.vector(tsH$medv), preds)
abline(0, 1)

plot(as.vector(tsH$medv), preds) points(as.vector(tsH$medv), psnr, col = "red")
abline(0, 1)