07/06/2020

Seed used in these slides

set.seed(1024)

Libraries used in these slides

library(performanceEstimation)
library(e1071)
library(randomForest)
library(DMwR2)
library(rpart)

Evaluation

Performance Evaluation

  • We need to evaluate the performance of our predictions
  • Bad idea:
    • Evaluate your performance on the training set
  • Good idea:
    • Evaluate your performance on a never-before-seen set, a.k.a. the test set

Evaluation Methodology

  • Split your data into
    • Training Set
    • Test Set
  • Train Model
  • Evaluate Error -> \(E_i\)
  • Do this a few times to ensure statistical reliability
  • Calculate mean prediction error \[\bar E = \frac{1}{k}\sum_{i=1}^{k}{E_i}\]
  • Also check the standard error of the mean: \[SE(\bar E) = \frac{s_E}{\sqrt{k}}\] where \(s_E\) is the sample standard deviation of the error \[s_E = \sqrt{\frac{1}{k-1}\sum_{i=1}^{k}{(E_i-\bar E)^2}}\]

Automated Evaluation

  • We will investigate the performanceEstimation package.
  • caret and mlr also provides similar functionalities.

Holdout and Random Subsampling

Holdout and Random Subsampling

  • Holdout method splits the data into two parts
    • Training set: Usually 70%
    • Test set: Usually 30%
  • What if the dataset size is too small?
    • Not enough data left to train a good model
    • Not enough data left to test accurately
  • This method works best for large datasets

Holdout and Random Subsampling

  • Random Subsampling is repeating holdout many times
  • This way we get a set of test scores
    • Which means more statistical significance

Example - Holdout

  • We will use performanceEstimation method
data(iris)
r <- performanceEstimation(
  PredTask(Species ~ ., iris),
  Workflow(learner = "svm"),
  EstimationTask(metrics = "err",
                 method = Holdout(hldSz = 0.3)))
## 
## 
## ##### PERFORMANCE ESTIMATION USING  HOLD OUT  #####
## 
## ** PREDICTIVE TASK :: iris.Species
## 
## ++ MODEL/WORKFLOW :: svm 
## Task for estimating  err  using
##  1 x 70 % / 30 % Holdout
##   Run with seed =  1234 
## Iteration :  1
  • use parameter learner.pars in Workflow to apply specific parameter values

Example - Holdout

summary(r)
## 
## == Summary of a  Hold Out Performance Estimation Experiment ==
## 
## Task for estimating  err  using
##  1 x 70 % / 30 % Holdout
##   Run with seed =  1234 
## 
## * Predictive Tasks ::  iris.Species
## * Workflows  ::  svm 
## 
## -> Task:  iris.Species
##   *Workflow: svm 
##                err
## avg     0.02222222
## std             NA
## med     0.02222222
## iqr     0.00000000
## min     0.02222222
## max     0.02222222
## invalid 0.00000000

Example - Random Subsampling

data(Boston, package="MASS")
r <- performanceEstimation(
  PredTask(medv ~ ., Boston),
  Workflow(learner = "randomForest"),
  EstimationTask(metrics = "mse",
                 method = Holdout(nReps = 3, hldSz = 0.3)))
## 
## 
## ##### PERFORMANCE ESTIMATION USING  HOLD OUT  #####
## 
## ** PREDICTIVE TASK :: Boston.medv
## 
## ++ MODEL/WORKFLOW :: randomForest 
## Task for estimating  mse  using
##  3 x 70 % / 30 % Holdout
##   Run with seed =  1234 
## Iteration :  1  2  3

Example - Random Subsampling

summary(r) 
## 
## == Summary of a  Hold Out Performance Estimation Experiment ==
## 
## Task for estimating  mse  using
##  3 x 70 % / 30 % Holdout
##   Run with seed =  1234 
## 
## * Predictive Tasks ::  Boston.medv
## * Workflows  ::  randomForest 
## 
## -> Task:  Boston.medv
##   *Workflow: randomForest 
##               mse
## avg     11.032014
## std      1.852507
## med     10.245758
## iqr      1.722827
## min      9.702315
## max     13.147969
## invalid  0.000000

Cross Validation

Cross Validation

  • Instead of randomly subsampling k test cases, we uniformly design them
  • Best works for medium sized datasets
    • few hundreds to few thousands

Cross Validation - Example

r <- performanceEstimation(
  PredTask(medv ~ ., Boston),
  workflowVariants(learner = "rpartXse",
                   learner.pars = list(se = c(0, 0.25, 0.5, 1, 2))),
  EstimationTask(metrics = c("mse", "mae"),
                 method = CV(nReps = 8, nFolds = 10)))
## 
## 
## ##### PERFORMANCE ESTIMATION USING  CROSS VALIDATION  #####
## 
## ** PREDICTIVE TASK :: Boston.medv
## 
## ++ MODEL/WORKFLOW :: rpartXse.v1 
## Task for estimating  mse,mae  using
##  8 x 10 - Fold Cross Validation
##   Run with seed =  1234 
## Iteration :********************************************************************************
## 
## 
## ++ MODEL/WORKFLOW :: rpartXse.v2 
## Task for estimating  mse,mae  using
##  8 x 10 - Fold Cross Validation
##   Run with seed =  1234 
## Iteration :********************************************************************************
## 
## 
## ++ MODEL/WORKFLOW :: rpartXse.v3 
## Task for estimating  mse,mae  using
##  8 x 10 - Fold Cross Validation
##   Run with seed =  1234 
## Iteration :********************************************************************************
## 
## 
## ++ MODEL/WORKFLOW :: rpartXse.v4 
## Task for estimating  mse,mae  using
##  8 x 10 - Fold Cross Validation
##   Run with seed =  1234 
## Iteration :********************************************************************************
## 
## 
## ++ MODEL/WORKFLOW :: rpartXse.v5 
## Task for estimating  mse,mae  using
##  8 x 10 - Fold Cross Validation
##   Run with seed =  1234 
## Iteration :********************************************************************************

Cross Validation - Example

rankWorkflows(r, top = 3)
## $Boston.medv
## $Boston.medv$mse
##      Workflow Estimate
## 1 rpartXse.v1 18.15666
## 2 rpartXse.v2 18.70882
## 3 rpartXse.v3 20.24466
## 
## $Boston.medv$mae
##      Workflow Estimate
## 1 rpartXse.v1 2.846930
## 2 rpartXse.v2 2.954981
## 3 rpartXse.v3 3.055003

Cross Validation - Example

getWorkflow("rpartXse.v1", r)
## Workflow Object:
##  Workflow ID       ::  rpartXse.v1 
##  Workflow Function ::  standardWF
##       Parameter values:
##       learner.pars  -> se=0 
##       learner  -> rpartXse

Cross Validation - Example

plot(r)

Custom Workflows

foo <- function(form, train, test, maxdep, cpar)
{
  treemodel <- rpart(
    form, train, 
    control = rpart.control(
      maxdepth = maxdep,
      cp = cpar))
  predictions <- predict(
    treemodel, 
    test)
  list(trues = responseValues(
    form, test),
       preds = predictions)
}

r <- performanceEstimation(
  PredTask(medv ~ ., Boston),
  workflowVariants(
    wf = "foo", 
    maxdep = c(2, 5, 8), 
    cpar = c(0.01, 0.005, 0.002, 0.001)),
  EstimationTask(
    metrics = "mse",
    method = CV(
      nFolds = 10, seed = 1001)))
## 
## 
## ##### PERFORMANCE ESTIMATION USING  CROSS VALIDATION  #####
## 
## ** PREDICTIVE TASK :: Boston.medv
## 
## ++ MODEL/WORKFLOW :: foo.v1 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********
## 
## 
## ++ MODEL/WORKFLOW :: foo.v2 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********
## 
## 
## ++ MODEL/WORKFLOW :: foo.v3 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********
## 
## 
## ++ MODEL/WORKFLOW :: foo.v4 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********
## 
## 
## ++ MODEL/WORKFLOW :: foo.v5 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********
## 
## 
## ++ MODEL/WORKFLOW :: foo.v6 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********
## 
## 
## ++ MODEL/WORKFLOW :: foo.v7 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********
## 
## 
## ++ MODEL/WORKFLOW :: foo.v8 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********
## 
## 
## ++ MODEL/WORKFLOW :: foo.v9 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********
## 
## 
## ++ MODEL/WORKFLOW :: foo.v10 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********
## 
## 
## ++ MODEL/WORKFLOW :: foo.v11 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********
## 
## 
## ++ MODEL/WORKFLOW :: foo.v12 
## Task for estimating  mse  using
##  1 x 10 - Fold Cross Validation
##   Run with seed =  1001 
## Iteration :**********

Cross Validation - Example

rankWorkflows(r)
## $Boston.medv
## $Boston.medv$mse
##   Workflow Estimate
## 1  foo.v11 20.94134
## 2  foo.v12 20.97372
## 3   foo.v8 21.05089
## 4   foo.v9 21.05812
## 5   foo.v5 22.54060
topPerformer(r, metric = "mse", task = "Boston.medv")
## Workflow Object:
##  Workflow ID       ::  foo.v11 
##  Workflow Function ::  foo
##       Parameter values:
##       maxdep  -> 5 
##       cpar  -> 0.001

Cross Validation - Example

plot(r)

Bootstrap Estimates

  • Construct the training set by random sampling with replacement
  • This results in 63.2% of rows to be selected
  • The remaining rows are chosen to be the test set
  • Applied with many repetitions
  • Best applicable to small datasets
datasize <- 10000
training <- sample(10000, replace = TRUE)
length(unique(training)) / datasize
## [1] 0.6296
head(training)
## [1] 7583 4572 2754 5903 1496 6312
test <- (1:10000)[-unique(training)]
head(test)
## [1]  4  5  6  7 11 12

Bootstrap Estimates

  • There are two versions
    • \(\epsilon_0\) estimates
      • average of the k bootstrap estimations
    • \(.632\) estimates
      • a weighted average of \(\epsilon_0\) and \(\epsilon_r\)
      • \(\epsilon_r\) is the resubstitution estimate, obtained by training with the full dataset and then testing with the full dataset \[\epsilon_{.632} = 0.368\epsilon_r + 0.632\epsilon_0\]

Bootstrap Estimates - Example

data(BreastCancer, package="mlbench")
bc <- cbind(knnImputation(BreastCancer[, -c(1,11)]), 
            Class = BreastCancer$Class)
r <- performanceEstimation(
  PredTask(Class ~ ., bc),
  workflowVariants(learner = "svm",
                   learner.pars = list(cost = c(1, 5, 10),
                                       gamma = c(0.01, 0.001))),
  EstimationTask(metrics = c("acc", "tnr"),
                 method = Bootstrap(nReps = 100, type = ".632")))

Bootstrap Estimates - Example

topPerformers(r, maxs = TRUE)
## $bc.Class
##     Workflow Estimate
## acc   svm.v3    0.966
## tnr   svm.v1    0.962
topPerformer(r, 
             max = TRUE, 
             metric = "acc",
             task = "bc.Class")
## Workflow Object:
##  Workflow ID       ::  svm.v3 
##  Workflow Function ::  standardWF
##       Parameter values:
##       learner.pars  -> cost=10 gamma=0.01 
##       learner  -> svm
topPerformer(r, 
             max = TRUE, 
             metric = "tnr",
             task = "bc.Class")
## Workflow Object:
##  Workflow ID       ::  svm.v1 
##  Workflow Function ::  standardWF
##       Parameter values:
##       learner.pars  -> cost=1 gamma=0.01 
##       learner  -> svm