0% found this document useful (0 votes)

107 views47 pages

Understanding Random Forest Importance

The document discusses random forest, an ensemble machine learning method. It describes how random forests are constructed by drawing bootstrap samples from the original data and fitting a classification or regression tree to each sample. It also discusses how random forests create a diverse set of trees by being unstable to changes in the learning data and randomly selecting variables at each split. The document notes two popular R packages for implementing random forests - randomForest and cforest - and some of their differences and advantages.

Uploaded by

agautam07

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views47 pages

Understanding Random Forest Importance

Uploaded by

agautam07

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Why and how to use random forest

variable importance measures

(and how you shouldnt)

Introduction
Construction
R functions

Variable
importance
Tests for variable
importance
Conditional
importance

Summary

Carolin Strobl (LMU M

unchen) and Achim Zeileis (WU Wien)
References

[email protected]
useR! 2008, Dortmund

Introduction
Introduction
Construction
R functions

Random forests

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

Introduction
Introduction
Construction
R functions

Random forests

Variable
importance

have become increasingly popular in, e.g., genetics and

the neurosciences

Tests for variable

importance
Conditional
importance

Summary
References

Introduction
Introduction
Construction
R functions

Random forests

Variable
importance

have become increasingly popular in, e.g., genetics and

the neurosciences [imagine a long list of references here]

Tests for variable

importance
Conditional
importance

Summary
References

Introduction
Introduction
Construction
R functions

Random forests

Variable
importance

have become increasingly popular in, e.g., genetics and

the neurosciences [imagine a long list of references here]

Tests for variable

importance
Conditional
importance

Summary

can deal with small n large p-problems, high-order

interactions, correlated predictor variables

References

Introduction
Introduction
Construction
R functions

Random forests

Variable
importance

have become increasingly popular in, e.g., genetics and

the neurosciences [imagine a long list of references here]

Tests for variable

importance
Conditional
importance

Summary

can deal with small n large p-problems, high-order

interactions, correlated predictor variables

are used not only for prediction, but also to assess

variable importance

References

(Small) random forest

Introduction
1
Start
p < 0.001

1
Start
p < 0.001

8
8

2
n = 13
y = (0.308, 0.692)

2
n = 15
y = (0.4, 0.6)

3
Age
p < 0.001

3
Start
p < 0.001

6
n = 16
y = (0.75, 0.25)

2
n = 38
y = (0.711, 0.289)

>5
9
n = 11
y = (0.364, 0.636)

2
Age
p < 0.001

> 12

81
3
n = 33
y = (1, 0)

3
Number
p < 0.001

> 81
4
Start
p < 0.001

12
5
n = 13
y = (0.385, 0.615)

4
n = 25
y = (1, 0)

5
n = 18
y = (0.889, 0.111)

4
n = 11
y = (1, 0)

6
n = 12
y = (0.25, 0.75)

5
n = 31
y = (1, 0)

1
Start
p < 0.001
> 12

2
Age
p < 0.001

7
Number
p < 0.001
> 18

4
Number
p < 0.001
4

8
9
n = 28
n = 21
y = (1, 0) y = (0.952, 0.048)

7
Start
p < 0.001
> 13

8
9
n = 11
n = 37
y = (0.818, 0.182) y = (1, 0)

> 71

3
n = 15
y = (0.933, 0.067)

4
Start
p < 0.001
12

5
6
n = 12
n = 10
y = (0.417, 0.583)y = (0.2, 0.8)

> 12

8
5
Start
p < 0.001

> 81

7
n = 34
y = (1, 0)

1
Start
p < 0.001

12
5
Start
p < 0.001

3
4
6
n=9
n = 13
n = 12
y = (0.778, 0.222) y = (0.154, 0.846) y = (0.833, 0.167)

2
Age
p < 0.001

136
6
n = 47
y = (1, 0)

> 136
7
n=8
y = (0.75, 0.25)

> 12
7
n = 47
y = (1, 0)

> 71

12
5
Start
p < 0.001

3
4
6
n = 15
n = 17
n = 17
y = (0.667, 0.333) y = (0.235, 0.765) y = (0.882, 0.118)

2
n = 28
y = (0.607, 0.393)

> 14
7
n = 32
y = (1, 0)

7
n = 10
y = (0.5, 0.5)

3
Start
p < 0.001

6
n = 37
y = (0.865, 0.135)

> 13

4
n = 10
y = (0.8, 0.2)

5
n = 24
y = (1, 0)

1
Start
p < 0.001

> 12

Summary
>6

2
Number
p < 0.001
5
Age
p < 0.001

8
>8

Conditional
importance

1
Number
p < 0.001

3
4
n = 12
n = 14
y = (0.667, 0.333) y = (0.143, 0.857)

Tests for variable

importance

> 12

5
6
n = 16
n = 15
y = (0.375, 0.625) y = (0.733, 0.267)

2
Start
p < 0.001

> 13

4
6
n = 16
n = 11
y = (0.188, 0.812) y = (0.818, 0.182)

1
Start
p < 0.001

7
n = 35
y = (1, 0)

1
Start
p < 0.001

2
Age
p < 0.001

3
n = 20
y = (0.85, 0.15)

5
6
n = 14
n=9
y = (0.357, 0.643)
y = (0.111, 0.889)

> 14

2
Age
p < 0.001

Variable
importance

6
n = 11
y = (0.818, 0.182)

4
Number
p < 0.001

7
n = 31
y = (0.806, 0.194)

> 125

1
Start
p < 0.001

> 27

4
Age
p < 0.001
125

R functions

3
Number
p < 0.001

> 12

2
Age
p < 0.001

2
Start
p < 0.001

5
n=9
y = (0.556, 0.444)

6
Start
p < 0.001

1
2
n=8
y = (0.375, 0.625)

3
n = 10
y = (0.9, 0.1)

> 12

15 > 15
7
8
n = 12
n = 12
y = (0.833, 0.167) y = (1, 0)

1
Start
p < 0.001

7
n = 16
y = (1, 0)

7
n = 49
y = (1, 0)

> 68

3
Number
p < 0.001

> 13

1
Number
p < 0.001

68
5
Start
p < 0.001
13

5
n = 32
y = (1, 0)

1
Start
p < 0.001

3
n = 10
y = (1, 0)

> 87

4
n = 36
y = (1, 0)

> 14

4
n = 34
y = (0.882, 0.118)

> 12

2
Age
p < 0.001

Construction

1
Start
p < 0.001

2
n = 18
y = (0.5, 0.5)

> 12
3
Start
p < 0.001

14
4
n = 21
y = (0.905, 0.095)

>8
3
Start
p < 0.001
12

> 14
5
n = 32
y = (1, 0)

> 12

4
n = 18
y = (0.833, 0.167)

5
Number
p < 0.001
3
6
n = 30
y = (1, 0)

>3
7
n = 15
y = (0.933, 0.067)

References

Construction of a random forest

Introduction
Construction
R functions

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

Construction of a random forest

Introduction
Construction
R functions

draw ntree bootstrap samples from original sample

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

Construction of a random forest

Introduction
Construction
R functions

I
I

draw ntree bootstrap samples from original sample

Variable
importance

fit a classification tree to each bootstrap sample

Tests for variable

importance

ntree trees

Conditional
importance

Summary
References

Construction of a random forest

Introduction
Construction
R functions

I
I

draw ntree bootstrap samples from original sample

Variable
importance

fit a classification tree to each bootstrap sample

Tests for variable

importance

ntree trees

Conditional
importance

creates diverse set of trees because

trees are instable w.r.t. changes in learning data

ntree different looking trees (bagging)

randomly preselect mtry splitting variables in each split

ntree more different looking trees (random forest)

Summary
References

Random forests in R
Introduction

randomForest (pkg: randomForest)

Construction
R functions

reference implementation based on CART trees

(Breiman, 2001; Liaw and Wiener, 2008)

for variables of different types: biased in favor of

continuous variables and variables with many categories
(Strobl, Boulesteix, Zeileis, and Hothorn, 2007)
I

cforest (pkg: party)

based on unbiased conditional inference trees

(Hothorn, Hornik, and Zeileis, 2006)

+ for variables of different types: unbiased when

subsampling, instead of bootstrap sampling, is used
(Strobl, Boulesteix, Zeileis, and Hothorn, 2007)

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

(Small) random forest

Introduction
1
Start
p < 0.001

1
Start
p < 0.001

8
8

2
n = 13
y = (0.308, 0.692)

2
n = 15
y = (0.4, 0.6)

3
Age
p < 0.001

3
Start
p < 0.001

6
n = 16
y = (0.75, 0.25)

2
n = 38
y = (0.711, 0.289)

>5
9
n = 11
y = (0.364, 0.636)

2
Age
p < 0.001

> 12

81
3
n = 33
y = (1, 0)

3
Number
p < 0.001

> 81
4
Start
p < 0.001

12
5
n = 13
y = (0.385, 0.615)

4
n = 25
y = (1, 0)

5
n = 18
y = (0.889, 0.111)

4
n = 11
y = (1, 0)

6
n = 12
y = (0.25, 0.75)

5
n = 31
y = (1, 0)

1
Start
p < 0.001
> 12

2
Age
p < 0.001

7
Number
p < 0.001
> 18

4
Number
p < 0.001
4

8
9
n = 28
n = 21
y = (1, 0) y = (0.952, 0.048)

7
Start
p < 0.001
> 13

8
9
n = 11
n = 37
y = (0.818, 0.182) y = (1, 0)

> 71

3
n = 15
y = (0.933, 0.067)

4
Start
p < 0.001
12

5
6
n = 12
n = 10
y = (0.417, 0.583)y = (0.2, 0.8)

> 12

8
5
Start
p < 0.001

> 81

7
n = 34
y = (1, 0)

1
Start
p < 0.001

12
5
Start
p < 0.001

3
4
6
n=9
n = 13
n = 12
y = (0.778, 0.222) y = (0.154, 0.846) y = (0.833, 0.167)

2
Age
p < 0.001

136
6
n = 47
y = (1, 0)

> 136
7
n=8
y = (0.75, 0.25)

> 12
7
n = 47
y = (1, 0)

> 71

12
5
Start
p < 0.001

3
4
6
n = 15
n = 17
n = 17
y = (0.667, 0.333) y = (0.235, 0.765) y = (0.882, 0.118)

2
n = 28
y = (0.607, 0.393)

> 14
7
n = 32
y = (1, 0)

7
n = 10
y = (0.5, 0.5)

3
Start
p < 0.001

6
n = 37
y = (0.865, 0.135)

> 13

4
n = 10
y = (0.8, 0.2)

5
n = 24
y = (1, 0)

1
Start
p < 0.001

> 12

Summary
>6

2
Number
p < 0.001
5
Age
p < 0.001

8
>8

Conditional
importance

1
Number
p < 0.001

3
4
n = 12
n = 14
y = (0.667, 0.333) y = (0.143, 0.857)

Tests for variable

importance

> 12

5
6
n = 16
n = 15
y = (0.375, 0.625) y = (0.733, 0.267)

2
Start
p < 0.001

> 13

4
6
n = 16
n = 11
y = (0.188, 0.812) y = (0.818, 0.182)

1
Start
p < 0.001

7
n = 35
y = (1, 0)

1
Start
p < 0.001

2
Age
p < 0.001

3
n = 20
y = (0.85, 0.15)

5
6
n = 14
n=9
y = (0.357, 0.643)
y = (0.111, 0.889)

> 14

2
Age
p < 0.001

Variable
importance

6
n = 11
y = (0.818, 0.182)

4
Number
p < 0.001

7
n = 31
y = (0.806, 0.194)

> 125

1
Start
p < 0.001

> 27

4
Age
p < 0.001
125

R functions

3
Number
p < 0.001

> 12

2
Age
p < 0.001

2
Start
p < 0.001

5
n=9
y = (0.556, 0.444)

6
Start
p < 0.001

1
2
n=8
y = (0.375, 0.625)

3
n = 10
y = (0.9, 0.1)

> 12

15 > 15
7
8
n = 12
n = 12
y = (0.833, 0.167) y = (1, 0)

1
Start
p < 0.001

7
n = 16
y = (1, 0)

7
n = 49
y = (1, 0)

> 68

3
Number
p < 0.001

> 13

1
Number
p < 0.001

68
5
Start
p < 0.001
13

5
n = 32
y = (1, 0)

1
Start
p < 0.001

3
n = 10
y = (1, 0)

> 87

4
n = 36
y = (1, 0)

> 14

4
n = 34
y = (0.882, 0.118)

> 12

2
Age
p < 0.001

Construction

1
Start
p < 0.001

2
n = 18
y = (0.5, 0.5)

> 12
3
Start
p < 0.001

14
4
n = 21
y = (0.905, 0.095)

>8
3
Start
p < 0.001
12

> 14
5
n = 32
y = (1, 0)

> 12

4
n = 18
y = (0.833, 0.167)

5
Number
p < 0.001
3
6
n = 30
y = (1, 0)

>3
7
n = 15
y = (0.933, 0.067)

References

Measuring variable importance

Introduction
Construction
R functions

Variable

Gini importance

importance

mean Gini gain produced by Xj over all trees

obj <- randomForest(..., importance=TRUE)

obj$importance

column: MeanDecreaseGini

importance(obj, type=2)

for variables of different types: biased in favor of continuous

variables and variables with many categories

Tests for variable

importance
Conditional
importance

Summary
References

Measuring variable importance

Introduction

permutation importance

Construction
R functions

mean decrease in classification accuracy after

permuting Xj over all trees
I

obj <- randomForest(..., importance=TRUE)

obj$importance

column: MeanDecreaseAccuracy

Variable
importance
Tests for variable
importance
Conditional
importance

Summary

importance(obj, type=1)
I

obj <- cforest(...)

varimp(obj)

for variables of different types: unbiased only when

subsampling is used as in cforest(..., controls =
cforest unbiased())

References

The permutation importance

within each tree t

Introduction
Construction
R functions

Variable

P
VI (t) (xj ) =

(t)

I yi = yi

(t)
B

(t)

I yi = yi,j

(t)
B

importance
Tests for variable
importance
Conditional
importance

Summary

(t)

= f (t) (xi ) = predicted class before permuting

(t)

yi,j = f (t) (xi,j ) = predicted class after permuting Xj

xi,j = (xi,1 , . . . , xi,j1 , xj (i),j , xi,j+1 , . . . , xi,p

Note: VI (t) (xj ) = 0 by definition, if Xj is not in tree t

References

The permutation importance

Introduction
Construction
R functions

over all trees:

Variable
importance

1. raw importance

Tests for variable

importance
Conditional
importance

Pntree
VI (xj ) =

VI (t) (xj )
ntree

Summary

t=1

obj <- randomForest(..., importance=TRUE)

importance(obj, type=1, scale=FALSE)

References

The permutation importance

Introduction
Construction
R functions

over all trees:

Variable
importance

2. scaled importance (z-score)

Tests for variable

importance
Conditional
importance

VI (xj )

ntree

Summary

= zj

obj <- randomForest(..., importance=TRUE)

importance(obj, type=1, scale=TRUE) (default)

References

Tests for variable importance

for variable selection purposes

Introduction
Construction
R functions

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

Tests for variable importance

for variable selection purposes

Introduction
Construction
R functions

Breiman and Cutler (2008): simple significance test

based on normality of z-score
randomForest, scale=TRUE + -quantile of N(0,1)

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

Tests for variable importance

for variable selection purposes

Introduction
Construction
R functions

Breiman and Cutler (2008): simple significance test

based on normality of z-score
randomForest, scale=TRUE + -quantile of N(0,1)

Diaz-Uriarte and Alvarez de Andres (2006): backward

Variable
importance
Tests for variable
importance
Conditional
importance

Summary

elimination (throw out least important variables until

out-of-bag prediction accuracy drops)
varSelRF (pkg: varSelRF), dep. on randomForest

References

Tests for variable importance

for variable selection purposes

Introduction
Construction
R functions

Breiman and Cutler (2008): simple significance test

based on normality of z-score
randomForest, scale=TRUE + -quantile of N(0,1)

Diaz-Uriarte and Alvarez de Andres (2006): backward

Variable
importance
Tests for variable
importance
Conditional
importance

Summary

elimination (throw out least important variables until

out-of-bag prediction accuracy drops)
varSelRF (pkg: varSelRF), dep. on randomForest
I

Diaz-Uriarte (2007) and Rodenburg et al. (2008): plots

and significance test (randomly permute response values
to mimic the overall null hypothesis that none of the
predictor variables is relevant = baseline)

References

Tests for variable importance

Introduction
Construction
R functions

Variable

problems of these approaches:

importance
Tests for variable
importance
Conditional
importance

Summary
References

Tests for variable importance

Introduction
Construction
R functions

Variable

problems of these approaches:

importance
Tests for variable
importance

(at least) Breiman and Cutler (2008): strange statistical

properties (Strobl and Zeileis, 2008)

Conditional
importance

Summary
References

Tests for variable importance

Introduction
Construction
R functions

Variable

problems of these approaches:

importance
Tests for variable
importance

(at least) Breiman and Cutler (2008): strange statistical

properties (Strobl and Zeileis, 2008)

Conditional
importance

Summary
References

all: preference of correlated predictor variables (see also

Nicodemus and Shugart, 2007; Archer and Kimes, 2008)

Breiman and Cutlers test

Introduction
Construction
R functions

Variable

under the null hypothesis of zero importance:

importance
Tests for variable
importance

as.

zj N(0, 1)

Conditional
importance

Summary
References

if zj exceeds the -quantile of N(0,1) reject the

null hypothesis of zero importance for variable Xj

Raw importance
Introduction
Construction
R functions

Variable

sample size

importance

100
200
500
mean importance
ntree = 200

mean importance
ntree = 100

Tests for variable

importance

mean importance
ntree = 500

Conditional
importance

Summary
References

0.0

0.1

0.2

0.3

0.4

0.0

0.1

0.2

relevance

0.3

0.4

0.0

0.1

0.2

0.3

0.4

z-score and power

Introduction

sample size

Construction

100
200
500
zscore
ntree = 200

zscore
ntree = 100

R functions
zscore
ntree = 500

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
0.0

0.1

0.2

0.3

0.4

0.0

0.1

power
ntree = 100

0.0

0.1

0.2

0.3

0.4

0.0

0.1

power
ntree = 200

0.3

0.4

0.0

0.1

0.2

relevance

0.2

0.3

0.4

References

power
ntree = 500

0.3

0.4

0.0

0.1

0.2

0.3

0.4

Findings
Introduction
Construction

z-score and power

R functions

Variable
importance

increase in ntree

decrease in sample size

Tests for variable

importance
Conditional
importance

Summary

rather use raw, unscaled permutation importance!

importance(obj, type=1, scale=FALSE)
varimp(obj)

References

What null hypothesis were we testing

in the first place?

Introduction
Construction
R functions

Variable

obs

1
..
.

y1
..
.

xj (1),j
..
.

z1
..
.

i
..
.

yi
..
.

xj (i),j
..
.

zi
..
.

xj (n),j

H0 : Xj Y , Z or Xj Y Xj Z
H

P(Y , Xj , Z ) =0 P(Y , Z ) P(Xj )

importance
Tests for variable
importance
Conditional
importance

Summary
References

What null hypothesis were we testing

in the first place?

Introduction
Construction
R functions

Variable
importance

the current null hypothesis reflects independence of Xj from

both Y and the remaining predictor variables Z

Tests for variable

importance
Conditional
importance

Summary
References

What null hypothesis were we testing

in the first place?

Introduction
Construction
R functions

Variable
importance

the current null hypothesis reflects independence of Xj from

both Y and the remaining predictor variables Z
a high variable importance can result from violation of
either one!

Tests for variable

importance
Conditional
importance

Summary
References

Suggestion: Conditional permutation scheme

Introduction

obs

xj|Z =a (1),j

z1 = a

xj|Z =a (3),j

z3 = a

y27

xj|Z =a (27),j

z27 = a

xj|Z =b (6),j

z6 = b

y14

xj|Z =b (14),j

z14 = b

33
..
.

y33
..
.

xj|Z =b (33),j
..
.

z33 = b
..
.

H0 : Xj Y |Z
P(Y , Xj |Z )

P(Y |Z ) P(Xj |Z )

or P(Y |Xj , Z )

P(Y |Z )

Construction
R functions

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

Technically
Introduction
Construction
R functions

use any partition of the feature space for conditioning

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

Technically
Introduction
Construction
R functions

I
I

use any partition of the feature space for conditioning

Variable
importance

here: use binary partition already learned by tree

Tests for variable

importance

(use cutpoints as bisectors of feature space)

Conditional
importance

Summary
References

Technically
Introduction
Construction
R functions

I
I

use any partition of the feature space for conditioning

Variable
importance

here: use binary partition already learned by tree

Tests for variable

importance

(use cutpoints as bisectors of feature space)

Conditional
importance

condition on correlated variables or select some

Summary
References

Technically
Introduction
Construction
R functions

I
I

use any partition of the feature space for conditioning

Variable
importance

here: use binary partition already learned by tree

Tests for variable

importance

(use cutpoints as bisectors of feature space)

Conditional
importance

condition on correlated variables or select some

Summary
References

Strobl et al. (2008)

available in cforest from version 0.9-994: varimp(obj,
conditional = TRUE)

Simulation study
I
I

Introduction

i.i.d.

dgp: yi = 1 xi,1 + + 12 xi,12 + i , i N(0, 0.5)

X1 , . . . , X12 N(0, )

Construction
R functions

Variable

0.9

0 0

0.9

0
..
.

0 0

1 0

.. . .
. 0
.

importance

Tests for variable

importance
Conditional
importance

Summary
References

X12

-5

-2

Results
Construction
R functions

mtry = 1

Introduction

Variable

0 5

importance

Summary

References

mtry = 8

0 10

mtry = 3

Conditional
importance

20 40 60 80

Tests for variable

importance

variable
variable

Peptide-binding data
Introduction
Construction

Variable

0.005

importance
Tests for variable
importance
Conditional
importance

Summary

0.005

References

conditional
conditional

unconditional

R functions

h2y8

flex8

*
pol3

Summary
Introduction
Construction
R functions

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

Summary
if your predictor variables are of different types:
use cforest (pkg: party) with default option controls =

Introduction
Construction
R functions

cforest unbiased()
with permutation importance varimp(obj)

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

Summary
if your predictor variables are of different types:
use cforest (pkg: party) with default option controls =

Introduction
Construction
R functions

cforest unbiased()
with permutation importance varimp(obj)

Variable
importance
Tests for variable
importance

otherwise: feel free to use cforest (pkg: party)

Conditional
importance

with permutation importance varimp(obj)

Summary

or randomForest (pkg: randomForest)

References

with permutation importance importance(obj, type=1)

or Gini importance importance(obj, type=2)
but dont fall for the z-score! (i.e. set scale=FALSE)

Summary
if your predictor variables are of different types:
use cforest (pkg: party) with default option controls =

Introduction
Construction
R functions

cforest unbiased()
with permutation importance varimp(obj)

Variable
importance
Tests for variable
importance

otherwise: feel free to use cforest (pkg: party)

Conditional
importance

with permutation importance varimp(obj)

Summary

or randomForest (pkg: randomForest)

References

with permutation importance importance(obj, type=1)

or Gini importance importance(obj, type=2)
but dont fall for the z-score! (i.e. set scale=FALSE)
if your predictor variables are highly correlated: use the
conditional importance in cforest (pkg: party)

Introduction
Construction
R functions

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

Archer, K. J. and R. V. Kimes (2008). Empirical characterization

of random forest variable importance measures. Computational

Introduction
Construction

Statistics & Data Analysis 52 (4), 22492260.

Breiman, L. (2001). Random forests. Machine Learning 45 (1),
532.
Breiman, L. and A. Cutler (2008). Random forests Classification
manual. Website accessed in 1/2008;
http://www.math.usu.edu/adele/forests.
Breiman, L., A. Cutler, A. Liaw, and M. Wiener (2006). Breiman
and Cutlers Random Forests for Classification and Regression.
R package version 4.5-16.
Diaz-Uriarte, R. (2007). GeneSrF and varselrf: A web-based
tool and R package for gene selection and classification using
random forest. BMC Bioinformatics 8:328.

R functions

Variable
importance
Tests for variable
importance
Conditional
importance

Summary
References

Hothorn, T., K. Hornik, and A. Zeileis (2006). Unbiased recursive

partitioning: A conditional inference framework. Journal of
Computational and Graphical Statistics 15 (3), 651674.

Introduction
Construction
R functions

Variable

Strobl, C., A.-L. Boulesteix, A. Zeileis, and T. Hothorn (2007).

importance

Bias in random forest variable importance measures:

Tests for variable

importance

Illustrations, sources and a solution. BMC Bioinformatics 8:25.

Conditional
importance

Strobl, C. and A. Zeileis (2008). Danger: High power! exploring

the statistical properties of a test for random forest variable
importance. In Proceedings of the 18th International
Conference on Computational Statistics, Porto, Portugal.
Strobl, C., A.-L. Boulesteix, T. Kneib, T. Augustin, and A. Zeileis
(2008). Conditional variable importance for random forests.
BMC Bioinformatics 9:307.

Summary
References

Statistical Computing by Using R
100% (1)
Statistical Computing by Using R
11 pages
ProbList5 24 SLN
No ratings yet
ProbList5 24 SLN
9 pages
R Commands
No ratings yet
R Commands
2 pages
Biomath
No ratings yet
Biomath
43 pages
Task 1 RR Usa
No ratings yet
Task 1 RR Usa
5 pages
Data Mining Techniques Overview
No ratings yet
Data Mining Techniques Overview
109 pages
Big Data Mid Term
No ratings yet
Big Data Mid Term
14 pages
BMC Bioinformatics: Bias in Random Forest Variable Importance Measures: Illustrations, Sources and A Solution
No ratings yet
BMC Bioinformatics: Bias in Random Forest Variable Importance Measures: Illustrations, Sources and A Solution
21 pages
Statistical Learning
No ratings yet
Statistical Learning
2 pages
Chapter 10 Analysis Examples Replication Fall 2011 R
No ratings yet
Chapter 10 Analysis Examples Replication Fall 2011 R
7 pages
Part 4
No ratings yet
Part 4
25 pages
R Commands
No ratings yet
R Commands
5 pages
Type I and Type II Errors Type I Error
No ratings yet
Type I and Type II Errors Type I Error
7 pages
AlexanderCampbell - Tamari.Hypothesis Testing 2 Samples Homework 1
No ratings yet
AlexanderCampbell - Tamari.Hypothesis Testing 2 Samples Homework 1
7 pages
Lecture 10 - Random Forest (Part II)
No ratings yet
Lecture 10 - Random Forest (Part II)
20 pages
cs447 - Tool Using Simulation To Test A Hypothesis
No ratings yet
cs447 - Tool Using Simulation To Test A Hypothesis
4 pages
R Statistical Analysis and Sampling Techniques
No ratings yet
R Statistical Analysis and Sampling Techniques
38 pages
09-Data Science-S25-Estimation
No ratings yet
09-Data Science-S25-Estimation
29 pages
Linearregression
No ratings yet
Linearregression
18 pages
Non Parametric Methods
No ratings yet
Non Parametric Methods
19 pages
Fundamental Biostatistics Dillon Jones
No ratings yet
Fundamental Biostatistics Dillon Jones
68 pages
STAT 231 Course Notes W16 Print
No ratings yet
STAT 231 Course Notes W16 Print
424 pages
Unit 2 Assignment SKELETON R spr18
No ratings yet
Unit 2 Assignment SKELETON R spr18
12 pages
Microarray Data Analysis with MeV
No ratings yet
Microarray Data Analysis with MeV
130 pages
Lecture Notes Week 2
No ratings yet
Lecture Notes Week 2
10 pages
Chapter 08 Inference
No ratings yet
Chapter 08 Inference
34 pages
BM-1, Applied Statistics, Lesson 2: Comparing Two Groups (And One Group)
No ratings yet
BM-1, Applied Statistics, Lesson 2: Comparing Two Groups (And One Group)
39 pages
Stroke Prediction Dataset
No ratings yet
Stroke Prediction Dataset
48 pages
1 Introduction To Statistics 2025 Jul
No ratings yet
1 Introduction To Statistics 2025 Jul
36 pages
Statistical Tests Martin G 161131 V15 UPLOAD
No ratings yet
Statistical Tests Martin G 161131 V15 UPLOAD
33 pages
Sujal 4
No ratings yet
Sujal 4
31 pages
Algorithm M
No ratings yet
Algorithm M
8 pages
Intro To Probability and Statistics
No ratings yet
Intro To Probability and Statistics
147 pages
Correlation Least Squares
No ratings yet
Correlation Least Squares
59 pages
Notes On Applied Statistics
No ratings yet
Notes On Applied Statistics
16 pages
7708 - MBA PredAnanBigDataNov21
No ratings yet
7708 - MBA PredAnanBigDataNov21
11 pages
Exercise Chapter 9
No ratings yet
Exercise Chapter 9
10 pages
STAT 231 Course Notes Winter
100% (1)
STAT 231 Course Notes Winter
358 pages
Margins 01
No ratings yet
Margins 01
48 pages
Ramin Shamshiri ABE6981 HW - 03
No ratings yet
Ramin Shamshiri ABE6981 HW - 03
13 pages
R Programming Unit 4
No ratings yet
R Programming Unit 4
26 pages
SC&RP - UNIT 4 Partial Notes
No ratings yet
SC&RP - UNIT 4 Partial Notes
10 pages
Bayesian Computation in R Guide
No ratings yet
Bayesian Computation in R Guide
24 pages
of Bootstrap by Spida - 2010
No ratings yet
of Bootstrap by Spida - 2010
80 pages
2024HW2Boot GOF Eng
No ratings yet
2024HW2Boot GOF Eng
4 pages
Prac 1 AP
No ratings yet
Prac 1 AP
9 pages
Unit 1 Assignment SKELETON R spr18
No ratings yet
Unit 1 Assignment SKELETON R spr18
23 pages
STATS250 Full Help Card
No ratings yet
STATS250 Full Help Card
6 pages
Introduction To Biostatistics - Research Etymology: Notes From The Lecture & Orientations
No ratings yet
Introduction To Biostatistics - Research Etymology: Notes From The Lecture & Orientations
2 pages
LTAMMergedSummaries PDF
No ratings yet
LTAMMergedSummaries PDF
17 pages
Heart Disease Prediction Model
No ratings yet
Heart Disease Prediction Model
19 pages
Asst. Prof. Florence C. Navidad, RMT, RN, M.Ed
No ratings yet
Asst. Prof. Florence C. Navidad, RMT, RN, M.Ed
29 pages
Part III - Analysis With NonLinear Models
No ratings yet
Part III - Analysis With NonLinear Models
68 pages
Module 4 - Continuous Random Variables (PDF, CDF, Mean, Variance, Covariance)
No ratings yet
Module 4 - Continuous Random Variables (PDF, CDF, Mean, Variance, Covariance)
22 pages
R Random Forest Guide
No ratings yet
R Random Forest Guide
8 pages
Which Test When: 1 Exploratory Tests
No ratings yet
Which Test When: 1 Exploratory Tests
5 pages
BMA3105 Actuarial Mathematics II Lecture 1 (1) 025228
No ratings yet
BMA3105 Actuarial Mathematics II Lecture 1 (1) 025228
49 pages
AE441A Assignment 1
No ratings yet
AE441A Assignment 1
2 pages
PSY 468: Social Cognition 2015 16 (I Semester) : Books
No ratings yet
PSY 468: Social Cognition 2015 16 (I Semester) : Books
2 pages
Cruise Flight - Range and Endurance of Propeller Driven Aircraft
No ratings yet
Cruise Flight - Range and Endurance of Propeller Driven Aircraft
5 pages
Vector Space Problems and Solutions
No ratings yet
Vector Space Problems and Solutions
2 pages
ESO 209: Assignment 3: August 18, 2015
No ratings yet
ESO 209: Assignment 3: August 18, 2015
1 page
ESO209 Dynamics Homework Guide
No ratings yet
ESO209 Dynamics Homework Guide
10 pages
Nonparametric Statistics Michaelmas 2024-25
No ratings yet
Nonparametric Statistics Michaelmas 2024-25
71 pages
Advanced Statistical Inference
No ratings yet
Advanced Statistical Inference
4 pages
Reading 38 Backtesting and Simulation - Answers
No ratings yet
Reading 38 Backtesting and Simulation - Answers
8 pages
Statistics Exam Questions
No ratings yet
Statistics Exam Questions
29 pages
Stock Price Predictions Using A Geometric Brownian Motion - Joel Lidén
No ratings yet
Stock Price Predictions Using A Geometric Brownian Motion - Joel Lidén
41 pages
Gaussian and Bootstrap Approximation For Matching-Based Average Treatment Effect Estimators
No ratings yet
Gaussian and Bootstrap Approximation For Matching-Based Average Treatment Effect Estimators
50 pages
Overview of Time Series Models and Tests
No ratings yet
Overview of Time Series Models and Tests
5 pages
PLS-SEM Results Reporting Format
100% (1)
PLS-SEM Results Reporting Format
46 pages
Multi Group Analysis in Amos
No ratings yet
Multi Group Analysis in Amos
22 pages
Manresa Et Al., 2024, Humanizing GenAI at Work Bridging The Gap Between Technological Innovation and Employee Engagement
No ratings yet
Manresa Et Al., 2024, Humanizing GenAI at Work Bridging The Gap Between Technological Innovation and Employee Engagement
21 pages
7FFB? :H H Ii?ed7d7boi?i: Dehc7dh$:h7f H 7hhoic?j
No ratings yet
7FFB? :H H Ii?ed7d7boi?i: Dehc7dh$:h7f H 7hhoic?j
15 pages
Fitzgerald Et Al. 2021 - Improving The Prediction of Epilepsy Surgery Outcomes Using Basic Scalp EEG Findings
No ratings yet
Fitzgerald Et Al. 2021 - Improving The Prediction of Epilepsy Surgery Outcomes Using Basic Scalp EEG Findings
12 pages
Zhuo 2016
No ratings yet
Zhuo 2016
14 pages
Thesis Download Free
100% (2)
Thesis Download Free
7 pages
A Dendrochronology Program Library in R (DPLR) : Andrew G. Bunn
No ratings yet
A Dendrochronology Program Library in R (DPLR) : Andrew G. Bunn
10 pages
Glossary SPSS Statistics
No ratings yet
Glossary SPSS Statistics
6 pages
A Probability and Statistics Cheatsheet
No ratings yet
A Probability and Statistics Cheatsheet
28 pages
Brainwave Detection for Experts
No ratings yet
Brainwave Detection for Experts
37 pages
Machine Learning Based Solar Photovoltaic Power Forecasting A Review and Comparison
No ratings yet
Machine Learning Based Solar Photovoltaic Power Forecasting A Review and Comparison
27 pages
Bootstrap Methods For Foreign Currency Exchange Rates Prediction
No ratings yet
Bootstrap Methods For Foreign Currency Exchange Rates Prediction
6 pages
AI3104 Foundation of Data Science (Handout) 2024
No ratings yet
AI3104 Foundation of Data Science (Handout) 2024
7 pages
Calculating 95 Upper Confidence Level
No ratings yet
Calculating 95 Upper Confidence Level
5 pages
Parametric Identification
No ratings yet
Parametric Identification
6 pages
Stat - Bootstrapping in Statistics
No ratings yet
Stat - Bootstrapping in Statistics
7 pages
Muñoz Et Al.2023.Lithological Substrates Influence Tropical Dry Forest Structure
No ratings yet
Muñoz Et Al.2023.Lithological Substrates Influence Tropical Dry Forest Structure
11 pages
Sobel Test for Mediation Analysis in SPSS
No ratings yet
Sobel Test for Mediation Analysis in SPSS
20 pages
Guo S Manuals SOA Exam C PDF
No ratings yet
Guo S Manuals SOA Exam C PDF
284 pages
Dual Role of Endophytic Entomopathogenic Fungi Induce Plant Growth and
No ratings yet
Dual Role of Endophytic Entomopathogenic Fungi Induce Plant Growth and
29 pages
Chapter-5-Model Evaluation
No ratings yet
Chapter-5-Model Evaluation
22 pages
A General Formula For Calculating Surface Area of The Similarly Shaped Leaves: Evidence From Six Magnoliaceae Species
No ratings yet
A General Formula For Calculating Surface Area of The Similarly Shaped Leaves: Evidence From Six Magnoliaceae Species
10 pages