ANALYSIS -without cross validation, 1000 trees

FUNCTION:

```
InvestRandomForest2.f <- function(x,Y,NoOfSeeds,mtry.v,nodesize.v,ntree) {
k <- length(mtry.v)
m <- length(nodesize.v)
Table.arr <- array(0,c(2,3,NoOfSeeds,k,m))
for (j in 1:NoOfSeeds) {
set.seed(99+j)
for (i in 1:k) {
for (ind in 1:m) {
library(randomForest)
FitObj <- randomForest(x ,Y,mtry=mtry.v[i],
nodesize=nodesize.v[ind],
ntree=ntree)
Table.arr[,,j,i,ind] <- FitObj$confusion
}
}
}
Table.arr
}
```

APPLY FUNCTION:

Table.arr <- InvestRandomForest2.f(x,Y,NoOfSeeds=50,

mtry.v=c(1:5),

nodesize.v=c(1:20),

ntree=1000)

save(Table.arr,file="Table.arr")

Table.arr

USING APPLY STATEMENT,PREPARE FOR GRAPHS

```
Error2.f <- function(mat)
1 - sum(diag(mat[,1:2]))/sum(mat[,1:2])
load("Table.arr")
Error.arr <- apply(Table.arr,c(3:5),Error2.f)
Error.arr #50rows, 5 columns
```

DO MATPLOT GRAPHICS HERE

FitObj OUTPUT SPECIFIES 500 TREES NOT 1000 ?????????????????????????????

FitObj

Call:

randomForest(x = x, y = Y, type = "prob")

Type of random forest: classification

Number of trees: 500

No. of variables tried at each split: 2

```
OOB estimate of error rate: 25.8%
```

Confusion matrix:

0 1 class.error

0 167 33 0.1650000

1 56 89 0.3862069

PLOT ALSO RESULTS IN 500 NOT 1000 TREES ??????????????????????????

plot(FitObj,lwd=2)

abline(h=0.26,lty=3,lwd=2)

legend(x = "topright",

legend = c("without cross-validation", "1000 trees"),

lty = c(1), # Line types

col = c(3,1), # Line colors

lwd = 2)

OUTPUT:

Call:

randomForest(x = x, y = Y, type = "prob")

Type of random forest: classification

Number of trees: 500 should be 1000 ????????????

No. of variables tried at each split: 2

```
OOB estimate of error rate: 25.8%
```

Confusion matrix:

0 1 class.error

0 167 33 0.1650000

1 56 89 0.3862069

WHAT IS GOING ON HERE?????????????????????????????????