Hello,
I'm using Shiny to make a little app that will allow users to play around with some variables and perform some modeling with Logistic Regression, Classification Trees, Random Forests, and kNN on a dataset.
I'm experiencing an issue where I am able to reference, use, and render output to UI for a non-pruned tree, but when attempting with a pruned tree, it seems some type of interaction is occurring to the dataset object that they both use that causes the second to always return 'object trainSet not found'.
Please take a look, I appreciate any time on this. I may be missing a reactivity concept issue.
Side Note: I'm unsure how to store multiple variables in a reactive function to be referenced later, so you may see redundant code etc (tips here would be great).
I'm operating off of a csv, but I've re-created an equivalent dataframe instead of a read_csv command for reproducibility's sake.
shinyServer(function(input, output,session) {
getData<- reactive ({
npData <- data.frame(match_id = c(1234,5678,1245),
start_time = c(1657339909,1657357190,1657366555),
win = c('false','false','true'),
hero_id = '53',
account_id = c(2345438,259803,438689),
leaguename = c('Titus','Destiny','Ultras'),
gold_per_min = c(569,549,654),
net_worth = c(15770,15317,16985),
gold = c(1140,1559,3210),
kills = c(4,12,7),
tower_damage = c(1245,19823,4599),
duration = c(1754,1759,1829),
lane = c(1,3,3),
lane_role = c(1, 3,3)
)
npData$gold <- as.numeric(npData$gold)
dataModel <- data.frame(npData)
dataModel$win <- as.factor(dataModel$win)
dataModel$lane <- as.factor(dataModel$lane)
dataModel$lane_role <- as.factor(dataModel$lane_role)
dataModel
})
This section below takes in variables from the user and chops up the data per the user requests. Largely, irrelevant. I've edited the code to have comments to the right for realistic values to simulate a user's inputs for reproducability.
getTrainSetSize <- reactive({ #Variable to call training set size from UI
trainSetSize <- input$train #.3
trainSetSize
})
output$trainSetSize <- renderText({
trainSetSize <- getTrainSetSize()
trainSetSize
})
getModelPreds <- reactive({ # variables selected by user for modelling from UI
keepVars <- input$preds # c('duration','kills','gold_per_min')
keepVars
})
output$modelPreds <- renderText({
modelPreds <- getModelPreds()
modelPreds
})
getDataModel<- eventReactive (input$modelButton,{ #When user clicks the model button
keepVars <- getModelPreds()
keepVars <- c(keepVars, "win")
dataModel <- getData()
dataModel <- dataModel[, names(dataModel) %in% keepVars]
dataModel
})
output$dataModel <- renderDataTable({
dataModel <- getDataModel()
dataModel
})
Next section is about chopping up data for partitioning between a training set and a testing set.
dataIndex <- reactive({ # Create our index for training from getDataModel()
dataModel<-getDataModel()
trainSetSize <- getTrainSetSize()
index <- createDataPartition(dataModel$win, p = trainSetSize, list = FALSE)
index
})
output$dataIndex <- renderDataTable({
index <- dataIndex()
index
})
dataTrain <- reactive({ #Set our training data to a variable
dataModel <- getDataModel()
dataIndex <- dataIndex()
dataTrain <- dataModel[dataIndex, ]
dataTrain
})
output$dataTrain <- renderDataTable({
dataTrain <- dataTrain()
dataTrain
})
dataTest <- reactive({ #Set our testing data to a variable
dataModel <- getDataModel()
dataIndex <- dataIndex()
dataTest <- dataModel[-dataIndex, ]
dataTest
})
output$dataTest <- renderDataTable({
dataTest <- dataTest()
dataTest
})
Note that the dataTrain object above returns perfectly fine and is viewable inside the app.
Now we are fitting our classification tree.
classTree <- reactive({ # Create out model
trainSet <- dataTrain()
classTreeFit <- tree(win ~ ., data = trainSet)
classTreeFit
})
output$classTreeSumm <-renderPrint({ #Output summary of model
classTreeSum <- classTree()
summary(classTreeSum)
})
Note that above code chunk also returns perfectly fine in the app.
Now for the pruning.
pruneTree <-reactive({ #Creating the pruned model
classTreeFit <- classTree()
pruneFit <- cv.tree(classTreeFit, FUN = prune.misclass)
pruneFit
})
pruneStats <-reactive({ #Generate best pruned model
pruneFit<- pruneTree()
#Ordering things so that the best value is always in the first slot of dfPruneFit$size
dfPruneFit <- cbind(size=pruneFit$size,dev=pruneFit$dev)
dfPruneFit <- data.frame(dfPruneFit)
dfPruneFit <- dfPruneFit %>% group_by(size)%>%arrange(size)%>%arrange(dev)
bestVal <- dfPruneFit$size[1]
bestVal
pruneFitFinal <- prune.misclass(classTreeFit, best = bestVal)
pruneFitFinal
})
pruneProgression <-reactive({ #Create the progression of nodes to show viewers how pruning works
pruneFit<- pruneTree()
#Ordering things so that the best value is always in the first slot of dfPruneFit$size
dfPruneFit <- cbind(size=pruneFit$size,dev=pruneFit$dev)
dfPruneFit <- data.frame(dfPruneFit)
dfPruneFit <- dfPruneFit %>% group_by(size)%>%arrange(size)%>%arrange(dev)
dfPruneFit
})
output$pruneProg <- renderDataTable({ #Output progression of the tree fit
pruneProg <- pruneProgression()
pruneProg
})
When attempting to render this, the code states it cannot find object trainSet.
If you follow the chain of calls starting from the latest to earliest it goes: pruneProgression -> pruneTree() -> classTree() -> dataTrain()
dataTrain() is the origin point for the trainSet object.
Here is the repo if you want to play with the code in its entirety: DATA607/Projects/Final Project - Predicting Wins/DotaNP at main · d-ev-craig/DATA607 · GitHub
Thanks for any help