credit scoring with the scorecard package

fcas80 · October 24, 2023, 2:12pm

Hi. I am trying to understand credit scoring with the scorecard package. The package converts independent variables into weight of evidence bins, and calculates a baseline score and individual variable scores for each observation. My question is how does it create a baseline score, especially since at this point I have not calculated probabilities for each observer. Thank you.

https://cran.r-project.org/web/packages/scorecard/scorecard.pdf

library(scorecard)
data("germancredit")
df <- data.frame(germancredit)
df$creditability <- ifelse(df$creditability == "bad", 0, 1)

filter out variables with low IV

dt_sel = scorecard::var_filter(df, "creditability")
set.seed(1)
indexes = sample(1:nrow(df), size=0.7*nrow(df))
train = dt_sel[indexes,]
test = dt_sel[-indexes,]
bins = woebin(train, "creditability")
card = scorecard2(bins, dt = train, y = 'creditability')
print(card$basepoints)

323

nirgrahamuk · October 24, 2023, 2:19pm

it fits a glm on the woe data according to the source code at : https://github.com/ShichenXie/scorecard/blob/master/R/scorecard.R

fcas80 · October 24, 2023, 7:50pm

Thank you Nirgraharnuk. You are correct. It seems the algorithm fit a GLM without my telling it do so, presumably in the scorecard2 command? I assume it used some default values. I need those default values, and the GLM intercept. But now that I know what the algorithm did, I can recreate the baseline score:

library(scorecard)
data("germancredit")
df <- data.frame(germancredit)
df$creditability <- ifelse(df$creditability == "bad", 0, 1)

dt_sel = scorecard::var_filter(df, "creditability")
set.seed(1)
indexes = sample(1:nrow(df), size=0.7*nrow(df))
train = dt_sel[indexes,]
test = dt_sel[-indexes,]
bins = woebin(train, "creditability")

dt_woe_train = woebin_ply(train, bins) # creates data table with woe values
dt_woe_test = woebin_ply(test, bins)
model3 <- glm(creditability ~ ., family = binomial(), data = dt_woe_train) # these are WOE vars
s <- summary(model3)
b0 <- s$coefficients[1,1] # fitted intercept

card = scorecard2(bins, dt = train, y = 'creditability')
print(card$basepoints) # 323

points0 = 600
odds0 = 1/19
pdo = 50
b = pdo/log(2)
a = points0 + blog(odds0)
Intercept = b0
basepoints = a - bIntercept
basepoints # 323

system · October 31, 2023, 7:50pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.