Every R
problem can be thought of with advantage as the interaction of three objects— an existing object, x , a desired object,y , and a function, f, that will return a value of y given x as an argument. In other words, school algebra— f(x) = y. Any of the objects can be composites.
In this case, x is your database, y is your database augmented by an additional variable an intercept value from a regression model. Both x and y are data frames—each contains observations of an object of interest, crsp_fundno
arranged row-wise and containing variables, some of which will be used as arguments to lm
, which will return an object of class lm
, call it fit
, containing the value of interest, the intercept, fit$coefficients[1]
.
Using these pieces we can construct f.
The first thing to note is that functions are first-class objects, which means that they can be given as arguments to other functions. It is convenient to work inside outwards and to create an auxiliary function:
get_intercept <- function(x) {
(lm(mretFFr$mexret ~ Mkt_RF + SMB + HML,
data = your_data[x,]))$coefficients[1]
}
NB: variable names cannot contain blanks or operators; Mkt-RF changed to Mkt_RF. Also, we would normally parameterize
your_data
and the other arguments, rather than hardwiring them.
get_intercept
takes an argument, x
(the crsp_fundno
of interest, distinct from the nomenclature for the formal object x) and returns the value of a linear regression's intercept coefficient, which is the desired portion of fit
to add to each selected crsp_fundno
.
Thus
get_intercept(64487)
will return the value for the intercept to be placed, FamaFrench3-factor alpha, which I'll call ff3fa
. It would be best for this new variable to be provisioned beforehand.
your_database[,"ff3fa"] <- NA
Another helper function will make the placement
place_intercept <- function(x) your_data[x,"ff3fa"] = get_intercept(x)
We now have a way to place a single crsp_fundno
into y
place_intercept(64487)
An auxiliary object, fund_list
can be used to identify the specific crsp_fundno
to be so processed.
fund_list <- c(
97403,62638,98168,92509,93172,69885,87073,51929,
81727,64998,68432,87733,78200,92599,59821,59391,
51450,56856,94761,65606,60274,94622,50572,65734,
91201,59542,72588,87752,97495,62544,90312,81084,
83960,84608,70966,80280,74213,98558,66360,61703,
96572,98795,71403,94230,90321,81786,85710,92169
)
From there
lapply(fund_list, place_intercept)
which leads to f and its application
add_intercepts <- function(x) lapply(x, place_intercept)
add_intercepts(fund_list)
See the FAQ: How to do a minimal reproducible example reprex
for beginners to illuminate why the specific code may not be reliable in the absence of a representative data object on which to test. Also, I express no opinion as to the appropriateness of any intended application of the intercept in this case.