Datasets: Comparing character variables in the same dataset

TJ37043 · December 14, 2021, 9:15pm

Is there a way to compare the variables without splitting them into individual datasets for a time series analysis?

library(readr)
ClarksvilleRedFinData <- read_csv("ClarksvilleRedFinData.csv", 
    col_types = cols(EndDate = col_date(format = "%m/%d/%Y")))
View(ClarksvilleRedFinData)

print(ClarksvilleRedFinData)

## # A tibble: 442 x 11
##    Region      EndDate    Sales PendingSales Med_Sale_Price  Ppsf Off2w   SAL
##    <chr>       <date>     <dbl>        <dbl>          <dbl> <dbl> <chr> <dbl>
##  1 Clarksville 2012-07-01   266          151         151000  88.9 0.126 0.09 
##  2 Clarksville 2012-08-01   245          149         151000  89.9 0.154 0.078
##  3 Clarksville 2012-09-01   202          108         156000  89.8 0.148 0.064
##  4 Clarksville 2012-10-01   243          140         152000  89.7 0.2   0.049
##  5 Clarksville 2012-11-01   202          112         154000  88.3 0.205 0.064
##  6 Clarksville 2012-12-01   225           84         150000  86.2 0.19  0.08 
##  7 Clarksville 2013-01-01   144           83         146000  86.4 0.217 0.049
##  8 Clarksville 2013-02-01   196          119         149000  86.5 0.16  0.036
##  9 Clarksville 2013-03-01   222          134         148000  86.2 0.134 0.068
## 10 Clarksville 2013-04-01   254          190         150000  88.0 0.158 0.079
## # ... with 432 more rows, and 3 more variables: New_Listings <dbl>,
## #   Active_Listings <dbl>, DOM <dbl>

tail(ClarksvilleRedFinData)

## # A tibble: 6 x 11
##   Region        EndDate    Sales PendingSales Med_Sale_Price  Ppsf Off2w    SAL
##   <chr>         <date>     <dbl>        <dbl>          <dbl> <dbl> <chr>  <dbl>
## 1 ZipCode_37043 2021-05-01   404           95         326000  145. 66.30% 0.374
## 2 ZipCode_37043 2021-06-01   423          115         330000  147. 58.30% 0.442
## 3 ZipCode_37043 2021-07-01   443          134         330000  149. 51.50% 0.485
## 4 ZipCode_37043 2021-08-01   459          111         333000  153. 52.30% 0.479
## 5 ZipCode_37043 2021-09-01   450          108         338000  157. 37.00% 0.442
## 6 ZipCode_37043 2021-10-01   415           89         345000  160. 43.80% 0.39 
## # ... with 3 more variables: New_Listings <dbl>, Active_Listings <dbl>,
## #   DOM <dbl>

technocrat · December 14, 2021, 9:29pm

Yes. Use {tsibble} which allows easy subsetting from the same dataframe. See Hyndman

TJ37043 · December 16, 2021, 8:02am

Thank you! I greatly appreciate your help!

system · December 23, 2021, 8:03am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.