I pulled some data from Fama French web site on operating data by month...here is my code that works (below).
I have 7 columns of data, with headers, including Date. I have successfully graphed one column data, but I would like to add several of the other columns of data to my chart with colors, and I'm struggling to add the additional columnar data. Suggestions? Thank you.
Step 1: Set Working Directory
setwd("c:/Users/Dean/OneDrive/Desktop")
Step 2: Call up Working Directory
getwd()
Step 3: Call up packages and libraries
install.packages("tidyverse")
library(tidyverse)
install.packages("rio")
library(rio)
install.packages("dplyr")
library("dplyr")
Step 4: Read FF_OP_Analysis_Since_1963
read.csv("c:/Users/Dean/OneDrive/Desktop/FF_OP_Analysis_Since_1963.csv")
read.csv("FF_OP_Analysis_Since_1963.csv")
Step 5: Create dataframe
df<-read.csv("FF_OP_Analysis_Since_1963.csv")
Install ggplot2 (if not already installed)
install.packages("ggplot2")
Load ggplot2
library(ggplot2)
Example: Calculate Operating Profit by Date
ggplot(data = df, aes(Date, SMALL_LoOP)) +
geom_line()
Also, here are the column names as well: Date, SMALL_LoOP, ME1_OP2, ME1_OP9, SMALL_HiOP, BIG_LoOP, BIG_HiOP
FJCC
January 7, 2024, 6:38pm
3
Try the following. I don't have your data, so I couldn't test the code.
df |> select(Date, SMALL_LoOP, ME1_OP2, ME1_OP9, SMALL_HiOP, BIG_LoOP, BIG_HiOP) |>
pivot_longer(cols = -Date, names_to = "Variable", values_to = "Value") |>
ggplot(aes(x= Date, y = Value, color = Variable))
That looked like it set up the table properly, but you cannot see any lines on the chart...empty.
Data is an Excel CSV file...here is a snippet of the table:
Date
SMALL_LoOP
ME1_OP2
ME1_OP9
SMALL_HiOP
BIG_LoOP
BIG_HiOP
196307
-0.9657
-0.8294
-2.2346
2.5642
-1.0687
1.1313
196308
2.4333
2.6256
-0.6244
5.0913
6.2374
5.7975
196309
-2.1246
-0.031
3.6577
-0.7201
-4.9282
-0.5708
196310
1.549
-1.517
4.5372
-0.5986
-2.0294
10.3195
196311
-4.8866
-0.789
-5.7208
-3.999
1.6446
-4.3612
196312
-3.785
-1.3718
-0.1347
-2.1368
3.169
1.6677
196401
7.5529
5.147
2.9066
4.8603
0.2483
3.5718
196402
3.918
4.7447
2.8176
-0.2699
6.1887
2.1659
196403
2.7378
-0.1299
2.3133
3.7433
-1.7251
1.8864
196404
-0.6952
-0.7992
-0.2116
-1.4397
4.1876
1.3738
196405
1.5796
0.7386
2.2588
-1.7503
2.9169
3.6611
196406
2.3038
-3.4242
1.113
3.76
4.4891
0.2734
196407
3.8523
6.0508
1.8796
5.7071
-0.9219
2.3778
196408
-0.3194
2.6002
-2.3171
-1.8086
-5.4859
-1.1284
196409
2.4605
9.7302
5.6853
8.2176
1.8623
2.3715
196410
3.7606
-0.1009
1.3762
2.6976
1.373
-0.6697
196411
-1.4074
-0.9774
-2.6125
1.3333
-6.1009
-1.3031
196412
-1.5714
0.6171
-1.0029
-1.7434
-4.1665
1.2899
196501
8.4073
7.4467
12.068
7.2627
6.9792
6.1292
Here is a copy of the output now...
FJCC
January 8, 2024, 12:40am
6
Oops, I had a copy and paste failure. Try this:
df |> select(Date, SMALL_LoOP, ME1_OP2, ME1_OP9, SMALL_HiOP, BIG_LoOP, BIG_HiOP) |>
pivot_longer(cols = -Date, names_to = "Variable", values_to = "Value") |>
ggplot(aes(x= Date, y = Value, color = Variable)) + geom_line()
FJCC, that is AWESOME!!! Thank you.
Do you know what might be the best way to convert monthly data into annual (average) data? '
Thank you. This is very helpful for a personal research project.
FJCC
January 8, 2024, 1:19am
8
Here is code for computing a yearly average for each variable. The dates were coming in as integers, so I extracted the year by truncating the date/100
library(tidyverse)
df <- read.csv("~/R/Play/Dummy.csv")
ByYear <- df |> select(Date, SMALL_LoOP, ME1_OP2, ME1_OP9, SMALL_HiOP, BIG_LoOP, BIG_HiOP) |>
pivot_longer(cols = -Date, names_to = "Variable", values_to = "Value") |>
mutate(Year = trunc(Date/100)) |>
group_by(Year, Variable) |>
summarize(Avg = mean(Value, na.rm = TRUE))
#> `summarise()` has grouped output by 'Year'. You can override using the
#> `.groups` argument.
ggplot(ByYear, aes(x= Year, y = Avg, color = Variable)) + geom_line() +
scale_x_continuous(breaks = 1963:1965)
Created on 2024-01-07 with reprex v2.0.2
FJCC, thank you so much…you’ve saved me a ton of frustration and time.
system
Closed
January 15, 2024, 1:35am
10
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.