johnn
May 26, 2018, 6:41am
1
Hi all,
I have a meteorological data for 50 years and the format of one-year data is as below:
And I want to align all the three variables in long format as shown below as an example:
Any kind of suggestion will be helpful and highly appreciated.
Regards
John
Hi,
Can you please take couple of rows in your dataset and convert them into reprex?
Why reprex?
Getting unstuck is hard. Your first step here is usually to create a reprex, or reproducible example. The goal of a reprex is to package your code, and information about your problem so that others can run it and feel your pain. Then, hopefully, folks can more easily provide a solution.
What's in a Reproducible Example?
Parts of a reproducible example:
background information - Describe what you are trying to do. What have you already done?
complete set up - include any library() calls and data to reproduce your issue.
data for a reprex: Here's a discussion on setting up data for a reprex
make it run - include the minimal code required to reproduce your error on the data…
Second, take a look at readxl
package. Since you have composite headers (each month is one header and then each variable is another), you first need to read it in properly so that it will make it easier to work with.
Once this is done, converting your initial table to the format you want is straightforward. It'll be a combination of gather
, separate
and spread
from tidyr
package.
Very similar example from that package with solution is available as a demo here - https://github.com/tidyverse/tidyr/blob/master/demo/so-9684671.R .
1 Like
johnn
May 26, 2018, 1:18pm
3
Hi,
Thanks for your suggestions. Here is my sample dataset as generated using reprex.
set.seed(10)
dat <- data.frame(
Max.1 = runif(10),
Min.1 = runif(10),
RF.1 = runif(10),
Max.2 = runif(10),
Min.2 = runif(10),
RF.2 = runif(10),
Max.3 = runif(10),
Min.3 = runif(10),
RF.3 = runif(10),
Max.4 = runif(10),
Min.4 = runif(10),
RF.4= runif(10),
Max.5 = runif(10),
Min.5 = runif(10),
RF.5 = runif(10),
Max.6 = runif(10),
Min.6 = runif(10),
RF.6 = runif(10),
Max.7 = runif(10),
Min.7 = runif(10),
RF.7 = runif(10),
Max.8 = runif(10),
Min.8 = runif(10),
RF.8 = runif(10),
Max.9 = runif(10),
Min.9 = runif(10),
RF.9 = runif(10),
Max.10 = runif(10),
Min.10 = runif(10),
RF.10 = runif(10),
Max.11 = runif(10),
Min.11 = runif(10),
RF.11 = runif(10),
Max.12 = runif(10),
Min.12 = runif(10),
RF.12 = runif(10))
Regards
john
You can use the gather/spread functions.
See:
http://garrettgman.github.io/tidying/
johnn
May 28, 2018, 5:59am
5
Hi,
Thanks for your suggestion! I will try it and post my response accordingly.
I tried to transform my data from wide format to long using reshape package using the following code.
reshape(dat,
varying=c(Max= c(seq(1,34,3)),
Min= c(seq(2,35,3)),
RF= c(seq(3,36,3))),
direction="long")
But the problem for me now is how I will incorporate the month aspect in my code as the number of days for each month changes??
Thanks again!
Regards
John