How to spread my dataset under different categories

hellovivvvv · March 13, 2020, 8:19pm

Hello, I'm still a novice to R. Can anyone help me figure out how to spread my dataset under different categories?
The following is the dataset description: items with item code are my broad categories, and others without code fall in the subcategories. I want to make my broad items together with codes in two separate columns, and associated subcategories (I mean items under the broad categories)in another column. (Like column#1 is "Item Code", column#2 is "Broad category", and column #3 is " Specific items")

Can anyone tell me how to do this? I'm trying to use spread( ) command, but I think it doesn't work, because in that way I will get a new table which is super "wide".

Thank you so much!

dromano · March 13, 2020, 10:19pm

Hi @hellovivvvv, have you imported your dataset into R yet? Or is that something you need help with?

hellovivvvv · March 14, 2020, 3:06am

Hello! Thanks for your reply.
I haven't imported my dataset into R yet, I'm just thinking about my following steps and how to separate the original "Item" column into two. I had experience in doing spread() command, but in this case, it seems doesn't work. Do you know how to put all the "broad category" into one column, and the other items in another column? Just like this (I mean after I do some sorting work in R):

Thank you so much!

dromano · March 14, 2020, 3:21am

It can depend on what happens when you import the file since that may change what the table looks like once it's in R. Could you import it and paste what you get here?

hellovivvvv · March 14, 2020, 2:22pm

Thanks for your patience. Here is what I got.

As I described, values with "item code" fall in "broad category", others are in the subcategories. So now, I wanna separate values in subcategories (that is, values with NA code) and make it into the third column. And I wish the final result could be like the form in my last reply.

Can I specify "broad categories" as a variable, and "subcategories" as another variable, and then, I might be able to spread the table? Not sure.

Could you give me some suggestions? Thanks a lot.

dromano · March 14, 2020, 2:47pm

Great! The next step is to get your table in shape so that folks who want to help can copy and paste into their own R files. One way to start is to copy the code you used to import the table, and paste it here between a pair of triple backticks (```), like this:

```
<--- paste your code here
```

and then do the same with the output you get from applying the function dput() to the name of your table:

```
<--- paste output of dput(your_table) here
```

That will make it easier to make sure folks can recreate your situation and help you more effectively.

hellovivvvv · March 14, 2020, 3:25pm

crop%>%
  select("Item Code (FAO)",Item)->crop.sorted
View(crop.sorted)

dput(crop.sorted)

I'm sorry I haven't done anything, cuz I really have no idea about it.

Is this correct? But my table doesn't show up...

andresrcs · March 14, 2020, 3:33pm

Ideally, you should provide a reproducible example for your issue, you can learn how to make one by reading this FAQ

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

Since you are new to R, I'm going to give you an example with some made-up data, this is an extra effort for people trying to help you so next time, please read the guide and try to provide a proper reprex.

library(tidyverse)

sample_df <- data.frame(
    stringsAsFactors = FALSE,
    item_code = c(100, NA, NA, NA, 200, NA, NA, NA),
    item = c("A", "a1", "a2", "a3", "B", "b1", "b2", "b3")
)

sample_df
#>   item_code item
#> 1       100    A
#> 2        NA   a1
#> 3        NA   a2
#> 4        NA   a3
#> 5       200    B
#> 6        NA   b1
#> 7        NA   b2
#> 8        NA   b3

sample_df %>% 
    mutate(item_code = ifelse(!is.na(item_code), item, item_code)) %>% 
    fill(item_code, .direction = "down") %>% 
    filter(item_code != item)
#>   item_code item
#> 1         A   a1
#> 2         A   a2
#> 3         A   a3
#> 4         B   b1
#> 5         B   b2
#> 6         B   b3

^{Created on 2020-03-14 by the reprex package (v0.3.0.9001)}

dromano · March 14, 2020, 5:21pm

If you run the command dput(crop.sorted), you'll see its output in your RStudio console, which is in the lower left pane. That output is what you want copy and paste here, whereas you pasted the command itself. Could you do that?

hellovivvvv · March 14, 2020, 6:29pm

Yeah, I think I got the answer! Thank you so much! I will do this for my next time:) Thank you!

hellovivvvv · March 14, 2020, 6:34pm

Perfect! Thank you so much, it works!
Based on your code, I also found an alternative way, which allows keeping item code while getting the "broad category" and "subcategories".
Here is what I got:

crop.sorted %>% 
  mutate(Category = ifelse(!is.na(Code), Item, Code)) %>% 
  fill(Code,Category) %>% 
  group_by(Code)%>%
  slice(2:n())%>%
  select(Code,Category,Item)-> crop.sorted

brianstamper · March 16, 2020, 7:21pm

Just as a side note as I read this thread in passing, the more idiomatic way to do the assignment in R would be at the beginning, like this:

crop.sorted <- crop.sorted %>% 
  mutate(Category = ifelse(!is.na(Code), Item, Code)) %>% 
  fill(Code, Category) %>% 
  group_by(Code) %>%
  slice(2:n()) %>%
  select(Code, Category, Item)

I prefer to avoid re-assigning a modified object back to the same name, but I also know many people who do so regularly.

system · March 23, 2020, 7:21pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.