New in R, looking for help

daisy · October 21, 2021, 8:50am

I downloaded my excel data sheet into R, but working with it afterwards doesnt seem to go very well. Is there someone who can help me out with showing me some basic skills? YouTube and google were not able to get me further.

williaml · October 21, 2021, 9:16am

Hi Daisy,

It might help if you post your code and describe what the issue is. Even post the error message. A reproducible example would be really helpful.

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

daisy · October 21, 2021, 9:35am

Hi Williaml,

Thank you for your fast reply. I made a screenshot of what I am working on. It is a dataset where I would like to now how many values of HbA1c are >43 and how many values of HbA1cp are >6.1.
I don't know where to start and if I put in the dataset correctly (selected HbA1c and HbA1cp as numeric and Geslacht (gender) as character. Thank you very much for helping me out.

williaml · October 21, 2021, 9:52am

Here is an example using the iris dataset.

library(dplyr)
iris %>% 
  filter(Sepal.Length > 5.5) %>% 
  count()

You'll need to install the dplyr package if you haven't already done so. You can replace iris and Sepal.Length with the variables from your dataset.

By the way, it is best if you post the code rather than a screenshot.

daisy · October 21, 2021, 10:03am

Thanks again.
I downloaded the dplyr package.

HbA1c %>%

```
filter(HbA1c > 43) 
```

Error in HbA1c %>% filter(HbA1c > 43) : could not find function "%>%"

count()

Error in count() : could not find function "count"

This is what I get when i try to add my variable.

williaml · October 21, 2021, 10:05am

Sorry, "%>%" is from the magrittr package.

If you want, you can just install that.

It would be better though to install the tidyverse pacakge. It contains a lot of other useful functions.

Otherwise, you can also do this:

count(filter(HbA1c, HbA1c > 43))

daisy · October 21, 2021, 10:10am

Thank you. I downloaded the Tidyverse package.

Now this is what I get:

HbA1c %>%

```
filter(HbA1c > 43) %>% 
```
```
count(filter(HbA1c, HbA1c > 43))
```

Error in HbA1c %>% filter(HbA1c > 43) %>% count(filter(HbA1c, HbA1c > :
could not find function "%>%"

count(filter(HbA1c, HbA1c > 43))
Error in count(filter(HbA1c, HbA1c > 43)) :
could not find function "count"

williaml · October 21, 2021, 10:12am

What happens if you run this?

library(tidyverse)
HbA1c %>% 
  filter(HbA1c > 43) %>% 
  count()

daisy · October 21, 2021, 10:17am

library(tidyverse)
-- Attaching packages ------------------------------------------------ tidyverse 1.3.1 --
v ggplot2 3.3.5 v purrr 0.3.4
v tibble 3.1.5 v dplyr 1.0.7
v tidyr 1.1.4 v stringr 1.4.0
v readr 2.0.2 v forcats 0.5.1
-- Conflicts --------------------------------------------------- tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag() masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 4.0.5
2: package ‘ggplot2’ was built under R version 4.0.5
3: package ‘tibble’ was built under R version 4.0.5
4: package ‘tidyr’ was built under R version 4.0.5
5: package ‘readr’ was built under R version 4.0.5
6: package ‘purrr’ was built under R version 4.0.5
7: package ‘dplyr’ was built under R version 4.0.5
8: package ‘stringr’ was built under R version 4.0.5
9: package ‘forcats’ was built under R version 4.0.5
HbA1c %>%

```
filter(HbA1c > 43) %>% 
```
```
count()
```

A tibble: 1 x 1

1 36

Thank you for your fast reply.
This happened! Does that mean the percentage is 36% of 43>?

williaml · October 21, 2021, 10:29am

That's the number of rows greater than 43 in your dataset (you were filtering the rows).

You could get percentage using something like this:

library(tidyverse)
filtered <- iris %>% 
  filter(Sepal.Length > 6) %>% 
  count() 

filtered / count(iris) * 100

daisy · October 21, 2021, 10:36am

This is so exciting!!! I am so grateful for your help and finally getting somewhere.. Thank you so so much!
I got the percentages! So now a whole new question... Can I get this data visual in a graph?

library(tidyverse)
filtered <- HbA1c %>%

```
filter(HbA1c > 43) %>% 
```
```
count() 
```

filtered / count(HbA1c) * 100
n
1 17.06161
library(tidyverse)
filtered <- HbA1c %>%

```
filter(HbA1cp > 6.1) %>% 
```
```
count() 
```

filtered / count(HbA1c) * 100
n
1 13.27014

nirgrahamuk · October 21, 2021, 10:39am

As a beginner to R you may benefit from studying this useful book.
https://r4ds.had.co.nz/
Particularly chapters 5 and 3

system · October 28, 2021, 10:40am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.