How do I get a summary of variables where > .85 ??

Eli_Vergara · October 30, 2021, 9:45pm

Very new to R (learned basics in Google DA Certificate), so please forgive. I am using dplyr.
I have a dataframe with the following structure and 112 rows:

measure	avg	delta	epsum	flora	light	perry	phoenix	sans	tim	winn	yoda
1	0.984	1	0.97	1	1	1	1	1	1	1	0.87
2	0.997	1	1	0.97	1	1	1	1	1	1	1
3	0.97	0.97	0.97	0.97	0.97	0.97	0.97	0.97	0.97	0.97	0.97
4	1			1	1	1			1		
5	0.983	1	0.98	1	0.96	1	0.99	1	0.93	0.97	1
6	0.981	1	1	0.96	0.94	0.96	1	1	0.95	1	1
7	0.996	1	1	1	0.97	1	0.99	1	1	1	1
8	0.977	0.93	1	0.98	0.99	1	0.99	1	1	1	0.88
9	1	1	1	1	1	1	1	1	1	1	1
10	0.907	1	0.93	0.85	1	0.69	0.9	1	0.87	1	0.83

I am trying to find a way to get a summary with a count of every variable (column) that has a score greater than .85

I have been able to do that with the count() function, but only for one variable at a time. What I want, is to get one with all the variables, that I can then make a dataframe and plot it.

Ideally, I would want to se something like this:

measure	avg	 delta	epsum	flora	light	perry	phoenix	sans	tim	winn	yoda
1	0.984	1	  97 	65             73            85          58               96               103         56    96           88

where I get the number of rows where each variable was greater than .85
Is that possible?

williaml · October 30, 2021, 10:26pm

Hi, can you provide a reproducible example of your dataset?

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

daghj · October 31, 2021, 7:15am

apply(df > .85, 2, sum)
Explanation: apply applies a function (here, sum) to the first argument either by row or by column depending on the second argument (by column = 2). The first argument here, df > .85, will be just TRUE/FALSE values, which is "translated" to 1/0 when sum is used.

system · November 21, 2021, 7:15am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.