I want to compute the mean of a dataset. I tried it according to the introduction I got but it doesn't work.
I proceeded as follows:
mean(dataframe$column I want compute the mean of)
It was written that the object wasn't found.
I loaded the dataset before whilst clicking on "environment" and then "import dataset". However by "setwd" it didn't work. Maybe it has something to do with this.
The values in the respective column are all numbers by the way.
# if not a standard built-in dataset, like mtcars, load the library that has it
library(palmerpenguins)
# bring the data into namespace
data("penguins")
# inspect
head(penguins)
#> # A tibble: 6 × 8
#> species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex
#> <fct> <fct> <dbl> <dbl> <int> <int> <fct>
#> 1 Adelie Torge… 39.1 18.7 181 3750 male
#> 2 Adelie Torge… 39.5 17.4 186 3800 fema…
#> 3 Adelie Torge… 40.3 18 195 3250 fema…
#> 4 Adelie Torge… NA NA NA NA <NA>
#> 5 Adelie Torge… 36.7 19.3 193 3450 fema…
#> 6 Adelie Torge… 39.3 20.6 190 3650 male
#> # … with 1 more variable: year <int>
# check for NA values
sum(is.na(penguins))
#> [1] 19
# there are columns with than one, so use na.rm = TRUE
# find numeric columns
str(penguins)
#> tibble [344 × 8] (S3: tbl_df/tbl/data.frame)
#> $ species : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ island : Factor w/ 3 levels "Biscoe","Dream",..: 3 3 3 3 3 3 3 3 3 3 ...
#> $ bill_length_mm : num [1:344] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ...
#> $ bill_depth_mm : num [1:344] 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ...
#> $ flipper_length_mm: int [1:344] 181 186 195 NA 193 190 181 195 193 190 ...
#> $ body_mass_g : int [1:344] 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ...
#> $ sex : Factor w/ 2 levels "female","male": 2 1 1 NA 1 2 1 2 NA NA ...
#> $ year : int [1:344] 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ...
# create data frame of only numeric colunmns
dat <- penguins[,-c(1,2,7)]
# find means of one numeric column
mean(dat$bill_depth_mm)
#> [1] NA
# now with na.rm = TRUE
mean(dat$bill_length_mm, na.rm = TRUE)
#> [1] 43.92193
# find means of each numeric column
# convert to matrix
colMeans(dat, na.rm = TRUE)
#> bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
#> 43.92193 17.15117 200.91520 4201.75439
#> year
#> 2008.02907
# if the data is NOT in a package, but in a csv file
# EXAMPLE, use the location of your data; full pathname not needed if the file
# is in the current working director
YOUR_CSV <- read.csv("/usr/local/lib/R/site-library/palmerpenguins/extdata/penguins.csv")
# proceed as before with YOUR_CSV in place of penguins
Blockquote
On windows you have to either double up on the slashes or reverse them
"C:\\Users\\somewhere\\"
"C:/Users/somewhere/"
It seems to be stubborn. When I did this the following notification appeared:
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file 'C:/datapath permission denied
Have you tried going to press CTR + SHIFT + H to set your working directory, chose the folder where you file is located, then in r type dir() to get an overview of the files in that directory and copy + paste the name so you know that there are no typos.
As others have said, it might be because of a restriction to open the folder, have you tried another folder?
BlockquoteHave you tried going to press CTR + SHIFT + H to set your working directory, chose the folder where you file is located, then in r type dir() to get an overview of the files in that directory and copy + paste the name so you know that there are no typos.
As others have said, it might be because of a restriction to open the folder, have you tried another folder?
I tried what you said without success. The folder opens but it's written no items match your search.
What I don't understand: When I went to import dataset it worked and the data are shown in a table (But not possible to compute the mean). Is it a difference to do it like this instead of the other ways discussed here?