Unable to create a graph to compare variables in R from a .csv file

Hello all :smiley:

I'm very new to this and would appreciate some help here. I am trying to analyse a table from a .csv document, and would like to create a visualisation for the most frequently occurring variable in a column, e.g. the column shows start time, end time, trip duration, gender of user, date of birth, etc.
I'm having some difficulty in figuring out how to create a visualisation to show the mode (most frequently occurring observation) for each column and compare the results.

I am currently just trying to compare 2 things to each other but I cannot seem to be able to form a graph.

I have installed and loaded both tidyverse and ggplot2, and I'm writing the code in the Source pane (not the console), but I need assistance in figuring out what to write exactly. The .csv file has also been imported.

One of the columns has a date/time data type (unsure if this is a contributing factor) and the other just has numbers. These are just two variables of many that I would like to compare.

I'd appreciate any help with this.

I tried the following, expecting a scatterplot:
data=read.csv('dataset1.csv')

print(data)

plot(x = data$x,y = data$y,
xlab = "x-axis",
ylab = "y-axis",
main = "Plot")

I got the following message with no output:
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf

and this:

ggplot(data = bikeshare) +

  • geom_point(mapping = aes(x=start_time, y=tripduration))
    Error in fortify():
    ! data must be a <data.frame>, or an object coercible by fortify(), not
    the string "Divvy_Trips_2019_Q4.csv".
    Run rlang::last_trace() to see where the error occurred.

ggplot(df, aes(x=start_time, y=tripduration)) +

  • geom_point()
    Error in ggplot():
    ! data cannot be a function.
    :information_source: Have you misspelled the data argument in ggplot()
    Run rlang::last_trace() to see where the error occurred.

rlang::last_trace()
<error/rlang_error>
Error in ggplot():
! data cannot be a function.
:information_source: Have you misspelled the data argument in ggplot()

#create scatter plot
ggplot(df, aes(x=start_time, y=tripduration))
Error in ggplot():
! data cannot be a function.
:information_source: Have you misspelled the data argument in ggplot()
Run rlang::last_trace() to see where the error occurred.
#create scatter plot
ggplot(data, aes(x=start_time, y=tripduration))
Error in ggplot():
! data cannot be a function.
:information_source: Have you misspelled the data argument in ggplot()
Run rlang::last_trace() to see where the error occurred.

  • List item

Hi @heyyou —how many rows and columns does your table have?

In R data is a function.

You can overwrite it with something like

data=read.csv('dataset1.csv')

Then, what you have written should work but I think somehow you may have reset data as a function.

Try saving your work, shutdown and restart R and RStudio. Then change the name of your data.frame to something like this

dat1 = read.csv('dataset1.csv')

and see what happens.

It has 704054 rows and 12 columns

Thanks, @heyyou, and in that case, could you execute the following commands, exactly as written?

dataset1 <- read.csv("dataset1.csv")
sink("for_posit.txt")
dput(dataset1[1:50,])
sink()

This will create the file for_posit.txt and will copy the contents of the first 50 rows of your data set in a form that folks on this site use to help you trouble shoot.

Next, follow the steps below:

  1. Select all of the contents of the file for_posit.txt, and
  2. paste them in a reply, but between a pair of triple backticks, like this:
```
<--- paste copied content here
```

Then it should be relatively easy to help you sort things out.

1 Like

Hey dromano
Is there any way to chat privately? I did what you said but the contents of the first 50 rows don't appear anywhere.

I don't know of a way to chat, but maybe we can sort things out here: There's another, more error-prone way to do the same, but before that, would you mind clicking on the Files tab in the lower right-hand pane of your RStudio window? The file "for_posit.txt" should show up there, and if it does, you can just click on it to open in RStudio, and you can copy and paste from there. If you don't see it there, then you can also run the commands

dataset1 <- read.csv("dataset1.csv")
dput(dataset1[1:50,])

without using the sink() command, and the output will appear in your RStudio console. Then you can copy the output and paste it into a reply here.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.