Merging two datasets in R

I got below error message in R trying to merge two datasets:
outer_merged <- merge(dailyActivity_merged, dailyIntensities_merged, by = Daily_Activities, all = TRUE)
Error: object 'Daily_Activities' not found

How can I resolve this and move forward

There must be a Daily_Activities variable in both dailyActivity_merged and dailyIntensities_merged

What do I do to resolve it Technocrat? Thanks

the by = parameter should give a character string which is the name of the column to merge on; you have used a symbol, so that symbol would have needed a definition. solution is to quote like so

outer_merged <- merge(dailyActivity_merged, 
                      by = "Daily_Activities", all = TRUE)

I plugged in the code as supplied but it returned this:
outer_merged <- merge(dailyActivity_merged,
by = "Daily_Activities", all = TRUE)
Error in, x) : 'by' must specify a uniquely valid column

that means you dont have Daily_Activities in both datasets, which is a requirement if you wish to merge with that as criteria

Column names in DailyActivity_merged Dataset
[1] "Id" "ActivityDate"
[3] "TotalSteps" "TotalDistance"
[5] "TrackerDistance" "LoggedActivitiesDistance"
[7] "VeryActiveDistance" "ModeratelyActiveDistance"
[9] "LightActiveDistance" "SedentaryActiveDistance"
[11] "VeryActiveMinutes" "FairlyActiveMinutes"
[13] "LightlyActiveMinutes" "SedentaryMinutes"
[15] "Calories"

[1] "Id" "ActivityDay"
[3] "SedentaryMinutes" "LightlyActiveMinutes"
[5] "FairlyActiveMinutes" "VeryActiveMinutes"
[7] "SedentaryActiveDistance" "LightActiveDistance"
[9] "ModeratelyActiveDistance" "VeryActiveDistance"

from this its clear that 'Daily_Activities' is in neither of your datasets.

DailyActivities_merged is one of the file names, as well as DailyIntensities_merged

Daily_Activities needs to be the name of a variable (column) in the dataset, not the name of the dataset.


outer_merged <- merge(dailyActivity_merged, 
                      by = "Id", all = TRUE)

Hi Nir, Please help to solve this. I am grateful. I am new to R Programming and data analytics

I dont know your data, because i have no context on your task, or what the info in your data.frames represents.
but common sense tells me you have ID's in both , and if they are fit for the task ID's they would be natural to merge on. but this could be completely wrong; its your data, you should have knowledge of it and what can in principle be done with it (which is not an R language specific issue)

1 Like

Hi Nir,To move forward. I have to resolve below error:

Warning message:
In left_join(., dailyIntensities_merged) :
  Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 105 of `x` matches multiple rows in `y`.
ℹ Row 105 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
  "many-to-many"` to silence this warning.

Kindly help, thanks

But merged dataset is not appearing in the ENVIRONMENT PANE, which I think is because of above error I shared with you earlier

I would really encourage you to review the following guide, FAQ: Tips for writing R-related questions.
For example, the guide emphasizes asking coding questions with formatted code-chunks and a reprex.

You may have noticed folks here requesting minimal reprexes, that's because asking questions this way saves answerers a lot of time.

Reproducible Examples:

  • help make your question clear and replicable
  • increases the probability folks will reach out and try to help,
  • reduces the number of back-and-forths required to understand the question,
  • and makes your question and suggested solutions more useful to folks in the future researching similar problems.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.