Error message "not found" when trying to knit

Good day, I am facing a problem. I have error message coming when trying to knit in Rmd. I have the following for example, but there are many others like that.


sum(duplicated(dailyactivity))

sum(duplicated(minutesleep))

When I knit the Rmd file, the error comes in with the line number as follow.

"Quitting from lines at lines 75-77 [unnamed-chunk-3] (FitBit-Data-Analysis.Rmd)
Error:
! object 'dailyactivity' not found
Backtrace:

  1. base::duplicated(dailyactivity)
    Execution halted."

If I removed the chunk, the error will be with the next chunk. But if I run the chunk, it gives result with no error.

Welcome to the forum.

We probably need to see most of the .rmd at least up to where the problem is occurring. We also will need some sample data.

Copy the .rmd and paste it here between
```

```

Copy the "raw" text, not the visiual mode text.

A handy way to supply some sample data is the dput() function. In the case of a large dataset something like dput(head(mydata, 100)) should supply the data we need. Just do dput(mydata) where mydata is your data. Copy the output and paste it here between
```

Hi @Bouba_Ismaila, this said that the object dailyactivity, in rmd file dont was run. Mybe you have in other script.
Load all data in a specific chunk before to use.

Hi, thank you for trying to assist. This is part of the Rmd. I replaced some data, so I will need to fix it from where this stops.


title: "Bellabeat Data Analysis"
author: "Bouba Ismaila"
date: "2024-04-27"
output: html_document

Statement of the business task

Assessing consumer usage trends of smart devices to inform Bellabeat marketing strategy

Data sources used

FitBit Fitness Tracker Data was downloaded from Kaggle (FitBit Fitness Tracker Data (kaggle.com)) and stored on my laptop in two sub-folders. In my analysis, I used relevant tables/dataframes to analyze relationships between total steps, calories, activity minutes, time in bed and sleep time. There are many dataframes with most of them repeated using different measures for the same thing.The dailyActivity_merged dataframe has the advantage of measuring many things in it. Therefore, that table was retained. The minuteSleep_merged dataframe was also added. All dataframes had the word merged with no distinction or specific meaning. So it was left as it was.
The dataframes used are in long format.

Data cleaning and transformation

The data cleaning follows the installation and loading the necessary packages.

The function sample() was used to check for data bias.

install.packages("DataEditR")
install.packages("reshape2")
install.packages("SimDesign")
install.packages ("ggplot2")
install.packages("tidyverse")
install.packages("reshape2")
install.packages("SimDesign")
install.packages("janitor")
library(ggplot2)
library(tidyverse)
library(janitor)
library(lubridate)
library(dplyr)
library(reshape2)
library(tidyr)
library(SimDesign)

Cleaning and transforming data

read.csv("dailyactivity.csv")
read.csv("minutesleep.csv")

THIS CHUNK RUNS WITH NO ISSUE; I JUST WANTED TO TEST.

sum(duplicated(dailyactivity))
sum(duplicated(minutesleep))

When I knit, it say 'dailyactivity' not found. if I removed the first line, then it will say 'minutessleep' not found, and so on. It means that I have to remove all, then nothing will be left to knit.
The data was downloded from Kaggle, and uploaded as csv files, and show on the Environment pane. I just renamed them after some cleaning. I couldn't upload. I will check your instructions on how to upload.
Thanks

The data is large, so I did the 100 as advised, just for one dataset.

dput(head(dailyactivity, 100))
structure(list(Id = c(1503960366, 1503960366, 1503960366, 1503960366,
1503960366, 1503960366, 1503960366, 1503960366, 1503960366, 1503960366,
1503960366, 1503960366, 1503960366, 1503960366, 1503960366, 1503960366,
1503960366, 1503960366, 1503960366, 1503960366, 1503960366, 1503960366,
1503960366, 1503960366, 1503960366, 1503960366, 1503960366, 1503960366,
1503960366, 1503960366, 1503960366, 1624580081, 1624580081, 1624580081,
1624580081, 1624580081, 1624580081, 1624580081, 1624580081, 1624580081,
1624580081, 1624580081, 1624580081, 1624580081, 1624580081, 1624580081,
1624580081, 1624580081, 1624580081, 1624580081, 1624580081, 1624580081,
1624580081, 1624580081, 1624580081, 1624580081, 1624580081, 1624580081,
1624580081, 1624580081, 1624580081, 1624580081, 1644430081, 1644430081,
1644430081, 1644430081, 1644430081, 1644430081, 1644430081, 1644430081,
1644430081, 1644430081, 1644430081, 1644430081, 1644430081, 1644430081,
1644430081, 1644430081, 1644430081, 1644430081, 1644430081, 1644430081,
1644430081, 1644430081, 1644430081, 1644430081, 1644430081, 1644430081,
1644430081, 1644430081, 1644430081, 1644430081, 1844505072, 1844505072,
1844505072, 1844505072, 1844505072, 1844505072, 1844505072, 1844505072
), TotalSteps = c(13162, 10735, 10460, 9762, 12669, 9705, 13019,
15506, 10544, 9819, 12764, 14371, 10039, 15355, 13755, 18134,
13154, 11181, 14673, 10602, 14727, 15103, 11100, 14070, 12159,
11992, 10060, 12022, 12207, 12770, 0, 8163, 7007, 9107, 1510,
5370, 6175, 10536, 2916, 4974, 6349, 4026, 8538, 6076, 6497,
2826, 8367, 2759, 2390, 6474, 36019, 7155, 2100, 2193, 2470,
1727, 2104, 3427, 1732, 2969, 3134, 2971, 10694, 8001, 11037,
5263, 15300, 8757, 7132, 11256, 2436, 1223, 3673, 6637, 3321,
3580, 9919, 3032, 9405, 3176, 18213, 6132, 3758, 12850, 2309,
4363, 9787, 13372, 6724, 6643, 9167, 1329, 6697, 4929, 7937,
3844, 3414, 4525, 4597, 197), TotalDistance = c(8.5, 6.96999979,
6.739999771, 6.28000021, 8.159999847, 6.480000019, 8.590000153,
9.880000114, 6.679999828, 6.340000153, 8.130000114, 9.039999962,
6.409999847, 9.800000191, 8.789999962, 12.21000004, 8.529999733,
7.150000095, 9.25, 6.809999943, 9.710000038, 9.659999847, 7.150000095,
8.899999619, 8.029999733, 7.710000038, 6.579999924, 7.71999979,
7.769999981, 8.130000114, 0, 5.309999943, 4.550000191, 5.920000076,
0.980000019, 3.49000001, 4.059999943, 7.409999847, 1.899999976,
3.230000019, 4.130000114, 2.619999886, 5.550000191, 3.950000048,
4.21999979, 1.840000033, 5.440000057, 1.789999962, 1.549999952,
4.300000191, 28.03000069, 4.929999828, 1.370000005, 1.429999948,
1.610000014, 1.120000005, 1.370000005, 2.230000019, 1.129999995,
1.929999948, 2.039999962, 1.929999948, 7.769999981, 5.820000172,
8.020000458, 3.829999924, 11.11999989, 6.369999886, 5.190000057,
8.180000305, 1.769999981, 0.889999986, 2.670000076, 4.829999924,
2.410000086, 2.599999905, 7.210000038, 2.200000048, 6.840000153,
2.309999943, 13.23999977, 4.460000038, 2.730000019, 9.340000153,
1.679999948, 3.190000057, 7.119999886, 9.720000267, 4.889999866,
4.829999924, 6.659999847, 0.970000029, 4.429999828, 3.25999999,
5.25, 2.539999962, 2.25999999, 2.99000001, 3.039999962, 0.129999995
), TrackerDistance = c(8.5, 6.96999979, 6.739999771, 6.28000021,
8.159999847, 6.480000019, 8.590000153, 9.880000114, 6.679999828,
6.340000153, 8.130000114, 9.039999962, 6.409999847, 9.800000191,
8.789999962, 12.21000004, 8.529999733, 7.150000095, 9.25, 6.809999943,
9.710000038, 9.659999847, 7.150000095, 8.899999619, 8.029999733,
7.710000038, 6.579999924, 7.71999979, 7.769999981, 8.130000114,
0, 5.309999943, 4.550000191, 5.920000076, 0.980000019, 3.49000001,
4.059999943, 7.409999847, 1.899999976, 3.230000019, 4.130000114,
2.619999886, 5.550000191, 3.950000048, 4.21999979, 1.840000033,
5.440000057, 1.789999962, 1.549999952, 4.300000191, 28.03000069,
4.929999828, 1.370000005, 1.429999948, 1.610000014, 1.120000005,
1.370000005, 2.230000019, 1.129999995, 1.929999948, 2.039999962,
1.929999948, 7.769999981, 5.820000172, 8.020000458, 3.829999924,
11.11999989, 6.369999886, 5.190000057, 8.180000305, 1.769999981,
0.889999986, 2.670000076, 4.829999924, 2.410000086, 2.599999905,
7.210000038, 2.200000048, 6.840000153, 2.309999943, 13.23999977,
4.460000038, 2.730000019, 9.340000153, 1.679999948, 3.190000057,
7.119999886, 9.720000267, 4.889999866, 4.829999924, 6.659999847,
0.970000029, 4.429999828, 3.25999999, 5.25, 2.539999962, 2.25999999,
2.99000001, 3.039999962, 0.129999995), LoggedActivitiesDistance = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), VeryActiveDistance = c(1.879999995,
1.570000052, 2.440000057, 2.140000105, 2.710000038, 3.190000057,
3.25, 3.529999971, 1.960000038, 1.340000033, 4.760000229, 2.809999943,
2.920000076, 5.289999962, 2.329999924, 6.400000095, 3.539999962,
1.059999943, 3.559999943, 2.289999962, 3.210000038, 3.730000019,
2.460000038, 2.920000076, 1.970000029, 2.460000038, 3.529999971,
3.450000048, 3.349999905, 2.559999943, 0, 0, 0, 0, 0, 0, 1.029999971,
2.150000095, 0, 0, 0, 0, 0, 1.149999976, 0, 0, 1.110000014, 0,
0, 0.899999976, 21.92000008, 0.860000014, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0.140000001, 2.279999971, 0.360000014, 0.219999999,
4.099999905, 2.25, 1.070000052, 0.360000014, 0, 0, 0, 0, 0, 0.589999974,
0.800000012, 0, 0.200000003, 0, 0.629999995, 0.239999995, 0.07,
0.720000029, 0, 0.519999981, 0.819999993, 3.25999999, 0, 2.390000105,
0.879999995, 0, 0, 0, 0, 0, 0, 0.140000001, 0, 0), ModeratelyActiveDistance = c(0.550000012,
0.689999998, 0.400000006, 1.25999999, 0.409999996, 0.779999971,
0.639999986, 1.320000052, 0.479999989, 0.349999994, 1.120000005,
0.870000005, 0.209999993, 0.569999993, 0.920000017, 0.409999996,
1.159999967, 0.5, 1.419999957, 1.600000024, 0.569999993, 1.049999952,
0.870000005, 1.080000043, 0.25, 2.119999886, 0.319999993, 0.529999971,
1.159999967, 1.00999999, 0, 0, 0, 0, 0, 0, 1.519999981, 0.620000005,
0, 0, 0, 0, 0, 0.910000026, 0, 0, 1.870000005, 0.200000003, 0,
1.279999971, 4.190000057, 0.589999974, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 2.299999952, 0.899999976, 2.559999943, 0.150000006, 1.879999995,
0.569999993, 1.669999957, 2.529999971, 0, 0, 0, 0.579999983,
0, 0.059999999, 1.720000029, 0, 2.319999933, 0, 3.140000105,
0.99000001, 0.310000002, 4.090000153, 0, 0.540000022, 0.270000011,
0.790000022, 0, 0.349999994, 0.810000002, 0, 0, 0, 0, 0, 0, 0.259999991,
0.479999989, 0), LightActiveDistance = c(6.059999943, 4.710000038,
3.910000086, 2.829999924, 5.039999962, 2.50999999, 4.710000038,
5.03000021, 4.239999771, 4.650000095, 2.24000001, 5.360000134,
3.279999971, 3.940000057, 5.539999962, 5.409999847, 3.789999962,
5.579999924, 4.269999981, 2.920000076, 5.920000076, 4.880000114,
3.819999933, 4.880000114, 5.809999943, 3.130000114, 2.730000019,
3.74000001, 3.25999999, 4.550000191, 0, 5.309999943, 4.550000191,
5.909999847, 0.970000029, 3.49000001, 1.49000001, 4.619999886,
1.899999976, 3.230000019, 4.110000134, 2.599999905, 5.539999962,
1.889999986, 4.199999809, 1.830000043, 2.460000038, 1.600000024,
1.549999952, 2.119999886, 1.909999967, 3.470000029, 1.340000033,
1.419999957, 1.580000043, 1.120000005, 1.370000005, 2.220000029,
1.129999995, 1.919999957, 2.039999962, 1.919999957, 5.329999924,
2.640000105, 5.099999905, 3.450000048, 5.090000153, 3.549999952,
2.450000048, 5.300000191, 1.75999999, 0.879999995, 2.660000086,
4.25, 2.410000086, 1.950000048, 4.690000057, 2.200000048, 4.309999943,
2.309999943, 9.460000038, 3.230000019, 2.349999905, 4.539999962,
1.659999967, 2.130000114, 6.010000229, 5.670000076, 4.880000114,
2.089999914, 4.96999979, 0.949999988, 4.429999828, 3.25999999,
5.230000019, 2.539999962, 2.25999999, 2.589999914, 2.559999943,
0.129999995), SedentaryActiveDistance = c(0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0.01, 0, 0, 0.01, 0.01, 0, 0, 0.02, 0, 0.01, 0,
0.02, 0.01, 0, 0, 0, 0.01, 0.02, 0, 0.02, 0, 0.02, 0.01, 0, 0,
0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0.01, 0.01, 0.01, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.02, 0.01, 0.02, 0.01, 0, 0.01,
0.01, 0.01, 0, 0, 0, 0, 0, 0, 0, 0), VeryActiveMinutes = c(25,
21, 30, 29, 36, 38, 42, 50, 28, 19, 66, 41, 39, 73, 31, 78, 48,
16, 52, 33, 41, 50, 36, 45, 24, 37, 44, 46, 46, 36, 0, 0, 0,
0, 0, 0, 15, 17, 0, 0, 0, 0, 0, 16, 0, 0, 17, 0, 0, 11, 186,
7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 30, 5, 3, 51, 29, 15, 5,
0, 0, 0, 0, 0, 8, 11, 0, 3, 0, 9, 3, 1, 10, 0, 6, 11, 41, 0,
32, 12, 0, 0, 0, 0, 0, 0, 2, 0, 0), FairlyActiveMinutes = c(13,
19, 11, 34, 10, 20, 16, 31, 12, 8, 27, 21, 5, 14, 23, 11, 28,
12, 34, 35, 15, 24, 22, 24, 6, 46, 8, 11, 31, 23, 0, 0, 0, 0,
0, 0, 22, 7, 0, 0, 0, 0, 0, 18, 0, 0, 36, 5, 0, 23, 63, 6, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 51, 16, 58, 4, 42, 13, 33, 58, 0,
0, 0, 15, 0, 1, 41, 0, 53, 0, 71, 24, 7, 94, 0, 12, 6, 17, 0,
6, 19, 0, 0, 0, 0, 0, 0, 8, 12, 0), LightlyActiveMinutes = c(328,
217, 181, 209, 221, 164, 233, 264, 205, 211, 130, 262, 238, 216,
279, 243, 189, 243, 217, 246, 277, 254, 203, 250, 289, 175, 203,
206, 214, 251, 0, 146, 148, 236, 96, 176, 127, 202, 141, 151,
186, 199, 227, 185, 202, 140, 154, 115, 150, 224, 171, 166, 96,
118, 117, 102, 182, 152, 91, 139, 112, 107, 256, 135, 252, 170,
212, 186, 121, 278, 125, 38, 86, 160, 89, 94, 223, 118, 227,
120, 402, 146, 148, 221, 52, 81, 369, 243, 295, 303, 155, 49,
339, 248, 373, 176, 147, 199, 217, 10), SedentaryMinutes = c(728,
776, 1218, 726, 773, 539, 1149, 775, 818, 838, 1217, 732, 709,
814, 833, 1108, 782, 815, 712, 730, 798, 816, 1179, 857, 754,
833, 574, 835, 746, 669, 1440, 1294, 1292, 1204, 1344, 1264,
1276, 1214, 1299, 1289, 1254, 1241, 1213, 1221, 1238, 1300, 1233,
1320, 1290, 1182, 1020, 1261, 1344, 1322, 1323, 1338, 1258, 1288,
1349, 1301, 1328, 890, 1131, 1259, 1125, 1263, 1135, 1212, 1271,
1099, 1315, 1402, 1354, 1265, 1351, 1337, 1165, 1322, 1157, 1193,
816, 908, 682, 1115, 1388, 1341, 1054, 1139, 991, 1099, 1254,
713, 1101, 1192, 843, 527, 1293, 1231, 1211, 1430), Calories = c(1985,
1797, 1776, 1745, 1863, 1728, 1921, 2035, 1786, 1775, 1827, 1949,
1788, 2013, 1970, 2159, 1898, 1837, 1947, 1820, 2004, 1990, 1819,
1959, 1896, 1821, 1740, 1819, 1859, 1783, 0, 1432, 1411, 1572,
1344, 1463, 1554, 1604, 1435, 1446, 1467, 1470, 1562, 1617, 1492,
1402, 1670, 1401, 1404, 1655, 2690, 1497, 1334, 1368, 1370, 1341,
1474, 1427, 1328, 1393, 1359, 1002, 3199, 2902, 3226, 2750, 3493,
3011, 2806, 3300, 2430, 2140, 2344, 2677, 2413, 2497, 3123, 2489,
3108, 2498, 3846, 2696, 2580, 3324, 2222, 2463, 3328, 3404, 2987,
3008, 2799, 1276, 2030, 1860, 2130, 1725, 1657, 1793, 1814, 1366
), ActivitiesDate = c("12/04/2016", "13/04/2016", "14/04/2016",
"15/04/2016", "16/04/2016", "17/04/2016", "18/04/2016", "19/04/2016",
"20/04/2016", "21/04/2016", "22/04/2016", "23/04/2016", "24/04/2016",
"25/04/2016", "26/04/2016", "27/04/2016", "28/04/2016", "29/04/2016",
"30/04/2016", "01/05/2016", "02/05/2016", "03/05/2016", "04/05/2016",
"05/05/2016", "06/05/2016", "07/05/2016", "08/05/2016", "09/05/2016",
"10/05/2016", "11/05/2016", "12/05/2016", "12/04/2016", "13/04/2016",
"14/04/2016", "15/04/2016", "16/04/2016", "17/04/2016", "18/04/2016",
"19/04/2016", "20/04/2016", "21/04/2016", "22/04/2016", "23/04/2016",
"24/04/2016", "25/04/2016", "26/04/2016", "27/04/2016", "28/04/2016",
"29/04/2016", "30/04/2016", "01/05/2016", "02/05/2016", "03/05/2016",
"04/05/2016", "05/05/2016", "06/05/2016", "07/05/2016", "08/05/2016",
"09/05/2016", "10/05/2016", "11/05/2016", "12/05/2016", "12/04/2016",
"13/04/2016", "14/04/2016", "15/04/2016", "16/04/2016", "17/04/2016",
"18/04/2016", "19/04/2016", "20/04/2016", "21/04/2016", "22/04/2016",
"23/04/2016", "24/04/2016", "25/04/2016", "26/04/2016", "27/04/2016",
"28/04/2016", "29/04/2016", "30/04/2016", "01/05/2016", "02/05/2016",
"03/05/2016", "04/05/2016", "05/05/2016", "06/05/2016", "07/05/2016",
"08/05/2016", "09/05/2016", "10/05/2016", "11/05/2016", "12/04/2016",
"13/04/2016", "14/04/2016", "15/04/2016", "16/04/2016", "17/04/2016",
"18/04/2016", "19/04/2016")), row.names = c(NA, -100L), class = c("tbl_df",
"tbl", "data.frame"))

Hi, Thank you for responding. I am not sure I understand what you are saying. All the chunks are there, and the data shows in the Environment pane. See below; I also provide more for the first reply. You may check that too.

Cleaning and transforming data

read.csv("dailyactivity.csv")
read.csv("minutesleep.csv")

THIS CHUNK RUNS WITH NO ISSUE; I JUST WANTED TO TEST.

sum(duplicated(dailyactivity))
sum(duplicated(minutesleep))

When I knit, it say 'dailyactivity' not found. if I removed the first line, then it will say 'minutessleep' not found, and so on. It means that I have to remove all, then nothing will be left to knit.

You load the data but not put a object.

dailyactivity <- read.csv("dailyactivity.csv")
minutesleep<- read.csv("minutesleep.csv")

# next run the other code.

I think @ M_AcostaCH has found the problem.

However you should not install packages within a Rmarkdown document. Always install the packages in the global environment.

Also

install.packages ("ggplot2")
--- 

is redundant as it is installed when you install *tidyverse*.

Likewise these library calls are redundant as they are all loaded when you do *library(tidyverse)*.

library(ggplot2)
library(lubridate)
library(dplyr)
library(tidyr)

It does not do any harm to make those calls but doing so just adds clutter.

Thank you for the help. I am trying to apply that to different chunks, it went ok with two, then error is showing again at the following chunks, saying 'sleepday' not found.

sleepdayunique <- sleepday %>%
unique()
sum(duplicated(sleepdayunique))

You have done

sleepday   <- read.csv("sleepday.csv"

or otherwise defined sleepday?

Hi,
Thank you all; I managed to go a few steps further, but the knitting was halted again at this chunk.

dailyactivity %>%  
  select(TotalSteps, SedentaryMinutes, FairlyActiveMinutes, VeryActiveMinutes,
         Calories) %>%
  summary()

Quitting from lines at lines 136-140 [unnamed-chunk-12] (FitBit-Data-Analysis.Rmd)

Error in UseMethod():
! no applicable method for 'select' applied to an object of class "c('double', 'numeric')"
Backtrace:

  1. ... %>% summary()
  2. dplyr::select(...)
    Execution halted

What is the result of:

names(dailyactivity )
str(dailyactivity)

I do not know where the problem is but

dailyactivity %>%  
  select(TotalSteps, SedentaryMinutes, FairlyActiveMinutes, VeryActiveMinutes,
         Calories) %>%
  summary()

is working fine for me. I think you need to give us the rest of the .Rmd as you did before.

Hi again,
I have tried so many things, no luck. At some point, it was saying the columns don't exist in 'dailyactivity' dataframe. The data is the same. Here the code from the first one after loading packages.

Cleaning and transforing data

dailyactivity <- read.csv("dailyactivity.csv")
minutesleep <- read.csv("minutesleep.csv")
sum(duplicated(dailyactivity))
sum(duplicated(minutesleep))
dailyactivity <- colSums(is.na(dailyactivity))
minutesleep<- colSums(is.na(minutesleep))

There are no missing data in the 2 dataframes. However, because the minuteSleep_merged dataframe has 543 duplicates, it was dropped and replaced with sleepDay_merged dataframe with only 3 duplicates that have been removed, and the dataframe renamed to sleepDay_merged_unique.

sleepday   <- read.csv("sleepday.csv")
sleepdaymissingdata <- sum(duplicated("sleepday.csv"))
sleepdayunique <- sleepday %>% 
unique()
sum(duplicated(sleepdayunique))

The sleepdayunique dataframe date (sleepDay) has been transformed into date format to be the same as the other dataframes transformed date formats.

All dates and/or time in the 2 dataframes were formatted and changed to a uniform name: activitiesdate, and columns renamed as necessary.

names(sleepdayunique)[2] <- 'ActivitiesDate'
head(sleepdayunique)
nrow(dailyactivity)
nrow(sleepdayunique)

Bias

After this data cleaning and transformation process, I assume that any possible bias in the data has been substantially reduced to a minimum.

Insufficient data

Because there is no time to collect new data, and extra data is not available, the analysis will be based on the existing data only.

Data analysis

Summary statistics about each dataframe

For the daily activity dataframe:

dailyactivity %>%
  select(TotalSteps, SedentaryMinutes, FairlyActiveMinutes, VeryActiveMinutes, Calories) %>%
  summary()

For the sleep day unique dataframe:

sleepdayunique %>%  
  select(TotalSleepRecords,
         TotalMinutesAsleep,
         TotalTimeInBed) %>%
  summary()

Plotting some exploration

Let see what these summaries tell us about how this sample of people's activities, starting with the relationship between total steps taken and calories used.

ggplot(data=dailyactivity) + 
  geom_smooth(mapping = aes(x=TotalSteps, y=Calories)) +
  labs(title = 'Total steps taken vs calories burned')
           

Hi, sorry I missed this yesterday. The results showed:
names(dailyactivity )
NULL

str(dailyactivity)
chr(0)
After that I imported the file again, but still it is telling the 'dailyactivity ' is not found.

I think it is telling you that *dailyactivity * exists as a data.frame or tibble, etc., but it in empty.

What happens if you read it in again and do

str(dailyactivity)

My guess is that somehow you managed to wipe out the original data.set.

We really need to see ALL of your code.

I deleted the data and imported it again. Still, when knitting, at some point, it says some columns are not found. Earlier, I sent all the code until just after where it stops.

You write

We do not see this in your code, Therefore we are not seeing All of your code. We really need to see everything you are doing, not what you think may be significant.

Hi again
These are the codes. I started this in R base first, but there were programs missing apparently when I wanted to knit. Then I moved to R Studio, and had to redo most of the work. I will send that dailyactivity dataset. But is has 940 rows; so I not sure if I will be able to do so.

sum(duplicated(dailyActivity_merged))
0
sum(duplicated(minuteSleep_merged))
543

Bouba_Ismaila, post:19, topic:186067, full:true"]

  • I will send that dailyactivity dataset. But is has 940 rows; so I not sure if I will be able to do so.*

We probably do not need that much data. In the case of this large a dataset something like dput(head(mydata, 100)) should supply the data we need. Just do dput(mydata, 100) where mydata is your data. Copy the output and paste it here between
```

```

What we really need is your ALL of your code.

Copy the code and paste it here between
```

```