After gathering all the data needed for the 12 months of the year, i.e. 'Frequent Monthly Routes' I am looking to consolidate all this data into one table that depicts the 'Annual Top Routes' that contains all the same data points as the 'Frequent Monthly Routes' for both members and casual users, depicting the annual totals. Currently, all the monthly data (for both members and casuals) are in separate Excel Workbooks organized by month and are displayed in Pivot Tables. What I thought to do, in considering the best method in extracting the top annual routes by users, is to create a separate workbook that organizes the data by user status in two separate sheets i.e. casual user data in one sheet and member user data in another. The data in both respective sheets will be organized by month. What I need to do is simply extract the totals listed throughout the months of the year to solve for the top annual routes. Let me share a snapshot of what I began to create but not quite finished:
Only the months for December, January is currently there. As you can see this sheet shows only the casual user data and is listed by month. The data displayed are the monthly (Dec & Jan) totals for frequented routes. I need to find any matching route_names and tally up the totals to create my annual dataset. For example, as you can see the route 'Michigan Ave & 18th St to Michigan Ave & 18th St' is repeated between Jan & Dec. I would like to figure out a way to implement a count to tally up all matching route_names and their respective data and be able to implement this across all 12 months once listed in the sheet.
If you don't understand what it is I am trying to accomplish I have included a short Loom video explaining what I am trying to accomplish.
Would it make more sense to upload this data into an SQL server to implement code to get totals or how would I implement this with R?
As I am reading this , there are 12 sheets, each sheet has the figures for members and casual riders. Each has the same layout , that is, the same number of columns with the same column names but possibly more or fewer rides ond riders per month. Is this correct?
In the screen shot are you showing us the pivot.tables rather than the raw data? It would probably be best for us to have some idea of the layout of the raw data.
It should be relatively easy to import data from a few individual sheets, combine them and do some number-crunching on the data in R once we have a good idea of the raw data layout. R has a number of packages to read Excel files and the analyses look straight-forward
Actually, as mentioned above. I am consolidating all of the monthly data into two separate sheets - one representing casual users - the other representing member users. Each of those respective sheets has monthly data throughout the year. The data is not in pivot tables, the original data was. The current data are normal values. As mentioned, I am looking to get the totals of all the monthly routes to create the top annual routes. The video should explain in a pretty thorough manner what I am trying to accomplish. The image above shows Dec & Jan in the Casual users sheet. I would like to write code that would return the totals of all the routes.
Okay, I still suspect a language problem between my raw and your normal but do you want to consolidate all the data into two data sets using R or just do it manually in Excel and then do the analysis in R.
Given the amount of data it's probably about the same amount of work either way. If the data is not confidential, would it be possible to post a file somewhere, perhaps at dropbox, for us to have a look at it? I may be worrying over nothing but I am more concerned about leading the data than the analysis which looks pretty straightforward