Quick background: extensively analysed with SAS and SPSS in the past, followed a HE course a couple of years ago that used R and found it not too hard, took getting used too, but was not that hard. Haven't used it since.
To get back into using R I thought I'd try to set up a simple dataset+analysis. With data-management to aggregate over time.
I want to use the aggregated daily visitor-numbers per os per day from the analytics.usa.gov site.
os <- read_csv("https://analytics.usa.gov/data/live/os.csv")
The aggregation should happen by occasionally importing the most recent version, taking the union of the two and adding the ones that do not occur in the current table for the date/os combination.
I'd need to check if no new OSs have started being included. and to add monitoring info to a file.
For the analysis I'd use a categorization table with the OS names categorised to mobiile, desktop,gaming-console or other.
In the resulting tables or dashboards I'd want the monthly and rearly sums, percetages etc..
My questions are:
-
I read that using data.tables is probably best for its SQL ability. But Should the data stay there? I'll make separate backups, of course, But I wonder there is advice on what the best way to store the data is? base- frames? Data-tables?
-
In sas I could just use a format (only showing the month/year) to allow summarising per month/year, But in SPSS I would need to recode. How are reports grouped by month/year handled in R-reporting? Do I need to recode (probably by using cast in SQL)?
-
I probably want to have a clickable couple of summaries on a web-page. This is new to me, but it was made to sound really easy.
Is shiny the best way to start looking?
I now it is a simple example, but I hope it will prove a simple way to get these basics done. I am also interested in how well Linux and Chromebooks are doing.
I considered using google-sheets, but you cannot really program and data-manage with sheets imho.
Thanks for your help.
Y.