I have a rather simple problem for which I would appreciate some help since I'm new to R. I 'm working with a dataset which includes amongst other things certain experimental studies on a topic and the countries in which these experiments were conducted. I want to create a contingency table with the absolute frequencies of the countries to visualize the countries in which this issue has been studied the most. The problem however is that in some cases a certain study spreads in two or three countries and I am not sure how I should include this in the dataset to be loaded in my code.
Any help on how I should organize my data and my code would be very appreciated. Thank you!
I would treat the multis as a “virtual” country unless I had confidence that the data was proportional to population. And do you know about table() to do contingency tables?
Yes, I know how to use the table() function, I'm just not sure how to organize my data to capture the studies with two or more countries as categorical variables. Is there a way to create an attribute in a dataframe that somehow accepts more than one values or shall I just manually create some extra rows for the ''two-country studies''?
@Techzill I haven't created my code yet but you can imagine a dataset that looks like this:
Study 1 -- Germany
Study 2 -- USA, Canada
Study 3 -- France
Study 4 -- Portugal, Poland, Italy
and so on.
And the question here is how to organize the data to create a contingency table with absolute frequencies of all the countries in the dataset. Or any other visualization showing where these studies have been carried out.