Multiple values for one attribute


I have a rather simple problem for which I would appreciate some help since I'm new to R. I 'm working with a dataset which includes amongst other things certain experimental studies on a topic and the countries in which these experiments were conducted. I want to create a contingency table with the absolute frequencies of the countries to visualize the countries in which this issue has been studied the most. The problem however is that in some cases a certain study spreads in two or three countries and I am not sure how I should include this in the dataset to be loaded in my code.

Any help on how I should organize my data and my code would be very appreciated. Thank you!

I would treat the multis as a “virtual” country unless I had confidence that the data was proportional to population. And do you know about table() to do contingency tables?

Yes, I know how to use the table() function, I'm just not sure how to organize my data to capture the studies with two or more countries as categorical variables. Is there a way to create an attribute in a dataframe that somehow accepts more than one values or shall I just manually create some extra rows for the ''two-country studies''?

Hi @anton_13 , can you share your codes?

1 Like

@Techzill I haven't created my code yet but you can imagine a dataset that looks like this:

Study 1 -- Germany
Study 2 -- USA, Canada
Study 3 -- France
Study 4 -- Portugal, Poland, Italy

and so on.

And the question here is how to organize the data to create a contingency table with absolute frequencies of all the countries in the dataset. Or any other visualization showing where these studies have been carried out.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.