Hi all,
I'm struggling with analysing a data set that is predominantly qualitative, categorical data. My data is basically organised as follows: the presence of three different mosquito species (yes/no or 1/0), and which season, year, month, locality, collection method and sex are associated with each entry. I don't know if that makes sense. I've attached some sample data below that will hopefully clarify. I have tried making some of the data continuous by converting to count data.
Orig collection date | Month | Year | Season | Locality | An. Merus | An. arabiensis | An. quadriannulatus | Identification | Count | Sex | Collection Method |
---|---|---|---|---|---|---|---|---|---|---|---|
27/10/2009 | October | 2009 | Spring | Block A | 1 | 0 | 0 | An. merus | 1 | Male | Larvae |
27/10/2009 | October | 2009 | Spring | Block A | 1 | 0 | 0 | An. merus | 1 | Male | Larvae |
30/04/2015 | April | 2015 | Autumn | Block A | 0 | 0 | 1 | An. quadriannulatus | 1 | Male | Outdoor pot/bucket |
30/04/2015 | April | 2015 | Autumn | Block A | 0 | 0 | 1 | An. quadriannulatus | 1 | Male | Outdoor pot/bucket |
16/03/2016 | March | 2016 | Autumn | Vlakbult | 0 | 1 | 0 | An. arabiensis | 1 | Female | CO₂ tent |
16/03/2016 | March | 2016 | Autumn | Vlakbult | 0 | 1 | 0 | An. arabiensis | 1 | Female | CO₂ tent |
Because of the nature of the data, parametric tests aren't appropriate, and a non-parametric test such as Kruskal-Wallis also doesn't work. A friend suggested I try correlation tests, so I tried Spearman's Rank Correlation, and additionally Wilcoxon Signed Rank tests. I got some statistical results from that, but I'm worried that the assumptions for the tests are not met. My data are not normally distributed, nor homogenous, as per Shapiro-Wilk and Levene's test. I would like to do multi-variable multi-comparison tests if possible. The Wilcoxon Signed Rank test gave me a significant p-value, but I don't know what post-hoc test to do then.
My research questions are as follows:
- Is there a significant difference between species abundance across seasons? I.e. species 1's abundance in Summer, Winter, Spring and Fall vs Species 2 and Species 3.
2). Is there a significant difference between species abundance across years? I.e. species 1's abundance in Year 1 etc vs Species 2 and Species 3. - Is there a significant difference in species abundance between collection methods?
- Is there a significant difference in species abundance between locations?
Any advice on how to analyse the data would be greatly appreciated!