Hello.
I am attempting to create a ggplot2 plot of a linear discriminant analysis of my data. I have done so without issues in the past. However, I notice that the plotted data appears 'inverted'--points that should be below zero on the Y axis/the regression line which I separately and initially plotted as a frame of reference are appearing above it, and vice-versa.
My (modified) code consists of the following.
For the initial plot, to yield an idea of which points will lie above and below the regression line. I include this for completeness, as maybe I made an error with my code here.
Create dataframe
Size<-c(6,6,6,8,8,8,10,10,10,12,12,12,15,15,15,6,6,8,8,8,10,10,10,12,12,12,15,15,15,6,6,6,8,10,10,10,12,12,12,15,15,6,8,8,8,10,10,10,12,12,15,15)
Category<-c("ClassII", "ClassII", "ClassII", "ClassII", "ClassII", "ClassII", "ClassII", "ClassII", "ClassII", "ClassII", "ClassII", "ClassII", "ClassII", "ClassII", "ClassII", "ClassIII", "ClassIII", "ClassIII", "ClassIII", "ClassIII", "ClassIII", "ClassIII", "ClassIII", "ClassIII", "ClassIII", "ClassIII", "ClassIII", "ClassIII", "ClassIII", "ClassI", "ClassI", "ClassI", "ClassI", "ClassI", "ClassI", "ClassI", "ClassI", "ClassI", "ClassI", "ClassI", "ClassI", "ClassIV", "ClassIV", "ClassIV", "ClassIV", "ClassIV", "ClassIV", "ClassIV", "ClassIV", "ClassIV", "ClassIV", "ClassIV")
H<-c(0.4597714,0.3384975,0.2438867,0.5773447,0.5424548,0.5225763,0.5773447,0.5424548,0.5225763,0.6188187,0.5979812,0.5321799,0.6028551,0.4706633,0.4867061,0.3674625,0.3430894,0.3102022,0.4380490,0.4037123,0.3904491,0.3952290,0.3964599,0.5618259,0.5479117,0.6004870,0.5838193,0.5983880,0.5864260,0.6313169,0.5161577,0.5822030,0.6525793,0.4346467,0.4190352,0.4248726,0.5149471,0.5433182,0.4797744,0.5149471,0.5433182,0.3071416,0.3227957,0.5113163,0.5167215,0.3055734,0.2595054,0.2697147,0.1945752,0.1844296,0.4543830,0.4506419)
D<-c(17.060473,17.247823,17.487762,14.783000,13.305876,11.955035,15.569631,16.330392,15.297604,13.801903,13.316480,12.114558,14.744418,16.776991,14.128221,42.428042,40.711409,45.048931,44.613229,34.386670,23.555482,24.578951,22.834340,16.106533,19.230402,18.609950,25.945419,17.957438,24.540131,9.217218,8.346780,8.350304,8.931497,7.871861,7.627603,8.483040,8.952785,7.902581,4.846481,9.441160,9.461342,34.636275,33.427111,36.670034,19.104717,34.539788,44.268683,38.370184,31.623433,33.561326,45.195551,27.661643)
data<-data.frame(Size,Category,H,D)
print(data)
##Create Regression Plot
RegressionPlot<- ggplot(data, aes(x=D, y=H)) + geom_point(aes(x = D, y = H, color = data$Size, shape=data$Category), size = 4) + scale_color_gradient(breaks=c(6, 8, 10, 12, 15),low = "blue1", high = "red1")+xlab("D") +ylab("H")+theme_classic()+theme(legend.position = "none")+ geom_smooth(method='lm', formula= y~x)+ stat_regline_equation(label.x = 30, label.y = .5) + stat_cor(label.x = 30, label.y = .4)
RegressionPlot
For the LDA plot, where I believe the error most likely lies:
varsDH <- cbind(data$H, data$D)
post_hocDH <-lda(data$Category~ varsDH, CV = F)
plot_ldaDHbyCategory <- data.frame(data[, "H"], lda =predict(post_hocDH)$x)
ggplot(plot_ldaDHbyCategory ) + geom_point(aes(x = lda.LD1, y = lda.LD2, color = data$Size, shape=data$Category), size = 4) + theme_classic() + scale_color_gradient(breaks=c(6, 8, 10, 12, 15),low = "blue1", high = "red1")+ xlab("D/H ratio") + ylab("Deviation from regression line")+theme(legend.position = "none")
I would like to know where I may be going wrong and how to rectify this issue of the deviation from 0 in my LDA plot being inverted--points that should negatively deviate appear as positive deviations, and vice versa.
Thank you.