-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
calculating the distance between two centroids #404
Comments
These things are showing slightly different things, and the details can be important. Using the following as an example to illustrate data(varespec)
## Bray-Curtis distances between samples
dis <- vegdist(varespec)
## First 16 sites grazed, remaining 8 sites ungrazed
groups <- factor(c(rep(1,16), rep(2,8)), labels = c("grazed","ungrazed"))
## Calculate multivariate dispersions
mod <- betadisper(dis, groups) The The printed output from > mod
Homogeneity of multivariate dispersions
Call: betadisper(d = dis, group = groups)
No. of Positive Eigenvalues: 15
No. of Negative Eigenvalues: 8
Average distance to median:
grazed ungrazed
0.3926 0.2706
Eigenvalues for PCoA axes:
(Showing 8 of 23 eigenvalues)
PCoA1 PCoA2 PCoA3 PCoA4 PCoA5 PCoA6 PCoA7 PCoA8
1.7552 1.1334 0.4429 0.3698 0.2454 0.1961 0.1751 0.1284 If you look at the boxplots of these distances by group boxplot(mod) you'll see that the one for The same thing applies in your example too. The output from > TukeyHSD(mod)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = distances ~ group, data = df)
$group
diff lwr upr p adj
ungrazed-grazed -0.1219422 -0.2396552 -0.004229243 0.0429502 where the > with(mod, tapply(distances, group, mean))
grazed ungrazed
0.3925879 0.2706457
> diff(with(mod, tapply(distances, group, mean)))
ungrazed
-0.1219422 The first line of the output above is the average (sample mean) distance to group centre for each group, while the second line of code computes the difference of these two samples means, which is the sample estimate of the difference of means, which is what is reported by If you want to compute the distance between the centres then > with(mod, dist(centroids))
grazed
ungrazed 0.5155369 So the above is the Euclidean distance between the group centres - not accounting for axes with negative eigenvalues, so it is wrong. You'd need to compute the squared Euclidean distance between centroids for the axes with positive eigenvalues separately from the distance among the axes with negative eigenvalues and then do: sqrt(dist.pos - dist.neg) and also worry about situations where The use of |
Thanks a lot for the feedback: it was really useful! Just a quick follow up: when you say: "and also worry about situations where dist.neg > dist.pos". Does that mean than in those cases I should just invert the order and that´s all ( i mean dist.neg - dist.pos). Thank you again for all the effort you put into answer these questions! It really helps understanding better what happens "behind the scenes" in vegan! |
HI, I want measure the difference between the spatial medians of two groups. The data comes from using bray distances between samples.Some samples were under control and other under a drought treatment. I found that betadisper followed by TukeyHSD does the trick. However, when I see the medians in the boxplots and check them by eye, it does not seem to correspond with value provided by TukeyHSD. As an example:
ejemplo<-OTU_abundance6[c(1:10),]
y<-vegdist(ejemplo[,c(1:2113)],method = "bray")
rw<-betadisper(y,ejemplo$Treatment)
If I do boxplot
boxplot(rw)
I get:
Now calculating TukeyHSD;
TukeyHSD(rw)
It returns a negative difference between the spatial medians (i.e. diff):
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = distances ~ group, data = df)
I thought I might be able to calculate by myself the difference between the medians by doing this:
dist(rw$centroids)
which returns a different distance:
Control
Drought 0.1620314
So, in summary, I want to be reallly sure of what is the real difference between the spatial medians between the two groups (i.e. drought vs control)
The text was updated successfully, but these errors were encountered: