In this activity, we will be analysing data from an experiment in Drosophila serrata conducted by Jacqueline. She set up a paternal half-sibling breeding design to estimate genetic variance in male cuticular hydrocarbon (CHC) pheromone profile, and conducted a binomal fitness assay using these same individuals to get a measure of their mating success. D. serrata has 8 well-studied CHCs that are known to be a target of sexual selection via female choice, and to have an important ecological role in preventing desiccation. We have measured these traits of interest in all offspring from the breeding design, and we also assayed the mating success of these individuals. Our goal is to better understand the genetic basis of these traits and whether they are under selection. In this activity we will analyse a subset of 4 CHC traits in order to estimate the genetic variance and covariance between them (ie. the G matrix), and to estimate direction and non-linear selection acting on these traits.
The original data were published in Sztepanacz, J.L. and Rundle, H.D., 2012. Reduced genetic variance among high fitness individuals: inferring stabilizing selection on male sexual displays in Drosophila serrata. Evolution, 66(10), pp.3101-3110 and the full data set is available on Dryad.
If you have not done so already, open this .Rmd file in RStudio to take advantage of the RMarkdown formatting.
First, set up your R environment by setting your working directory and loading packages.
First, load the data. We have one file of data, contained in
Data.csv
, and one file with the pedigree, contained in
fullped.csv
. Load these data into R.
data <- read.csv("Data.csv", header = T)
ped <- read.csv("fullped.csv", header = T)
Before using the data, inspect the structure of the data in order to ensure that the classes of all variables are correct. Use this code block to convert variables to factors as needed. The data contains the following variables:
head(data)
## Sire Dam animal Column Block Chosen.Rejected Vial l2 l3
## 1 1 81 321 F 1 1 81 1.759736 0.3584508
## 2 1 81 322 F 1 0 81 1.755448 0.4241581
## 3 1 81 323 F 1 0 81 1.718102 0.3642616
## 4 1 81 324 F 1 1 81 1.728217 0.4719437
## 5 1 81 325 F 1 0 81 1.744923 0.4649696
## 6 1 81 326 B 1 1 81 1.728600 0.4677961
## l4 l5
## 1 0.2405727 0.5794495
## 2 0.2724918 0.3975695
## 3 0.3341148 0.3823888
## 4 0.1828450 0.5979854
## 5 0.2766503 0.5733535
## 6 0.5190923 0.9399905
head(ped)
## id dam sire
## 1 1 NA NA
## 2 2 NA NA
## 3 3 NA NA
## 4 4 NA NA
## 5 5 NA NA
## 6 6 NA NA
data$Sire<-as.factor(data$Sire)
data$Dam<-as.factor(data$Dam)
data$animal<-as.factor(data$animal)
data$Column<-as.factor(data$Column)
data$Block<-as.factor(data$Block)
data$Chosen.Rejected<-as.numeric(data$Chosen.Rejected)
data$Vial<-as.factor(data$Vial)
for (x in 1:3) ped[, x] <- as.factor(ped[, x])
#this function manipulates the pedigree to conform to the structure required for MCMCglmm
ped<-fix_ped(ped=ped)
Before jumping in to analyses, take a moment to explore the data. Produce graphs of your choice to visualize patterns in the distributions of the variables and the relationships between them.
plot(data[,8:11])
hist(data$l2)
hist(data$l3)
hist(data$l4)
hist(data$l5)
#summarise the data
summary(data[,8:11])
## l2 l3 l4 l5
## Min. :1.518 Min. :0.004832 Min. :-0.1383 Min. :-0.003662
## 1st Qu.:1.691 1st Qu.:0.259254 1st Qu.: 0.1557 1st Qu.: 0.537238
## Median :1.716 Median :0.313911 Median : 0.2494 Median : 0.625257
## Mean :1.713 Mean :0.317438 Mean : 0.2614 Mean : 0.630429
## 3rd Qu.:1.739 3rd Qu.:0.369115 3rd Qu.: 0.3311 3rd Qu.: 0.714851
## Max. :1.834 Max. :0.893993 Max. : 1.3929 Max. : 2.666013
# standardise the data to mean 0 variance 1
data_std<-scale(data[,8:11])
colnames(data_std)<-c("l2s", "l3s", "l4s", "l5s")
data<-cbind(data, data_std)
cor(data[,12:15])
## l2s l3s l4s l5s
## l2s 1.0000000 0.4195317 0.1084286 0.3970258
## l3s 0.4195317 1.0000000 0.4772407 0.2405112
## l4s 0.1084286 0.4772407 1.0000000 0.2618420
## l5s 0.3970258 0.2405112 0.2618420 1.0000000
Are there any outliers in the data? How are the data distributed? Do they meet the assumptions of the model you would like to fit? Is there anything interesting to note in the summary statistics? Is there anything else we should be looking for when exploring the data
Do you predict that there is a genetic correlation between any of the traits? If so, why?
If there are any outliers in the data remove them here, and then standardise the data to mean of 0 and variance of 1
pairs(data[,8:11])
#plot l3 vs l5 with animals labelled to figure out which to remove
ggplot(data=data, aes(x = l3, y=l5))+geom_text(aes(label=animal))
#remove outliers
remove<-c("2143", "1093", "2153", "2154")
data <- data %>% filter(!animal %in% remove)
# standardise the data to mean 0 variance 1
data_std<-scale(data[,8:11])
colnames(data_std)<-c("l2s", "l3s", "l4s", "l5s")
data<-cbind(data, data_std)
cor(data[,12:15])
## l2s l3s l4s l5s
## l2s 1.0000000 0.4194997 0.1000275 0.4150147
## l3s 0.4194997 1.0000000 0.4490175 0.1944472
## l4s 0.1000275 0.4490175 1.0000000 0.1976393
## l5s 0.4150147 0.1944472 0.1976393 1.0000000
#Explore pedigree; learn how to plot a pedigree,
#here we are visualizing the pedigree for our interest
#this code draws the pedigree structure; unfortunately for the half-sib full-sib design, it doesn't look very interesting
draw_ped(ped)
#this code draws the additive genetic relatedness matrix-- again not very interesting
draw_pedA(ped)
So you can see what a different kind of pedigree might look like, let’s
draw an example
#here is an example of a pedigree that does look interesting
data(gryphons)
test.pedigree <- fix_ped(gryphons[,1:3])
## draw the gryphon pedigree by pedigree depth
draw_ped(test.pedigree)
draw_pedA(test.pedigree)
Next, build a model to investigate the genetic variance and covariance between the log-contrast CHC traits. Even though we only have 4 traits, we still need to estimate quite a few parameters, in fact,
n*(n+1)/2 parameters, where n is the number of traits. So, 10 in this case.
Use this code block to generate the inverse A matrix that you can
provide to MCMCglmm()
.
Ainv <- inverseA(ped)$Ainv
Then, use this code block to specify your prior and run your model
using MCMCglmm()
. In this model, include Block
and Column
as fixed effects, and vial as a random effect.
Choose a small number of iterations to run in the interest of time.
Check out this tutorial https://devillemereuil.legtux.org/wp-content/uploads/2012/12/tuto_en.pdf
for more information on how ti set up a a multivariate analysis.
#specify a prior for a 4x4 covariance matrix
prior <- list(
G = list(G1 = list(V = diag(4)/3, nu = 1.002),
G2 = list(V = diag(4)/3, nu = 1.002)),
R = list(V = diag(4)/3, nu = 1.002)
)
#build a model
model <- MCMCglmm(cbind(l2s, l3s, l4s, l5s) ~ trait-1+Block+Column,
random = ~ us(trait):animal + us(trait):Vial,
rcov = ~ us(trait):units,
family = c("gaussian", "gaussian", "gaussian", "gaussian"),
ginverse = list(animal = Ainv),
data = data, prior = prior, verbose = T,
#nitt=, burnin=, thin=,
)
##
## MCMC iteration = 0
##
## MCMC iteration = 1000
##
## MCMC iteration = 2000
##
## MCMC iteration = 3000
##
## MCMC iteration = 4000
##
## MCMC iteration = 5000
##
## MCMC iteration = 6000
##
## MCMC iteration = 7000
##
## MCMC iteration = 8000
##
## MCMC iteration = 9000
##
## MCMC iteration = 10000
##
## MCMC iteration = 11000
##
## MCMC iteration = 12000
##
## MCMC iteration = 13000
#summarize the model
summary(model)
##
## Iterations = 3001:12991
## Thinning interval = 10
## Sample size = 1000
##
## DIC: 8959.042
##
## G-structure: ~us(trait):animal
##
## post.mean l-95% CI u-95% CI eff.samp
## traitl2s:traitl2s.animal 0.73013 0.616425 0.8621 37.09
## traitl3s:traitl2s.animal 0.36866 0.224162 0.5136 23.64
## traitl4s:traitl2s.animal 0.10250 -0.038942 0.2207 17.57
## traitl5s:traitl2s.animal 0.34146 0.176057 0.5200 11.82
## traitl2s:traitl3s.animal 0.36866 0.224162 0.5136 23.64
## traitl3s:traitl3s.animal 0.58241 0.363786 0.8533 22.69
## traitl4s:traitl3s.animal 0.22955 0.066578 0.3926 16.23
## traitl5s:traitl3s.animal 0.18952 0.009458 0.3731 14.54
## traitl2s:traitl4s.animal 0.10250 -0.038942 0.2207 17.57
## traitl3s:traitl4s.animal 0.22955 0.066578 0.3926 16.23
## traitl4s:traitl4s.animal 0.42334 0.228743 0.5969 17.84
## traitl5s:traitl4s.animal -0.02578 -0.165363 0.1194 17.15
## traitl2s:traitl5s.animal 0.34146 0.176057 0.5200 11.82
## traitl3s:traitl5s.animal 0.18952 0.009458 0.3731 14.54
## traitl4s:traitl5s.animal -0.02578 -0.165363 0.1194 17.15
## traitl5s:traitl5s.animal 0.43399 0.167340 0.7039 10.13
##
## ~us(trait):Vial
##
## post.mean l-95% CI u-95% CI eff.samp
## traitl2s:traitl2s.Vial 0.14540 0.05981 0.24063 208.74
## traitl3s:traitl2s.Vial 0.01585 -0.07081 0.09102 75.83
## traitl4s:traitl2s.Vial -0.01671 -0.10341 0.06322 35.25
## traitl5s:traitl2s.Vial 0.06243 -0.02613 0.15592 34.60
## traitl2s:traitl3s.Vial 0.01585 -0.07081 0.09102 75.83
## traitl3s:traitl3s.Vial 0.18607 0.06900 0.29903 39.68
## traitl4s:traitl3s.Vial 0.09231 -0.00444 0.18738 34.03
## traitl5s:traitl3s.Vial -0.01054 -0.08892 0.07311 36.07
## traitl2s:traitl4s.Vial -0.01671 -0.10341 0.06322 35.25
## traitl3s:traitl4s.Vial 0.09231 -0.00444 0.18738 34.03
## traitl4s:traitl4s.Vial 0.27150 0.16067 0.40355 75.12
## traitl5s:traitl4s.Vial 0.08774 0.01012 0.17247 50.70
## traitl2s:traitl5s.Vial 0.06243 -0.02613 0.15592 34.60
## traitl3s:traitl5s.Vial -0.01054 -0.08892 0.07311 36.07
## traitl4s:traitl5s.Vial 0.08774 0.01012 0.17247 50.70
## traitl5s:traitl5s.Vial 0.21480 0.10612 0.33037 23.34
##
## R-structure: ~us(trait):units
##
## post.mean l-95% CI u-95% CI eff.samp
## traitl2s:traitl2s.units 0.081419 0.028345 0.13651 31.722
## traitl3s:traitl2s.units 0.016870 -0.063477 0.07606 19.576
## traitl4s:traitl2s.units -0.030458 -0.094219 0.03539 15.471
## traitl5s:traitl2s.units -0.007396 -0.087412 0.08141 9.207
## traitl2s:traitl3s.units 0.016870 -0.063477 0.07606 19.576
## traitl3s:traitl3s.units 0.201377 0.069765 0.31263 18.759
## traitl4s:traitl3s.units 0.079926 0.001138 0.16716 15.045
## traitl5s:traitl3s.units 0.021785 -0.068093 0.11265 13.110
## traitl2s:traitl4s.units -0.030458 -0.094219 0.03539 15.471
## traitl3s:traitl4s.units 0.079926 0.001138 0.16716 15.045
## traitl4s:traitl4s.units 0.192345 0.098933 0.28151 16.244
## traitl5s:traitl4s.units 0.129597 0.055261 0.20123 16.242
## traitl2s:traitl5s.units -0.007396 -0.087412 0.08141 9.207
## traitl3s:traitl5s.units 0.021785 -0.068093 0.11265 13.110
## traitl4s:traitl5s.units 0.129597 0.055261 0.20123 16.242
## traitl5s:traitl5s.units 0.257178 0.102059 0.37786 8.478
##
## Location effects: cbind(l2s, l3s, l4s, l5s) ~ trait - 1 + Block + Column
##
## post.mean l-95% CI u-95% CI eff.samp pMCMC
## traitl2s 0.09852 -0.06774 0.23949 1000.0 0.198
## traitl3s 0.09211 -0.03830 0.22140 249.2 0.190
## traitl4s 0.06703 -0.06621 0.20729 437.3 0.328
## traitl5s 0.06745 -0.06553 0.19975 515.9 0.338
## Block2 -0.13993 -0.30039 0.03622 101.0 0.116
## ColumnF -0.01096 -0.05464 0.03231 1000.0 0.658
After running your model, assess the MCMC run to see whether it ran for a sufficient length of time. What you are looking for is a trace plot that looks like a fuzzy caterpillar, and a density plot that looks like a normal distribution
#plot the trace and posterior density for a variable
#plot the trace and posterior density for the `traittarsus:traittarsus.animal` variable
plot(model$Sol)
plot(model$VCV[,1:10])
plot(model$VCV[,11:20])
plot(model$VCV[,21:32])
#calculate autocorrelation between estimates
autocorr.diag(model$VCV)
## traitl2s:traitl2s.animal traitl3s:traitl2s.animal
## Lag 0 1.00000000 1.0000000
## Lag 10 0.77671532 0.8745685
## Lag 50 0.56604086 0.7642979
## Lag 100 0.36219701 0.6425177
## Lag 500 0.05261015 0.2048663
## traitl4s:traitl2s.animal traitl5s:traitl2s.animal
## Lag 0 1.0000000 1.0000000
## Lag 10 0.9183136 0.9103215
## Lag 50 0.8046875 0.7607653
## Lag 100 0.6830884 0.6162243
## Lag 500 0.1817047 0.1523944
## traitl2s:traitl3s.animal traitl3s:traitl3s.animal
## Lag 0 1.0000000 1.00000000
## Lag 10 0.8745685 0.91984697
## Lag 50 0.7642979 0.79499829
## Lag 100 0.6425177 0.66316894
## Lag 500 0.2048663 0.08397362
## traitl4s:traitl3s.animal traitl5s:traitl3s.animal
## Lag 0 1.0000000 1.0000000
## Lag 10 0.9162679 0.9464407
## Lag 50 0.7961105 0.8452590
## Lag 100 0.6809388 0.7317542
## Lag 500 0.1148138 0.1995741
## traitl2s:traitl4s.animal traitl3s:traitl4s.animal
## Lag 0 1.0000000 1.0000000
## Lag 10 0.9183136 0.9162679
## Lag 50 0.8046875 0.7961105
## Lag 100 0.6830884 0.6809388
## Lag 500 0.1817047 0.1148138
## traitl4s:traitl4s.animal traitl5s:traitl4s.animal
## Lag 0 1.0000000 1.0000000
## Lag 10 0.9301880 0.9684940
## Lag 50 0.8295949 0.8998259
## Lag 100 0.7343611 0.8189337
## Lag 500 0.2090379 0.3187443
## traitl2s:traitl5s.animal traitl3s:traitl5s.animal
## Lag 0 1.0000000 1.0000000
## Lag 10 0.9103215 0.9464407
## Lag 50 0.7607653 0.8452590
## Lag 100 0.6162243 0.7317542
## Lag 500 0.1523944 0.1995741
## traitl4s:traitl5s.animal traitl5s:traitl5s.animal
## Lag 0 1.0000000 1.0000000
## Lag 10 0.9684940 0.9658395
## Lag 50 0.8998259 0.8771823
## Lag 100 0.8189337 0.7866685
## Lag 500 0.3187443 0.3137599
## traitl2s:traitl2s.Vial traitl3s:traitl2s.Vial traitl4s:traitl2s.Vial
## Lag 0 1.00000000 1.00000000 1.00000000
## Lag 10 0.49687365 0.61312066 0.56278496
## Lag 50 0.19156170 0.35023029 0.35035117
## Lag 100 0.22061045 0.31395526 0.29606998
## Lag 500 -0.00442625 0.08328054 0.05239741
## traitl5s:traitl2s.Vial traitl2s:traitl3s.Vial traitl3s:traitl3s.Vial
## Lag 0 1.00000000 1.00000000 1.00000000
## Lag 10 0.56799574 0.61312066 0.53261983
## Lag 50 0.34315889 0.35023029 0.30752863
## Lag 100 0.31270951 0.31395526 0.29030163
## Lag 500 0.06132009 0.08328054 0.05881093
## traitl4s:traitl3s.Vial traitl5s:traitl3s.Vial traitl2s:traitl4s.Vial
## Lag 0 1.00000000 1.00000000 1.00000000
## Lag 10 0.48860877 0.60430213 0.56278496
## Lag 50 0.34512310 0.43007083 0.35035117
## Lag 100 0.30117529 0.38894663 0.29606998
## Lag 500 0.07704659 0.08240153 0.05239741
## traitl3s:traitl4s.Vial traitl4s:traitl4s.Vial traitl5s:traitl4s.Vial
## Lag 0 1.00000000 1.0000000 1.0000000
## Lag 10 0.48860877 0.4554379 0.6085302
## Lag 50 0.34512310 0.3405114 0.5142248
## Lag 100 0.30117529 0.2879653 0.4577181
## Lag 500 0.07704659 0.1199511 0.2322306
## traitl2s:traitl5s.Vial traitl3s:traitl5s.Vial traitl4s:traitl5s.Vial
## Lag 0 1.00000000 1.00000000 1.0000000
## Lag 10 0.56799574 0.60430213 0.6085302
## Lag 50 0.34315889 0.43007083 0.5142248
## Lag 100 0.31270951 0.38894663 0.4577181
## Lag 500 0.06132009 0.08240153 0.2322306
## traitl5s:traitl5s.Vial traitl2s:traitl2s.units traitl3s:traitl2s.units
## Lag 0 1.0000000 1.00000000 1.0000000
## Lag 10 0.6395262 0.91665304 0.9552116
## Lag 50 0.5473007 0.65974264 0.8216119
## Lag 100 0.4515577 0.42089420 0.6791017
## Lag 500 0.2139800 0.09396462 0.2410493
## traitl4s:traitl2s.units traitl5s:traitl2s.units traitl2s:traitl3s.units
## Lag 0 1.0000000 1.0000000 1.0000000
## Lag 10 0.9499647 0.9400568 0.9552116
## Lag 50 0.8346657 0.7933036 0.8216119
## Lag 100 0.7085795 0.6496255 0.6791017
## Lag 500 0.2146753 0.1418543 0.2410493
## traitl3s:traitl3s.units traitl4s:traitl3s.units traitl5s:traitl3s.units
## Lag 0 1.0000000 1.0000000 1.0000000
## Lag 10 0.9492174 0.9456094 0.9526415
## Lag 50 0.8154004 0.8159038 0.8422274
## Lag 100 0.6898534 0.7032383 0.7310818
## Lag 500 0.1024166 0.1078273 0.2011584
## traitl2s:traitl4s.units traitl3s:traitl4s.units traitl4s:traitl4s.units
## Lag 0 1.0000000 1.0000000 1.0000000
## Lag 10 0.9499647 0.9456094 0.9434076
## Lag 50 0.8346657 0.8159038 0.8306505
## Lag 100 0.7085795 0.7032383 0.7370967
## Lag 500 0.2146753 0.1078273 0.1957837
## traitl5s:traitl4s.units traitl2s:traitl5s.units traitl3s:traitl5s.units
## Lag 0 1.0000000 1.0000000 1.0000000
## Lag 10 0.9611350 0.9400568 0.9526415
## Lag 50 0.8929816 0.7933036 0.8422274
## Lag 100 0.8117488 0.6496255 0.7310818
## Lag 500 0.3158567 0.1418543 0.2011584
## traitl4s:traitl5s.units traitl5s:traitl5s.units
## Lag 0 1.0000000 1.0000000
## Lag 10 0.9611350 0.9516570
## Lag 50 0.8929816 0.8721305
## Lag 100 0.8117488 0.7828992
## Lag 500 0.3158567 0.3010300
The following bit of code makes a nice table with the estimates and effective sample sizes beside them.The effective sample size indicates the number of independent samples from you MCMC chain, which you want to be as large as possible. You are looking for effective sample sizes close to the actual sample sizes from your MCMC chain, smaller numbers indicates autocorrelation in the estimates.
clean.MCMC <- function(x) {
sols <- summary(x)$solutions ## pull out relevant info from model summary
Gcovs <- summary(x)$Gcovariances
Rcovs <- summary(x)$Rcovariances
fixed <- data.frame(row.names(sols), sols, row.names = NULL) ## convert to dataframes with the row.names as the first col
random <- data.frame(row.names(Gcovs), Gcovs, row.names = NULL)
residual <- data.frame(row.names(Rcovs), Rcovs, row.names = NULL)
names(fixed)[names(fixed) == "row.names.sols."] <- "variable" ## change the columns names to variable, so they all match
names(random)[names(random) == "row.names.Gcovs."] <- "variable"
names(residual)[names(residual) == "row.names.Rcovs."] <- "variable"
fixed$effect <- "fixed" ## add ID column for type of effect (fixed, random, residual)
random$effect <- "random"
residual$effect <- "residual"
modelTerms <- as.data.frame(bind_rows(fixed, random, residual)) # merge it all together
}
getName.MCMC <- function(x) {deparse(substitute(x))}
oneModel <- clean.MCMC(model) # get all the info from summary(modelName)
oneModel$modelName <- getName.MCMC(model) # add the model's name in a new column
oneModel
## variable post.mean l.95..CI u.95..CI eff.samp
## 1 traitl2s 0.098523877 -0.067739515 0.23948665 1000.000000
## 2 traitl3s 0.092111321 -0.038296162 0.22139822 249.217209
## 3 traitl4s 0.067033189 -0.066205468 0.20729053 437.267624
## 4 traitl5s 0.067452505 -0.065533190 0.19975484 515.887523
## 5 Block2 -0.139926410 -0.300394334 0.03621734 100.955905
## 6 ColumnF -0.010964564 -0.054644068 0.03231489 1000.000000
## 7 traitl2s:traitl2s.animal 0.730125087 0.616425377 0.86214877 37.093461
## 8 traitl3s:traitl2s.animal 0.368661615 0.224162176 0.51361361 23.638521
## 9 traitl4s:traitl2s.animal 0.102500917 -0.038941847 0.22070311 17.572866
## 10 traitl5s:traitl2s.animal 0.341462850 0.176057127 0.51997719 11.818877
## 11 traitl2s:traitl3s.animal 0.368661615 0.224162176 0.51361361 23.638521
## 12 traitl3s:traitl3s.animal 0.582411656 0.363785741 0.85328352 22.688535
## 13 traitl4s:traitl3s.animal 0.229550587 0.066577668 0.39262077 16.227555
## 14 traitl5s:traitl3s.animal 0.189524695 0.009457925 0.37312718 14.538337
## 15 traitl2s:traitl4s.animal 0.102500917 -0.038941847 0.22070311 17.572866
## 16 traitl3s:traitl4s.animal 0.229550587 0.066577668 0.39262077 16.227555
## 17 traitl4s:traitl4s.animal 0.423342651 0.228743033 0.59694534 17.835616
## 18 traitl5s:traitl4s.animal -0.025779594 -0.165363009 0.11936950 17.145641
## 19 traitl2s:traitl5s.animal 0.341462850 0.176057127 0.51997719 11.818877
## 20 traitl3s:traitl5s.animal 0.189524695 0.009457925 0.37312718 14.538337
## 21 traitl4s:traitl5s.animal -0.025779594 -0.165363009 0.11936950 17.145641
## 22 traitl5s:traitl5s.animal 0.433993830 0.167340435 0.70391727 10.130271
## 23 traitl2s:traitl2s.Vial 0.145401082 0.059806795 0.24063419 208.737909
## 24 traitl3s:traitl2s.Vial 0.015847227 -0.070806942 0.09101862 75.834482
## 25 traitl4s:traitl2s.Vial -0.016707009 -0.103411130 0.06321968 35.246231
## 26 traitl5s:traitl2s.Vial 0.062425221 -0.026127843 0.15592389 34.602775
## 27 traitl2s:traitl3s.Vial 0.015847227 -0.070806942 0.09101862 75.834482
## 28 traitl3s:traitl3s.Vial 0.186070157 0.068998658 0.29903242 39.677978
## 29 traitl4s:traitl3s.Vial 0.092305132 -0.004440188 0.18738007 34.031047
## 30 traitl5s:traitl3s.Vial -0.010540752 -0.088918504 0.07311185 36.072753
## 31 traitl2s:traitl4s.Vial -0.016707009 -0.103411130 0.06321968 35.246231
## 32 traitl3s:traitl4s.Vial 0.092305132 -0.004440188 0.18738007 34.031047
## 33 traitl4s:traitl4s.Vial 0.271504032 0.160671395 0.40354961 75.124679
## 34 traitl5s:traitl4s.Vial 0.087741535 0.010118732 0.17246891 50.698637
## 35 traitl2s:traitl5s.Vial 0.062425221 -0.026127843 0.15592389 34.602775
## 36 traitl3s:traitl5s.Vial -0.010540752 -0.088918504 0.07311185 36.072753
## 37 traitl4s:traitl5s.Vial 0.087741535 0.010118732 0.17246891 50.698637
## 38 traitl5s:traitl5s.Vial 0.214800526 0.106116046 0.33037368 23.341733
## 39 traitl2s:traitl2s.units 0.081418799 0.028345215 0.13650946 31.722363
## 40 traitl3s:traitl2s.units 0.016869837 -0.063476807 0.07606305 19.575880
## 41 traitl4s:traitl2s.units -0.030458447 -0.094218714 0.03538642 15.471428
## 42 traitl5s:traitl2s.units -0.007396331 -0.087411521 0.08141220 9.207197
## 43 traitl2s:traitl3s.units 0.016869837 -0.063476807 0.07606305 19.575880
## 44 traitl3s:traitl3s.units 0.201376778 0.069764615 0.31262855 18.758734
## 45 traitl4s:traitl3s.units 0.079925683 0.001138465 0.16715568 15.045283
## 46 traitl5s:traitl3s.units 0.021785378 -0.068092779 0.11265145 13.110478
## 47 traitl2s:traitl4s.units -0.030458447 -0.094218714 0.03538642 15.471428
## 48 traitl3s:traitl4s.units 0.079925683 0.001138465 0.16715568 15.045283
## 49 traitl4s:traitl4s.units 0.192344997 0.098932560 0.28151483 16.244320
## 50 traitl5s:traitl4s.units 0.129597270 0.055260732 0.20122887 16.242138
## 51 traitl2s:traitl5s.units -0.007396331 -0.087411521 0.08141220 9.207197
## 52 traitl3s:traitl5s.units 0.021785378 -0.068092779 0.11265145 13.110478
## 53 traitl4s:traitl5s.units 0.129597270 0.055260732 0.20122887 16.242138
## 54 traitl5s:traitl5s.units 0.257178096 0.102059017 0.37786124 8.478093
## pMCMC effect modelName
## 1 0.198 fixed model
## 2 0.190 fixed model
## 3 0.328 fixed model
## 4 0.338 fixed model
## 5 0.116 fixed model
## 6 0.658 fixed model
## 7 NA random model
## 8 NA random model
## 9 NA random model
## 10 NA random model
## 11 NA random model
## 12 NA random model
## 13 NA random model
## 14 NA random model
## 15 NA random model
## 16 NA random model
## 17 NA random model
## 18 NA random model
## 19 NA random model
## 20 NA random model
## 21 NA random model
## 22 NA random model
## 23 NA random model
## 24 NA random model
## 25 NA random model
## 26 NA random model
## 27 NA random model
## 28 NA random model
## 29 NA random model
## 30 NA random model
## 31 NA random model
## 32 NA random model
## 33 NA random model
## 34 NA random model
## 35 NA random model
## 36 NA random model
## 37 NA random model
## 38 NA random model
## 39 NA residual model
## 40 NA residual model
## 41 NA residual model
## 42 NA residual model
## 43 NA residual model
## 44 NA residual model
## 45 NA residual model
## 46 NA residual model
## 47 NA residual model
## 48 NA residual model
## 49 NA residual model
## 50 NA residual model
## 51 NA residual model
## 52 NA residual model
## 53 NA residual model
## 54 NA residual model
Do you think that your model ran for a sufficient number of MCMC iterations? Describe what features you used to reach your conclusion.
If you model did not run for a sufficient amount of time, import the output from an analysis that I ran previously and continue your analysis using this output.
#load(file = "./activity/MCMCglmm_model_z_Longrun.rda")
Now, let’s look at the estimates for the variance components in the model.
writeLines("Variance components:")
#calculate the modes of the posterior distribution for the variance components
posterior.mode(model$VCV)
## Variance components:
## traitl2s:traitl2s.animal traitl3s:traitl2s.animal traitl4s:traitl2s.animal
## 0.663548211 0.341182067 0.116515542
## traitl5s:traitl2s.animal traitl2s:traitl3s.animal traitl3s:traitl3s.animal
## 0.336772596 0.341182067 0.598548365
## traitl4s:traitl3s.animal traitl5s:traitl3s.animal traitl2s:traitl4s.animal
## 0.331569494 0.161049892 0.116515542
## traitl3s:traitl4s.animal traitl4s:traitl4s.animal traitl5s:traitl4s.animal
## 0.331569494 0.488468971 -0.061777253
## traitl2s:traitl5s.animal traitl3s:traitl5s.animal traitl4s:traitl5s.animal
## 0.336772596 0.161049892 -0.061777253
## traitl5s:traitl5s.animal traitl2s:traitl2s.Vial traitl3s:traitl2s.Vial
## 0.446580528 0.152470459 0.034317339
## traitl4s:traitl2s.Vial traitl5s:traitl2s.Vial traitl2s:traitl3s.Vial
## -0.043478773 0.058389242 0.034317339
## traitl3s:traitl3s.Vial traitl4s:traitl3s.Vial traitl5s:traitl3s.Vial
## 0.129727694 0.079436964 -0.027774004
## traitl2s:traitl4s.Vial traitl3s:traitl4s.Vial traitl4s:traitl4s.Vial
## -0.043478773 0.079436964 0.277259806
## traitl5s:traitl4s.Vial traitl2s:traitl5s.Vial traitl3s:traitl5s.Vial
## 0.109707865 0.058389242 -0.027774004
## traitl4s:traitl5s.Vial traitl5s:traitl5s.Vial traitl2s:traitl2s.units
## 0.109707865 0.219878700 0.098643556
## traitl3s:traitl2s.units traitl4s:traitl2s.units traitl5s:traitl2s.units
## -0.002707819 -0.046813581 -0.029243644
## traitl2s:traitl3s.units traitl3s:traitl3s.units traitl4s:traitl3s.units
## -0.002707819 0.169693333 0.053737027
## traitl5s:traitl3s.units traitl2s:traitl4s.units traitl3s:traitl4s.units
## 0.023220952 -0.046813581 0.053737027
## traitl4s:traitl4s.units traitl5s:traitl4s.units traitl2s:traitl5s.units
## 0.160809569 0.166652525 -0.029243644
## traitl3s:traitl5s.units traitl4s:traitl5s.units traitl5s:traitl5s.units
## 0.023220952 0.166652525 0.352830271
Estimate the heritability of each trait
#estimate heritability of l2-l5
heritability.l2 <- model$VCV[, "traitl2s:traitl2s.animal"] / (model$VCV[, "traitl2s:traitl2s.animal"] +model$VCV[, "traitl2s:traitl2s.Vial"]+ model$VCV[, "traitl2s:traitl2s.units"])
heritability.l3 <- model$VCV[, "traitl3s:traitl3s.animal"] / (model$VCV[, "traitl3s:traitl3s.animal"] +model$VCV[, "traitl3s:traitl3s.Vial"]+ model$VCV[, "traitl3s:traitl3s.units"])
heritability.l4 <- model$VCV[, "traitl4s:traitl4s.animal"] / (model$VCV[, "traitl4s:traitl4s.animal"] + model$VCV[, "traitl4s:traitl4s.Vial"]+ model$VCV[, "traitl4s:traitl4s.units"])
heritability.l5 <- model$VCV[, "traitl5s:traitl5s.animal"] / (model$VCV[, "traitl5s:traitl5s.animal"] +model$VCV[, "traitl5s:traitl5s.Vial"]+ model$VCV[, "traitl5s:traitl5s.units"])
#calculate the mode of the posterior distributions for heritability and the HPD intervals
h2.l2<-c(posterior.mode(heritability.l2), HPDinterval(heritability.l2))
h2.l3<-c(posterior.mode(heritability.l3), HPDinterval(heritability.l3))
h2.l4<-c(posterior.mode(heritability.l4), HPDinterval(heritability.l4))
h2.l5<-c(posterior.mode(heritability.l5), HPDinterval(heritability.l5))
#arrange in a data frame and produce a table
heritabilities<-rbind(h2.l2, h2.l3, h2.l4, h2.l5)
colnames(heritabilities)<-c("h2", "lowerHPD", "upperHPD")
knitr::kable(heritabilities[,1:3], "simple", digits=2)
h2 | lowerHPD | upperHPD | |
---|---|---|---|
h2.l2 | 0.81 | 0.64 | 0.88 |
h2.l3 | 0.56 | 0.36 | 0.82 |
h2.l4 | 0.44 | 0.26 | 0.67 |
h2.l5 | 0.46 | 0.20 | 0.74 |
What is your estimate of the heritability for each trait? Include the 95% HPD.
Let’s arrange the estimates from our model into a G-matrix
#put the posterior mode of the VCV elements in a matrix
pointG<-matrix(posterior.mode(model$VCV[,1:16]),4,4)
pointG
## [,1] [,2] [,3] [,4]
## [1,] 0.6635482 0.3411821 0.11651554 0.33677260
## [2,] 0.3411821 0.5985484 0.33156949 0.16104989
## [3,] 0.1165155 0.3315695 0.48846897 -0.06177725
## [4,] 0.3367726 0.1610499 -0.06177725 0.44658053
#turn into correlation matrix
pointGcor<-cov2cor(pointG)
pointGcor
## [,1] [,2] [,3] [,4]
## [1,] 1.0000000 0.5413777 0.2046582 0.6186580
## [2,] 0.5413777 1.0000000 0.6132061 0.3115021
## [3,] 0.2046582 0.6132061 1.0000000 -0.1322697
## [4,] 0.6186580 0.3115021 -0.1322697 1.0000000
What is the genetic covariance between each pair of traits? Include the 95% HPD.
What are the genetic correlations between all traits?
What would you conclude about the genetic architecture of these traits?
Now that we have estimated G, we would like to determine how genetic variance is distributed across multivariate space.
Start by doing an eigenanalysis of G and produce a scree plot of the eigenvalues.
eigen(pointG)
## eigen() decomposition
## $values
## [1] 1.2409757 0.6417662 0.1842224 0.1301817
##
## $vectors
## [,1] [,2] [,3] [,4]
## [1,] 0.6324326 0.3553261 0.6731876 -0.1434953
## [2,] 0.5981946 -0.3351711 -0.4981884 -0.5306900
## [3,] 0.3316509 -0.6768211 0.1803805 0.6319682
## [4,] 0.3635931 0.5507421 -0.5158389 0.5462540
#plot the eigenvalues
plot(eigen(pointG)$values)
#plot the eigenvalues with uncertainty
#save an object of the MCMC samples of G
Gmcmc<-model$VCV[,1:16]
#eigenanalysis of each posterior sample
eigs<-array(dim=c(1000,4))
for (i in 1:1000){
eigs[i,]<-eigen(matrix(Gmcmc[i,],4,4))$values
}
eig1l<-quantile(eigs[,1], 0.025)
eig1u<-quantile(eigs[,1], 0.975)
eig2l<-quantile(eigs[,2], 0.025)
eig2u<-quantile(eigs[,2], 0.975)
eig3l<-quantile(eigs[,3], 0.025)
eig3u<-quantile(eigs[,3], 0.975)
eig4l<-quantile(eigs[,4], 0.025)
eig4u<-quantile(eigs[,4], 0.975)
#plot
eigen_plot<-data.frame(seq(1:4),eigen(pointG)$values, rbind(eig1l, eig2l, eig3l, eig4l), rbind(eig1u, eig2u, eig3u, eig4u))
colnames(eigen_plot)<-c("eigenvalue", "value", "lci", "uci")
p<-ggplot(data = eigen_plot, aes(x = eigenvalue, y = value)) +
geom_point(cex=5) +
geom_errorbar(aes(x = eigenvalue, ymin = lci, ymax = uci))+
scale_y_continuous(expand = c(0, 0), limits = c(-1, 2))
p
Are there any multivariate trait combinations that have low genetic variance? How many?
What is the effective dimensionality of the matrix?
We would like to know if our estimates of genetic variance and covariance are different from sampling error (ie. are they statistically significant). MCMCglmm constrains estimates of G to be positive-definite, which means that estimates of variances, including eigenvalues, must be greater than or equal to 0. Therefore, the lower bound of an HPD cannot be less than 0, which means we can’t compare it to 0 to determine statistical support.
We need to fit a null-model to compare our estimates to. There are different ways one could go about doing this, but the most straightforward is to randomise the phenotypes across the pedigree to keep the same data structure, but remove any genetic signal, and then re-fit the model. There are technical questions about the best way to do this randomisation in the presence of fixed effects, but let’s ignore those questions for now.
#randomise the phenotypes to break any genetic links in the data
shuffled_data<-data[sample(1:nrow(data)), 12:15]
colnames(shuffled_data)<-c("l2sr", "l3sr", "l4sr", "l5sr")
data<-cbind(data, shuffled_data)
#fit the same model that you used to estimate G
#build a model
model.r <- MCMCglmm(cbind(l2sr, l3sr, l4sr, l5sr) ~ trait - 1+Block+Column,
random = ~ us(trait):animal + us(trait):Vial,
rcov = ~ us(trait):units,
family = c("gaussian", "gaussian", "gaussian", "gaussian"),
pedigree= ped,
data = data, prior = prior, verbose = T
)
##
## MCMC iteration = 0
##
## MCMC iteration = 1000
##
## MCMC iteration = 2000
##
## MCMC iteration = 3000
##
## MCMC iteration = 4000
##
## MCMC iteration = 5000
##
## MCMC iteration = 6000
##
## MCMC iteration = 7000
##
## MCMC iteration = 8000
##
## MCMC iteration = 9000
##
## MCMC iteration = 10000
##
## MCMC iteration = 11000
##
## MCMC iteration = 12000
##
## MCMC iteration = 13000
#summarize the model
summary(model.r)
##
## Iterations = 3001:12991
## Thinning interval = 10
## Sample size = 1000
##
## DIC: 21295.08
##
## G-structure: ~us(trait):animal
##
## post.mean l-95% CI u-95% CI eff.samp
## traitl2sr:traitl2sr.animal 0.043954 0.023325 0.06785 72.33
## traitl3sr:traitl2sr.animal 0.009183 -0.008883 0.02930 56.75
## traitl4sr:traitl2sr.animal -0.002219 -0.021557 0.01428 53.87
## traitl5sr:traitl2sr.animal 0.005932 -0.011404 0.02426 38.98
## traitl2sr:traitl3sr.animal 0.009183 -0.008883 0.02930 56.75
## traitl3sr:traitl3sr.animal 0.045331 0.023323 0.07367 52.55
## traitl4sr:traitl3sr.animal 0.013522 -0.006030 0.03699 49.41
## traitl5sr:traitl3sr.animal 0.002356 -0.017841 0.02298 47.58
## traitl2sr:traitl4sr.animal -0.002219 -0.021557 0.01428 53.87
## traitl3sr:traitl4sr.animal 0.013522 -0.006030 0.03699 49.41
## traitl4sr:traitl4sr.animal 0.053278 0.021184 0.08631 42.20
## traitl5sr:traitl4sr.animal 0.004471 -0.016795 0.02353 96.07
## traitl2sr:traitl5sr.animal 0.005932 -0.011404 0.02426 38.98
## traitl3sr:traitl5sr.animal 0.002356 -0.017841 0.02298 47.58
## traitl4sr:traitl5sr.animal 0.004471 -0.016795 0.02353 96.07
## traitl5sr:traitl5sr.animal 0.045057 0.023527 0.07179 48.00
##
## ~us(trait):Vial
##
## post.mean l-95% CI u-95% CI eff.samp
## traitl2sr:traitl2sr.Vial 0.0318932 0.0184069 0.04789 532.4
## traitl3sr:traitl2sr.Vial 0.0073445 -0.0040313 0.02002 541.6
## traitl4sr:traitl2sr.Vial -0.0009259 -0.0132805 0.00975 524.3
## traitl5sr:traitl2sr.Vial 0.0041812 -0.0058038 0.01667 548.6
## traitl2sr:traitl3sr.Vial 0.0073445 -0.0040313 0.02002 541.6
## traitl3sr:traitl3sr.Vial 0.0316813 0.0172933 0.04656 632.9
## traitl4sr:traitl3sr.Vial 0.0112068 0.0001019 0.02440 606.1
## traitl5sr:traitl3sr.Vial 0.0026549 -0.0095074 0.01245 651.8
## traitl2sr:traitl4sr.Vial -0.0009259 -0.0132805 0.00975 524.3
## traitl3sr:traitl4sr.Vial 0.0112068 0.0001019 0.02440 606.1
## traitl4sr:traitl4sr.Vial 0.0367322 0.0199655 0.05477 491.1
## traitl5sr:traitl4sr.Vial 0.0039065 -0.0075782 0.01608 639.3
## traitl2sr:traitl5sr.Vial 0.0041812 -0.0058038 0.01667 548.6
## traitl3sr:traitl5sr.Vial 0.0026549 -0.0095074 0.01245 651.8
## traitl4sr:traitl5sr.Vial 0.0039065 -0.0075782 0.01608 639.3
## traitl5sr:traitl5sr.Vial 0.0311723 0.0191224 0.04629 652.6
##
## R-structure: ~us(trait):units
##
## post.mean l-95% CI u-95% CI eff.samp
## traitl2sr:traitl2sr.units 0.9613 0.89975 1.0263 1000.0
## traitl3sr:traitl2sr.units 0.4098 0.36201 0.4594 740.1
## traitl4sr:traitl2sr.units 0.1132 0.05973 0.1578 719.3
## traitl5sr:traitl2sr.units 0.3945 0.34610 0.4423 637.2
## traitl2sr:traitl3sr.units 0.4098 0.36201 0.4594 740.1
## traitl3sr:traitl3sr.units 0.9604 0.89798 1.0242 873.6
## traitl4sr:traitl3sr.units 0.4599 0.41316 0.5161 848.7
## traitl5sr:traitl3sr.units 0.2393 0.19445 0.2871 1000.0
## traitl2sr:traitl4sr.units 0.1132 0.05973 0.1578 719.3
## traitl3sr:traitl4sr.units 0.4599 0.41316 0.5161 848.7
## traitl4sr:traitl4sr.units 0.9465 0.88042 1.0168 304.2
## traitl5sr:traitl4sr.units 0.2576 0.21072 0.3075 842.2
## traitl2sr:traitl5sr.units 0.3945 0.34610 0.4423 637.2
## traitl3sr:traitl5sr.units 0.2393 0.19445 0.2871 1000.0
## traitl4sr:traitl5sr.units 0.2576 0.21072 0.3075 842.2
## traitl5sr:traitl5sr.units 0.9617 0.90304 1.0275 840.5
##
## Location effects: cbind(l2sr, l3sr, l4sr, l5sr) ~ trait - 1 + Block + Column
##
## post.mean l-95% CI u-95% CI eff.samp pMCMC
## traitl2sr 0.01437 -0.06101 0.08890 1000.0 0.726
## traitl3sr 0.01183 -0.06019 0.08631 1000.0 0.744
## traitl4sr 0.01562 -0.06330 0.08258 1000.0 0.688
## traitl5sr 0.01216 -0.07025 0.08856 1000.0 0.754
## Block2 0.01226 -0.06169 0.08467 1000.0 0.772
## ColumnF -0.03749 -0.08936 0.03397 887.7 0.230
#this will take a while to run, so load the output from a model that I have run for you
Produce a scree plot of the eigenvalues from your estimated G and your null G with 95% HPD on the same figure.
#put the posterior mode of the VCV elements in a matrix
pointG.r<-matrix(posterior.mode(model.r$VCV[,1:16]),4,4)
pointG.r
## [,1] [,2] [,3] [,4]
## [1,] 0.036623329 0.0038493353 -0.002957749 0.0053837592
## [2,] 0.003849335 0.0393369165 0.013745495 -0.0004053919
## [3,] -0.002957749 0.0137454954 0.047734297 0.0015486248
## [4,] 0.005383759 -0.0004053919 0.001548625 0.0406155395
#turn into correlation matrix
pointG.rcor<-cov2cor(pointG.r)
pointG.rcor
## [,1] [,2] [,3] [,4]
## [1,] 1.00000000 0.10141605 -0.07074033 0.13959202
## [2,] 0.10141605 1.00000000 0.31720861 -0.01014212
## [3,] -0.07074033 0.31720861 1.00000000 0.03517101
## [4,] 0.13959202 -0.01014212 0.03517101 1.00000000
#save an object of the MCMC samples of G
Gmcmc.r<-model.r$VCV[,1:16]
#eigenanalysis of each posterior sample
eigs.r<-array(dim=c(1000,4))
for (i in 1:1000){
eigs.r[i,]<-eigen(matrix(Gmcmc.r[i,],4,4))$values
}
eig1l.r<-quantile(eigs.r[,1], 0.025)
eig1u.r<-quantile(eigs.r[,1], 0.975)
eig2l.r<-quantile(eigs.r[,2], 0.025)
eig2u.r<-quantile(eigs.r[,2], 0.975)
eig3l.r<-quantile(eigs.r[,3], 0.025)
eig3u.r<-quantile(eigs.r[,3], 0.975)
eig4l.r<-quantile(eigs.r[,4], 0.025)
eig4u.r<-quantile(eigs.r[,4], 0.975)
#plot the eigenvalues
eigen_plot.r<-data.frame(seq(1:4),eigen(pointG.r)$values, rbind(eig1l.r, eig2l.r, eig3l.r, eig4l.r), rbind(eig1u.r, eig2u.r, eig3u.r, eig4u.r))
colnames(eigen_plot.r)<-c("eigenvalue", "value", "lci", "uci")
p + geom_point(data = eigen_plot.r, aes(x = eigenvalue, y = value), color = "purple", cex=5) +
geom_errorbar(data = eigen_plot.r, aes(x = eigenvalue, ymin = lci, ymax = uci))
Do any of your previous interpretations change now that you have compared your estimates to the null model?
Next we will estimate how selection acts on these 4 CHC traits. The component of fitness that we will use is mating success. The standard Lande-Arnold approach to estimate selection regresses trait values on relative fitness.
First calculate the relative fitness of individuals in the data
#calculate relative fitness and add it to the data
str(data)
## 'data.frame': 1985 obs. of 19 variables:
## $ Sire : Factor w/ 80 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Dam : Factor w/ 240 levels "81","82","83",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ animal : Factor w/ 1985 levels "321","322","323",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ Column : Factor w/ 2 levels "B","F": 2 2 2 2 2 1 1 1 1 1 ...
## $ Block : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
## $ Chosen.Rejected: num 1 0 0 1 0 1 1 1 0 1 ...
## $ Vial : Factor w/ 240 levels "81","82","83",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ l2 : num 1.76 1.76 1.72 1.73 1.74 ...
## $ l3 : num 0.358 0.424 0.364 0.472 0.465 ...
## $ l4 : num 0.241 0.272 0.334 0.183 0.277 ...
## $ l5 : num 0.579 0.398 0.382 0.598 0.573 ...
## $ l2s : num 1.224 1.111 0.13 0.396 0.835 ...
## $ l3s : num 0.439 1.142 0.501 1.654 1.579 ...
## $ l4s : num -0.1219 0.0647 0.425 -0.4594 0.0891 ...
## $ l5s : num -0.308 -1.407 -1.498 -0.196 -0.345 ...
## $ l2sr : num 1.756 -0.363 -0.93 -0.276 -1.143 ...
## $ l3sr : num 2.5476 0.0644 -0.1564 -0.219 -0.2436 ...
## $ l4sr : num 0.7084 0.3317 -0.65 0.0359 -0.0586 ...
## $ l5sr : num 0.0195 -0.7992 -0.1465 0.7825 -0.0699 ...
data$relfit<-data$Chosen.Rejected/mean(data$Chosen.Rejected)
Fit the regression model using the lm function to estimate the selection differential for each trait. Don’t forget to include block and column as fixed effects
#regress each trait on relative fitness separately and save the estimate and standard error
sel.dif<-rbind(
summary(lm(relfit~Block+Column+l2s, data=data))$coefficients[4,1:2],
summary(lm(relfit~Block+Column+l3s, data=data))$coefficients[4,1:2],
summary(lm(relfit~Block+Column+l4s, data=data))$coefficients[4,1:2],
summary(lm(relfit~Block+Column+l5s, data=data))$coefficients[4,1:2])
We fit a linear regression to estimate selection, even though the response variable was not normally distributed. That is totally fine, and these are the correct estimates of the gradients! However, to determine which of these coefficients are statistically significant, we should fit a binomial model with a logit link function to get the appropriate p-values.
#fit a model using the glm function
sel.dif.p<-rbind(
summary(glm(Chosen.Rejected ~Block+Column+l2s, family=binomial(link=logit), data=data))$coefficients[4,4],
summary(glm(Chosen.Rejected ~Block+Column+l3s, family=binomial(link=logit), data=data))$coefficients[4,4],
summary(glm(Chosen.Rejected ~Block+Column+l4s, family=binomial(link=logit), data=data))$coefficients[4,4],
summary(glm(Chosen.Rejected ~Block+Column+l5s, family=binomial(link=logit), data=data))$coefficients[4,4])
sel.dif.table<-data.frame(cbind(sel.dif, sel.dif.p))
colnames(sel.dif.table)<-c("estimate", "se", "p-value")
knitr::kable(sel.dif.table, "simple", digits=2)
estimate | se | p-value |
---|---|---|
-0.02 | 0.02 | 0.50 |
0.04 | 0.02 | 0.08 |
0.03 | 0.02 | 0.19 |
-0.04 | 0.02 | 0.12 |
Which trait is under the strongest directional selection? Weakest?
Now let’s use multiple regression to disentangle the effects of direct versus indirect selection. Use the lm function to fit a multiple regression of all 4 traits on relative fitness. Don’t forget to include block and column as fixed effects.
#fit multiple regression model
sel.grad<-lm(relfit~Block+Column+l2s+l3s+l4s+l5s, data=data)
beta<-sel.grad$coefficients[4:7]
print(beta)
## l2s l3s l4s l5s
## -0.02422153 0.05050675 0.02761503 -0.04785819
#fit a model using glm to determine statistical significance for directional selection overall
a3<-glm(Chosen.Rejected ~Block+Column+l2s+l3s+l4s+l5s, family=binomial(link=logit), data=data)
a4<-glm(Chosen.Rejected ~Block+Column, family=binomial(link=logit), data=data)
anova(a3,a4)
## Analysis of Deviance Table
##
## Model 1: Chosen.Rejected ~ Block + Column + l2s + l3s + l4s + l5s
## Model 2: Chosen.Rejected ~ Block + Column
## Resid. Df Resid. Dev Df Deviance
## 1 1978 2740.1
## 2 1982 2749.5 -4 -9.3565
#make a table of the selection gradients and p-values for individual coefficients
sel.grad.table<-cbind(beta, summary(a3)$coefficients[4:7,4])
colnames(sel.grad.table)<-c("selection gradient", "p-value")
knitr::kable(sel.grad.table, "simple", digits=2)
selection gradient | p-value | |
---|---|---|
l2s | -0.02 | 0.37 |
l3s | 0.05 | 0.08 |
l4s | 0.03 | 0.34 |
l5s | -0.05 | 0.07 |
Now which trait is under the strongest directional selection? Weakest?
What is the total strength of directional selection on CHCs? hint: calculate the length of the vector
We might also be interested in whether there is stabilising or disruptive selection on each CHC trait. We will fit the same type of model as before, but now estimate the quadratic terms (ie. the square of each trait). Don’t forget to include the linear terms as covariates in the model, or to double the quadratic coefficients! (see Stinchcombe J. R., A. F. Agrawal, P. A. Hohenlohe, S. J. Arnold, and M. W. Blows. 2008. Estimating nonlinear selection gradients using quadratic regression coefficients: double or nothing? Evolution 62:2435-2440).
a5<-lm(relfit~Block+Column + l2s+l3s+l4s+l5s+I(l2s^2)+I(l3s^2)+I(l4s^2)+I(l5s^2), data=data)
summary(lm(relfit~Block+Column + l2s+l3s+l4s+l5s+I(l2s^2)+I(l3s^2)+I(l4s^2)+I(l5s^2), data=data))
##
## Call:
## lm(formula = relfit ~ Block + Column + l2s + l3s + l4s + l5s +
## I(l2s^2) + I(l3s^2) + I(l4s^2) + I(l5s^2), data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.4679 -0.9944 -0.6388 1.0119 1.8313
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.979855 0.043509 22.521 < 2e-16 ***
## Block2 0.079896 0.052228 1.530 0.12624
## ColumnF 0.047844 0.046535 1.028 0.30402
## l2s -0.041607 0.028877 -1.441 0.14978
## l3s 0.062248 0.028948 2.150 0.03165 *
## l4s 0.101355 0.035056 2.891 0.00388 **
## l5s 0.015142 0.030539 0.496 0.62007
## I(l2s^2) -0.017450 0.015246 -1.145 0.25253
## I(l3s^2) 0.033031 0.014787 2.234 0.02561 *
## I(l4s^2) -0.047947 0.012952 -3.702 0.00022 ***
## I(l5s^2) -0.008717 0.006483 -1.345 0.17890
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.008 on 1974 degrees of freedom
## Multiple R-squared: 0.0163, Adjusted R-squared: 0.01131
## F-statistic: 3.27 on 10 and 1974 DF, p-value: 0.0003306
quad_coeffs<-a5$coefficients[8:11]*2
quad_coeffs
## I(l2s^2) I(l3s^2) I(l4s^2) I(l5s^2)
## -0.03490008 0.06606194 -0.09589340 -0.01743486
## compare the fit of a model with and without quadratic terms
a6<-glm(Chosen.Rejected~Block+Column + l2s+l3s+l4s+l5s+I(l2s^2)+I(l3s^2)+I(l4s^2)+I(l5s^2), family=binomial(link=logit), data=data)
quad.sel.table<-cbind(quad_coeffs, summary(a6)$coefficients[8:11,4])
colnames(quad.sel.table)<-c("quad.grad", "p-value")
knitr::kable(quad.sel.table, "simple", digits=2)
quad.grad | p-value | |
---|---|---|
I(l2s^2) | -0.03 | 0.26 |
I(l3s^2) | 0.07 | 0.02 |
I(l4s^2) | -0.10 | 0.00 |
I(l5s^2) | -0.02 | 0.18 |
Are any traits under stabilising selection?
Are there any traits under disruptive selection?
The previous analysis looks at individual traits in isolation, but there might be selection on combinations of traits (ie. correlational selection). To estimate quadratic and correlational selection we need to expand our model to include both quadratic and correlational terms. With 4 traits, that is already quite a few parameters! 4 linear + 10 quadratic and correlational terms = 14. The matrix of quadratic and correlational selection coefficients that we are going to estimate is called ‘gamma’.
a6<-lm(relfit~Block+Column + l2s+l3s+l4s+l5s+I(l2s^2)+I(l3s^2)+I(l4s^2)+I(l5s^2)+I(l2s*l3s)+I(l2s*l4s)+I(l2s*l5s)+I(l3s*l4s)+I(l3s*l5s)+I(l4s*l5s), data=data)
summary(a6)
##
## Call:
## lm(formula = relfit ~ Block + Column + l2s + l3s + l4s + l5s +
## I(l2s^2) + I(l3s^2) + I(l4s^2) + I(l5s^2) + I(l2s * l3s) +
## I(l2s * l4s) + I(l2s * l5s) + I(l3s * l4s) + I(l3s * l5s) +
## I(l4s * l5s), data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.4241 -0.9927 -0.6462 1.0126 1.7353
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.987064 0.046919 21.038 < 2e-16 ***
## Block2 0.080932 0.052797 1.533 0.12547
## ColumnF 0.047449 0.046669 1.017 0.30941
## l2s -0.042522 0.029413 -1.446 0.14842
## l3s 0.066099 0.029265 2.259 0.02402 *
## l4s 0.096242 0.035878 2.682 0.00737 **
## l5s 0.013870 0.031818 0.436 0.66294
## I(l2s^2) -0.032168 0.019953 -1.612 0.10708
## I(l3s^2) 0.016914 0.024314 0.696 0.48673
## I(l4s^2) -0.058577 0.023023 -2.544 0.01103 *
## I(l5s^2) -0.005635 0.010770 -0.523 0.60086
## I(l2s * l3s) 0.044487 0.035498 1.253 0.21028
## I(l2s * l4s) -0.012360 0.032256 -0.383 0.70163
## I(l2s * l5s) 0.019347 0.030104 0.643 0.52051
## I(l3s * l4s) 0.026367 0.036084 0.731 0.46504
## I(l3s * l5s) -0.056696 0.032922 -1.722 0.08521 .
## I(l4s * l5s) 0.026389 0.033702 0.783 0.43371
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.009 on 1968 degrees of freedom
## Multiple R-squared: 0.01833, Adjusted R-squared: 0.01035
## F-statistic: 2.296 on 16 and 1968 DF, p-value: 0.002439
# arrange the coefficients into the gamma matrix
gamma<-diag(a6$coefficients[8:11]*2)
gamma[2:4, 1]<-a6$coefficients[12:14]
gamma[3:4, 2]<-a6$coefficients[15:16]
gamma[4, 3]<-a6$coefficients[16]
gamma<-gamma+t(gamma)- diag(diag(gamma))
What can you say about patterns of correlational selection on CHCs?
In the same way that it can be difficult to interpret G just by looking at it, it can also be difficult to interpret the multivariate pattern of selection just by looking at gamma. We can conduct an eigenanalysis of gamma the same way that we did with G to get a better understanding of multivariate selection. This is called the canonical rotation (see Blows for original paper, and x for a criticism of the approach).
eigen(gamma)
## eigen() decomposition
## $values
## [1] 0.089622429 -0.005774695 -0.100663656 -0.142117580
##
## $vectors
## [,1] [,2] [,3] [,4]
## [1,] -0.1381846 -0.5890903 0.79258768 0.07538182
## [2,] -0.7857326 -0.4241542 -0.44869245 -0.03731817
## [3,] -0.2432062 0.2838662 0.08070198 0.92399023
## [4,] 0.5517065 -0.6264869 -0.40492844 0.37305092
#plot the eigenvalues of gamma
plot(eigen(gamma)$values)
abline(h=0, lty=2)
How many multivariate trait combinations are under disruptive selection?
How many multivariate trait combinations are under stabilising selection?
Overall, is there more disruptive or stabilising selection?
Note: In a real analysis you will need to account for the uncertainty of your estimates of G and selection. For this exercise, however, we will use the point estimates only. Use the posterior mode for G, and use the estimated selection vector from your linear model for beta.
First we will calculate the predicted response to directional selection on the 4 CHC traits
deltazbar<-pointG %*% beta
To calculate the correlations between vectors they need to be standardised to unit length first. Check whether they are, and if not, standardise to unit length before answering the follwing questions.
l2_norm<-sqrt(sum(beta^2))
l2_norm
## [1] 0.07868044
beta_std<-beta/l2_norm
beta_std<-as.matrix(beta_std, 4,1)
gmax<-eigen(pointG)$vectors[,1]
#check whether it's unit length
sqrt(sum(gmax^2))
## [1] 1
#check beta_std for unit length
sqrt(sum(beta_std^2))
## [1] 1
Is directional selection aligned with the major axis of genetic variance? hint: calculate the vector correlation between gmax and the directional selection gradient
#calculate the angle
angle<-acos(cor(beta_std, gmax))*180/pi
if(angle<90){angle}else{180-angle}
## [,1]
## [1,] 79.83421
How much genetic variance is there in the direction of linear selection? hint: project the selection gradient through G
#project beta through G
#project beta through G
t(beta_std) %*% pointG %*% beta
## [,1]
## [1,] 0.04335597
Is that a lot or not much? How did you come to your conclusion?