In this activity, we will be analysing data from an experiment in Drosophila serrata conducted by Jacqueline. She set up a paternal half-sibling breeding design to estimate genetic variance in male cuticular hydrocarbon (CHC) pheromone profile, and conducted a binomal fitness assay using these same individuals to get a measure of their mating success. D. serrata has 8 well-studied CHCs that are known to be a target of sexual selection via female choice, and to have an important ecological role in preventing desiccation. We have measured these traits of interest in all offspring from the breeding design, and we also assayed the mating success of these individuals. Our goal is to better understand the genetic basis of these traits and whether they are under selection. In this activity we will analyse a subset of 4 CHC traits in order to estimate the genetic variance and covariance between them (ie. the G matrix), and to estimate direction and non-linear selection acting on these traits.

The original data were published in Sztepanacz, J.L. and Rundle, H.D., 2012. Reduced genetic variance among high fitness individuals: inferring stabilizing selection on male sexual displays in Drosophila serrata. Evolution, 66(10), pp.3101-3110 and the full data set is available on Dryad.

If you have not done so already, open this .Rmd file in RStudio to take advantage of the RMarkdown formatting.

First, set up your R environment by setting your working directory and loading packages.

First, load the data. We have one file of data, contained in Data.csv, and one file with the pedigree, contained in fullped.csv. Load these data into R.

data <- read.csv("Data.csv", header = T)
ped <- read.csv("fullped.csv", header = T)

Before using the data, inspect the structure of the data in order to ensure that the classes of all variables are correct. Use this code block to convert variables to factors as needed. The data contains the following variables:

Sire: ID of the father
Dam: ID of the mother
animal: ID of offspring
Column: gas chromatography column the sample was run on
Block: temporal block of the experiment
Chosen/Rejected: mating success of the individual; 1=chosen, 0=rejected
Vial: same as Dam,
l2: log-contrast value of CHC2
l3: log-contrast value of CHC3
l4: log-contrast value of CHC4
l5: log-contrast value of CHC5

head(data)

##   Sire Dam animal Column Block Chosen.Rejected Vial       l2        l3
## 1    1  81    321      F     1               1   81 1.759736 0.3584508
## 2    1  81    322      F     1               0   81 1.755448 0.4241581
## 3    1  81    323      F     1               0   81 1.718102 0.3642616
## 4    1  81    324      F     1               1   81 1.728217 0.4719437
## 5    1  81    325      F     1               0   81 1.744923 0.4649696
## 6    1  81    326      B     1               1   81 1.728600 0.4677961
##          l4        l5
## 1 0.2405727 0.5794495
## 2 0.2724918 0.3975695
## 3 0.3341148 0.3823888
## 4 0.1828450 0.5979854
## 5 0.2766503 0.5733535
## 6 0.5190923 0.9399905

head(ped)

##   id dam sire
## 1  1  NA   NA
## 2  2  NA   NA
## 3  3  NA   NA
## 4  4  NA   NA
## 5  5  NA   NA
## 6  6  NA   NA

data$Sire<-as.factor(data$Sire)
data$Dam<-as.factor(data$Dam)
data$animal<-as.factor(data$animal)
data$Column<-as.factor(data$Column)
data$Block<-as.factor(data$Block)
data$Chosen.Rejected<-as.numeric(data$Chosen.Rejected)
data$Vial<-as.factor(data$Vial)

for (x in 1:3) ped[, x] <- as.factor(ped[, x])


#this function manipulates the pedigree to conform to the structure required for MCMCglmm
ped<-fix_ped(ped=ped)

Explore data

Before jumping in to analyses, take a moment to explore the data. Produce graphs of your choice to visualize patterns in the distributions of the variables and the relationships between them.

plot(data[,8:11])

hist(data$l2)

hist(data$l3)

hist(data$l4)

hist(data$l5)

#summarise the data
summary(data[,8:11])

##        l2              l3                 l4                l5           
##  Min.   :1.518   Min.   :0.004832   Min.   :-0.1383   Min.   :-0.003662  
##  1st Qu.:1.691   1st Qu.:0.259254   1st Qu.: 0.1557   1st Qu.: 0.537238  
##  Median :1.716   Median :0.313911   Median : 0.2494   Median : 0.625257  
##  Mean   :1.713   Mean   :0.317438   Mean   : 0.2614   Mean   : 0.630429  
##  3rd Qu.:1.739   3rd Qu.:0.369115   3rd Qu.: 0.3311   3rd Qu.: 0.714851  
##  Max.   :1.834   Max.   :0.893993   Max.   : 1.3929   Max.   : 2.666013

# standardise the data to mean 0 variance 1

data_std<-scale(data[,8:11])
colnames(data_std)<-c("l2s", "l3s", "l4s", "l5s")
data<-cbind(data, data_std)

cor(data[,12:15])

##           l2s       l3s       l4s       l5s
## l2s 1.0000000 0.4195317 0.1084286 0.3970258
## l3s 0.4195317 1.0000000 0.4772407 0.2405112
## l4s 0.1084286 0.4772407 1.0000000 0.2618420
## l5s 0.3970258 0.2405112 0.2618420 1.0000000

Are there any outliers in the data? How are the data distributed? Do they meet the assumptions of the model you would like to fit? Is there anything interesting to note in the summary statistics? Is there anything else we should be looking for when exploring the data

Do you predict that there is a genetic correlation between any of the traits? If so, why?

If there are any outliers in the data remove them here, and then standardise the data to mean of 0 and variance of 1

pairs(data[,8:11])

#plot l3 vs l5 with animals labelled to figure out which to remove
ggplot(data=data, aes(x = l3, y=l5))+geom_text(aes(label=animal))

#remove outliers
remove<-c("2143", "1093", "2153", "2154")
data <- data %>% filter(!animal %in% remove) 

# standardise the data to mean 0 variance 1

data_std<-scale(data[,8:11])
colnames(data_std)<-c("l2s", "l3s", "l4s", "l5s")
data<-cbind(data, data_std)

cor(data[,12:15])

##           l2s       l3s       l4s       l5s
## l2s 1.0000000 0.4194997 0.1000275 0.4150147
## l3s 0.4194997 1.0000000 0.4490175 0.1944472
## l4s 0.1000275 0.4490175 1.0000000 0.1976393
## l5s 0.4150147 0.1944472 0.1976393 1.0000000

#Explore pedigree; learn how to plot a pedigree,

#here we are visualizing the pedigree for our interest
#this code draws the pedigree structure; unfortunately for the half-sib full-sib design, it doesn't look very interesting
draw_ped(ped)

#this code draws the additive genetic relatedness matrix-- again not very interesting
draw_pedA(ped)

So you can see what a different kind of pedigree might look like, let’s draw an example

#here is an example of a pedigree that does look interesting
data(gryphons)
test.pedigree <- fix_ped(gryphons[,1:3])
## draw the gryphon pedigree by pedigree depth
draw_ped(test.pedigree)

draw_pedA(test.pedigree)

Estimate G

Next, build a model to investigate the genetic variance and covariance between the log-contrast CHC traits. Even though we only have 4 traits, we still need to estimate quite a few parameters, in fact,

n*(n+1)/2 parameters, where n is the number of traits. So, 10 in this case.

Use this code block to generate the inverse A matrix that you can provide to MCMCglmm().

Ainv <- inverseA(ped)$Ainv

Then, use this code block to specify your prior and run your model using MCMCglmm(). In this model, include Block and Column as fixed effects, and vial as a random effect. Choose a small number of iterations to run in the interest of time. Check out this tutorial https://devillemereuil.legtux.org/wp-content/uploads/2012/12/tuto_en.pdf for more information on how ti set up a a multivariate analysis.

#specify a prior for a 4x4 covariance matrix
prior <- list(
  G = list(G1 = list(V = diag(4)/3, nu = 1.002),
           G2 = list(V = diag(4)/3, nu = 1.002)),
  R = list(V = diag(4)/3, nu = 1.002)
)

#build a model
model <- MCMCglmm(cbind(l2s, l3s, l4s, l5s) ~ trait-1+Block+Column,
  random = ~ us(trait):animal + us(trait):Vial,
  rcov = ~ us(trait):units,
  family = c("gaussian", "gaussian", "gaussian", "gaussian"),
  ginverse = list(animal = Ainv),
  data = data, prior = prior, verbose = T,
  #nitt=, burnin=, thin=,
)

## 
##                        MCMC iteration = 0
## 
##                        MCMC iteration = 1000
## 
##                        MCMC iteration = 2000
## 
##                        MCMC iteration = 3000
## 
##                        MCMC iteration = 4000
## 
##                        MCMC iteration = 5000
## 
##                        MCMC iteration = 6000
## 
##                        MCMC iteration = 7000
## 
##                        MCMC iteration = 8000
## 
##                        MCMC iteration = 9000
## 
##                        MCMC iteration = 10000
## 
##                        MCMC iteration = 11000
## 
##                        MCMC iteration = 12000
## 
##                        MCMC iteration = 13000

#summarize the model
summary(model)

## 
##  Iterations = 3001:12991
##  Thinning interval  = 10
##  Sample size  = 1000 
## 
##  DIC: 8959.042 
## 
##  G-structure:  ~us(trait):animal
## 
##                          post.mean  l-95% CI u-95% CI eff.samp
## traitl2s:traitl2s.animal   0.73013  0.616425   0.8621    37.09
## traitl3s:traitl2s.animal   0.36866  0.224162   0.5136    23.64
## traitl4s:traitl2s.animal   0.10250 -0.038942   0.2207    17.57
## traitl5s:traitl2s.animal   0.34146  0.176057   0.5200    11.82
## traitl2s:traitl3s.animal   0.36866  0.224162   0.5136    23.64
## traitl3s:traitl3s.animal   0.58241  0.363786   0.8533    22.69
## traitl4s:traitl3s.animal   0.22955  0.066578   0.3926    16.23
## traitl5s:traitl3s.animal   0.18952  0.009458   0.3731    14.54
## traitl2s:traitl4s.animal   0.10250 -0.038942   0.2207    17.57
## traitl3s:traitl4s.animal   0.22955  0.066578   0.3926    16.23
## traitl4s:traitl4s.animal   0.42334  0.228743   0.5969    17.84
## traitl5s:traitl4s.animal  -0.02578 -0.165363   0.1194    17.15
## traitl2s:traitl5s.animal   0.34146  0.176057   0.5200    11.82
## traitl3s:traitl5s.animal   0.18952  0.009458   0.3731    14.54
## traitl4s:traitl5s.animal  -0.02578 -0.165363   0.1194    17.15
## traitl5s:traitl5s.animal   0.43399  0.167340   0.7039    10.13
## 
##                ~us(trait):Vial
## 
##                        post.mean l-95% CI u-95% CI eff.samp
## traitl2s:traitl2s.Vial   0.14540  0.05981  0.24063   208.74
## traitl3s:traitl2s.Vial   0.01585 -0.07081  0.09102    75.83
## traitl4s:traitl2s.Vial  -0.01671 -0.10341  0.06322    35.25
## traitl5s:traitl2s.Vial   0.06243 -0.02613  0.15592    34.60
## traitl2s:traitl3s.Vial   0.01585 -0.07081  0.09102    75.83
## traitl3s:traitl3s.Vial   0.18607  0.06900  0.29903    39.68
## traitl4s:traitl3s.Vial   0.09231 -0.00444  0.18738    34.03
## traitl5s:traitl3s.Vial  -0.01054 -0.08892  0.07311    36.07
## traitl2s:traitl4s.Vial  -0.01671 -0.10341  0.06322    35.25
## traitl3s:traitl4s.Vial   0.09231 -0.00444  0.18738    34.03
## traitl4s:traitl4s.Vial   0.27150  0.16067  0.40355    75.12
## traitl5s:traitl4s.Vial   0.08774  0.01012  0.17247    50.70
## traitl2s:traitl5s.Vial   0.06243 -0.02613  0.15592    34.60
## traitl3s:traitl5s.Vial  -0.01054 -0.08892  0.07311    36.07
## traitl4s:traitl5s.Vial   0.08774  0.01012  0.17247    50.70
## traitl5s:traitl5s.Vial   0.21480  0.10612  0.33037    23.34
## 
##  R-structure:  ~us(trait):units
## 
##                         post.mean  l-95% CI u-95% CI eff.samp
## traitl2s:traitl2s.units  0.081419  0.028345  0.13651   31.722
## traitl3s:traitl2s.units  0.016870 -0.063477  0.07606   19.576
## traitl4s:traitl2s.units -0.030458 -0.094219  0.03539   15.471
## traitl5s:traitl2s.units -0.007396 -0.087412  0.08141    9.207
## traitl2s:traitl3s.units  0.016870 -0.063477  0.07606   19.576
## traitl3s:traitl3s.units  0.201377  0.069765  0.31263   18.759
## traitl4s:traitl3s.units  0.079926  0.001138  0.16716   15.045
## traitl5s:traitl3s.units  0.021785 -0.068093  0.11265   13.110
## traitl2s:traitl4s.units -0.030458 -0.094219  0.03539   15.471
## traitl3s:traitl4s.units  0.079926  0.001138  0.16716   15.045
## traitl4s:traitl4s.units  0.192345  0.098933  0.28151   16.244
## traitl5s:traitl4s.units  0.129597  0.055261  0.20123   16.242
## traitl2s:traitl5s.units -0.007396 -0.087412  0.08141    9.207
## traitl3s:traitl5s.units  0.021785 -0.068093  0.11265   13.110
## traitl4s:traitl5s.units  0.129597  0.055261  0.20123   16.242
## traitl5s:traitl5s.units  0.257178  0.102059  0.37786    8.478
## 
##  Location effects: cbind(l2s, l3s, l4s, l5s) ~ trait - 1 + Block + Column 
## 
##          post.mean l-95% CI u-95% CI eff.samp pMCMC
## traitl2s   0.09852 -0.06774  0.23949   1000.0 0.198
## traitl3s   0.09211 -0.03830  0.22140    249.2 0.190
## traitl4s   0.06703 -0.06621  0.20729    437.3 0.328
## traitl5s   0.06745 -0.06553  0.19975    515.9 0.338
## Block2    -0.13993 -0.30039  0.03622    101.0 0.116
## ColumnF   -0.01096 -0.05464  0.03231   1000.0 0.658

After running your model, assess the MCMC run to see whether it ran for a sufficient length of time. What you are looking for is a trace plot that looks like a fuzzy caterpillar, and a density plot that looks like a normal distribution

#plot the trace and posterior density for a variable
#plot the trace and posterior density for the `traittarsus:traittarsus.animal` variable
plot(model$Sol)

plot(model$VCV[,1:10])

plot(model$VCV[,11:20])

plot(model$VCV[,21:32])

#calculate autocorrelation between estimates
autocorr.diag(model$VCV)

##         traitl2s:traitl2s.animal traitl3s:traitl2s.animal
## Lag 0                 1.00000000                1.0000000
## Lag 10                0.77671532                0.8745685
## Lag 50                0.56604086                0.7642979
## Lag 100               0.36219701                0.6425177
## Lag 500               0.05261015                0.2048663
##         traitl4s:traitl2s.animal traitl5s:traitl2s.animal
## Lag 0                  1.0000000                1.0000000
## Lag 10                 0.9183136                0.9103215
## Lag 50                 0.8046875                0.7607653
## Lag 100                0.6830884                0.6162243
## Lag 500                0.1817047                0.1523944
##         traitl2s:traitl3s.animal traitl3s:traitl3s.animal
## Lag 0                  1.0000000               1.00000000
## Lag 10                 0.8745685               0.91984697
## Lag 50                 0.7642979               0.79499829
## Lag 100                0.6425177               0.66316894
## Lag 500                0.2048663               0.08397362
##         traitl4s:traitl3s.animal traitl5s:traitl3s.animal
## Lag 0                  1.0000000                1.0000000
## Lag 10                 0.9162679                0.9464407
## Lag 50                 0.7961105                0.8452590
## Lag 100                0.6809388                0.7317542
## Lag 500                0.1148138                0.1995741
##         traitl2s:traitl4s.animal traitl3s:traitl4s.animal
## Lag 0                  1.0000000                1.0000000
## Lag 10                 0.9183136                0.9162679
## Lag 50                 0.8046875                0.7961105
## Lag 100                0.6830884                0.6809388
## Lag 500                0.1817047                0.1148138
##         traitl4s:traitl4s.animal traitl5s:traitl4s.animal
## Lag 0                  1.0000000                1.0000000
## Lag 10                 0.9301880                0.9684940
## Lag 50                 0.8295949                0.8998259
## Lag 100                0.7343611                0.8189337
## Lag 500                0.2090379                0.3187443
##         traitl2s:traitl5s.animal traitl3s:traitl5s.animal
## Lag 0                  1.0000000                1.0000000
## Lag 10                 0.9103215                0.9464407
## Lag 50                 0.7607653                0.8452590
## Lag 100                0.6162243                0.7317542
## Lag 500                0.1523944                0.1995741
##         traitl4s:traitl5s.animal traitl5s:traitl5s.animal
## Lag 0                  1.0000000                1.0000000
## Lag 10                 0.9684940                0.9658395
## Lag 50                 0.8998259                0.8771823
## Lag 100                0.8189337                0.7866685
## Lag 500                0.3187443                0.3137599
##         traitl2s:traitl2s.Vial traitl3s:traitl2s.Vial traitl4s:traitl2s.Vial
## Lag 0               1.00000000             1.00000000             1.00000000
## Lag 10              0.49687365             0.61312066             0.56278496
## Lag 50              0.19156170             0.35023029             0.35035117
## Lag 100             0.22061045             0.31395526             0.29606998
## Lag 500            -0.00442625             0.08328054             0.05239741
##         traitl5s:traitl2s.Vial traitl2s:traitl3s.Vial traitl3s:traitl3s.Vial
## Lag 0               1.00000000             1.00000000             1.00000000
## Lag 10              0.56799574             0.61312066             0.53261983
## Lag 50              0.34315889             0.35023029             0.30752863
## Lag 100             0.31270951             0.31395526             0.29030163
## Lag 500             0.06132009             0.08328054             0.05881093
##         traitl4s:traitl3s.Vial traitl5s:traitl3s.Vial traitl2s:traitl4s.Vial
## Lag 0               1.00000000             1.00000000             1.00000000
## Lag 10              0.48860877             0.60430213             0.56278496
## Lag 50              0.34512310             0.43007083             0.35035117
## Lag 100             0.30117529             0.38894663             0.29606998
## Lag 500             0.07704659             0.08240153             0.05239741
##         traitl3s:traitl4s.Vial traitl4s:traitl4s.Vial traitl5s:traitl4s.Vial
## Lag 0               1.00000000              1.0000000              1.0000000
## Lag 10              0.48860877              0.4554379              0.6085302
## Lag 50              0.34512310              0.3405114              0.5142248
## Lag 100             0.30117529              0.2879653              0.4577181
## Lag 500             0.07704659              0.1199511              0.2322306
##         traitl2s:traitl5s.Vial traitl3s:traitl5s.Vial traitl4s:traitl5s.Vial
## Lag 0               1.00000000             1.00000000              1.0000000
## Lag 10              0.56799574             0.60430213              0.6085302
## Lag 50              0.34315889             0.43007083              0.5142248
## Lag 100             0.31270951             0.38894663              0.4577181
## Lag 500             0.06132009             0.08240153              0.2322306
##         traitl5s:traitl5s.Vial traitl2s:traitl2s.units traitl3s:traitl2s.units
## Lag 0                1.0000000              1.00000000               1.0000000
## Lag 10               0.6395262              0.91665304               0.9552116
## Lag 50               0.5473007              0.65974264               0.8216119
## Lag 100              0.4515577              0.42089420               0.6791017
## Lag 500              0.2139800              0.09396462               0.2410493
##         traitl4s:traitl2s.units traitl5s:traitl2s.units traitl2s:traitl3s.units
## Lag 0                 1.0000000               1.0000000               1.0000000
## Lag 10                0.9499647               0.9400568               0.9552116
## Lag 50                0.8346657               0.7933036               0.8216119
## Lag 100               0.7085795               0.6496255               0.6791017
## Lag 500               0.2146753               0.1418543               0.2410493
##         traitl3s:traitl3s.units traitl4s:traitl3s.units traitl5s:traitl3s.units
## Lag 0                 1.0000000               1.0000000               1.0000000
## Lag 10                0.9492174               0.9456094               0.9526415
## Lag 50                0.8154004               0.8159038               0.8422274
## Lag 100               0.6898534               0.7032383               0.7310818
## Lag 500               0.1024166               0.1078273               0.2011584
##         traitl2s:traitl4s.units traitl3s:traitl4s.units traitl4s:traitl4s.units
## Lag 0                 1.0000000               1.0000000               1.0000000
## Lag 10                0.9499647               0.9456094               0.9434076
## Lag 50                0.8346657               0.8159038               0.8306505
## Lag 100               0.7085795               0.7032383               0.7370967
## Lag 500               0.2146753               0.1078273               0.1957837
##         traitl5s:traitl4s.units traitl2s:traitl5s.units traitl3s:traitl5s.units
## Lag 0                 1.0000000               1.0000000               1.0000000
## Lag 10                0.9611350               0.9400568               0.9526415
## Lag 50                0.8929816               0.7933036               0.8422274
## Lag 100               0.8117488               0.6496255               0.7310818
## Lag 500               0.3158567               0.1418543               0.2011584
##         traitl4s:traitl5s.units traitl5s:traitl5s.units
## Lag 0                 1.0000000               1.0000000
## Lag 10                0.9611350               0.9516570
## Lag 50                0.8929816               0.8721305
## Lag 100               0.8117488               0.7828992
## Lag 500               0.3158567               0.3010300

The following bit of code makes a nice table with the estimates and effective sample sizes beside them.The effective sample size indicates the number of independent samples from you MCMC chain, which you want to be as large as possible. You are looking for effective sample sizes close to the actual sample sizes from your MCMC chain, smaller numbers indicates autocorrelation in the estimates.

    clean.MCMC <- function(x) {
    sols <- summary(x)$solutions  ## pull out relevant info from model summary
    Gcovs <- summary(x)$Gcovariances
    Rcovs <- summary(x)$Rcovariances
    fixed <- data.frame(row.names(sols), sols, row.names = NULL)  ## convert to dataframes with the row.names as the first col
    random <- data.frame(row.names(Gcovs), Gcovs, row.names = NULL)
    residual <- data.frame(row.names(Rcovs), Rcovs, row.names = NULL)
    names(fixed)[names(fixed) == "row.names.sols."] <- "variable"  ## change the columns names to variable, so they all match
    names(random)[names(random) == "row.names.Gcovs."] <- "variable"
    names(residual)[names(residual) == "row.names.Rcovs."] <- "variable"
    fixed$effect <- "fixed"  ## add ID column for type of effect (fixed, random, residual)
    random$effect <- "random"
    residual$effect <- "residual"
    modelTerms <- as.data.frame(bind_rows(fixed, random, residual))  # merge it all together
    }
    getName.MCMC <- function(x) {deparse(substitute(x))}
    
oneModel <- clean.MCMC(model)  # get all the info from summary(modelName)
oneModel$modelName <- getName.MCMC(model)  # add the model's name in a new column
oneModel

##                    variable    post.mean     l.95..CI   u.95..CI    eff.samp
## 1                  traitl2s  0.098523877 -0.067739515 0.23948665 1000.000000
## 2                  traitl3s  0.092111321 -0.038296162 0.22139822  249.217209
## 3                  traitl4s  0.067033189 -0.066205468 0.20729053  437.267624
## 4                  traitl5s  0.067452505 -0.065533190 0.19975484  515.887523
## 5                    Block2 -0.139926410 -0.300394334 0.03621734  100.955905
## 6                   ColumnF -0.010964564 -0.054644068 0.03231489 1000.000000
## 7  traitl2s:traitl2s.animal  0.730125087  0.616425377 0.86214877   37.093461
## 8  traitl3s:traitl2s.animal  0.368661615  0.224162176 0.51361361   23.638521
## 9  traitl4s:traitl2s.animal  0.102500917 -0.038941847 0.22070311   17.572866
## 10 traitl5s:traitl2s.animal  0.341462850  0.176057127 0.51997719   11.818877
## 11 traitl2s:traitl3s.animal  0.368661615  0.224162176 0.51361361   23.638521
## 12 traitl3s:traitl3s.animal  0.582411656  0.363785741 0.85328352   22.688535
## 13 traitl4s:traitl3s.animal  0.229550587  0.066577668 0.39262077   16.227555
## 14 traitl5s:traitl3s.animal  0.189524695  0.009457925 0.37312718   14.538337
## 15 traitl2s:traitl4s.animal  0.102500917 -0.038941847 0.22070311   17.572866
## 16 traitl3s:traitl4s.animal  0.229550587  0.066577668 0.39262077   16.227555
## 17 traitl4s:traitl4s.animal  0.423342651  0.228743033 0.59694534   17.835616
## 18 traitl5s:traitl4s.animal -0.025779594 -0.165363009 0.11936950   17.145641
## 19 traitl2s:traitl5s.animal  0.341462850  0.176057127 0.51997719   11.818877
## 20 traitl3s:traitl5s.animal  0.189524695  0.009457925 0.37312718   14.538337
## 21 traitl4s:traitl5s.animal -0.025779594 -0.165363009 0.11936950   17.145641
## 22 traitl5s:traitl5s.animal  0.433993830  0.167340435 0.70391727   10.130271
## 23   traitl2s:traitl2s.Vial  0.145401082  0.059806795 0.24063419  208.737909
## 24   traitl3s:traitl2s.Vial  0.015847227 -0.070806942 0.09101862   75.834482
## 25   traitl4s:traitl2s.Vial -0.016707009 -0.103411130 0.06321968   35.246231
## 26   traitl5s:traitl2s.Vial  0.062425221 -0.026127843 0.15592389   34.602775
## 27   traitl2s:traitl3s.Vial  0.015847227 -0.070806942 0.09101862   75.834482
## 28   traitl3s:traitl3s.Vial  0.186070157  0.068998658 0.29903242   39.677978
## 29   traitl4s:traitl3s.Vial  0.092305132 -0.004440188 0.18738007   34.031047
## 30   traitl5s:traitl3s.Vial -0.010540752 -0.088918504 0.07311185   36.072753
## 31   traitl2s:traitl4s.Vial -0.016707009 -0.103411130 0.06321968   35.246231
## 32   traitl3s:traitl4s.Vial  0.092305132 -0.004440188 0.18738007   34.031047
## 33   traitl4s:traitl4s.Vial  0.271504032  0.160671395 0.40354961   75.124679
## 34   traitl5s:traitl4s.Vial  0.087741535  0.010118732 0.17246891   50.698637
## 35   traitl2s:traitl5s.Vial  0.062425221 -0.026127843 0.15592389   34.602775
## 36   traitl3s:traitl5s.Vial -0.010540752 -0.088918504 0.07311185   36.072753
## 37   traitl4s:traitl5s.Vial  0.087741535  0.010118732 0.17246891   50.698637
## 38   traitl5s:traitl5s.Vial  0.214800526  0.106116046 0.33037368   23.341733
## 39  traitl2s:traitl2s.units  0.081418799  0.028345215 0.13650946   31.722363
## 40  traitl3s:traitl2s.units  0.016869837 -0.063476807 0.07606305   19.575880
## 41  traitl4s:traitl2s.units -0.030458447 -0.094218714 0.03538642   15.471428
## 42  traitl5s:traitl2s.units -0.007396331 -0.087411521 0.08141220    9.207197
## 43  traitl2s:traitl3s.units  0.016869837 -0.063476807 0.07606305   19.575880
## 44  traitl3s:traitl3s.units  0.201376778  0.069764615 0.31262855   18.758734
## 45  traitl4s:traitl3s.units  0.079925683  0.001138465 0.16715568   15.045283
## 46  traitl5s:traitl3s.units  0.021785378 -0.068092779 0.11265145   13.110478
## 47  traitl2s:traitl4s.units -0.030458447 -0.094218714 0.03538642   15.471428
## 48  traitl3s:traitl4s.units  0.079925683  0.001138465 0.16715568   15.045283
## 49  traitl4s:traitl4s.units  0.192344997  0.098932560 0.28151483   16.244320
## 50  traitl5s:traitl4s.units  0.129597270  0.055260732 0.20122887   16.242138
## 51  traitl2s:traitl5s.units -0.007396331 -0.087411521 0.08141220    9.207197
## 52  traitl3s:traitl5s.units  0.021785378 -0.068092779 0.11265145   13.110478
## 53  traitl4s:traitl5s.units  0.129597270  0.055260732 0.20122887   16.242138
## 54  traitl5s:traitl5s.units  0.257178096  0.102059017 0.37786124    8.478093
##    pMCMC   effect modelName
## 1  0.198    fixed     model
## 2  0.190    fixed     model
## 3  0.328    fixed     model
## 4  0.338    fixed     model
## 5  0.116    fixed     model
## 6  0.658    fixed     model
## 7     NA   random     model
## 8     NA   random     model
## 9     NA   random     model
## 10    NA   random     model
## 11    NA   random     model
## 12    NA   random     model
## 13    NA   random     model
## 14    NA   random     model
## 15    NA   random     model
## 16    NA   random     model
## 17    NA   random     model
## 18    NA   random     model
## 19    NA   random     model
## 20    NA   random     model
## 21    NA   random     model
## 22    NA   random     model
## 23    NA   random     model
## 24    NA   random     model
## 25    NA   random     model
## 26    NA   random     model
## 27    NA   random     model
## 28    NA   random     model
## 29    NA   random     model
## 30    NA   random     model
## 31    NA   random     model
## 32    NA   random     model
## 33    NA   random     model
## 34    NA   random     model
## 35    NA   random     model
## 36    NA   random     model
## 37    NA   random     model
## 38    NA   random     model
## 39    NA residual     model
## 40    NA residual     model
## 41    NA residual     model
## 42    NA residual     model
## 43    NA residual     model
## 44    NA residual     model
## 45    NA residual     model
## 46    NA residual     model
## 47    NA residual     model
## 48    NA residual     model
## 49    NA residual     model
## 50    NA residual     model
## 51    NA residual     model
## 52    NA residual     model
## 53    NA residual     model
## 54    NA residual     model

Do you think that your model ran for a sufficient number of MCMC iterations? Describe what features you used to reach your conclusion.

If you model did not run for a sufficient amount of time, import the output from an analysis that I ran previously and continue your analysis using this output.

#load(file = "./activity/MCMCglmm_model_z_Longrun.rda")

Now, let’s look at the estimates for the variance components in the model.

writeLines("Variance components:")
#calculate the modes of the posterior distribution for the variance components
posterior.mode(model$VCV)

## Variance components:
## traitl2s:traitl2s.animal traitl3s:traitl2s.animal traitl4s:traitl2s.animal 
##              0.663548211              0.341182067              0.116515542 
## traitl5s:traitl2s.animal traitl2s:traitl3s.animal traitl3s:traitl3s.animal 
##              0.336772596              0.341182067              0.598548365 
## traitl4s:traitl3s.animal traitl5s:traitl3s.animal traitl2s:traitl4s.animal 
##              0.331569494              0.161049892              0.116515542 
## traitl3s:traitl4s.animal traitl4s:traitl4s.animal traitl5s:traitl4s.animal 
##              0.331569494              0.488468971             -0.061777253 
## traitl2s:traitl5s.animal traitl3s:traitl5s.animal traitl4s:traitl5s.animal 
##              0.336772596              0.161049892             -0.061777253 
## traitl5s:traitl5s.animal   traitl2s:traitl2s.Vial   traitl3s:traitl2s.Vial 
##              0.446580528              0.152470459              0.034317339 
##   traitl4s:traitl2s.Vial   traitl5s:traitl2s.Vial   traitl2s:traitl3s.Vial 
##             -0.043478773              0.058389242              0.034317339 
##   traitl3s:traitl3s.Vial   traitl4s:traitl3s.Vial   traitl5s:traitl3s.Vial 
##              0.129727694              0.079436964             -0.027774004 
##   traitl2s:traitl4s.Vial   traitl3s:traitl4s.Vial   traitl4s:traitl4s.Vial 
##             -0.043478773              0.079436964              0.277259806 
##   traitl5s:traitl4s.Vial   traitl2s:traitl5s.Vial   traitl3s:traitl5s.Vial 
##              0.109707865              0.058389242             -0.027774004 
##   traitl4s:traitl5s.Vial   traitl5s:traitl5s.Vial  traitl2s:traitl2s.units 
##              0.109707865              0.219878700              0.098643556 
##  traitl3s:traitl2s.units  traitl4s:traitl2s.units  traitl5s:traitl2s.units 
##             -0.002707819             -0.046813581             -0.029243644 
##  traitl2s:traitl3s.units  traitl3s:traitl3s.units  traitl4s:traitl3s.units 
##             -0.002707819              0.169693333              0.053737027 
##  traitl5s:traitl3s.units  traitl2s:traitl4s.units  traitl3s:traitl4s.units 
##              0.023220952             -0.046813581              0.053737027 
##  traitl4s:traitl4s.units  traitl5s:traitl4s.units  traitl2s:traitl5s.units 
##              0.160809569              0.166652525             -0.029243644 
##  traitl3s:traitl5s.units  traitl4s:traitl5s.units  traitl5s:traitl5s.units 
##              0.023220952              0.166652525              0.352830271

Estimate the heritability of each trait

#estimate heritability of l2-l5
heritability.l2 <- model$VCV[, "traitl2s:traitl2s.animal"] / (model$VCV[, "traitl2s:traitl2s.animal"] +model$VCV[, "traitl2s:traitl2s.Vial"]+ model$VCV[, "traitl2s:traitl2s.units"])
heritability.l3 <- model$VCV[, "traitl3s:traitl3s.animal"] / (model$VCV[, "traitl3s:traitl3s.animal"] +model$VCV[, "traitl3s:traitl3s.Vial"]+ model$VCV[, "traitl3s:traitl3s.units"])
heritability.l4 <- model$VCV[, "traitl4s:traitl4s.animal"] / (model$VCV[, "traitl4s:traitl4s.animal"] + model$VCV[, "traitl4s:traitl4s.Vial"]+  model$VCV[, "traitl4s:traitl4s.units"])
heritability.l5 <- model$VCV[, "traitl5s:traitl5s.animal"] / (model$VCV[, "traitl5s:traitl5s.animal"] +model$VCV[, "traitl5s:traitl5s.Vial"]+ model$VCV[, "traitl5s:traitl5s.units"])


#calculate the mode of the posterior distributions for heritability and the HPD intervals
h2.l2<-c(posterior.mode(heritability.l2), HPDinterval(heritability.l2))

h2.l3<-c(posterior.mode(heritability.l3), HPDinterval(heritability.l3))

h2.l4<-c(posterior.mode(heritability.l4), HPDinterval(heritability.l4))

h2.l5<-c(posterior.mode(heritability.l5), HPDinterval(heritability.l5))

#arrange in a data frame and produce a table
heritabilities<-rbind(h2.l2, h2.l3, h2.l4, h2.l5)
colnames(heritabilities)<-c("h2", "lowerHPD", "upperHPD")
knitr::kable(heritabilities[,1:3], "simple", digits=2)

	h2	lowerHPD	upperHPD
h2.l2	0.81	0.64	0.88
h2.l3	0.56	0.36	0.82
h2.l4	0.44	0.26	0.67
h2.l5	0.46	0.20	0.74

What is your estimate of the heritability for each trait? Include the 95% HPD.

Let’s arrange the estimates from our model into a G-matrix

#put the posterior mode of the VCV elements in a matrix

pointG<-matrix(posterior.mode(model$VCV[,1:16]),4,4)
pointG

##           [,1]      [,2]        [,3]        [,4]
## [1,] 0.6635482 0.3411821  0.11651554  0.33677260
## [2,] 0.3411821 0.5985484  0.33156949  0.16104989
## [3,] 0.1165155 0.3315695  0.48846897 -0.06177725
## [4,] 0.3367726 0.1610499 -0.06177725  0.44658053

#turn into correlation matrix

pointGcor<-cov2cor(pointG)
pointGcor

##           [,1]      [,2]       [,3]       [,4]
## [1,] 1.0000000 0.5413777  0.2046582  0.6186580
## [2,] 0.5413777 1.0000000  0.6132061  0.3115021
## [3,] 0.2046582 0.6132061  1.0000000 -0.1322697
## [4,] 0.6186580 0.3115021 -0.1322697  1.0000000

What is the genetic covariance between each pair of traits? Include the 95% HPD.

What are the genetic correlations between all traits?

What would you conclude about the genetic architecture of these traits?

Quantify potential genetic constraints

Now that we have estimated G, we would like to determine how genetic variance is distributed across multivariate space.

Start by doing an eigenanalysis of G and produce a scree plot of the eigenvalues.

eigen(pointG)

## eigen() decomposition
## $values
## [1] 1.2409757 0.6417662 0.1842224 0.1301817
## 
## $vectors
##           [,1]       [,2]       [,3]       [,4]
## [1,] 0.6324326  0.3553261  0.6731876 -0.1434953
## [2,] 0.5981946 -0.3351711 -0.4981884 -0.5306900
## [3,] 0.3316509 -0.6768211  0.1803805  0.6319682
## [4,] 0.3635931  0.5507421 -0.5158389  0.5462540

#plot the eigenvalues
plot(eigen(pointG)$values)

#plot the eigenvalues with uncertainty

#save an object of the MCMC samples of G
Gmcmc<-model$VCV[,1:16]

#eigenanalysis of each posterior sample
eigs<-array(dim=c(1000,4))
for (i in 1:1000){
 eigs[i,]<-eigen(matrix(Gmcmc[i,],4,4))$values
}

eig1l<-quantile(eigs[,1], 0.025)
eig1u<-quantile(eigs[,1], 0.975)

eig2l<-quantile(eigs[,2], 0.025)
eig2u<-quantile(eigs[,2], 0.975)

eig3l<-quantile(eigs[,3], 0.025)
eig3u<-quantile(eigs[,3], 0.975)

eig4l<-quantile(eigs[,4], 0.025)
eig4u<-quantile(eigs[,4], 0.975)

#plot

eigen_plot<-data.frame(seq(1:4),eigen(pointG)$values, rbind(eig1l, eig2l, eig3l, eig4l), rbind(eig1u, eig2u, eig3u, eig4u))
colnames(eigen_plot)<-c("eigenvalue", "value", "lci", "uci")

p<-ggplot(data = eigen_plot, aes(x = eigenvalue, y = value)) +
  geom_point(cex=5) + 
  geom_errorbar(aes(x = eigenvalue, ymin = lci, ymax = uci))+
   scale_y_continuous(expand = c(0, 0), limits = c(-1, 2)) 

p

Are there any multivariate trait combinations that have low genetic variance? How many?

What is the effective dimensionality of the matrix?

Fit a null-model

We would like to know if our estimates of genetic variance and covariance are different from sampling error (ie. are they statistically significant). MCMCglmm constrains estimates of G to be positive-definite, which means that estimates of variances, including eigenvalues, must be greater than or equal to 0. Therefore, the lower bound of an HPD cannot be less than 0, which means we can’t compare it to 0 to determine statistical support.

We need to fit a null-model to compare our estimates to. There are different ways one could go about doing this, but the most straightforward is to randomise the phenotypes across the pedigree to keep the same data structure, but remove any genetic signal, and then re-fit the model. There are technical questions about the best way to do this randomisation in the presence of fixed effects, but let’s ignore those questions for now.

#randomise the phenotypes to break any genetic links in the data

shuffled_data<-data[sample(1:nrow(data)), 12:15]
colnames(shuffled_data)<-c("l2sr", "l3sr", "l4sr", "l5sr")
data<-cbind(data, shuffled_data)

#fit the same model that you used to estimate G

#build a model
model.r <- MCMCglmm(cbind(l2sr, l3sr, l4sr, l5sr) ~ trait - 1+Block+Column,
  random = ~ us(trait):animal + us(trait):Vial,
  rcov = ~ us(trait):units,
  family = c("gaussian", "gaussian", "gaussian", "gaussian"),
  pedigree= ped,
  data = data, prior = prior, verbose = T
)

## 
##                        MCMC iteration = 0
## 
##                        MCMC iteration = 1000
## 
##                        MCMC iteration = 2000
## 
##                        MCMC iteration = 3000
## 
##                        MCMC iteration = 4000
## 
##                        MCMC iteration = 5000
## 
##                        MCMC iteration = 6000
## 
##                        MCMC iteration = 7000
## 
##                        MCMC iteration = 8000
## 
##                        MCMC iteration = 9000
## 
##                        MCMC iteration = 10000
## 
##                        MCMC iteration = 11000
## 
##                        MCMC iteration = 12000
## 
##                        MCMC iteration = 13000

#summarize the model
summary(model.r)

## 
##  Iterations = 3001:12991
##  Thinning interval  = 10
##  Sample size  = 1000 
## 
##  DIC: 21295.08 
## 
##  G-structure:  ~us(trait):animal
## 
##                            post.mean  l-95% CI u-95% CI eff.samp
## traitl2sr:traitl2sr.animal  0.043954  0.023325  0.06785    72.33
## traitl3sr:traitl2sr.animal  0.009183 -0.008883  0.02930    56.75
## traitl4sr:traitl2sr.animal -0.002219 -0.021557  0.01428    53.87
## traitl5sr:traitl2sr.animal  0.005932 -0.011404  0.02426    38.98
## traitl2sr:traitl3sr.animal  0.009183 -0.008883  0.02930    56.75
## traitl3sr:traitl3sr.animal  0.045331  0.023323  0.07367    52.55
## traitl4sr:traitl3sr.animal  0.013522 -0.006030  0.03699    49.41
## traitl5sr:traitl3sr.animal  0.002356 -0.017841  0.02298    47.58
## traitl2sr:traitl4sr.animal -0.002219 -0.021557  0.01428    53.87
## traitl3sr:traitl4sr.animal  0.013522 -0.006030  0.03699    49.41
## traitl4sr:traitl4sr.animal  0.053278  0.021184  0.08631    42.20
## traitl5sr:traitl4sr.animal  0.004471 -0.016795  0.02353    96.07
## traitl2sr:traitl5sr.animal  0.005932 -0.011404  0.02426    38.98
## traitl3sr:traitl5sr.animal  0.002356 -0.017841  0.02298    47.58
## traitl4sr:traitl5sr.animal  0.004471 -0.016795  0.02353    96.07
## traitl5sr:traitl5sr.animal  0.045057  0.023527  0.07179    48.00
## 
##                ~us(trait):Vial
## 
##                           post.mean   l-95% CI u-95% CI eff.samp
## traitl2sr:traitl2sr.Vial  0.0318932  0.0184069  0.04789    532.4
## traitl3sr:traitl2sr.Vial  0.0073445 -0.0040313  0.02002    541.6
## traitl4sr:traitl2sr.Vial -0.0009259 -0.0132805  0.00975    524.3
## traitl5sr:traitl2sr.Vial  0.0041812 -0.0058038  0.01667    548.6
## traitl2sr:traitl3sr.Vial  0.0073445 -0.0040313  0.02002    541.6
## traitl3sr:traitl3sr.Vial  0.0316813  0.0172933  0.04656    632.9
## traitl4sr:traitl3sr.Vial  0.0112068  0.0001019  0.02440    606.1
## traitl5sr:traitl3sr.Vial  0.0026549 -0.0095074  0.01245    651.8
## traitl2sr:traitl4sr.Vial -0.0009259 -0.0132805  0.00975    524.3
## traitl3sr:traitl4sr.Vial  0.0112068  0.0001019  0.02440    606.1
## traitl4sr:traitl4sr.Vial  0.0367322  0.0199655  0.05477    491.1
## traitl5sr:traitl4sr.Vial  0.0039065 -0.0075782  0.01608    639.3
## traitl2sr:traitl5sr.Vial  0.0041812 -0.0058038  0.01667    548.6
## traitl3sr:traitl5sr.Vial  0.0026549 -0.0095074  0.01245    651.8
## traitl4sr:traitl5sr.Vial  0.0039065 -0.0075782  0.01608    639.3
## traitl5sr:traitl5sr.Vial  0.0311723  0.0191224  0.04629    652.6
## 
##  R-structure:  ~us(trait):units
## 
##                           post.mean l-95% CI u-95% CI eff.samp
## traitl2sr:traitl2sr.units    0.9613  0.89975   1.0263   1000.0
## traitl3sr:traitl2sr.units    0.4098  0.36201   0.4594    740.1
## traitl4sr:traitl2sr.units    0.1132  0.05973   0.1578    719.3
## traitl5sr:traitl2sr.units    0.3945  0.34610   0.4423    637.2
## traitl2sr:traitl3sr.units    0.4098  0.36201   0.4594    740.1
## traitl3sr:traitl3sr.units    0.9604  0.89798   1.0242    873.6
## traitl4sr:traitl3sr.units    0.4599  0.41316   0.5161    848.7
## traitl5sr:traitl3sr.units    0.2393  0.19445   0.2871   1000.0
## traitl2sr:traitl4sr.units    0.1132  0.05973   0.1578    719.3
## traitl3sr:traitl4sr.units    0.4599  0.41316   0.5161    848.7
## traitl4sr:traitl4sr.units    0.9465  0.88042   1.0168    304.2
## traitl5sr:traitl4sr.units    0.2576  0.21072   0.3075    842.2
## traitl2sr:traitl5sr.units    0.3945  0.34610   0.4423    637.2
## traitl3sr:traitl5sr.units    0.2393  0.19445   0.2871   1000.0
## traitl4sr:traitl5sr.units    0.2576  0.21072   0.3075    842.2
## traitl5sr:traitl5sr.units    0.9617  0.90304   1.0275    840.5
## 
##  Location effects: cbind(l2sr, l3sr, l4sr, l5sr) ~ trait - 1 + Block + Column 
## 
##           post.mean l-95% CI u-95% CI eff.samp pMCMC
## traitl2sr   0.01437 -0.06101  0.08890   1000.0 0.726
## traitl3sr   0.01183 -0.06019  0.08631   1000.0 0.744
## traitl4sr   0.01562 -0.06330  0.08258   1000.0 0.688
## traitl5sr   0.01216 -0.07025  0.08856   1000.0 0.754
## Block2      0.01226 -0.06169  0.08467   1000.0 0.772
## ColumnF    -0.03749 -0.08936  0.03397    887.7 0.230

#this will take a while to run, so load the output from a model that I have run for you

Produce a scree plot of the eigenvalues from your estimated G and your null G with 95% HPD on the same figure.

#put the posterior mode of the VCV elements in a matrix

pointG.r<-matrix(posterior.mode(model.r$VCV[,1:16]),4,4)
pointG.r

##              [,1]          [,2]         [,3]          [,4]
## [1,]  0.036623329  0.0038493353 -0.002957749  0.0053837592
## [2,]  0.003849335  0.0393369165  0.013745495 -0.0004053919
## [3,] -0.002957749  0.0137454954  0.047734297  0.0015486248
## [4,]  0.005383759 -0.0004053919  0.001548625  0.0406155395

#turn into correlation matrix

pointG.rcor<-cov2cor(pointG.r)
pointG.rcor

##             [,1]        [,2]        [,3]        [,4]
## [1,]  1.00000000  0.10141605 -0.07074033  0.13959202
## [2,]  0.10141605  1.00000000  0.31720861 -0.01014212
## [3,] -0.07074033  0.31720861  1.00000000  0.03517101
## [4,]  0.13959202 -0.01014212  0.03517101  1.00000000

#save an object of the MCMC samples of G
Gmcmc.r<-model.r$VCV[,1:16]

#eigenanalysis of each posterior sample
eigs.r<-array(dim=c(1000,4))
for (i in 1:1000){
 eigs.r[i,]<-eigen(matrix(Gmcmc.r[i,],4,4))$values
}

eig1l.r<-quantile(eigs.r[,1], 0.025)
eig1u.r<-quantile(eigs.r[,1], 0.975)

eig2l.r<-quantile(eigs.r[,2], 0.025)
eig2u.r<-quantile(eigs.r[,2], 0.975)

eig3l.r<-quantile(eigs.r[,3], 0.025)
eig3u.r<-quantile(eigs.r[,3], 0.975)

eig4l.r<-quantile(eigs.r[,4], 0.025)
eig4u.r<-quantile(eigs.r[,4], 0.975)

#plot the eigenvalues


eigen_plot.r<-data.frame(seq(1:4),eigen(pointG.r)$values, rbind(eig1l.r, eig2l.r, eig3l.r, eig4l.r), rbind(eig1u.r, eig2u.r, eig3u.r, eig4u.r))
colnames(eigen_plot.r)<-c("eigenvalue", "value", "lci", "uci")

p + geom_point(data = eigen_plot.r, aes(x = eigenvalue, y = value), color = "purple", cex=5) + 
  geom_errorbar(data = eigen_plot.r, aes(x = eigenvalue, ymin = lci, ymax = uci))

Do any of your previous interpretations change now that you have compared your estimates to the null model?

Estimate selection

Next we will estimate how selection acts on these 4 CHC traits. The component of fitness that we will use is mating success. The standard Lande-Arnold approach to estimate selection regresses trait values on relative fitness.

First calculate the relative fitness of individuals in the data

#calculate relative fitness and add it to the data
str(data)

## 'data.frame':    1985 obs. of  19 variables:
##  $ Sire           : Factor w/ 80 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Dam            : Factor w/ 240 levels "81","82","83",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ animal         : Factor w/ 1985 levels "321","322","323",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ Column         : Factor w/ 2 levels "B","F": 2 2 2 2 2 1 1 1 1 1 ...
##  $ Block          : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Chosen.Rejected: num  1 0 0 1 0 1 1 1 0 1 ...
##  $ Vial           : Factor w/ 240 levels "81","82","83",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ l2             : num  1.76 1.76 1.72 1.73 1.74 ...
##  $ l3             : num  0.358 0.424 0.364 0.472 0.465 ...
##  $ l4             : num  0.241 0.272 0.334 0.183 0.277 ...
##  $ l5             : num  0.579 0.398 0.382 0.598 0.573 ...
##  $ l2s            : num  1.224 1.111 0.13 0.396 0.835 ...
##  $ l3s            : num  0.439 1.142 0.501 1.654 1.579 ...
##  $ l4s            : num  -0.1219 0.0647 0.425 -0.4594 0.0891 ...
##  $ l5s            : num  -0.308 -1.407 -1.498 -0.196 -0.345 ...
##  $ l2sr           : num  1.756 -0.363 -0.93 -0.276 -1.143 ...
##  $ l3sr           : num  2.5476 0.0644 -0.1564 -0.219 -0.2436 ...
##  $ l4sr           : num  0.7084 0.3317 -0.65 0.0359 -0.0586 ...
##  $ l5sr           : num  0.0195 -0.7992 -0.1465 0.7825 -0.0699 ...

data$relfit<-data$Chosen.Rejected/mean(data$Chosen.Rejected)

Directional selection

Fit the regression model using the lm function to estimate the selection differential for each trait. Don’t forget to include block and column as fixed effects

#regress each trait on relative fitness separately and save the estimate and standard error

sel.dif<-rbind(
summary(lm(relfit~Block+Column+l2s, data=data))$coefficients[4,1:2],
summary(lm(relfit~Block+Column+l3s, data=data))$coefficients[4,1:2],
summary(lm(relfit~Block+Column+l4s, data=data))$coefficients[4,1:2],
summary(lm(relfit~Block+Column+l5s, data=data))$coefficients[4,1:2])

We fit a linear regression to estimate selection, even though the response variable was not normally distributed. That is totally fine, and these are the correct estimates of the gradients! However, to determine which of these coefficients are statistically significant, we should fit a binomial model with a logit link function to get the appropriate p-values.

#fit a model using the glm function
sel.dif.p<-rbind(
summary(glm(Chosen.Rejected ~Block+Column+l2s, family=binomial(link=logit), data=data))$coefficients[4,4],
summary(glm(Chosen.Rejected ~Block+Column+l3s, family=binomial(link=logit), data=data))$coefficients[4,4],
summary(glm(Chosen.Rejected ~Block+Column+l4s, family=binomial(link=logit), data=data))$coefficients[4,4],
summary(glm(Chosen.Rejected ~Block+Column+l5s, family=binomial(link=logit), data=data))$coefficients[4,4])

sel.dif.table<-data.frame(cbind(sel.dif, sel.dif.p))
colnames(sel.dif.table)<-c("estimate", "se", "p-value")
knitr::kable(sel.dif.table, "simple", digits=2)

estimate	se	p-value
-0.02	0.02	0.50
0.04	0.02	0.08
0.03	0.02	0.19
-0.04	0.02	0.12

Which trait is under the strongest directional selection? Weakest?

Now let’s use multiple regression to disentangle the effects of direct versus indirect selection. Use the lm function to fit a multiple regression of all 4 traits on relative fitness. Don’t forget to include block and column as fixed effects.

#fit multiple regression model
sel.grad<-lm(relfit~Block+Column+l2s+l3s+l4s+l5s, data=data)
beta<-sel.grad$coefficients[4:7]
print(beta)

##         l2s         l3s         l4s         l5s 
## -0.02422153  0.05050675  0.02761503 -0.04785819

#fit a model using glm to determine statistical significance for directional selection overall
a3<-glm(Chosen.Rejected ~Block+Column+l2s+l3s+l4s+l5s, family=binomial(link=logit), data=data)
a4<-glm(Chosen.Rejected ~Block+Column, family=binomial(link=logit), data=data)
anova(a3,a4)

## Analysis of Deviance Table
## 
## Model 1: Chosen.Rejected ~ Block + Column + l2s + l3s + l4s + l5s
## Model 2: Chosen.Rejected ~ Block + Column
##   Resid. Df Resid. Dev Df Deviance
## 1      1978     2740.1            
## 2      1982     2749.5 -4  -9.3565

#make a table of the selection gradients and p-values for individual coefficients
sel.grad.table<-cbind(beta, summary(a3)$coefficients[4:7,4])
colnames(sel.grad.table)<-c("selection gradient", "p-value")
knitr::kable(sel.grad.table, "simple", digits=2)

	selection gradient	p-value
l2s	-0.02	0.37
l3s	0.05	0.08
l4s	0.03	0.34
l5s	-0.05	0.07

Now which trait is under the strongest directional selection? Weakest?

What is the total strength of directional selection on CHCs? hint: calculate the length of the vector

Quadratic selection

We might also be interested in whether there is stabilising or disruptive selection on each CHC trait. We will fit the same type of model as before, but now estimate the quadratic terms (ie. the square of each trait). Don’t forget to include the linear terms as covariates in the model, or to double the quadratic coefficients! (see Stinchcombe J. R., A. F. Agrawal, P. A. Hohenlohe, S. J. Arnold, and M. W. Blows. 2008. Estimating nonlinear selection gradients using quadratic regression coefficients: double or nothing? Evolution 62:2435-2440).

a5<-lm(relfit~Block+Column + l2s+l3s+l4s+l5s+I(l2s^2)+I(l3s^2)+I(l4s^2)+I(l5s^2), data=data)
summary(lm(relfit~Block+Column + l2s+l3s+l4s+l5s+I(l2s^2)+I(l3s^2)+I(l4s^2)+I(l5s^2), data=data))

## 
## Call:
## lm(formula = relfit ~ Block + Column + l2s + l3s + l4s + l5s + 
##     I(l2s^2) + I(l3s^2) + I(l4s^2) + I(l5s^2), data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.4679 -0.9944 -0.6388  1.0119  1.8313 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.979855   0.043509  22.521  < 2e-16 ***
## Block2       0.079896   0.052228   1.530  0.12624    
## ColumnF      0.047844   0.046535   1.028  0.30402    
## l2s         -0.041607   0.028877  -1.441  0.14978    
## l3s          0.062248   0.028948   2.150  0.03165 *  
## l4s          0.101355   0.035056   2.891  0.00388 ** 
## l5s          0.015142   0.030539   0.496  0.62007    
## I(l2s^2)    -0.017450   0.015246  -1.145  0.25253    
## I(l3s^2)     0.033031   0.014787   2.234  0.02561 *  
## I(l4s^2)    -0.047947   0.012952  -3.702  0.00022 ***
## I(l5s^2)    -0.008717   0.006483  -1.345  0.17890    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.008 on 1974 degrees of freedom
## Multiple R-squared:  0.0163, Adjusted R-squared:  0.01131 
## F-statistic:  3.27 on 10 and 1974 DF,  p-value: 0.0003306

quad_coeffs<-a5$coefficients[8:11]*2
quad_coeffs

##    I(l2s^2)    I(l3s^2)    I(l4s^2)    I(l5s^2) 
## -0.03490008  0.06606194 -0.09589340 -0.01743486

## compare the fit of a model with and without quadratic terms

a6<-glm(Chosen.Rejected~Block+Column + l2s+l3s+l4s+l5s+I(l2s^2)+I(l3s^2)+I(l4s^2)+I(l5s^2), family=binomial(link=logit), data=data)

quad.sel.table<-cbind(quad_coeffs, summary(a6)$coefficients[8:11,4])
colnames(quad.sel.table)<-c("quad.grad", "p-value")
knitr::kable(quad.sel.table, "simple", digits=2)

	quad.grad	p-value
I(l2s^2)	-0.03	0.26
I(l3s^2)	0.07	0.02
I(l4s^2)	-0.10	0.00
I(l5s^2)	-0.02	0.18

Are any traits under stabilising selection?

Are there any traits under disruptive selection?

Quadratic and correlational selection

The previous analysis looks at individual traits in isolation, but there might be selection on combinations of traits (ie. correlational selection). To estimate quadratic and correlational selection we need to expand our model to include both quadratic and correlational terms. With 4 traits, that is already quite a few parameters! 4 linear + 10 quadratic and correlational terms = 14. The matrix of quadratic and correlational selection coefficients that we are going to estimate is called ‘gamma’.

a6<-lm(relfit~Block+Column + l2s+l3s+l4s+l5s+I(l2s^2)+I(l3s^2)+I(l4s^2)+I(l5s^2)+I(l2s*l3s)+I(l2s*l4s)+I(l2s*l5s)+I(l3s*l4s)+I(l3s*l5s)+I(l4s*l5s), data=data)
summary(a6)

## 
## Call:
## lm(formula = relfit ~ Block + Column + l2s + l3s + l4s + l5s + 
##     I(l2s^2) + I(l3s^2) + I(l4s^2) + I(l5s^2) + I(l2s * l3s) + 
##     I(l2s * l4s) + I(l2s * l5s) + I(l3s * l4s) + I(l3s * l5s) + 
##     I(l4s * l5s), data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.4241 -0.9927 -0.6462  1.0126  1.7353 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.987064   0.046919  21.038  < 2e-16 ***
## Block2        0.080932   0.052797   1.533  0.12547    
## ColumnF       0.047449   0.046669   1.017  0.30941    
## l2s          -0.042522   0.029413  -1.446  0.14842    
## l3s           0.066099   0.029265   2.259  0.02402 *  
## l4s           0.096242   0.035878   2.682  0.00737 ** 
## l5s           0.013870   0.031818   0.436  0.66294    
## I(l2s^2)     -0.032168   0.019953  -1.612  0.10708    
## I(l3s^2)      0.016914   0.024314   0.696  0.48673    
## I(l4s^2)     -0.058577   0.023023  -2.544  0.01103 *  
## I(l5s^2)     -0.005635   0.010770  -0.523  0.60086    
## I(l2s * l3s)  0.044487   0.035498   1.253  0.21028    
## I(l2s * l4s) -0.012360   0.032256  -0.383  0.70163    
## I(l2s * l5s)  0.019347   0.030104   0.643  0.52051    
## I(l3s * l4s)  0.026367   0.036084   0.731  0.46504    
## I(l3s * l5s) -0.056696   0.032922  -1.722  0.08521 .  
## I(l4s * l5s)  0.026389   0.033702   0.783  0.43371    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.009 on 1968 degrees of freedom
## Multiple R-squared:  0.01833,    Adjusted R-squared:  0.01035 
## F-statistic: 2.296 on 16 and 1968 DF,  p-value: 0.002439

# arrange the coefficients into the gamma matrix
gamma<-diag(a6$coefficients[8:11]*2)
gamma[2:4, 1]<-a6$coefficients[12:14]
gamma[3:4, 2]<-a6$coefficients[15:16]
gamma[4, 3]<-a6$coefficients[16]
gamma<-gamma+t(gamma)- diag(diag(gamma))

What can you say about patterns of correlational selection on CHCs?

Canonical rotation of gamma

In the same way that it can be difficult to interpret G just by looking at it, it can also be difficult to interpret the multivariate pattern of selection just by looking at gamma. We can conduct an eigenanalysis of gamma the same way that we did with G to get a better understanding of multivariate selection. This is called the canonical rotation (see Blows for original paper, and x for a criticism of the approach).

eigen(gamma)

## eigen() decomposition
## $values
## [1]  0.089622429 -0.005774695 -0.100663656 -0.142117580
## 
## $vectors
##            [,1]       [,2]        [,3]        [,4]
## [1,] -0.1381846 -0.5890903  0.79258768  0.07538182
## [2,] -0.7857326 -0.4241542 -0.44869245 -0.03731817
## [3,] -0.2432062  0.2838662  0.08070198  0.92399023
## [4,]  0.5517065 -0.6264869 -0.40492844  0.37305092

#plot the eigenvalues of gamma

plot(eigen(gamma)$values)
abline(h=0, lty=2)

How many multivariate trait combinations are under disruptive selection?

How many multivariate trait combinations are under stabilising selection?

Overall, is there more disruptive or stabilising selection?

Combine G and selection

Note: In a real analysis you will need to account for the uncertainty of your estimates of G and selection. For this exercise, however, we will use the point estimates only. Use the posterior mode for G, and use the estimated selection vector from your linear model for beta.

What is the predicted response to selection?

First we will calculate the predicted response to directional selection on the 4 CHC traits

deltazbar<-pointG %*% beta

Alignment of directional selection and genetic variance

To calculate the correlations between vectors they need to be standardised to unit length first. Check whether they are, and if not, standardise to unit length before answering the follwing questions.

l2_norm<-sqrt(sum(beta^2))
l2_norm

## [1] 0.07868044

beta_std<-beta/l2_norm
beta_std<-as.matrix(beta_std, 4,1)



gmax<-eigen(pointG)$vectors[,1]
#check whether it's unit length
sqrt(sum(gmax^2))

## [1] 1

#check beta_std for unit length
sqrt(sum(beta_std^2))

## [1] 1

Is directional selection aligned with the major axis of genetic variance? hint: calculate the vector correlation between gmax and the directional selection gradient

#calculate the angle     
angle<-acos(cor(beta_std, gmax))*180/pi

if(angle<90){angle}else{180-angle}

##          [,1]
## [1,] 79.83421

How much genetic variance is there in the direction of linear selection? hint: project the selection gradient through G

#project beta through G
#project beta through G

t(beta_std) %*% pointG %*% beta

##            [,1]
## [1,] 0.04335597

Is that a lot or not much? How did you come to your conclusion?

Congratulations! You have now completed a multivariate genetic and selection analysis

EQGW Activity– Jacqueline