Thursday, April 3, 2014

FST distributions, between vs. within clades

The pattern of among-clade divergence becomes more interesting when I compare the clades. What I did here is I looked at FST within the white oaks, splitting all the white oaks (exc. for Virentes) from the Mexican / AZ clade; and FST within the red oaks, splitting all the red oaks except for Q. palustris and pals from the Mexican / AZ clade. These were the outcomes I imagined, and why I suspected each of them:

  1. If diversification is neutral, then I expect conserved regions of the genome to be, on average, shared between the white and red oaks. This would be reflected by a correlation in FST within the white oaks compared to FST within the red oaks.
  2. If diversification is driven by divergent selection on some regions of the genome but not others, I might find either:

    a. positive correlation, if the same regions of the genome were under divergent selection during diversification of the major white oaks clades as were under selection during divergence of the major red oak clades; or

    b. negative correlation, if white oak divergence was driven by selection at different loci than red oak divergence.
I subsetted the ca. 33000 RADseq loci from my last clustering analysis based on the criteria that loci were present in at least 5 members of each of the two white oak clades or at least 5 members of the two black oak clades, and variable at least at one nucleotide position. Then I mapped these all back to the Q. robur SNP map, 800-bp contigs, as described here: 2014-03-10. It is not the same set of markers that successfully maps in each case. Here are the markers that map in the two groups:

FST by linkage group within Quercus section Lobatae,
where the two populations are the eastern North American
red oaks (excluding Q. palustris and allies) and the Mexcian /
Arizonian red oaks.

FST by linkage group within Quercus section Quercus,
where the two populations are the eastern North American
red oaks (excluding Virentes and the Roburoids)
and the Mexcian / Arizonian red oaks.

I looked first to see whether there is any kind of correlation on a locus-by-locus basis, but this is noisy:

Biplot of FST within section Quercus (y axis) against section
Lobatae (x axis)

But we have a map! binning by 3 cM, and averaging FST within those bins, still gives a pretty noisy plot:


But, remarkably, the story seems to be a lot cleaner at bin sizes of 30 cM or higher:


Map position of divergence in 30 cM windows,
red oaks and white oaks, with all 12 linkage
groups concatenated.
Biplot, FST of red oaks on FST of
white oaks, 30 cM windows.
Pearson's r = -0.583, P = 0.0028.













This effect I was expecting to potentially show up at fine scales, due to selection at the scale of individual genes. In both groups we are looking at a cladogenetic split that is probably > 10 million years old (the split between the red and white oaks is about 30 million years old). Is there any chance, though, that we are really picking up on a strong divergence in the selective pressures driving divergence at the bases of the red and white oak clades, and that affected divergence patterns across 1/2-chromosome size blocks of the genome?