Oly Genetics 104, NF GenePop Analysis

Tried to do the html to .md trick for this notebook, but it did not function. No biggie, since there are no pretty plots in this notebook. Original notebooks: R markdown version, NF-GenePop-Analysis.Rmd; HTML version, NF-GenePop-Analysis.html

In this notebook I will use the GenePop R program to analyze the 2016/2017 Fidalgo Bay (NF) Ostrea lurida microsatellite data; the results from each analysis are housed in a series of .txt files.

Prior to importing the data, prepared the 2016/2017 NF data in Excel and exported into GenePop format; resulting file is available on 2018-01-22-Preparing-for-Genepop.md

First, install GenePop program;

    install.packages("genepop")
    library(genepop)

Pull basic information on allele and genotype frequencies per locus and per sample

    basic_info(inputFile="Data/Oly2016NFH+2017NFW_Merged.txt", outputFile = "Analyses/NF-Basic-Info.txt", verbose=T)

Resulting file: “NF-Basic-Info.txt” Hetero- and homozygosity info pasted here:

NF Wild

Loci Olur10 Olur11 Olur12 Olur13 Olur15 Olur19
Expected Homozygotes 4.99 24.59 11.83 6.68 6.66 6.64
Observed Homozygotes 2 28 12 7 6 7
Expected Heterozygotes 92.01 70.41 86.17 91.32 91.34 90.36
Observed Heterozygotes 95 67 86 91 92 90

NF Hatchery

Loci Olur10 Olur11 Olur12 Olur13 Olur15 Olur19
Expected Homozygotes 5.02 23.03 13.44 7.14 7.22 7.10
Observed Homozygotes 7 27 5 6 6 7
Expected Heterozygotes 92.98 70.97 86.56 88.86 90.78 88.90
Observed Heterozygotes 91 67 95 90 92 89

Assess whether loci are in Hardy-Weinberg Equilibrium

    test_HW(inputFile = "Data/Oly2016NFH+2017NFW_Merged.txt", which="Proba", outputFile = "Analyses/NF-HWE.txt", enumeration = FALSE, dememorization = 10000, batches = 500, iterations = 2000, verbose = interactive())

Resulting file: “NF-HWE.txt” All P-values across loci in each population are »0.05, do not reject the null hypothesis that all loci are in HWE.

Pop : NFW-2017
-----------------------------------------
                             Fis estimates
                            ---------------
locus       P-val   S.E.    W&C     R&H     Steps 
----------- ------- ------- ------- ------- ------
Olur10      0.2885  0.0155  -0.0327 -0.0097  22774 switches
Olur11      0.7817  0.0096   0.0487  0.0823  32358 switches
Olur12      0.9079  0.0055   0.0020 -0.0093  77677 switches
Olur13      0.3355  0.0108   0.0035  0.0086  83321 switches
Olur15      0.3031  0.0111  -0.0073 -0.0015  68376 switches
Olur19      0.3528  0.0109   0.0040  0.0092  82550 switches

All (Fisher's method):
 Chi2 :    9.8276
 Df   :    12.0000
 Prob :    0.6311

Pop : NFH-2016
-----------------------------------------
                             Fis estimates
                            ---------------
locus       P-val   S.E.    W&C     R&H     Steps 
----------- ------- ------- ------- ------- ------
Olur10      0.6622  0.0165   0.0214  0.0184  19734 switches
Olur11      0.7706  0.0105   0.0563  0.0169  29904 switches
Olur12      0.8312  0.0070  -0.0981 -0.0520  86460 switches
Olur13      0.1129  0.0057  -0.0129 -0.0057 112034 switches
Olur15      0.0795  0.0057  -0.0135 -0.0123  93919 switches
Olur19      0.1101  0.0065  -0.0012  0.0012  89921 switches

All (Fisher's method):
 Chi2 :    15.5530
 Df   :    12.0000
 Prob :    0.2126
==========================================
 All locus, all populations 
==========================================
All (Fisher's method) :
 Chi2 :    25.3806
 Df   :    24.0000
 Prob :    0.3853

Assess whether any loci are linked

   test_LD(inputFile = "Data/Oly2016NFH+2017NFW_Merged.txt", outputFile = "Analyses/NF-LD.txt", dememorization = 10000, batches = 100, iterations = 1000, verbose = TRUE)

Resulting file: “NF-LD.txt” Interesting, results from this Linkage Disequilibrium test indicate that there are, in fact, linked loci:

Pop             Locus#1  Locus#2    P-Value      S.E.     Switches
----------      -------  -------    --------     -------- --------
NFW-2017        Olur10   Olur11     0.32418      0.045606      216
NFW-2017        Olur10   Olur12     0.372000     0.048411       65
NFW-2017        Olur11   Olur12     1.000000     0.000000      435
NFW-2017        Olur10   Olur13     1.000000     0.000000       42
NFW-2017        Olur11   Olur13     0.990730     0.006526      279
NFW-2017        Olur12   Olur13     0.144450     0.034977       90
NFW-2017        Olur10   Olur15     1.000000     0.000000       33
NFW-2017        Olur11   Olur15     0.975760     0.014058      232
NFW-2017        Olur12   Olur15     1.000000     0.000000       96
NFW-2017        Olur13   Olur15     0.065770     0.024413       35
NFW-2017        Olur10   Olur19     1.000000     0.000000       35
NFW-2017        Olur11   Olur19     1.000000     0.000000      229
NFW-2017        Olur12   Olur19     1.000000     0.000000       91
NFW-2017        Olur13   Olur19     0.060010     0.023868       16
NFW-2017        Olur15   Olur19     0.000000     0.000000       28  <------ Wild 15 & 19 linked
NFH-2016        Olur10   Olur11     1.000000     0.000000      154
NFH-2016        Olur10   Olur12     0.208400     0.040527       72
NFH-2016        Olur11   Olur12     0.923220     0.024857      520
NFH-2016        Olur10   Olur13     1.000000     0.000000       45
NFH-2016        Olur11   Olur13     0.726700     0.043512      284
NFH-2016        Olur12   Olur13     0.000000     0.000000      151 <------ Hatchery 12 & 13 linked
NFH-2016        Olur10   Olur15     1.000000     0.000000       36
NFH-2016        Olur11   Olur15     0.715690     0.043815      301
NFH-2016        Olur12   Olur15     0.049120     0.021532      165
NFH-2016        Olur13   Olur15     0.000000     0.000000       68 <------ Hatchery 13 & 15 linked
NFH-2016        Olur10   Olur19     1.000000     0.000000       40
NFH-2016        Olur11   Olur19     0.716270     0.042213      282
NFH-2016        Olur12   Olur19     0.000000     0.000000      149 <------ Hatchery 12 & 19 linked
NFH-2016        Olur13   Olur19     0.000000     0.000000       54 <------ Hatchery 13 & 19 linked
NFH-2016        Olur15   Olur19     0.000000     0.000000       44 <------ Hatchery 15 & 19 linked

P-value for each locus pair across all populations
(Fisher's method)
-----------------------------------------------------
Locus pair                    Chi2      df   P-Value
--------------------          --------  ---  --------
Olur10        & Olur11        2.252913  4    0.689355
Olur10        & Olur12        5.114315  4    0.275768
Olur11        & Olur12        0.159775  4    0.996974
Olur10        & Olur13        0.000000  4    1.000000
Olur11        & Olur13        0.657110  4    0.956511
Olur12        & Olur13        >35.7544134    <0.000000 <------ 12 & 13 linked
Olur10        & Olur15        0.000000  4    1.000000
Olur11        & Olur15        0.718094  4    0.949079
Olur12        & Olur15        6.026978  4    0.197143
Olur13        & Olur15        >37.3279524    <0.000000 <------ 13 & 15 linked
Olur10        & Olur19        0.000000  4    1.000000
Olur11        & Olur19        0.667396  4    0.955288
Olur12        & Olur19        >31.8847694    <0.000002 <------ 12 & 19 linked
Olur13        & Olur19        >37.5112584    <0.000000 <------ 13 & 19 linked
Olur15        & Olur19        >63.7695394    <0.000000 <------ 15 & 19 linked

Assess for null alleles

    nulls(inputFile = "Data/Oly2016NFH+2017NFW_Merged.txt", outputFile = "Analyses/NF-null.txt", nullAlleleMethod = "B96", CIcoverage = 0.95, verbose = TRUE)

Resulting file: “NF-null.txt” Null allele frequences low; will compare with results from MicroChecker to confirm.

(Locus by population) table of estimated null allele frequencies
================================================================
Locus:     Populations (! names truncated to 6 characters):
           NFW-20 NFH-20 
           -----------------------------------------------------
Olur10     0.0000 0.0079 
Olur11     0.0125 0.0159 
Olur12     0.0000 0.0000 
Olur13     0.0000 0.0000 
Olur15     0.0000 0.0000 
Olur19     0.0000 0.0000 
================================================================


Confidence intervals for null allele frequencies
=================================================
                       Frequency   0.0250   0.9750 
Locus      Population   estimate   bound    bound
-------------------------------------------------
Olur10     NFW-2017    0.0000     (No info for CI)
           NFH-2016    0.0079     0.0000  0.0422  
Olur11     NFW-2017    0.0125     0.0000  0.0694  
           NFH-2016    0.0159     0.0000  0.0724  
Olur12     NFW-2017    0.0000     (No info for CI)
           NFH-2016    0.0000     (No info for CI)
Olur13     NFW-2017    0.0000     (No info for CI)
           NFH-2016    0.0000     (No info for CI)
Olur15     NFW-2017    0.0000     (No info for CI)
           NFH-2016    0.0000     (No info for CI)
Olur19     NFW-2017    0.0000     (No info for CI)
           NFH-2016    0.0000     (No info for CI)
=================================================

Exact conditional contingency-table test or genotypic differentiation.

Assesses the distribution of diploid genotypes in the various populations. The null hypothesis tested is Ho: “genotypes are drawn from the same distribution in all populations”

    test_diff(inputFile = "Data/Oly2016NFH+2017NFW_Merged.txt", outputFile = "Analyses/NF-Diff.txt", genic=FALSE, pairs=TRUE, dememorization = 10000, batches = 100, iterations = 1000, verbose = TRUE)

Resulting file: “NF-Diff.txt” Results indicate that genotypes are drawn from the same distribution, as P».01 for all loci.

Locus        Population pair        P-Value  S.E.     Switches
-----------  ---------------------  -------  -------  --------
Olur10       NFH-2016  & NFW-2017   0.32784  0.01552     27018
Olur11       NFH-2016  & NFW-2017   0.54191  0.01155     36148
Olur12       NFH-2016  & NFW-2017   0.67662  0.01140     38491
Olur13       NFH-2016  & NFW-2017   0.08560  0.00745     32174
Olur15       NFH-2016  & NFW-2017   0.06505  0.00713     32414
Olur19       NFH-2016  & NFW-2017   0.15165  0.01146     32502


P-value for each population pair across all loci
(Fisher's method)
-----------------------------------------------------
Population pair               Chi2      df   P-Value
--------------------          --------  ---  --------
NFW-2017      & NFH-2016      18.39076  12   0.104331

Calculate Fst for each population

Fst is a measure of genetic structure (developed by Sewall Wright, 1969, 1978), and is related to statistical analysis of variance (ANOVA). FST is the proportion of the total genetic variance contained in a subpopulation (the S subscript) relative to the total genetic variance (the T subscript). Values can range from 0 to 1. High FST implies a considerable degree of differentiation among populations.

    Fst(inputFile = "Data/Oly2016NFH+2017NFW_Merged.txt", outputFile = "Analyses/NF-Fst.txt", sizes=F, pairs=TRUE, dataType="Diploid", verbose = TRUE)

Resulting file: “NF-Fst.txt” Values are very close to zero, which indicates that there is little genetic differentiation among wild and hatchery populations.

Indices for populations:
----     -------------
1        NFW-2017
2        NFH-2016
----------------------

Estimates for each locus:
========================
  Locus: Olur10
---------------------------------
pop      1       
2     -0.0007 

  Locus: Olur11
---------------------------------
pop      1       
2     -0.0027 

  Locus: Olur12
---------------------------------
pop      1       
2     -0.0020 

  Locus: Olur13
---------------------------------
pop      1       
2      0.0030 

  Locus: Olur15
---------------------------------
pop      1       
2      0.0036 

  Locus: Olur19
---------------------------------
pop      1       
2      0.0027 

Estimates for all loci (diploid):
=========================
pop      1       
2      0.0008 

Generate stats on allelic diversity. Interpretation TBD.

   genedivFis(inputFile="Data/Oly2016NFH+2017NFW_Merged.txt", sizes=FALSE, outputFile = "Analyses/NF-DivFis.txt", dataType = "Diploid", verbose=interactive())

Resulting file: “NF-DivFis.txt”

Written on January 22, 2018