Returns a \(Z_{\alpha}^{BetaCDF}\) value for each SNP location supplied to the function, based on the expected \(r^2\) values given an LD profile and genetic distances. For more information about the \(Z_{\alpha}^{BetaCDF}\) statistic, please see Jacobs (2016). The \(Z_{\alpha}^{BetaCDF}\) statistic is defined as: $${Z_{\alpha}^{BetaCDF}}=\frac{{|L| \choose 2}^{-1}\sum_{i,j \in L}\frac{B(r^2_{i,j};a,b)}{B(a,b)} + {|R| \choose 2}^{-1}\sum_{i,j \in R}\frac{B(r^2_{i,j};a,b)}{B(a,b)}}{2}$$ where |L| and |R| are the number of SNPs to the left and right of the current locus within the given window ws, \(r^2\) is equal to the squared correlation between a pair of SNPs, and \(\frac{B(r^2_{i,j};a,b)}{B(a,b)}\) is the cumulative distribution function for the Beta distribution given the estimated a and b parameters from the LD profile.

Zalpha_BetaCDF(
  pos,
  ws,
  x,
  dist,
  LDprofile_bins,
  LDprofile_Beta_a,
  LDprofile_Beta_b,
  minRandL = 4,
  minRL = 25,
  X = NULL
)

Arguments

pos

A numeric vector of SNP locations

ws

The window size which the \(Z_{\alpha}^{BetaCDF}\) statistic will be calculated over. This should be on the same scale as the pos vector.

x

A matrix of SNP values. Columns represent chromosomes; rows are SNP locations. Hence, the number of rows should equal the length of the pos vector. SNPs should all be biallelic.

dist

A numeric vector of genetic distances (e.g. cM, LDU). This should be the same length as pos.

LDprofile_bins

A numeric vector containing the lower bound of the bins used in the LD profile. These should be of equal size.

LDprofile_Beta_a

A numeric vector containing the first estimated Beta parameter for the corresponding bin in the LD profile.

LDprofile_Beta_b

A numeric vector containing the second estimated Beta parameter for the corresponding bin in the LD profile.

minRandL

Minimum number of SNPs in each set R and L for the statistic to be calculated. Default is 4.

minRL

Minimum value for the product of the set sizes for R and L. Default is 25.

X

Optional. Specify a region of the chromosome to calculate \(Z_{\alpha}^{BetaCDF}\) for in the format c(startposition, endposition). The start position and the end position should be within the extremes of the positions given in the pos vector. If not supplied, the function will calculate \(Z_{\alpha}^{BetaCDF}\) for every SNP in the pos vector.

Value

A list containing the SNP positions and the \(Z_{\alpha}^{BetaCDF}\) values for those SNPs

Details

The LD profile describes the expected correlation between SNPs at a given genetic distance, generated using simulations or real data. Care should be taken to utilise an LD profile that is representative of the population in question. The LD profile should consist of evenly sized bins of distances (for example 0.0001 cM per bin), where the value given is the (inclusive) lower bound of the bin. Ideally, an LD profile would be generated using data from a null population with no selection, however one can be generated using this data. See the create_LDprofile function for more information on how to create an LD profile.

References

Jacobs, G.S., T.J. Sluckin, and T. Kivisild, Refining the Use of Linkage Disequilibrium as a Robust Signature of Selective Sweeps. Genetics, 2016. 203(4): p. 1807

See also

Examples

## load the snps and LDprofile example datasets data(snps) data(LDprofile) ## run Zalpha_BetaCDF over all the SNPs with a window size of 3000 bp Zalpha_BetaCDF(snps$bp_positions,3000,as.matrix(snps[,3:12]),snps$cM_distances, LDprofile$bin,LDprofile$Beta_a,LDprofile$Beta_b)
#> $position #> [1] 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 #> [16] 1600 1700 1800 1900 2000 #> #> $Zalpha_BetaCDF #> [1] NA NA NA NA 0.5553324 0.6013140 0.6135663 #> [8] 0.6114601 0.6153488 0.6073154 0.5895785 0.6060782 0.5871007 0.6161310 #> [15] 0.5889128 0.5730982 NA NA NA NA #>
## only return results for SNPs between locations 600 and 1500 bp Zalpha_BetaCDF(snps$bp_positions,3000,as.matrix(snps[,3:12]),snps$cM_distances, LDprofile$bin,LDprofile$Beta_a,LDprofile$Beta_b,X=c(600,1500))
#> $position #> [1] 600 700 800 900 1000 1100 1200 1300 1400 1500 #> #> $Zalpha_BetaCDF #> [1] 0.6013140 0.6135663 0.6114601 0.6153488 0.6073154 0.5895785 0.6060782 #> [8] 0.5871007 0.6161310 0.5889128 #>