Genome Uniqueness Functions

calcGU(alleles, threshold = 1, byID = FALSE, pop = NULL)

Arguments

alleles

dataframe of containing an AlleleTable. This is a table of allele information produced by geneDrop(). An AlleleTable contains information about alleles an ego has inherited. It contains the following columns:

  • id --- A character vector of IDs for a set of animals.

  • parent --- A factor with levels of sire and dam.

  • V1 --- Unnamed integer column representing allele 1.

  • V2 --- Unnamed integer column representing allele 2.

  • ... --- Unnamed integer columns representing alleles.

  • Vn --- Unnamed integer column representing the nth column.

threshold

an integer indicating the maximum number of copies of an allele that can be present in the population for it to be considered rare. Default is 1.

byID

logical variable of length 1 that is passed through to eventually be used by alleleFreq(), which calculates the count of each allele in the provided vector. If byID is TRUE and ids are provided, the function will only count the unique alleles for an individual (homozygous alleles will be counted as 1).

pop

character vector with animal IDs to consider as the population of interest, otherwise all animals will be considered. The default is NULL.

Value

Dataframe rows: id, col: gu A single-column table of genome uniqueness values as percentages. Rownames are set to 'id' values that are part of the population.

Details

Part of Genetic Value Analysis

The following functions calculate genome uniqueness according to the equation described in Ballou & Lacy.

It should be noted, however that this function differs slightly in that it does not distinguish between founders and non-founders in calculating the statistic.

Ballou & Lacy describe genome uniqueness as "the proportion of simulations in which an individual receives the only copy of a founder allele." We have interpreted this as meaning that genome uniqueness should only be calculated for living, non-founder animals. Alleles possessed by living founders are not considered when calculating genome uniqueness.

We have a differing view on this, since a living founder can still contribute to the population. The function below calculates genome uniqueness for all living animals and considers all alleles. It does not ignore living founders and their alleles.

Our results for genome uniqueness will, therefore differ slightly from those returned by Pedscope. Pedscope calculates genome uniqueness only for non-founders and ignores the contribution of any founders in the population. This will cause Pedscope's genome uniqueness estimates to possibly be slightly higher for non-founders than what this function will calculate.

The estimates of genome uniqueness for founders within the population calculated by this function should match the "founder genome uniqueness" measure calculated by Pedscope.

References

Ballou JD, Lacy RC. 1995. Identifying genetically important individuals for management of genetic variation in pedigreed populations, p 77-111. In: Ballou JD, Gilpin M, Foose TJ, editors. Population management for survival and recovery. New York (NY): Columbia University Press.

Examples

# \donttest{ library(nprcgenekeepr) ped1Alleles <- nprcgenekeepr::ped1Alleles gu_1 <- calcGU(ped1Alleles, threshold = 1, byID = FALSE, pop = NULL) gu_2 <- calcGU(ped1Alleles, threshold = 3, byID = FALSE, pop = NULL) gu_3 <- calcGU(ped1Alleles, threshold = 3, byID = FALSE, pop = ped1Alleles$id[20:60]) # }