viernes, 20 de noviembre de 2015

Statistical Tests for Clonality

Statistical Tests for Clonality

SUMMARY

Cancer investigators frequently conduct studies to examine tumor samples from pairs of apparently independent primary tumors with a view to determining if they share a “clonal” origin. The genetic fingerprints of the tumors are compared using a panel of markers, often representing loss of heterogeneity (LOH) at distinct genetic loci. In this article we evaluate candidate significance tests for this purpose. The relevant information derives from the observed correlation of the tumors with respect to the occurrence of LOH at individual loci, a phenomenon that can be evaluated using Fisher’s Exact Test. Information is also available from the extent to which losses at the same locus occur on the same parental allele. Data from these combined sources of information can be evaluated using a simple adaptation of Fisher’s Exact Test. The test statistic is the total number of loci at which concordant mutations occur on the same parental allele, with higher values providing more evidence in favor of a clonal origin for the two tumors. The test is shown to have high power for detecting clonality for plausible models of the alternative (clonal) hypothesis, and for reasonable numbers of informative loci, preferably located on distinct chromosomal arms. The method is illustrated using studies to identify clonality in contralateral breast cancer. Interpretation of the results of these tests requires caution due to simplifying assumptions regarding the possible variability in mutation probabilities between loci, and possible imbalances in the mutation probabilities between parental alleles. Nonetheless, we conclude that the method represents a simple, powerful strategy for distinguishing independent tumors from those of clonal origin.
Keywords: Clonality, Permutation test, Second primary cancers

Clonality: A Package for Clonality testing
Statistical Challenges in Testing Clonal





Molecular Evolution

Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data

Abstract

Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and “ancient” Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com.

Abstract

Phylogenies of highly genetically variable viruses such as HIV-1 are potentially informative of epidemiological dynamics. Several studies have demonstrated the presence of clusters of highly related HIV-1 sequences, particularly among recently HIV-infected individuals, which have been used to argue for a high transmission rate during acute infection. Using a large set of HIV-1 subtype B pol sequences collected from men who have sex with men, we demonstrate that virus from recent infections tend to be phylogenetically clustered at a greater rate than virus from patients with chronic infection (‘excess clustering’) and also tend to cluster with other recent HIV infections rather than chronic, established infections (‘excess co-clustering’), consistent with previous reports. To determine the role that a higher infectivity during acute infection may play in excess clustering and co-clustering, we developed a simple model of HIV infection that incorporates an early period of intensified transmission, and explicitly considers the dynamics of phylogenetic clusters alongside the dynamics of acute and chronic infected cases. We explored the potential for clustering statistics to be used for inference of acute stage transmission rates and found that no single statistic explains very much variance in parameters controlling acute stage transmission rates. We demonstrate that high transmission rates during the acute stage is not the main cause of excess clustering of virus from patients with early/acute infection compared to chronic infection, which may simply reflect the shorter time since transmission in acute infection. Higher transmission during acute infection can result in excess co-clustering of sequences, while the extent of clustering observed is most sensitive to the fraction of infections sampled.

A general linear model-based approach for inferring selection to climate


Estimation of Population Genetic Structure Software and Methods

Artìculos útiles para Estimación de Estructura poblacional:

On Identifying the Optimal Number of Population Clusters via the Deviance Information Criterion
On Identifying the... 

Detecting correlation between allele frequencies and environmental variables as a signature of selection. A fast computational approach for genome-wide studies

Detecting and measuring selection from gene frequency data












GenClone 2.0

GenClone: a computer program to analyze genotypic data, test for clonality and describe spatial clonal organization
Arnaud-Haond Sophie and Belkhir Khalid
«Team MAREE» - CCMAR, Algarve University, FCMA, Gambelas, 8005-139 Faro, PORTUGAL
«Génome, Populations, Interactions »-Université Montpellier II, Place Eugène Bataillon ; 34090 Montpellier Cedex, FRANCE

Link: GenClone