Elucidating the role of 8q24 in colorectal cancer No sign up video chat
Depth of sequencing coverage in the CG panel was high across each of the target regions, 48x–58x (Fig. Concordance between Illumina Omni Express genotype and sequencing data in 84 samples was 99%; 139 668 SNPs and 16 173 indels and substitutions were catalogued within the 16.2 Mb region.
Of these, 96 195 were also present in the 1000 genomes panel, and a further 11 653 were monomorphic in the five GWASs.
Fine-mapping of disease loci has traditionally been undertaken using a combination of re-sequencing and direct typing of SNPs within regions of association.
This strategy is, however, costly and time consuming and the ability to impute unobserved genotypes in GWAS data sets using a reference panel provides an attractive and practical alternative.
Our analysis did not provide evidence that any of the associations at the 16 loci being a consequence of synthetic associations rather than linkage disequilibrium with a common risk variant. Recent genome-wide association studies (GWASs) have validated the hypothesis that part of the heritable risk of CRC is attributable to common variation identifying susceptibility loci at 1q41, 3q26,2, 6p21.2, 8q23.3, 8q24.21, 10p14, 11q13.4, 11q23.1, 12q13, 14q22.2, 15q13.3, 16q22.1, 18q21.1, 19q13.11, 20p12.3, 20q13.33 and Xp22.2 (4–11).
While the associations identified by GWAS provide novel insights, for example, into the development of CRC highlighting the role of TGF-β signalling in disease aetiology, since the tag single nucleotide polymorphisms (tag SNPs) genotyped are generally not strong candidates for causality elucidating the functional basis of associations is challenging.
This analysis revealed very similar results with 94.9 and 88.6% of heterozygotes being correctly imputed in the imputations with and without the CG panel, respectively. Table 2 shows for each region the tag SNP and the most associated SNP along with respective pair wise LD metrics.Under such a scenario many risk variants will have carrier frequencies below the threshold of representation in sequencing of population-based reference panels.To maximize the utility of imputation as a means of fine-mapping CRC loci, it is therefore highly desirable to also use high-coverage sequencing data on CRC cases to ensure adequate representation of risk variants.In addition, in 10 of the 16 regions, the most associated SNP in the imputation was greater than an order of magnitude more strongly associated with CRC.In 4 of the 16 regions, 1q41, 15q13.3, 18q21.1 and 20q13.33 imputation results were consistent with and without the CG panel and a variant significantly more associated ( functional annotation of the most associated variant, using publically available data from ENCODE, revealed that they reside within potential regulatory regions of DNA.