基因组变异仿真与基因组模式鉴定

NULL论文
论文详情
It has long been appreciated that genomic variants can partly contribute to human cancer. Two main types of variants, single nucleotide polymorphisms (SNPs) and copy number alterations (CNAs) have been largely explored in genome wide association studies (GWASs) recently. Characterization of the variants can enable us to understand the genesis and progression of tumors, so as to provide valuable information for the diagnosis and treatment of human cancer. For this purpose, we simulate genomic variants and identify genomic patterns (i.e., significant genomic variants and structures among them) with respect to SNPs and CNAs. Five key contributions of the thesis are summarized as below.1. For a clear understanding of the development of genome simulation, we make a comprehensive comparison of existing simulators w.r.t. evolutionary and demographic scenarios, computational efficiency, and applicability in genome study. This work will help to guide informed choice for researchers and help to make progress on new simulation methods.2. The important issue that arises in existing genome simulators is: efficiency and flexibility can not be well handled simultaneously. We propose a new algorithm, SIMLD, to simulate real linkage disequilibrium (LD) patterns and case-control samples. The main features of SIMLD are two-fold: (1) less number of evolutionary generations is required to converge to real LD patterns; and (2) various disease models can be flexibly incorporated to produce phenotypes.3. To search for susceptibility SNPs and epistatic models that underlie human cancer, we propose a novel SNP association study method based on probability theory, called ProbSNP. The experimental results show that ProbSNP achieves success in applications to simulation and real data when compared with other methods. The main features of ProbSNP are three-fold: (1) joint probability between SNPs and phenotypes is modelled to assess the importance of SNPs; (2) the stability of the SNP selection is validated through resampling process; and (3) the space for detecting epistatic models is reduced due to the step of individual SNP selection.4. In addition to SNPs, somatic copy number alterations (CNAs) in genomes underlie almost all human cancers. To identify significant consensus events (SCEs) from random background CNAs, we develop a novel algorithm, called iSCE, which uses permutation test to determine significance based on a new statistic. The experimental results show that iSCE outperforms others in terms of larger area under the Receiver Operating Characteristics curve. The novel features of iSCE are three-fold: (1) iSCE considers the strong correlation among neighboring probes thus assigns a score to each region instead of single probe; (2) iSCE conducts permutations on ensemble CNAs segments rather than single probes across samples; and (3) iSCE iteratively performs significance assessment and SCE-exclusive permutations.5. To identify subtype-speicfic SCEs in heterogeneous diseases, we analyze two types of ovarian cancers: primary-recurrent ovarian cancer and high-grade ovarian cancer, w.r.t. CNAs based on clustering and the iSCE algorithm. The identified patterns show biological significance when compared with regions known to be associated with oncogenes (EGFR, KRAS, MYC, etc.) and tumor suppressor genes (CDKN2A/B, PTEN, etc.). The results will be helpful for exploring subtype-specific diagnosis and treatment.
Abstract第5-6页
Chapter 1 Introduction第10-28页
    1.1 Conceptual description of genomic variants第10-18页
        1.1.1 Single nucleotide polymorphism第10-14页
        1.1.2 Copy number alteration第14-18页
    1.2 Significance and status in investigating genomic variants第18-23页
        1.2.1 Significance第18-20页
        1.2.2 Current research status第20-23页
    1.3 Challenges第23-25页
    1.4 Thesis objective第25-26页
    1.5 Thesis organization第26-28页
Chapter 2 Comparative analysis of genome simulators第28-44页
    2.1 Motivation第28-29页
    2.2 Comparison of simulators第29-40页
        2.2.1 Coalescent approach第30-34页
        2.2.2 Forward-time approach第34-38页
        2.2.3 Resampling approach第38-40页
    2.3 Theoretical analysis of the approaches第40-42页
    2.4 Conclusion第42-44页
Chapter 3 Simulation of LD patterns and case-control data第44-60页
    3.1 Motivation第44-46页
    3.2 The framework of SIMLD第46-48页
    3.3 Simulation of real LD patterns第48-51页
        3.3.1 Initialization of population第48-49页
        3.3.2 Mating and recombination第49-50页
        3.3.3 Termination criteria第50-51页
    3.4 Simulation of case-control samples第51-53页
    3.5 Experimental results第53-58页
    3.6 Conclusion第58-60页
Chapter 4 Detection of susceptibility SNPs in cancer第60-80页
    4.1 Motivation第60-62页
    4.2 The probability theory based method第62-69页
        4.2.1 Principle of ProbSNP第62-64页
        4.2.2 Estimation of probabilities第64-67页
        4.2.3 SNP selection criterion第67页
        4.2.4 Detection of epistatic models第67-69页
    4.3 Experimental results第69-77页
        4.3.1 Results on simulation data第69-73页
        4.3.2 Comparison with other approaches第73页
        4.3.3 Results on real genome-wide data第73-77页
    4.4 Conclusion第77-80页
Chapter 5 Identification of SCEs in copy number alterations第80-108页
    5.1 Motivation第80-81页
    5.2 The framework of iSCE第81-84页
    5.3 Design of copy number alteration regions第84-85页
        5.3.1 Detection of copy number alteration probes第84页
        5.3.2 Construction of copy number alteration regions第84-85页
    5.4 Statistic第85-88页
        5.4.1 Two types of R scores第85-86页
        5.4.2 Difference between the two R scores第86-88页
    5.5 Significance assessment第88-92页
        5.5.1 Permutation第88-89页
        5.5.2 Calculation of p-value第89-90页
        5.5.3 Iteration of significance test第90-92页
    5.6 Experimental results第92-105页
        5.6.1 Comparison with other methods第92-102页
        5.6.2 Results on public glioma and lung cancer data第102-105页
    5.7 Conclusion第105-108页
Chapter 6 Cancer subtype-specific SCEs detection第108-122页
    6.1 Motivation第108-109页
    6.2 Clustering-iSCE based method第109-111页
    6.3 Experimental results第111-121页
        6.3.1 Results on JHU ovarian cancer第111-117页
        6.3.2 Results on TCGA ovarian cancer第117-121页
    6.4 Conclusion第121-122页
Chapter 7 Summary and future work第122-126页
    7.1 Summary第122-123页
    7.2 Future work第123-126页
Acknowledgements第126-128页
References第128-142页
Appendix A Abbreviations第142-144页
Appendix B LD comparison using D'第144-148页
Appendix C List of SNPs identified第148-150页
Appendix D Power comparison between both R scores第150-154页
Publications and projects第154-155页
论文购买
论文编号ABS540280,这篇论文共155页
会员购买按0.30元/页下载,共需支付46.5
不是会员,注册会员
会员更优惠充值送钱
直接购买按0.5元/页下载,共需要支付77.5
只需这篇论文,无需注册!
直接网上支付,方便快捷!
相关论文

点击收藏 | 在线购卡 | 站内搜索 | 网站地图
版权所有 艾博士论文 Copyright(C) All Rights Reserved
版权申明:本文摘要目录由会员***投稿,艾博士论文编辑,如作者需要删除论文目录请通过QQ告知我们,承诺24小时内删除。
联系方式: QQ:277865656