WEKO3
アイテム
Bioinformatics for the study of biodiversity
https://ir.soken.ac.jp/records/1221
https://ir.soken.ac.jp/records/1221e3e340e3-215e-4edd-aa6e-e49a108a6066
名前 / ファイル | ライセンス | アクション |
---|---|---|
要旨・審査要旨 (330.0 kB)
|
Item type | 学位論文 / Thesis or Dissertation(1) | |||||
---|---|---|---|---|---|---|
公開日 | 2010-02-22 | |||||
タイトル | ||||||
タイトル | Bioinformatics for the study of biodiversity | |||||
タイトル | ||||||
タイトル | Bioinformatics for the study of biodiversity | |||||
言語 | en | |||||
言語 | ||||||
言語 | eng | |||||
資源タイプ | ||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_46ec | |||||
資源タイプ | thesis | |||||
著者名 |
YANG, Zhong
× YANG, Zhong |
|||||
フリガナ |
ヤン, ツォン
× ヤン, ツォン |
|||||
著者 |
YANG, Zhong
× YANG, Zhong |
|||||
学位授与機関 | ||||||
学位授与機関名 | 総合研究大学院大学 | |||||
学位名 | ||||||
学位名 | 博士(理学) | |||||
学位記番号 | ||||||
内容記述タイプ | Other | |||||
内容記述 | 総研大乙第162号 | |||||
研究科 | ||||||
値 | 先導科学研究科 | |||||
専攻 | ||||||
値 | 21 生命体科学専攻 | |||||
学位授与年月日 | ||||||
学位授与年月日 | 2006-03-24 | |||||
学位授与年度 | ||||||
値 | 2005 | |||||
要旨 | ||||||
内容記述タイプ | Other | |||||
内容記述 | This thesis is a study of phylogenetic approaches and database system as well as<br />their uses in bioinformatics. It focuses on three main topics: (A) molecular<br />phylogenetic analysis as an effective tool to investigate the evolutionary relationships<br />and rates and adaptation of two important groups: the mangrove family<br />Rhizophoraceae and the sere acute respiratory syndrome (SARS) coronavirus; (B)<br />statistical model and computer simulation approach for testing hybridization<br />hypotheses based on incongruent gene trees; and (C) a new data model and comparison<br />method for interacting classifications and phylogenetic trees in a taxonomic database.<br /> In Chapter 1 , I outlined the advances in today's biodiversity science and<br />bioinformatics, and in the studies of molecular evolution and phylogenetics. To meet<br />the major needs for a newly formed cross-disciplinary between biodiversity science and<br />bioinformatics, <i>i. e.,</i> biodiversity informatics, applications of phylogenetic approaches<br />and data models as well as taxonomic database systems in this field are needed.<br /> In Chapter 2, I Investigated the phylogenetic relationships and evolutionary rate<br />heterogeneity of the family Rhizophoraceae based on the sequences of chloroplast<br />genes <i>mat</i>K and <i>rbcL</i>, and ITS regions of nuclear ribosomal DNA. Phylogenetic trees<br />were constructed using the maximum parsimony (MP), neighbor-joining (NJ) and<br />maximum likelihood (ML) methods. The partition-homogeneity tests indicated that the<br />data sets were homogeneous, and the combined analysis showed that four mangrove<br />genera formed a monophyletic group and the terrestrial genus <i>Pellacalyx</i> was shown to<br />be the basal clade. Evolutionary rate heterogeneity for the plastid <i>mat</i>K and <i>rbc</i>L genes<br />in different species of the Rhizophoraceae was analyzed by means of the relative-rate<br />tests. A number of significant rate differences at synonymous and non-synonymous<br />sites were detected in the two genes. Two significant contrasts are that the mangrove<br />genus <i>Bruguiera</i> has relatively slower substitution rates than the terrestrial genus<br /><i>Carallia</i> at both synonymous and non-synonymous sites in the <i>mat</i>K sequences. The<br />Mantel tests showed that the synonymous and non-synonymous relative・rate matrices<br />are correlated at the <i>mat</i>K gene, suggesting that selective constraint at non-synonymous<br />sites is fairly constant among evolutionary lineages of the <i>mat</i>K locus. Second, there<br />are 13 significant contrasts at non-synonymous sites in the <i>rbc</i>L sequences. Among<br />them, six indicate that the mangrove genera have relatively faster non-synonymous<br />substitution rates than the related terrestrial groups. However, the terrestrial genus<br /><i>Carallia</i> still shows a relatively faster non-synonymous rate than the mangrove genus<br /><i>Kandelia.</i> Moreover, the <i>rbc</i>L non-synonymous sites also exhibit rate heterogeneity<br />among the terrestrial groups, regardless of their geographical distributions. The Mantel<br />tests show that the <i>rbc</i>L rates at synonymous and non-synonymous sites are<br />uncorrelated. The molecular evolutionary pattern of mangroves and their terrestrial<br />relatives in which non-synonymous and synonymous substitution rates are uncoupled<br />suggests that selection is probably an important influence on the rate variation.<br /> In Chapter 3, I detected the adaptive evolution in SARS coronavirus (SARS-CoV)<br />genome. First, 61 SARS coronavirus (SARS-CoV) genomic sequences derived from<br />the early, middle, and late phases of the SARS epidemic were analyzed together with<br />two viral sequences from palm civets. The neutral mutation rate of the viral genome<br />was constant but the amino acid substitution rate of the coding sequences slowed<br />during the course of the epidemic. Between the sequences of the palm civets and each<br />of the human SARS-Co-V sequences, the ratios of the rates of nonsynonymous to<br />synonymous changes (K<small>A</small>/ K<small>s</small>) for the S gene sequences were always greater than 1,<br />indicating an overall positive selection pressure. However, pairwise analysis of the K<small>A</small>/<br />K<small>s</small> for the genotypes in each epidemic group shows that the average K<small>A</small>/ K<small>s</small> for the<br /> early phase was significantly larger than that for the middle phase, which in tum was<br />significantly larger than the ratio for the late phase, which in fact was significantly less<br />than 1. These data indicated that the S gene showed the strongest initial responses to<br />positive selection pressures, followed by subsequent purifying selection and eventual<br />stabilization. Second, I further tested the hypothesis that radical amino acid<br />replacements in the spike protein, favored by environmental selective pressure during<br />the process of SARS-CoV interspecific transmission. I investigated 108 complete<br />sequences of the SARS-CoV S gene, and reconstructed the most recent common<br />ancestor (MRCA) sequences of the S gene and detected the adaptive evolution in the<br />spike protein. The results showed the simultaneous amino acid replacements in three<br />sites, i.e., 360, 665 and 701. These sites led to the excess of observed radical<br />substitution number over corresponding expectation under the assumption of selective<br />neutrality, indicative of potentially important roles they played in the adaptive<br />evolution of the spike protein.<br /> In Chapter 4, I characterized certain distinctions between hybridization and other<br />biological processes, including lineage sorting, paralogy, and lateral gene transfer, that<br />are responsible for topological incongruence between gene trees. Consider two<br />incongruent gene trees with three taxa, A, B, and C, where B is a sister group of A on<br />gene tree 1 but a sister group of C on gene tree 2. With a theoretical model based on the<br />molecular clock, we demonstrated that time of divergence of each gene between taxa A<br />and C is nearly equal in the case of hybridization (B is a hybrid) or lateral gene transfer,<br />but differs significantly in the case of lineage sorting or paralogy. After developing a<br />bootstrap test to test these altermative hypotheses, we extended the model and test to<br />account for incongruent gene trees with numerous taxa. Computer simulation studies<br />supported the validity of the theoretical model and bootstrap test when each gene<br />evolved at a constant rate. The computer simulation also suggested that the model<br />remained valid as long as the rate heterogeneity was occurring proportionally in the<br />same taxa for both genes.<br /> Finally, in Chapter 5, I described an information-theoretic view,<i> i. e.,</i> taxon-view,<br />which can be applied to biological classification to capture taxonomic concepts as data<br />entities and to develop a system for managing these concepts and the lineage<br />relationships among them. A new data model and methodology for comparing<br />interacting classiflcations were outlined. On the basis of the data model and<br />comparison and query methods, a prototype taxonomic database system called<br />HICLAS (Hierarchical CLAssification System) was built to query classification data<br />and to compare interacting classifications and phylogenetic trees. | |||||
所蔵 | ||||||
値 | 有 |