Statistical analysis of viral sequences : bridging sampling design, molecular phylogenetics and population genetics

徐, 泰健; セオ, タエクン; SEO, Tae-Kun

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

Statistical analysis of viral sequences : bridging sampling design, molecular phylogenetics and population genetics

https://ir.soken.ac.jp/records/1204

名前 / ファイル	ライセンス	アクション
要旨・審査要旨 / Abstract, Screening Result (315.4 kB)

Item type

学位論文 / Thesis or Dissertation(1)

公開日

2010-02-22

タイトル

Statistical analysis of viral sequences : bridging sampling design, molecular phylogenetics and population genetics

タイトル

Statistical analysis of viral sequences : bridging sampling design, molecular phylogenetics and population genetics

言語

eng

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_46ec

資源タイプ

thesis

著者名

徐, 泰健

フリガナ

セオ, タエクン

著者

SEO, Tae-Kun

学位授与機関

学位授与機関名

総合研究大学院大学

学位名

博士（学術）

学位記番号

内容記述タイプ

Other

内容記述

総研大甲第623号

研究科

値

先導科学研究科

専攻

値

21 生命体科学専攻

学位授与年月日

2002-03-22

学位授与年度

値

2001

要旨

内容記述タイプ

Other

内容記述

The high pace of viral sequence change means that variation in the times at which sequences are sampled can have a profound effect both on the ability to detect trends over time in evolutionary rates and on the power to reject the molecular clock hypothesis. Trends in viral evolutionary rates are of particular interest because their detection may allow connections to be established between a patient's treatment or condition and the process of evolution. Variation in sequence isolation times also impacts the uncertainty associated with estimates of divergence times and evolutionary rates. Variation in isolation times can be intentionally adjusted to increase the power of hypothesis tests and to reduce the uncertainty of evolutionary parameter estimates, but this fact has received little previous attention. I provide approximations for the power to reject the molecular clock hypothesis when the alternative is that rates change in a linear fashion over time and when the alternative is that rates differ randomly among branches. 　　When the evolutionary rate changes linearly, it can be shown as r(t) = a(t- t1) + r where t is current time, t1 is the time of origin and a is the amount of increase or decrease per unit time. For given a, we can calculate the power to reject the null hypothesis(H0: a = 0) using the fact that the statistic 2Δlog L = 2log 〓 tends to a non-central χ2 distribution under alternative hypothesis (H1: a ≠ 0) where the single circumflex (^) and double circumflex (〓) respectively denote maximum likelihood estimators (m.l.e.'s) under H1 and H0. 　　When the rates differ randomly among branches, we can consider the gamma distribution as a model of rate variation. If we further assume the number of substitution in each branch follows Poisson distribution, the probability density function of the number of substitutions is that of negative binomial distribution. The power to reject the null hypothesis(H0: Evolutionary rate does not vary) can be calculated using non-central χ2 distribution. 　　When the evolutionary rate is constant, the standard deviation of estimated evolutionary rates and divergence times can be approximated using Fisher information matrix. I illustrate how these approximations can be exploited to determine which vital sample should be sequenced when samples representing different dates are available. 　　Using pseudo-maximum likelihood approaches to phylogenetic inference and coalescent theory, I develop a computationally tractable method of estimating effective population size from serially sampled viral data. In this method, a two stage estimation procedure is adopted. The vector of times of internal nodes (〓) is estimated from sequence data and then these estimated node times serve as the basis for inferring effective population size (〓). Because the main interest is effective population size and not times of internal nodes, the internal node times are nuisance parameters in my analysis and the number of these nuisance parameters increases as the number of sequences increases. 　　The variance of the maximum likelihood estimator of effective population size is approximated as (Numerical formula was abbreviated.) where n is the number of sequences. 　　I show that the variance of the maximum likelihood estimator of effective population size depends on the serial sampling design only because internal node times on a coalescent genealogy can be better estimated with some designs than with others. Given the internal node times and the number of sequences sampled, the variance of the maximum likelihood estimator is independent of the serial sampling design. 　　I estimate the effective size of the HIV-1 population within nine hosts. If I assume that the mutation rate is 2.5 x 10-5 substitutions per generation and is the same in all patients, estimated generation lengths vary from 0.73 to 2.43 days per generation and the mean (1.47) is similar to the generation lengths estimated by other researchers. If I assume that generation length is 1.47 days and is the same in all patients, mutation rate estimates vary from 1.52 x 10-5 to 5.02 x 10-5. The results indicate that effective viral population size and evolutionary rate per year are negatively correlated among HIV-1 patients. (Figure 1(a), (b) were abbreviated.) Figure 1: A negative correlation between the evolutionary rate per year 〓 and the effective population size 〓. (a) assuming a generation length of 1.47 days, (b) assuming a mutation rate of 2.5 x 10-5 substitutions per generation.

所蔵

値

有

戻る

views

See details

	Views

Versions

Ver.1

2023-06-20 14:38:22.927852

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR 2.0
JPCOAR 1.0
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Statistical analysis of viral sequences : bridging sampling design, molecular phylogenetics and population genetics

× 徐, 泰健

× セオ, タエクン

× SEO, Tae-Kun

Versions

Share

Cite as

エクスポート