@misc{oai:ir.soken.ac.jp:00001441, author = {増山, 和花 and マスヤマ, ワカ and MASUYAMA, Waka}, month = {2016-02-17}, note = {Protein domains are considered to be fundamental units of protein evolution. Many proteins consist of multiple domains, and domain combinations are closely linked with the function and evolutionary history of proteins. Domain loss and gene loss may lead to immediate loss of gene function. Domain combination of genes could also potentially attribute to the diversity of organisms. My purpose is to search human-, primate-, rodents-, and mammalian-specific domain combinations created through domain gain and loss. I retrieved domain combinations of vertebrate proteins from genome data, and defined a repertoire of domain combinations of an organism as a set of combinations encoded in its genome. Pfam (http://pfam.sanger.ac.uk/) was used for the analysis as a domain database. I extracted domain information and functions of vertebrate proteins from GTOP (http://spock.genes.nig.ac.jp/~genome/gtop.html). In order to examine the phylogeny of domain combinations, I followed the nine procedures. At procedure 1, I enumerated 17,358 domain combinations from 37 Metazoa genome data set. At procedure 2, I selected domain combinations of eight species with steady genome data (human, chimpanzee, rhesus macaque, mouse, rat, dog, opossum, and chicken). Opossum and chicken were used as outgroup to placental mammals. At procedure 3, I removed domain combinations of outgroup species from the list of domain combinations left after procedure 2. At procedure 4, I selected lineage specific domaincombinations (human-, primate-, rodents-, and mammalian-specific). At procedure 5, Iselected only the protein sequences existing in SWISS-PROT(http://au.expasy.org/sprot/). At procedure 6, I selected data with high scores by BLASTPas orthologue candidates. At procedure 7, I retained well-aligned sequences by usingClustalW multiple alignment. At procedure 8, I constructed phylogenetic trees toidentify orthologues. At procedure 9, I eliminated domain combinations that were alsofound in more than one outgroup Metazoan species.
 After following procedures 1 to 9, I selected 34 mammalian-specific, 17 primate- specific,10 human-specific, and 14 rodent-specific domain combinations. As the branch in thespecies tree becomes shorter, the frequency of the appearance of novel domains andnovel domain combinations tends to decrease. This is why thenumber of lineage-specific domain combinations decreases as we go down the speciestree from mammalian ancestors via primate ancestors to humans.
 I classified the remaining proteins into seven groups as follows; group 1: emergence ofnew domain, group 2: insertion of one (or more) domain into existing protein, group 3: deletion of one (or more) domain from existing protein, group 4: change of domain order, group 5: increase of domain copy number, group 6: decrease of domain copy number, andgroup 5/6: case when it is not possible to distinguish increase or decrease of domain copynumber. As a result, four lineage-specific domain combinations were decomposed intothese seven groups: mammal-specific(14, 10, 1, 0, 7, 2 and 0), primate-specific (3, 4, 1, 0, 4, 5 and 0), human-specific (1, 0, 0, 0, 8, 0 and 1), rodent-specific (0, 0, 1, 0, 0, 0 and 13) in the order of groups 1, 2, 3, 4, 5, 6, and 5/6, respectively, in parentheses. There was nogroup 4 proteins. Mammal-specific group 1 proteins include various interleukins andmacrophage scavenger receptor. This implies that the mammals upgraded theimmunity system using these new proteins. Out of the seventeen proteins belonging to group 2 and group 3 for all four lineages, twelve in fact did not undergo deletion and insertion of domains that are very definitions of these two groups. These sequences havegradually changed their amino acid sequences via substitionts as the species evolved, and their regions came to be recognized as domains at certain points in evolution. To putit another way, even orthologous proteins that appear to have no domains actually have "pre-domain" sequences which can be detected with thresholds lower than the defaultvalue of the domain search program. This was revealed by the fact that, for theseproteins, the taxonomic ranges of domain annotation vary depending on the thresholdsfed into the domain search program. Other than the above-mentioned proteins, group 2also includes signal transduction proteins, for example, regulator of G-protein signaling 3protein, LIM kinase 2 and ubiquitin carboxyl terminal hydrolase. These proteins may have been involved in the evolution of lineage-specific signaling networks. Interestingly, lineage-specific domain combinations caused by insertions or deletions of domainsalways accompany alternative splicing products or paralogous proteins of the ancestralforms, which seem to alleviate the risk of impaired original functions, which could leadto lethality. It's interesting such backups are working as if they were a kind of insurance.
 Aside from my main procedures 1 through 9, I also conducted secondary analyses andfound several domain combinations that do not belong to the seven groups mentionedabove. Examples of proteins with such domain combinations are: sex-determining region Y protein (SRY), which is conserved throughout therians but is absent in other vertebrates; uricase and L-gulonolactone oxidase, which are absent only in primates.
 In this research, I focused on mammals and performed analysis. Compared to thenumbers of lineage specific, new domain appearance including bacteria, appearance of novel domain in the mammalian lineage is rare because of the short evolution time. In such a situation, however, I found 34 mammal-specific, 17 primate -specific, 10human-specific, and 14 roden-specific domain combinations. As a result of detailedanalysis, I discovered five cases where domain insertion occured., 総研大甲第1248号}, title = {Evolutionary Analysis of Protein Domains in Mammals}, year = {} }