@misc{oai:ir.soken.ac.jp:00000906, author = {深川, 竜郎 and フカガワ, タツオ and FUKAGAWA, Tatsuo}, month = {2016-02-17, 2016-02-17}, note = {Studies on a boundary of long-range G+C % mosaic domains in the human genome: characterization of pseudoautosomal boundary-like sequence (PABL) found near the boundary
The human genome,like those of warm-blooded vertebrates in general, is composed of long-range G+C% (GC%) mosaic structures related to chromosome bands. Several groups showed that the Giemsa-dark G bands are composed mainly of AT-rich sequences, and T bands (a subgroup of Giemsa-pale R bands) mainly of GC-rich sequences: ordinary R bands are heterogeneous and appear to be intermediate. Gene density, CpG island density, codon usage, chromosome condensation, DNA replication timing, repeat sequence density, and other chromosome behaviors such as recombination and mutation rate are related to chromosome bands and to long-range GC% mosaic domains. Gene-dense T bands with loose chromatin structures replicate early in S phase and are rich in Alu repeats, while G bands with condensed chromatin structures replicate late and are rich in LINE-1 repeats. Because chromosome bands are structures observed with microscopes, precise location of their boundaries may seem meaningless. However, considering various genome behaviors connected with chromosome bands, it appears possible to precisely locate their boundaries by putting informative landmarks on genome DNA. Boundaries may be structurally assigned as clear GC% transition points, and signals for punctuating and/or differentiating respective functions (e.g., a switching signal from early to late DNA replication) may be found in the boundaries.
The human major histocompatibility complex (MHC) has classes I (about 2 Mb), III (1 Mb), and II (1 Mb) from telomere to centromere. Ikemura and his colleagues had found the human MHC to be a typical example of long-range GC% mosaic structures by analyzing MHC sequences compiled by GenBank database and predicted a possible boundary of the Mb-level domains within an under-characterized 450 kb containing the junction of MHC classes II and III. To clone the mosaic boundary, bidirectional chromosome walking from the class III CYP21 to the class II and from the class II HLADRA to the class III was conducted in this study, and contiguous clones covering the 450-kb region bridging classes II and III were obtained. To analyze base-compositional distribution of the walked area, insert DNAs of the clones were purified and digested by nuclease P1, and GC% was measured by a HPLC method. About 150 kb from the HLADRA was fairly homogeneously AT-rich (mostly less than 40% GC) showing extension of AT-rich sequences from the class II side. Then a sharp transition to about 50% GC occurred and this GC-rich level continued to the class III CYP21 . To analyze the structures near and at the GC% transition, the cosmid and λ clones harboring the transitional region were sequenced. The following three types of characteristic structures were found; Alu repeats densely clustered in a 20-kb region, five LINE-1 repeats also clustered in a 30-kb region, and a sequence highly homologous with the pseudoautosomal boundary sequence of the short arms of the human sex chromosomes (PABX1 and PABY1); PABX1 and PABY1 are the interface sequences between sexspecific and pseudoautosomal regions. The author designated the sequence highly homologous with PABXY1 "PABL". There exists a possibility that the organization, a dense Alu cluster - a dense LINE-1 cluster - a PABL, is one characteristic of certain types of long-range GC% mosaic boundaries and of band boundaries. The author focused on characterization of the newly found sequence, PABL.
Human sex chromosomes are divided into two functionally distinct regions, sexspecific sequences and pseudoautosomal regions (PARs); within each male meiosis, X and Y chromosomes exchange DNA sequences by homologous recombination in PARs and thus PAR sequences are practically identical between X and Y chromosomes because of this obligatory recombination. The interface between PARI (about 2.6 Mb) and the sex-specific region is the pseudoautosomal boundary (PAB1) and therefore PAB1 is the proximal (centromeric) limit to X-Y homologous recombination in PAR1 : PAB1 is very unique in the human genome as a strict physical site at which unusually high frequency recombination in the 2.6 Mb of PAR1 (known to be 20-fold greater than the genome average) terminates abruptly. Ellis and his colleagues reported sequences around the interface, i.e., PABX1 and PABY1 sequences. Interestingly, the sequence found near the boundary of the long-range GC% mosaic domains in the MHC is highly homologous (about 80% nucleotide identity) with the PABXY1 sequences which constitute the functional interface in the sex chromosomes. Using the sequence in the MHC as a probe, multiple copies of pseudoautosomal boundary-like sequences (PABLs) were detected through Southem blot hybridization against genomic DNAs and cosmid cloning. The author defined a ca. 650-nt consensus sequence of the PABL core by determining and comparing eleven independent PABL sequences. Although GenBank genomic sequences showing significant homology with the PABLs were confined to PABXY1 , several human ESTs (expressed sequence tags) showed evident homology with separate portions of PABLs, indicating some, if not all, PABLs are transcribable. To clarify characteristics of the predicted PABL transcripts, six human cDNA libraries of different tissues and cells were screened using the PABL segment as a probe. Positive clones were obtained from all six libraries. Sequence analysis of the six cDNA clones showed there exists again a 650-nt core sequence which corresponds to that defined by the genomic PABLS. No ORFs with significant sizes could be found for the obtained cDNAs, not only for the PABL core sequences but also their flanks. When the cDNA sequences were searched with the BLASTX program against the protein sequence database, no significant homology with known proteins was detected. GRAIL, a computer program trained to identify protein-coding ORFs in human DNA, also could not detect reliable protein-coding capacity. These may suggest that their functional form, if present, is RNA molecules. To estimate intact sizes of PABL transcripts, northem blot analysis of human total or polyA+ RNA fraction was conducted using the PABL probe. Broad bands, estimated to be 5-10 kb in length, were detected.
For study of evolutionary processes involved in forming the present PABXY1 and PABLS, their phylogenetic relationships were examined. In order to estimate evolutionary rates, the reported PABXY1 sequences of great apes and Old World monkeys were included; divergence between great apes and Old World monkeys is postulated here to be 25 million years ago. Phylogenetic trees were constructed using the neighbor-joinjng method. Using the evolutionary distance between human and Old World monkey PABX1 , as well as that for PABY1, divergence time of PABLs including PABXY1 was estimated to be 60-120 million years: this is consistent with the result obtained through Southem hybridization that PABLs are present in the bovine genome. The evolutionary rates of individual PABLs and PABXY1 were then calculated using the divergence time. The rates of some PABLs were far less than 1 x 10-9 substitutions per site per year, indicating evolutionary and functional constrains were executed on PABL sequences. Taking phylogenetic relationship between PABLS and PABXY1 into consideration, evolutionary process in the formation of the present pseudoautosomal boundary PAB1 is proposed by postulating an illegitimate recombination between two PABLS., 総研大甲第159号}, title = {ヒトゲノムに存在する巨大G+C%区分構造の境界に関する研究:境界領域に見い出した偽常染色体部位境界様配列(PABL)の解析}, year = {} }