ログイン
言語:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 020 学位論文
  2. 複合科学研究科
  3. 17 情報学専攻

Exploring Semantic roles for Named Entity Recognition in the Molecular biology domain

https://ir.soken.ac.jp/records/843
https://ir.soken.ac.jp/records/843
2017f53f-9ccb-426c-bf55-8d1148f24ba2
名前 / ファイル ライセンス アクション
甲905_要旨.pdf 要旨・審査要旨 (234.9 kB)
甲905_本文.pdf 本文 (15.9 MB)
Item type 学位論文 / Thesis or Dissertation(1)
公開日 2010-02-22
タイトル
タイトル Exploring Semantic roles for Named Entity Recognition in the Molecular biology domain
タイトル
タイトル Exploring Semantic roles for Named Entity Recognition in the Molecular biology domain
言語 en
言語
言語 eng
資源タイプ
資源タイプ識別子 http://purl.org/coar/resource_type/c_46ec
資源タイプ thesis
著者名 WATTARUJEEKRIT, Tuangthong

× WATTARUJEEKRIT, Tuangthong

WATTARUJEEKRIT, Tuangthong

Search repository
フリガナ ワッタールジークリット, ツァンチョン

× ワッタールジークリット, ツァンチョン

ワッタールジークリット, ツァンチョン

Search repository
著者 WATTARUJEEKRIT, Tuangthong

× WATTARUJEEKRIT, Tuangthong

en WATTARUJEEKRIT, Tuangthong

Search repository
学位授与機関
学位授与機関名 総合研究大学院大学
学位名
学位名 博士(情報学)
学位記番号
内容記述タイプ Other
内容記述 総研大甲第905号
研究科
値 複合科学研究科
専攻
値 17 情報学専攻
学位授与年月日
学位授与年月日 2005-09-30
学位授与年度
値 2005
要旨
内容記述タイプ Other
内容記述 Named entity recognition (NER) in the molecular biology domain, the task of identifying and categorizing molecular entities appearing in text, is one of the most important tasks in a biological text mining engine. In general, this task is taken as the first step towards the more ambitious task of molecular event extraction (relation extraction)and, eventually, pathway discovery. However, NER in this scientific domain, which seems to be the easiest task among others in text mining, still achieves quite low performance. As can be seen from the most recent shared-task evaluations of NER in this domain(JNLPBA-2004), the best performance in terms of Fl-score is only 72.6. This result is far below what is achieved by NER system in newswire domain (Fl-score of about 96%) which is near the human level of performance. At present, most NER systems employ term internal features (e.g., lexical and morphology) and co-occurrence information as term external features. Due to the lack of molecular naming convention, which leads to the difficulty of terminological variations as well as the difficulty of polysemy (i.e. the sharing of names between different entities), such features are insufficient to handle the difficulties for NER in the molecular biology domain. To obtain a complete set of rules for lexical patterns of molecular names seem impossible, thus to use term external features other than co-occurrence information is of interest. <br /> In this thesis, the semantic relationships between a predicate and its arguments in terms of semantic roles are proposed to enhance NER system in the molecular biology domain. The semantic role information is derived from a predicate-argument structure (PAS) which is a higher sentence representation level than syntactic relation and surface form levels. Thus, the use of semantic roles is more consistent than co-occurrence information derived from a surface level. To employ the semantic role for NER system, it is realized in various sets of syntactic features which were used by a machine learning model to explore the most efficient way in allowing this knowledge to provide the highest positive effect on the NER. <br /> As a result, the best feature set composed of the 6 lexical features (i.e., surface word, lemma form, orthographic feature, part-of-speech, phrase-chunk and head word of NP-chunk) and 4 PAS-related features for representing an argument's semantic role (i.e., predicate's surface form, predicate's lemma, voice and the united feature of subject-object head's lemma and transitive-intransitive sense). Moreover, the use of semantic roles can show the positive effects for only the predicates conforming to the criteria as follows. A predicate must have its arguments as both agent and theme with a higher probability of belonging to a named entity class than non-named entity class; otherwise, a predicate must have its arguments as both agent and theme with a lower probability of belonging to a named entity class than non-named entity class and the number of training examples for this predicate should be large enough (by observing from empirical evidences, at least 270 sentences). The improvement in performance obtained from the NER system using PAS-related features, compared to not using these features, affirms that the using of semantic roles can enhance NER system.
所蔵
値 有
フォーマット
内容記述タイプ Other
内容記述 application/pdf
戻る
0
views
See details
Views

Versions

Ver.1 2023-06-20 16:10:04.332558
Show All versions

Share

Mendeley Twitter Facebook Print Addthis

Cite as

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX

Confirm


Powered by WEKO3


Powered by WEKO3