WEKO3
アイテム
{"_buckets": {"deposit": "d54c7c88-27b4-4f84-85f5-14bbd2dc2199"}, "_deposit": {"created_by": 21, "id": "1503", "owners": [21], "pid": {"revision_id": 0, "type": "depid", "value": "1503"}, "status": "published"}, "_oai": {"id": "oai:ir.soken.ac.jp:00001503", "sets": ["19"]}, "author_link": ["0", "0", "0"], "item_1_biblio_info_21": {"attribute_name": "書誌情報(ソート用)", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "2009-09-30", "bibliographicIssueDateType": "Issued"}, "bibliographic_titles": [{}]}]}, "item_1_creator_2": {"attribute_name": "著者名", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "McCRAE, John Philip"}], "nameIdentifiers": [{"nameIdentifier": "0", "nameIdentifierScheme": "WEKO"}]}]}, "item_1_creator_3": {"attribute_name": "フリガナ", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "マックレー, ジョン フィリップ"}], "nameIdentifiers": [{"nameIdentifier": "0", "nameIdentifierScheme": "WEKO"}]}]}, "item_1_date_granted_11": {"attribute_name": "学位授与年月日", "attribute_value_mlt": [{"subitem_dategranted": "2009-09-30"}]}, "item_1_degree_grantor_5": {"attribute_name": "学位授与機関", "attribute_value_mlt": [{"subitem_degreegrantor": [{"subitem_degreegrantor_name": "総合研究大学院大学"}]}]}, "item_1_degree_name_6": {"attribute_name": "学位名", "attribute_value_mlt": [{"subitem_degreename": "博士(情報学)"}]}, "item_1_description_1": {"attribute_name": "ID", "attribute_value_mlt": [{"subitem_description": "2009523", "subitem_description_type": "Other"}]}, "item_1_description_12": {"attribute_name": "要旨", "attribute_value_mlt": [{"subitem_description": "Ontologies provide a structured description of the concepts and terminology\u003cbr /\u003eused in a particular domain and provide valuable knowledge for a range of natu-\u003cbr /\u003eral language processing applications. However, for many domains and languages\u003cbr /\u003eontologies do not exist and manual creation is a difficult and resource-intensive\u003cbr /\u003eprocess. As such, automatic methods to extract, expand or aid the construction\u003cbr /\u003eof these resources is of significant interest.\u003cbr /\u003e There are a number of methods for extracting semantic information about\u003cbr /\u003ehow terms are related from raw text, most notably the approach of Hearst\u003cbr /\u003e[1992], who used \u003ci\u003epatterns\u003c/i\u003e to extract hypernym information. This method was\u003cbr /\u003emanual and it is not clear how to automatically generate patterns, which are\u003cbr /\u003especific to a given relationship and domain. I present a novel method for de-\u003cbr /\u003eveloping patterns based on the use of alignments between patterns. Alignment\u003cbr /\u003eworks well as it is closely related to the concept of a \u003ci\u003ejoin-set\u003c/i\u003e of patterns, which\u003cbr /\u003eminimally generalise over-fitting patterns. I show that join-sets can be viewed\u003cbr /\u003eas an reduction on the search space of patterns, while resulting in no loss of\u003cbr /\u003eaccuracy. I then show the results can be combined by a \u003ci\u003esupport vector machine\u003c/i\u003e\u003cbr /\u003eto a obtain a classifier, which can decide if a pair of terms are related. I applied\u003cbr /\u003ethis to several data sets and conclude that this method produces a precise result,\u003cbr /\u003ewith reasonable recall.\u003cbr /\u003e The system I developed, like many semantic relation systems, produces only\u003cbr /\u003ea binary decision of whether a term pair is related. Ontologies have a structure,\u003cbr /\u003ethat limits the forms of networks they represent. As the relation extraction is\u003cbr /\u003egenerally noisy and incomplete, it is unlikely that the extracted relations will\u003cbr /\u003ematch the structure of the ontology. As such I represent the structure of ontol-\u003cbr /\u003eogy as a set of logical statements, and form a consistent ontology by finding the\u003cbr /\u003enetwork closest to the relation extraction system\u0027s output, which is consistent\u003cbr /\u003ewith these restrictions. This gives a novel \u003ci\u003eNP-hard\u003c/i\u003e optimisation problem, for\u003cbr /\u003ewhich I develop several algorithms. I present simple greedy approaches, and\u003cbr /\u003ebranch and bound approaches, which my results show are not sufficient for this\u003cbr /\u003eproblem. I then use resolution to show how this problem can be stated as an\u003cbr /\u003e\u003ci\u003einteger programming problem,\u003c/i\u003e which can be efficiently solved by relaxing it to\u003cbr /\u003ea \u003ci\u003elinear programming problem\u003c/i\u003e. I show that this result can efficiently solve the\u003cbr /\u003eproblem, and furthermore when applied to the result of the relation extraction\u003cbr /\u003esystem, this improves the quality of the extraction as well as converting it to an\u003cbr /\u003eontological structure.", "subitem_description_type": "Other"}]}, "item_1_description_18": {"attribute_name": "フォーマット", "attribute_value_mlt": [{"subitem_description": "application/pdf", "subitem_description_type": "Other"}]}, "item_1_description_7": {"attribute_name": "学位記番号", "attribute_value_mlt": [{"subitem_description": "総研大甲第1288号", "subitem_description_type": "Other"}]}, "item_1_select_14": {"attribute_name": "所蔵", "attribute_value_mlt": [{"subitem_select_item": "有"}]}, "item_1_select_16": {"attribute_name": "複写", "attribute_value_mlt": [{"subitem_select_item": "印刷物から複写可"}]}, "item_1_select_8": {"attribute_name": "研究科", "attribute_value_mlt": [{"subitem_select_item": "複合科学研究科"}]}, "item_1_select_9": {"attribute_name": "専攻", "attribute_value_mlt": [{"subitem_select_item": "17 情報学専攻"}]}, "item_1_text_10": {"attribute_name": "学位授与年度", "attribute_value_mlt": [{"subitem_text_value": "2009"}]}, "item_creator": {"attribute_name": "著者", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "McCRAE, John Philip", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "0", "nameIdentifierScheme": "WEKO"}]}]}, "item_files": {"attribute_name": "ファイル情報", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2016-02-17"}], "displaytype": "simple", "download_preview_message": "", "file_order": 0, "filename": "甲1288_要旨.pdf", "filesize": [{"value": "308.8 kB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_11", "mimetype": "application/pdf", "size": 308800.0, "url": {"label": "要旨・審査要旨", "url": "https://ir.soken.ac.jp/record/1503/files/甲1288_要旨.pdf"}, "version_id": "64906015-c777-4d0e-b4b3-d37598419d49"}, {"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2016-02-17"}], "displaytype": "simple", "download_preview_message": "", "file_order": 1, "filename": "甲1288_本文.pdf", "filesize": [{"value": "2.1 MB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_11", "mimetype": "application/pdf", "size": 2100000.0, "url": {"label": "本文", "url": "https://ir.soken.ac.jp/record/1503/files/甲1288_本文.pdf"}, "version_id": "8bd3e0bc-a60c-4325-b10e-0407693bb186"}]}, "item_language": {"attribute_name": "言語", "attribute_value_mlt": [{"subitem_language": "eng"}]}, "item_resource_type": {"attribute_name": "資源タイプ", "attribute_value_mlt": [{"resourcetype": "thesis", "resourceuri": "http://purl.org/coar/resource_type/c_46ec"}]}, "item_title": "AUTOMATIC EXTRACTION OF LOGICALLY CONSISTENT ONTOLOGIES FROM TEXT CORPORA", "item_titles": {"attribute_name": "タイトル", "attribute_value_mlt": [{"subitem_title": "AUTOMATIC EXTRACTION OF LOGICALLY CONSISTENT ONTOLOGIES FROM TEXT CORPORA"}, {"subitem_title": "AUTOMATIC EXTRACTION OF LOGICALLY CONSISTENT ONTOLOGIES FROM TEXT CORPORA", "subitem_title_language": "en"}]}, "item_type_id": "1", "owner": "21", "path": ["19"], "permalink_uri": "https://ir.soken.ac.jp/records/1503", "pubdate": {"attribute_name": "公開日", "attribute_value": "2010-06-09"}, "publish_date": "2010-06-09", "publish_status": "0", "recid": "1503", "relation": {}, "relation_version_is_last": true, "title": ["AUTOMATIC EXTRACTION OF LOGICALLY CONSISTENT ONTOLOGIES FROM TEXT CORPORA"], "weko_shared_id": -1}
AUTOMATIC EXTRACTION OF LOGICALLY CONSISTENT ONTOLOGIES FROM TEXT CORPORA
https://ir.soken.ac.jp/records/1503
https://ir.soken.ac.jp/records/15037a411a21-9f94-4627-bdb1-4b271e79bb0e
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]() |
||
![]() |
Item type | 学位論文 / Thesis or Dissertation(1) | |||||
---|---|---|---|---|---|---|
公開日 | 2010-06-09 | |||||
タイトル | ||||||
タイトル | AUTOMATIC EXTRACTION OF LOGICALLY CONSISTENT ONTOLOGIES FROM TEXT CORPORA | |||||
タイトル | ||||||
言語 | en | |||||
タイトル | AUTOMATIC EXTRACTION OF LOGICALLY CONSISTENT ONTOLOGIES FROM TEXT CORPORA | |||||
言語 | ||||||
言語 | eng | |||||
資源タイプ | ||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_46ec | |||||
資源タイプ | thesis | |||||
著者名 |
McCRAE, John Philip
× McCRAE, John Philip |
|||||
フリガナ |
マックレー, ジョン フィリップ
× マックレー, ジョン フィリップ |
|||||
著者 |
McCRAE, John Philip
× McCRAE, John Philip |
|||||
学位授与機関 | ||||||
学位授与機関名 | 総合研究大学院大学 | |||||
学位名 | ||||||
学位名 | 博士(情報学) | |||||
学位記番号 | ||||||
内容記述タイプ | Other | |||||
内容記述 | 総研大甲第1288号 | |||||
研究科 | ||||||
値 | 複合科学研究科 | |||||
専攻 | ||||||
値 | 17 情報学専攻 | |||||
学位授与年月日 | ||||||
学位授与年月日 | 2009-09-30 | |||||
学位授与年度 | ||||||
2009 | ||||||
要旨 | ||||||
内容記述タイプ | Other | |||||
内容記述 | Ontologies provide a structured description of the concepts and terminology<br />used in a particular domain and provide valuable knowledge for a range of natu-<br />ral language processing applications. However, for many domains and languages<br />ontologies do not exist and manual creation is a difficult and resource-intensive<br />process. As such, automatic methods to extract, expand or aid the construction<br />of these resources is of significant interest.<br /> There are a number of methods for extracting semantic information about<br />how terms are related from raw text, most notably the approach of Hearst<br />[1992], who used <i>patterns</i> to extract hypernym information. This method was<br />manual and it is not clear how to automatically generate patterns, which are<br />specific to a given relationship and domain. I present a novel method for de-<br />veloping patterns based on the use of alignments between patterns. Alignment<br />works well as it is closely related to the concept of a <i>join-set</i> of patterns, which<br />minimally generalise over-fitting patterns. I show that join-sets can be viewed<br />as an reduction on the search space of patterns, while resulting in no loss of<br />accuracy. I then show the results can be combined by a <i>support vector machine</i><br />to a obtain a classifier, which can decide if a pair of terms are related. I applied<br />this to several data sets and conclude that this method produces a precise result,<br />with reasonable recall.<br /> The system I developed, like many semantic relation systems, produces only<br />a binary decision of whether a term pair is related. Ontologies have a structure,<br />that limits the forms of networks they represent. As the relation extraction is<br />generally noisy and incomplete, it is unlikely that the extracted relations will<br />match the structure of the ontology. As such I represent the structure of ontol-<br />ogy as a set of logical statements, and form a consistent ontology by finding the<br />network closest to the relation extraction system's output, which is consistent<br />with these restrictions. This gives a novel <i>NP-hard</i> optimisation problem, for<br />which I develop several algorithms. I present simple greedy approaches, and<br />branch and bound approaches, which my results show are not sufficient for this<br />problem. I then use resolution to show how this problem can be stated as an<br /><i>integer programming problem,</i> which can be efficiently solved by relaxing it to<br />a <i>linear programming problem</i>. I show that this result can efficiently solve the<br />problem, and furthermore when applied to the result of the relation extraction<br />system, this improves the quality of the extraction as well as converting it to an<br />ontological structure. | |||||
所蔵 | ||||||
値 | 有 | |||||
フォーマット | ||||||
内容記述タイプ | Other | |||||
内容記述 | application/pdf |