@misc{oai:ir.soken.ac.jp:00001503, author = {McCRAE, John Philip and マックレー, ジョン フィリップ and McCRAE, John Philip}, month = {2016-02-17, 2016-02-17}, note = {Ontologies provide a structured description of the concepts and terminology
used in a particular domain and provide valuable knowledge for a range of natu-
ral language processing applications. However, for many domains and languages
ontologies do not exist and manual creation is a difficult and resource-intensive
process. As such, automatic methods to extract, expand or aid the construction
of these resources is of significant interest.
  There are a number of methods for extracting semantic information about
how terms are related from raw text, most notably the approach of Hearst
[1992], who used patterns to extract hypernym information. This method was
manual and it is not clear how to automatically generate patterns, which are
specific to a given relationship and domain. I present a novel method for de-
veloping patterns based on the use of alignments between patterns. Alignment
works well as it is closely related to the concept of a join-set of patterns, which
minimally generalise over-fitting patterns. I show that join-sets can be viewed
as an reduction on the search space of patterns, while resulting in no loss of
accuracy. I then show the results can be combined by a support vector machine
to a obtain a classifier, which can decide if a pair of terms are related. I applied
this to several data sets and conclude that this method produces a precise result,
with reasonable recall.
  The system I developed, like many semantic relation systems, produces only
a binary decision of whether a term pair is related. Ontologies have a structure,
that limits the forms of networks they represent. As the relation extraction is
generally noisy and incomplete, it is unlikely that the extracted relations will
match the structure of the ontology. As such I represent the structure of ontol-
ogy as a set of logical statements, and form a consistent ontology by finding the
network closest to the relation extraction system's output, which is consistent
with these restrictions. This gives a novel NP-hard optimisation problem, for
which I develop several algorithms. I present simple greedy approaches, and
branch and bound approaches, which my results show are not sufficient for this
problem. I then use resolution to show how this problem can be stated as an
integer programming problem, which can be efficiently solved by relaxing it to
a linear programming problem. I show that this result can efficiently solve the
problem, and furthermore when applied to the result of the relation extraction
system, this improves the quality of the extraction as well as converting it to an
ontological structure., application/pdf, 総研大甲第1288号}, title = {AUTOMATIC EXTRACTION OF LOGICALLY CONSISTENT ONTOLOGIES FROM TEXT CORPORA}, year = {} }