|
|
Pascale Sébillot, Pierrette Bouillon,
Vincent
Claveau, Cécile Fabre, Laurence Jacqmin, Jacques Nicolas,
Apprentissage en corpus de couples nom-verbe pour la
construction
d'un lexique génératif,
JADT 2000 (journées d'analyse de données
textuelles), Lausanne, Switzerland, March 2000,
Document (pdf) |
|
Abstract NLP systems
involving
disambiguation and rephrasing require a fine-grained description of the
semantics of lexical units. In this paper we describe a means for
automatically extracting such information from corpora, in the
framework
of Pustejovsky s Generative Lexicon. In one of the components of this
lexical model, called the qualia structure, words are described in
terms of semantic roles. The qualia structure of a noun is mainly made
up of verbal associations, encoding relational information. For
example, the French verb mesurer refers to the telic role of the noun
jaugeur. Our aim is, for a given noun (N), to be able to automatically
extract from a corpus the verbs (V) that could belong to its qualia
structure. More precisely, in this paper, we describe a method based on
learning techniques within the Inductive Logic Programming framework,
that permits us to distinguish in the corpus between N-V pairs that are
linked by a semantic relation and pairs that are not. Results compared
with a Khi2 score demonstrate that the method is very promising, not
only because an important proportion of relevant pairs are detected,
but
also because it provides information that can be used to build
linguistic rules. |
|
|