V. Claveau > Publications > Ozdowska & Claveau TALN'05 Article Abstract OLSTOLST
Version française

Sylwia Ozdowska, Vincent Claveau,
Alignement de mots par apprentissage artificiel de règles de propagation syntaxique en corpus de taille restreinte,
Actes de la 12ème conférence de Traitement automatique des langues naturelles (TALN'05), Dourdan, France, June 2005,
Document (pdf)

Abstract This paper presents and evaluates an effective yet original approach to automatically align bitexts at the word level. This approach relies on a syntactic dependency analysis of the texts provided by the tools SYNTEX and uses a machine-learning technique, namely inductive logic programming, to automatically infer rules called propagation rules. These rules make the most of the syntactic information to precisely align words. This approach is entirely automatic, and results obtained on the data of the HLT evaluation campaign rival the ones of the best existing alignment systems. Moreover, our system uses very few training data: only hundreds of sentences compared to millions for the existing systems. Furthermore, syntactic isomorphisms between the two spotted languages are easily identified through a linguistic examination of the inferred propagation rules.