Ingrid's publications

Natural Language Processing

[1] Ingrid Falk, Claire Gardent, Evelyne Jacquey, and Fabienne Venant. Synonym Sense Disambiguation, accepted. In eLexicography in the 21st century, eLex, Louvain-la-Neuve, Belgium, 10 2009. [ bib ]
[2] Ingrid Falk, Samuel Cruz Lara, Nadia Bellalem, and Lotfi Bellalem. Linguistic Information for Multilinguality in the SEMbySEM Project. In IC3K - International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, Funchal, Madeira Portugal, 10 2009. [ bib | .pdf  |.pdf (longer version) |.pdf (poster]
In this paper we discuss ways to handle multilingual linguistic information within the framework of the SEMbySEM project (http://www.sembysem.org, a research project within the European ITEA2 programme) The SEMbySEM project aims at defining tools and standards for the supervision and management of complex and dynamic systems by using a semantic abstract representation of the system to be supervised or managed. As we want our system to conform to an end-user's point of view, the conceptual information must be available and presentable in the end-user's language. On the other hand, lately the need for and benefits of more accurate linguistic information associated to ontological knowledge representations have become more evident and there emerged models of how this articulation could be achieved. Two of these models are LexInfo (Buitelaar et al. 2009) and LIR, the Linguistic Information Repository (Montiel-Ponsoda et al. 2008). In this paper we explore these models under the prospect of putting one or both in praxis in the setting of the SEMbySEM project.

Keywords: knowledge representation; ontology engineering; multilingual linguistic representation; ontology and lexicon
[3] Ingrid Falk, Claire Gardent, Evelyne Jacquey, and Fabienne Venant. Grouping synonyms by definitions, accepted. In Recent Advances in Natural Language Processing, RANLP, Borovets, Bulgaria, 9 2009. [ bib | .pdf ]
[4] Ingrid Falk, Claire Gardent, Evelyne Jacquey, and Fabienne Venant. Sens, synonymes et définitions. In Conférence sur le Traitement Automatique du Langage Naturel - TALN'2009, Senlis France, 2009. [ bib | http ]
We present a method for grouping the synonyms of a word into sets representing the possible meanings of that word. The possible meanings are given by the definitions of a general dictionary for French, the TLFi (Trésor de la langue française informatisé) and the method is applied to the synonyms of 5 synonym dictionnaries. To evaluate the method, we manually constructed a gold standard where for each (word, definition) pair, 4 lexicographers specified the set of synonyms they judge adequate. The method scores an F-measure of 0.602 when no distinction is made between pronominal and non-pronominal use and 0.706 when it is.

[5] Ingrid Falk. Automated Semantic Classification of French Verbs. Master's thesis, École doctorale IAEM Lorraine, Nancy, France, 06 2008. [ bib | .pdf ]
[6] Ingrid Falk, Gil Francopoulo, and Claire Gardent. Évaluer SynLex. In TALN 2007 Traitement Automatique de la Langue Naturelle - TALN 2007, page ., Toulouse France, 06 2007. financement CPER. [ bib | http ]
SYNLEX is a syntactic lexicon extracted semi-automatically from the LADL tables. Like the other syntactic lexicons for French which are both available and usable for NLP (LEFFF, DICOVALENCE), it is incomplete and its recall and precision wrt a gold standard are unknown.We present an approach which goes some way towards adressing these shortcomings. The approach draws on methods used for the automatic acquisition of syntactic lexicons. First, a new syntactic lexicon is acquired from an 82 million words corpus. This lexicon is then used to validate and extend SYNLEX. Finally, the recall and precision of the extended version of SYNLEX is computed based on a gold standard extracted from DICOVALENCE.

Keywords: Syntactic lexicon; Evaluation
[7] Claire Gardent, Bruno Guillaume, Guy Perrier, and Ingrid Falk. Extraction d'information de sous-catégorisation à partir des tables du LADL. In Traitement Automatique de la Langue Naturelle - TALN 2006, Leuven/Belgique, 04 2006. [ bib | http ]
Maurice Gross' grammar lexicon contains rich and exhaustive information about the morphosyntactic and syntactic properties of French syntactic functors (verbs, adjectives, nouns). Yet its use within natural language processing systems is hampered both by its non standard encoding and by a structure that is partly implicit and partly underspecified. In this paper, we present a method for translating this information into a format more amenable for use by NLP systems, we discuss the results obtained so far, we compare our approach with related work and we identify the possible further uses that can be made of the reformatted information.

Keywords: grammar lexicon; M. Gross; subcategorisation
[8] Claire Gardent, Bruno Guillaume, Guy Perrier, and Ingrid Falk. Extracting subcategorisation information from Maurice Gross' grammar lexicon. Archives of Control Sciences, pages 289-300, 2005. [ bib | http ]
Maurice Gross' grammar lexicon contains rich and exhaustive information about the morphosyntactic and semantic properties of French syntactic functors (verbs, adjectives, nouns). Yet its use within natural language processing systems is hampered both by its non standard encoding and by a structure that is partly implicit and partly underspecified. In this paper, we present a method for translating this information into a format more amenable for use by NLP systems, we discuss the results obtained so far, we compare our approach with related work and we identify the possible further uses that can be made of the reformatted information.

Keywords: computational linguistics; syntactic lexicon
[9] Claire Gardent, Bruno Guillaume, Guy Perrier, and Ingrid Falk. Maurice Gross' grammar lexicon and Natural Language Processing. In Language and Technology Conference, Poznan/Pologne, 04 2005. [ bib | http ]
Maurice Gross' grammar lexicon contains an extremly rich and exhaustive information about the morphosyntactic and semantic properties of French syntactic functors (verbs, adjectives, nouns). Yet its use within natural language processing systems is still restricted. In this paper, we first argue that the information contained in the grammar lexicon is potentially useful for Natural Language Processing (NLP). We then sketch a way to translate this information into a format which is arguably more amenable for use by NLP systems.

[10] Claire Gardent, Bruno Guillaume, Ingrid Falk, and Guy Perrier. Le lexique-grammaire de M. Gross et le traitement automatique des langues. Communication à la jounée ATALA : Interface lexique-grammaire et lexiques syntaxiques et sémantiques (http://www.atala.org/article.php3?id_article=240), 2005. [ bib | http ]
Nous proposons les grandes lignes d'une méthode de traduction du lexique-grammaire de Maurice Gross dans un format approprié aux systèmes de traitement automatique des langues.


This file has been generated by bibtex2html 1.88.

Document Engineering

[1] Abdel Belaid, Ingrid Falk, and Yves Rangoni. XML data representation in Document Image Analysis. In Flavio Bortolozzi and Robert Sabourin, editors, 9th International Conference on Document Analysis and Recognition - ICDAR'07, pages 78-82, Curitiba Brésil, 2007. IEEE Computer Society. [ bib | http ]
This paper presents the XML-based formats ALTO, TEI, METS used for Digital Libraries and their interest for data representation in a Document Image Analysis and Recognition (DIAR) process. In the first part we briefly present these formats with focus on their adequacy for structural representation and modeling of DIAR data. The second part shows how these formats can be used in a reverse engineering process. Their implementation as a data representation framework will be shown.

Keywords: XML; TEI; ALTO; METS; Document Image Analysis and Recognition; XSLT; Reverse Engineering; Document Class Model
[2] Ingrid Falk. Représentation et Stockage des données de la numérisation du dictionnaire Trévoux. Technical report, Loria, Nancy, France, 03 2006. [ bib | .pdf ]


This file has been generated by bibtex2html 1.88.


Last update: Thursday, October 15, 2009