PhD student in computer sciences in Rennes (France)
Text mining and information extraction in clinical data
My thesis is part of the BigClin project which aims at developing a new clinical records representation
relying on fine-grained semantic annotation thanks to new NLP tools dedicated to French clinical narratives. The project also addresses distributed systems issues:
scalability, management of uncertain data and privacy, stream processing at runtime...
My role in this context is to develop and test NLP methods and tools in order to process unstructured clinical data in French. This methods have to be based on algorithms able to capture the semantics of the texts efficiently. Targeted tasks include indexing medical content, text mining, information extraction, dectection of uncertainty and negation, ect. NLP tasks will rely on a precise semantic annotation.
During my master's degree internship, I worked on the accordys project under the supervision of a PhD student. The main task of this internship was to evaluate the performances of document indexing models such as TF-IDF, LSI, LDA, etc. at the document level in order to compute the similarity between several cases of fetal malformation. To do so, we used a small corpus was composed of fetoplacental examinations, written in free text form.
I got my master’s degree in Linguistic Research and Applied Computations from the university Bordeaux-Montaigne. I followed many courses in linguistic theory (discourse analysis, syntax, semantics, etc.) and applied linguistics (natural language processing, corpus linguistics, etc.)
Dalloux, C., V. Claveau, et N. Grabar (2017). Détection de la négation : corpus français et apprentissage supervisé. In SIIM 2017 - Symposium sur l’Ingénierie de l’Information Médicale, Toulouse, France, pp. 1–8.
Dalloux, C. (2017). Détection de l’incertitude et de la négation : un état de l’art. In 19es REncontres jeunes Chercheurs en Informatique pour le TAL (RECITAL 2017), pp. 94–107.