Brigitte Bigi
Publications Software Corpus


I'm Brigitte Bigi (PhD), working in Aix-en-Provence, France. I'm a CNRS researcher at Laboratoire Parole et Langage.

I am absolutely convinced that research should be public and should primarily be a cooperation. That's why I'm sharing all my research works under Open Source licenses.

I'm a computer scientist working on Artificial Intelligent systems. I'm interested in Applied Computational Linguistics and Corpus Linguistics. I previously worked on speech technologies (ASR) and natural language processing (NLP).

My research topics are related to multimodal corpora:

Since 2011, all my researches are programmed, tested, documented and freely distributed: this results in a software tool with name SPPAS. It is daily developed with the aim to provide a robust and reliable software for the automatic annotation and for the analyses of annotated data. As the primary functionality, SPPAS proposes a set of automatic or semi-automatic annotations of recordings. Some special features are also offered in SPPAS for managing corpora of annotated files; particularly, it includes a tool to filter multi-levels annotations. Some other tools are dedicated to the analysis of time-aligned data; as for example to estimate descriptive statistics, etc.

Open Science information
Events and organizations about Open Science
Award Ceremony 2022 (audio/video, in French)
Photo remise trophée Science Ouverte
Screenshot of the award ceremony: Brigitte Bigi, and Sylvie Retailleau - MESRI Minister

Main projects and activities

  • Leader of the project "AutoCuedSpeech" (2023-2026), granted by FIRAH:
  • Contributor to Vapvisio (2018-2021), granted by ANR.
  • Contributor to Physocial (2018-2019), granted by Aix-Marseille Univ.
  • Leader of a Procore project, in collaboration with Poly U of Hong Kong, granted by Campus France (2015-2016).
  • Contributor to Variamu (2014-2015), granted by Aix-Marseille Univ.
  • Contributor to ORTOLANG (2012-2015), Equipex.
  • Contributor to TYPALOC , granted by ANR.
  • Member of CoFee (2012-2014), granted by ANR.


Past research topics are related to text corpora:

  • Formalization and constitution of corpora, including for under-resourced languages
  • Application to information retrieval (classification)
  • Application to automatic speech recognition (French, vietnamese, Khmer)
  • Application to statistical translation (French-English, French-vietnamese)

I am a graduate from Avignon University with a PhD in Computer Science. From 1997 to 2000, I worked with Professor Renato De Mori at LIA, France. I worked on statistical language modelling for automatic speech recognition and information retrieval. I had introduced a new effective model for topic identification.

From 2000 to 2002, I worked with Professor Jean-Paul Haton and Pr Kamel Smaïli at LORIA, Nancy, France. My work focused on topic identification in newspaper articles and e-mails.

From 2002 to 2009, I worked at LIG on statistical language modelling for automatic speech recognition and statistical machine translation.

Since 2009, at LPL (Laboratoire Parole et Langage, Aix-en-Provence, France), my research has focused on corpus creation and annotation of speech recordings. My research focuses on language-independent approaches to tools and systems development so that they can be used either for languages with few available data resources or for languages with unexpected amount of – unnecessary – data.

Professional Experiences

  • 2009-date, CR1 CNRS at LPL Aix-en-Provence, France
  • 2006-2009, CR1 CNRS at LIG Grenoble, France
  • 2002-2006, CR2 CNRS at LIG Grenoble, France
    • 2004 (1st semester), invited researcher at ICSI Berkeley, CA, USA
  • 2000-2002, ATER (Assistant) at LORIA Nancy, France
  • 1997-2000, PhD student at LIA Avignon, France.