The automatic annotation and analysis of speech

A scientific research software

SPPAS offers open source cross-platform, customizable automatic annotation and analysis solutions for audio and video media.

SPPAS was awarded by the French Ministry of Higher Education, Research and Innovation at the 2022 Open Source Research Software Competition.

Brigitte Bigi is the author of SPPAS: she's a computer scientist, researcher at Laboratoire Parole et Langage, Aix-en-Provence, France.

Honourable Mention to the Special Jury Prize


SPPAS produces automatically annotations from a recorded audio and its orthographic transcription and/or from a video.


SPPAS helps for the analysis of annotated files: statistics, requests, view and edit files to annotate manually.


SPPAS converts annotated files from/to a wide range of formats: xra, TextGrid, eaf, trs...

Hot topics

2023: Annotate manually in the "Edit" page

The editor page of the Graphical User Interface will allow to annotate manually. This interface is already offering an easy and powerful solution to modify labels of annotations. In 2023 releases, it will be possible to adjust very precisely boundaries of video annotations in a convenient window displaying a sequence of 3 frames of the video.

SPPAS Edit Screenshot

SPPAS 4.2: The Edit page with an audio file, a video file, the manual orthographic transcription and 2 automatic annotations.

2023-2026: Cued Speech Automatic Generation

Cued speech keys generator was introduced the first time in version 3.9, August 2021. Then, a proof of concept (PoC) of an augmented reality system was firstly proposed in version 4.2, January 2022.

The PoC will be turned into a stable version in 2023, then some models based on the analyses of CLeLfPC will be implemented (statistical distributions analyses, machine learning, ...).

This is part of a project funded by FIRAH.

Cette vidéo est une démonstration de la génération automatique des clés LPC par le logiciel SPPAS.
Result of the proof-of-concept of "Cued Speech automatic annotation" (SPPAS 4.3)

Why should you trust SPPAS?


The SPPAS software tool is reliable: the application performs the features that the documentation described. It can tolerate the user making mistakes or using the software in unexpected ways: in that cases an error identifier with an error message is displayed. Its performances are good enough for the required uses cases, under the expected load and data volume.

SPPAS is installed on your computer: your corpus won't be transferred on the web. No statistics, no personal data are collected.

Its ongoing maintenance by the author: fixing bugs, keeping its systems operational, investigating failures, checking it on 3 platforms, modifying it for new use cases, adding new features and last but not least adding new language resources, updating the documentation and the website.

SPPAS is an open source package. You can edit the source code of the software tool, you can modify it, you can re-distribute it, etc.

Key Facts and Figures

  • Documentations:
    • The SPPAS documentation: 160 pages
    • The resources documentation: 60 pages
    • The XRA file format: 15 pages
    • The transcription convention: 5 pages
  • References: 29
    • 4 about the software tool itself
    • 18 about the annotations
    • 3 about the linguistic resources construction
    • 2 about the analyses
    • 2 about the data representation

How to cite?


Current ones

Past ones

Previously, SPPAS was partly supported by the following projects:


ORTOLANG - Investissement d'Avenir


ORTOLANG receives state aid under the « Investissements d’avenir » program (ANR–11–EQPX–0032)

Logo CoFee Project

CoFee - Conversational Feedback


Multidimensional analyses and modeling (ANR-12-JCJC-JSH2-006-01)

Logo Variamu Project

Variamu - Variations in Action


a Multilingual approach (Campus France - Procore PHC)

Logo Polytechnic University of Hong-Kong Project

Adding Cantonese into SPPAS


In collaboration with PolyU (Campus France - Procore PHC)

Logo NaijaSynCor Project

NaijaSynCor - Common Nigerian Pidgin


a corpus-based study of the nature and functions of Naija in Nigeria (ANR-16-CE27-0007)

Logo VAPVISIO Project



the training of language trainers in online environments using videoconferencing (ANR-18-CE28-0011)