SPPAS — the automatic annotation and analysis of speech

SPPAS is an open source annotation tool allowing to create, visualize and search annotations for audio/video data. Among others, it is able to produce automatically speech segmentation annotations from a recorded speech sound and its transcription. Some special features are also offered for managing corpora of annotated files.

SPPAS's data files are XML-based, and SPPAS is compatible with Praat, Elan, Transcriber, and others. SPPAS runs on Windows, macOS and Unix platforms.

More about SPPAS...

AudiooPy

AudiooPy stands for "Audio manager in Python Object-Oriented Programming." It is an open-source library that provides a range of useful operations for sound files and audio fragments. It processes audio at the frame level, working with signed integer samples of 8, 16, or 32 bits, stored in byte-like objects.

Key features include:

Reading and writing WAV files using Python's standard library.
A scientifically validated method for automatically detecting sound segments in speech.
Manipulation of raw audio data.
Audio mixing capabilities.
Automated computation of statistical descriptors for audio data.
Channel extraction.
Channel mixing.

AudiooPy is entirely self-contained and does not rely on any external libraries.

Link to AudiooPy...

WhakerPy

WhakerPy is an open-source library to create dynamic HTML content and web-based applications. It allows creating and manipulating HTML from the power of Python:

Easy to learn, consistent, simple syntax;
Flexible and easy usage;
Create HTML pages dynamically;
Can save as static files; and/or
Run locally with its HTTPD server, or WSGI service, and its response "bakery" system.

Link to WhakerPy...

Whakerexa

Whakerexa is a set of CSS frameworks and JavaScript's. It is intended to be as simple as possible to make accessible web content, and to minimize the use of CSS classes for enhancing the readability of HTML code.

It was designed to be easily customizable, allowing users to adjust properties such as fonts, colors, borders, etc., effortlessly. Most of the properties are stored into variables which makes possible to re-define them, then to obtain a custom different style, enabling users to achieve a unique style easily.

Link to Whakerexa...

See Whakerexa in action

ClammingPy

ClammingPy is an open-source library to convert a Python class or module into Markdown or HTML for documentation purposes. It supports reStructuredText and Epydoc formats. Docstrings are analyzed with flexibility rather than completeness.

ClammingPy generates HTML-5 with a high WCAG 2.1 conformity level.

More about ClammingPy...

Deprecated tools

CLIPS Corpus Filtering Toolkit (version 2.6)

XML output example

Download:

Various tools

DistKLCount.awk : Estimation of the Kullback-Leibler distance between 2 files (word freq)
EvalVocab.awk : Estimation of a vocabulairy quality (see JEP 2006)
txt2sgm.csh : Convert ascii text to (one sentence a line) to SGM (MTEVAL)

[For FRENCH ONLY] Logiciels pour éditer/Modifier/Rechercher/Valider des annotations

Logiciel pour visualiser/modifier 1 ou 2 fichiers TextGrid. Il est essentiellement dédié à l'édition de transcriptions. Il inclut des fonctionnalités spécifiques pour les transcriptions sous forme de TOE : affichage en couleur des élisions, prononciations particuliès, etc, ainsi qu'un outils de diagnostic de la syntaxe de la TOE.

Télécharger

Screencasts de l'ensemble des fonctionalités :

brigitte.bigi[at]cnrs.fr