Support of a new language?
All automatic annotations included in SPPAS are implemented with language-independent algorithms...
this means that adding a new language in SPPAS only consists in adding the linguistic resources
related to the annotation (like lexicons, dictionaries, models, set of rules, etc).
Linguistic resources can be edited, modified, changed or deleted by any user.
Supporting new languages is performed step by step, by adding linguistic resources; and
constructing linguistic resources requires to collaborate with linguists. So... any help is welcome!
Find more details in chapter 3: each annotation requiring a specific resource has a section "Support of a new language".
[ ERROR ] End TimePoint must be greater than Begin TimePoint
This error message means that the given file contains a degenerated interval.
It has to be corrected before the use of SPPAS.
Julius failed to time-align
The procedure outcome report indicates an error message, and the IPU is not time-aligned.
It can happen when something is wrong: julius is not installed,
the audio signal quality is not good, the orthographic transcription does not really match
the audio signal, there are errors in the phonetization, etc. You can then try to identify the
problem and to solve it!
You can also enable the basic
option and SPPAS will assign the same duration
to each phoneme to that specific IPU.
How long audio files can SPPAS process?
SPPAS can work on any audio file in length, as soon as the computer has enough memory.
[WARNING] Unknown word phonetization
When a token is missing of the pronunciation dictionary, SPPAS tries to phonetize
by analogy with other entries of the dictionary, so this warning message occurs.
If the proposed phonetization is the right one, you can ignore it. If not,
you can edit the dictionary, then perform again the phonetization.
See Chapter 3, Section 5 for details and the "Resources Documentation".
Do SPPAS can transcribe automatically?
No, and it won't! SPPAS is not an Automatic Speech Recognition (ASR) system.
None of the existing ASR system is able to produce the high quality
orthographic transcription which is required for further reliable
analyses!
Orthographic transcription has to be done manually into IPUs, and it must follow
the convention briefly described in (Chapter 3, Section 3) and
detailed here.
SPPAS don't create annotated files:
From SPPAS version 2.3, when I run a python script, I found SPPAS don't create annotated files.
My input file name is "bcf2e54f-16ea-46d2-a969-38bda8b9265e.wav". When I remove "-", it can be run.
If my file name is "1-1-1.wav", it can be run. Prior to this version, SPPAS didn't have
this kind of behavior.
There's no way for the workspace manager to distinguish if the character "-" is part
of a filename or if it's a "root-pattern" separator.
I recommend to use "_" instead.
In details... This is not a bug but the consequence of a new feature. From version 2.3,
annotations of SPPAS are based on the use of "Workspaces". All the advantages of such new
feature in the new GUI (requires python3 + wx4). In workspaces, filenames like "oriana1.wav" or
"oriana1.TextGrid" or oriana1-token.TextGrid are all sharing the same file root which
is "oriana1" and annotations will append a pattern to that root like "-token".
When annotating, the workspace manager is searching for the file root from the given
filename and so...
- for "oriana1-token.wav" the root is "oriana1", because a pattern "-token" is existing.
- for "bcf2e54f-16ea-46d2-a969-38bda8b9265e.wav" the root is "bcf2e54f-16ea-46d2-a969",
because a pattern "-38bda8b9265e" is existing. SPPAS is then searching for annotated files
like bcf2e54f-16ea-46d2-a969.TextGrid, bcf2e54f-16ea-46d2-a969-token.TextGrid, etc.
- for "1-1-1.wav" the root is "1-1-1" because a pattern can't be less than 2 characters.