SPPAS implements an Application Programming Interface (API), with name anndata, to deal with annotated files.
anndata is a free and open source Python library to access and search data from annotated data of the supported formats (xra, TextGrid, eaf…). It can either be used with the Programming Language Python 2.7 or Python 3.4+. This API is PEP8 and PEP257 compliant and the internationalization of the messages is implemented (English and French are available in the po
directory).
In this chapter, it is assumed that a version of Python is installed and configured. It is also assumed that the Python IDLE is ready-to-use. For more details about Python, see:
The Python Website: http://www.python.org
This chapter firstly introduces basic programming concepts, then it gradually introduces how to write scripts with Python. Those who are familiar with programming in Python can directly go to the last section related to the description of the anndata API and how to use it in Python scripts.
This API can convert file formats like Elan’s EAF, Praat’s TextGrid and others into a sppasTranscription
object and convert this object into any of these formats. This object allows unified access to linguistic data from a wide range sources.
This chapter includes exercises. The solution scripts are included in
the package directory *documentation*, folder *scripting_solutions*.
This section includes examples in Python programming language. You may want to try out some of the examples that come with the description. In order to do this, execute the Python IDLE - available in the Application-Menu of your operating system, and write the examples after the prompt >>>
.
To get information about the IDLE, get access to the IDLE documentation
Writing any program consists of writing statements so using a programming language. A statement is often known as a line of code that can be one of:
Lines of code are grouped in blocks. Depending on the programming language, blocks delimited by brackets, braces or by the indentation.
Each language has its own syntax to write these lines and the user has to follow strictly this syntax for the program to be able to interpret the program. However, the amount of freedom the user has to use capital letters, whitespace and so on is very high. Recommendations for Python language are available in the PEP8 - Style Guide for Python Code.
A variable is a name to give to a piece of memory with some information inside. Assignment is then the action of setting a variable to a value. The equal sign (=) is used to assign values to variables.
In the previous example, a
, b
, c
, hello
and vrai
are variables, a = 1
is a declaration.
Assignments to variables with Python language can be performed with the following operators:
>>> a = 10 # simple assignment operator
>>> a += 2 # add and assignment operator, so a is 12
>>> a -= 7 # minus and assignment, so a is 5
>>> a *= 20 # multiply and assignment, so a is 100
>>> a /= 10 # divide and assignment, so a is 10
>>> a # verify the value of a...
10
Basic operators are used to manipulate variables. The following is the list of operators that can be used with Python, i.e. equal (assignment), plus, minus, multiply, divide:
>>> a = 10
>>> b = 20 # assignment
>>> a + b # addition
>>> a - b # subtraction
>>> a * b # multiplication
>>> a / b # division
The variables are of a data-type. For example, the declarations a=1
and a=1.0
are respectively assigning an integer and a real number. In Python, the command type
allows to get the type of a variable, like in the following:
>>> type(a)
<type 'int'>
>>> type(b)
<type 'float'>
>>> type(c)
<type 'str'>
>>> type(cc)
<type 'unicode'>
>>> type(vrai)
<type 'bool'>
Here is a list of some fundamental data types, and their characteristics:
True
(=1) or False
(=0)Python is assigning data types dynamically. As a consequence, the result of the sum between an int
and a float
is a float
. The next examples illustrate that the type of the variables have to be carefully managed.
>>> a = 10
>>> a += 0.
>>> a
10.0
>>> a += True
>>> a
11.0
>>> a += "a"
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: unsupported operand type(s) for +=: 'float' and 'str'
>>> a = "a"
>>> a *= 5
>>> a
'aaaaa'
The type of a variable can be explicitly changed. This is called a cast
:
>>> a = 10
>>> b = 2
>>> a/b
5
>>> float(a) / float(b)
5.0
>>> a = 1
>>> b = 1
>>> str(a) + str(b)
'11'
Complex data types are often used to store variables sharing the same properties like a list of numbers, and so on. Common types in languages are lists/arrays and dictionaries. The following is the assignment of a list with name fruits
, then the assignment of a sub-part of the list to the to_buy
list:
>>> fruits = ['apples', 'tomatoes', 'peers', 'bananas', 'lemons']
>>> to_buy = fruits[1:3]
>>> to_buy
['tomatoes', 'peers']
Conditions aim to test whether a statement is True or False. The statement of the condition can include a variable, or be a variable and is written with operators. The following shows examples of conditions/comparisons in Python. Notice that the comparison of variables of a different data-type is possible (but not recommended!).
>>> var = 100
>>> if var == 100:
... print("Value of expression is 100.")
...
Value of expression is 100.
>>> if var == "100":
... print("This message won't be printed out.")
...
Conditions can be expressed in a more complex way like:
>>> if a == b:
... print('a and b are equals')
... elif a > b:
... print('a is greater than b')
... else:
... print('b is greater than a')
The simple operators for comparisons are summarized in the next examples:
>>> a == b # check if equals
>>> a != b # check if different
>>> a > b # check if a is greater than b
>>> a >= b # check if a is greater or equal to b
>>> a < b # check if a is lesser than b
>>> a <= b # check if a is lesser or equal to b
It is also possible to use the following operators:
and
: called Logical AND operator. If both the operands are true then the condition becomes true.
or
: called Logical OR operator. If any of the two operands are non zero then the condition becomes true.
not
called Logical NOT operator. Use to reverses the logical state of its operand. If a condition is true then Logical NOT operator will make false.
in
: evaluates to true if it finds a variable in the specified sequence and false otherwise.>>> if a == "apples" and b == "peers":
... print("You need to buy fruits.")
>>> if a == "apples" or b == "apples":
... print("You already have bought apples.")
>>> if "tomatoas" not in to_buy:
... print("You don't have to buy tomatoes.")
The for
loop statement iterates over the items of any sequence. The next Python lines of code print items of a list on the screen:
>>> to_buy = ['fruits', 'viande', 'poisson', 'oeufs']
>>> for item in to_buy:
... print(item)
...
fruits
viande
poisson
oeufs
A while
loop statement repeatedly executes a target statement as long as a given condition returns True
. The following example prints exactly the same result as the previous one:
>>> to_buy = ['fruits', 'viande', 'poisson', 'oeufs']
>>> i = 0
>>> while i < len(to_buy):
... print(to_buy[i])
... i += 1
...
fruits
viande
poisson
oeufs
A dictionary is a very useful data type. It consists of pairs of keys and their corresponding values.
fruits['apples']
is a way to get the value - i.e. 3, of the apple
key. However, an error is sent if the key is unknown, like fruits[bananas]
. Alternatively, the get
function can be used, like fruits.get("bananas", 0)
that returns 0 instead of an error.
The next example is showing how use a simple dictionary:
>>> for key in fruits:
... value = fruits.get(key, 0)
... if value < 3:
... print("You have to buy new {:s}.".format(key))
...
You have to buy new tomatoes.
To learn more about data structures and how to manage them, get access to the Python documentation
This section describes how to create simple Python lines of code in separated files commonly called scripts, and run them. Some practical exercises, appropriate to the content of each action, are proposed and test exercises are suggested at the end of the section.
To practice, you have first to create a new folder in your computer - on your Desktop for example; with name pythonscripts
for example, and to execute the python IDLE.
For an advanced use of Python, the installation of a dedicated IDE is very useful. SPPAS is developed with PyCharm: See the PyCharm Help webpage
Comments are not required by the program to work. But comments are necessary! Comments are expected to be appropriate, useful, relevant, adequate and always reasonable.
# This script is doing this and that.
# It is under the terms of a license.
# and I can continue to write what I want after the # symbol
# except that it's not the right way to tell the story of my life
The documentation of a program complements the comments. Both are not sharing the same goal: comments are used in all kind of programs but documentation is appended to comments for the biggest programs and/or projects. Documentation is automatically extracted and formatted thanks to dedicated tools. Documentation is required for sharing the program. See the Docstring Conventions for details. Documentation must follow a convention like for example the markup language reST - reStructured Text. Both conventions are used into SPPAS API, programs and scripts.
In the IDLE, create a new empty file either by clicking on File
menu, then New File
, or with the shortcut CTRL
+N.
Copy the following line of code in this newly created file:
Then, save the file in the pythonscripts
folder. By convention, Python source files end with a .py extension, and so the name 01_helloworld.py
could be fine.
To execute the program, you can do one of:
Run, then
Run module
The expected output is as follow:
A better practice while writing scripts is to describe by who, what and why this script was done. A nifty trick is to create a skeleton for any future script that will be written. Such ready-to-use script is available in the SPPAS package with the name skeleton.py
.
Blocks in Python are created from the indentation. Tab and spaces can be used but using spaces is recommended.
A function does something: it stats with its definition then is followed by its lines of code in a block.
Here is an example of function:
def print_vowels():
""" Print the list of French vowels on the screen. """
vowels = ['a', 'e', 'E', 'i', 'o', 'u', 'y', '@', '2', '9', 'a~', 'o~', 'U~']
print("List of French vowels:")
for v in vowels:
print(v)
What the print_vowels()
function is doing? This function declares a list with name vowels
. Each item of the list is a string representing a vowel in French encoded in X-SAMPA. Of course, this list can be overridden with any other set of strings. The next line prints a message. Then, a loop prints each item of the list.
At this stage, if a script with this function is executed, it will do… nothing! Actually, the function is created, but it must be invoked in the main function to be interpreted by Python. The main
is as follow:
Practice: create a copy of the file
skeleton.py
, then make a function to printHello World!. (solution: ex01_hello_world.py).
Practice: Create a function to print plosives and call it in the main function (solution: ex02_functions.py).
One can also create a function to print glides, another one to print affricates, and so on. Hum… this sounds a little bit fastidious!
Rather than writing the same lines of code with only a minor difference over and over, we can declare parameters to the function to make it more generic. Notice that the number of parameters of a function is not limited!
In the example, we can replace the print_vowels()
function and the print_plosives()
function by a single function print_list(mylist)
where mylist
can be any list containing strings or characters. If the list contains other typed-variables like numerical values, they must be converted to string to be printed out. This can result in the following function:
def print_list(mylist, message=" -"):
""" Print a list on the screen.
:param mylist: (list) the list to print
:param message: (string) an optional message to print before each element
"""
for item in mylist:
print("{:s} {:s}".format(message, item))
Functions are used to do a specific job and the result of the function can be captured by the program. In the following example, the function would return a boolean value, i.e. True if the given string has no character.
Practice: Add this function in a new script and try to print various lists (solution: ex03_functions.py)
Now, we’ll try to get data from a file. Create a new empty file with the following lines - and add as many lines as you want; then, save it with the name phonemes.csv
by using UTF-8 encoding:
occlusives ; b ; b
occlusives ; d ; d
fricatives ; f ; f
liquids ; l ; l
nasals ; m ; m
nasals ; n ; n
occlusives ; p ; p
glides ; w ; w
vowels ; a ; a
vowels ; e ; e
The following statements are typical statements used to read the content of a file. The first parameter of the open
function is the name of the file, including the path (relative or absolute); and the second argument is the opening mode (r
is the default value, used for reading).
Practice: Add these lines of code in a new script and try it (solution: ex04_reading_simple.py)
fp = open("phonemes.csv", 'r')
for line in fp:
# do something with the line stored in variable l
print(line.strip())
f.close()
The following is a solution with the ability to deal with various file encodings, thanks to the codecs
library:
def read_file(filename):
""" Get the content of file.
:param filename: (string) Name of the file to read, including path.
:returns: List of lines
"""
with codecs.open(filename, 'r', encoding="utf8") as fp:
return fp.readlines()
In the previous code, the codecs.open
functions got 3 parameters: the name of the file, the mode to open, and the encoding. The readlines()
function gets each line of the file and store it into a list.
Practice: Write a script to print the content of a file (solution: ex05_reading_file.py)
Notice that Python os
module provides useful methods to perform file-processing operations, such as renaming and deleting. See Python documentation for details: https://docs.python.org/3.8/
Writing a file requires to open it in a writing mode:
wis the mode to write data; it will erase any existing file;
ais the mode to append data in an existing file.
A file can be opened in an encoding and saved in another one. This could be useful to write a script to convert the encoding of a set of files. The following could help to create such script:
# Converting the encoding of a file:
file_stream = codecs.open(file_location, 'r', file_encoding)
file_output = codecs.open(file_location+'utf8', 'w', 'utf-8')
for line in file_stream:
file_output.write(line)
Here is a list of web sites with tutorials, from the easiest to the most complete:
Exercise 1: How many vowels are in a list of phonemes? (solution: ex06_list.py)
Exercise 2: Write a X-SAMPA to IPA converter. (solution: ex07_dict.py)
Exercise 3: Compare 2 sets of data using NLP techniques (Zipf law, Tf.Idf) (solution: ex08_counter.py)
We are now going to write Python scripts using the anndata API included in SPPAS. This API is useful to read/write and manipulate files annotated from various annotation tools like SPPAS, Praat or Elan.
First of all, it is important to understand the data structure included into the API to be able to use it efficiently.
In the Linguistics field, multimodal annotations contain information ranging from general linguistic to domain specific information. Some are annotated with automatic tools, and some are manually annotated. In annotation tools, annotated data are mainly represented in the form of tiers
or tracks
of annotations. Tiers are mostly series of intervals defined by:
Of course, depending on the annotation tool, the internal data representation and the file formats are different. In Praat, tiers can be represented either by a single point in time
(such tiers are named PointTiers) or two (IntervalTiers) In Elan, points are not supported; but contrariwise to Praat, unlabelled intervals are not represented nor saved.
The anndata API was designed to be able to manipulate all data in the same way, regardless of the file type. It supports to merge data and annotations from a wide range of heterogeneous data sources.
anndataAPI class diagram
After opening/loading a file, its content is stored in a sppasTranscription
object. A sppasTranscription
has a name, and a list of sppasTier
objects. Tiers can’t share the same name, the list of tiers can be empty, and a hierarchy between tiers can be defined. Actually, subdivision relations can be established between tiers. For example, a tier with phonemes is a subdivision reference for syllables, or for tokens; and tokens are a subdivision reference for the orthographic transcription in IPUs. Such subdivisions can be of two categories: alignment or association.
A sppasTier
object has a name, and a list of sppasAnnotation
objects. It can also be associated to a controlled vocabulary, or a media.
Al these objects contain a set of meta-data.
An annotation is made of 2 objects:
sppasLocation
object,sppasLabel
objects.A sppasLabel
object is representing the content
of the annotation. It is a list of sppasTag
each one associated to a score.
A sppasLocation
is representing where this annotation occurs in the media. Then, a sppasLocation
is made of a list of localization each one associated with a score. A localization is one of:
sppasPoint
object; orsppasInterval
object, which is made of 2 sppasPoint
objects; orsppasDisjoint
object which is a list of sppasInterval
.Each annotation holds a serie of 0..N labels. A label is also an object made of a list of sppasTag, each one with a score. A sppasTag is mainly represented in the form of a string, freely written by the annotator but it can also be a boolean (True/False), an integer, a floating number, a point with (x, y) coordinates with an optional radius or a rectangle with (x, y, w, h) coordinates with an optional radius value.
In the anndata API, a sppasPoint
is considered as an imprecise value. It is possible to characterize a point in a space immediately allowing its vagueness by using:
The screenshot below shows an example of multimodal annotated data, imported from 3 different annotation tools. Each sppasPoint
is represented by a vertical dark-blue line with a gradient color to refer to the radius value.
In the screenshot the following radius values were assigned:
To practice, you have first to create a new folder in your computer - on your Desktop for example; with name sppasscripts
for example, and to execute the python IDLE.
Open a File Explorer window and go to the SPPAS folder location. Then, copy the sppas
directory into the newly created sppasscripts
folder. Then, go to the solution directory and copy/paste the files skeleton-sppas.py
and F_F_B003-P9-merge.TextGrid
into your sppasscripts
folder. Then, open the skeleton script with the python IDLE and execute it. It will do… nothing! But now, you are ready to do something with the API of SPPAS!
When using the API, if something forbidden is attempted, the object will raise an Exception which means the program will stop.
We are being to Open/Read an annotated file of any format (XRA, TextGrid, Elan, …) and store it into a sppasTranscription
object instance. Then, the object will be saved into another file.
# Create a parser object then parse the input file.
parser = sppasRW(input_filename)
trs = parser.read()
# Save the sppasTranscription object into a file.
parser.set_filename(output_filename)
parser.write(trs)
Only these few lines of code are required to convert a file from a format to another one! The appropriate parsing system is extracted from the extension of file name.
To get the list of accepted extensions that the API can read, just use parser.extensions_in()
. The list of accepted extensions that the API can write is given by parser.extensions_out()
.
Practice: Write a script to convert a TextGrid file into CSV (solution: ex10_read_write.py)
The most useful functions to manage the tiers of a sppasTranscription object are:
create_tier()
to create an empty tier and to append it,append(tier)
to add a tier into the sppasTranscription,pop(index)
to remove a tier of the sppasTranscription,find(name, case_sensitive=True)
to find a tier from its name.Below is a piece of code to browse through the list of tiers:
for tier in trs:
# below, do something with the tier:
print(tier.get_name())
# Search for a specific tier,
# None is returned if not found.
phons_tier = trs.find("PhonAlign")
Practice: Write a script to select a set of tiers of a file and save them into a new file (solution: ex11_transcription.py).
A tier is made of a name, a list of annotations, and optionally a controlled vocabulary and a media. To get the name of a tier, or to fix a new name, the easier way is to use tier.get_name()
. The following block of code allows to get a tier and change its name.
# Get the first tier, with index=0
tier = trs[0]
print(tier.get_name())
tier.set_name("NewName")
print(tier.get_name())
The most useful functions to manage annotations of a sppasTier
object are:
create_annotation(location, labels)
to create and add a new annotationappend(annotation)
to add a new annotation at the end of the listadd(annotation)
to add a new annotationpop(index)
to delete the annotation of a given indexremove(begin, end)
to remove annotations of a given localization rangeis_disjoint()
, is_interval()
, is_point()
to know the type of locationis_string()
, is_int()
, is_float()
, is_bool()
, is_fuzzypoint()
, is_fuzzyrect()
to know the type of labelsfind(begin, end)
to get annotations in a given localization rangeget_first_point()
, get_last_point()
to get respectively the point with the lowest or highest localizationset_radius(radius)
to fix the same vagueness value to each localization pointPractice: Write a script to open an annotated file and print information about tiers (solution: ex12_tiers_info.py)
An annotation is a container for a location and optionally a list of labels. It can be used to manage the labels and tags with the following methods:
is_labelled()
returns True if at least a sppasTag
exists and is not Noneappend_label(label)
to add a label at the end of the list of labelsget_labels_best_tag()
returns a list with the best tag of each labeladd_tag(tag, score, label_index)
to add a tag into a labelremove_tag(tag, label_index)
to remove a tag of a labelAn annotation object can also be copied with the method copy()
. The location, the labels and the metadata are all copied; and the id
of the returned annotation is then the same. It is expected that each annotation of a tier as its own id
, but the API doesn’t check this.
Practice: Write a script to print information about annotations of a tier (solution: ex13_tiers_info.py)
This section focuses on the problem of searching and retrieving data from annotated corpora.
The filter implementation can only be used together with the sppasTier()
class. The idea is that each sppasTier()
can contain a set of filters, that each reduce the full list of annotations to a subset.
SPPAS filtering system proposes 2 main axis to filter such data:
A set of filters can be created and combined to get the expected result. To be able to apply filters to a tier, some data must be loaded first. First, a new sppasTranscription()
has to be created when loading a file. Then, the tier(s) to apply filters on must be fixed. Finally, if the input file was NOT an XRA, it is widely recommended to fix a radius value before using a relation filter.
When a filter is applied, it returns an instance of sppasAnnSet
which is the set of annotations matching with the request. It also contains a value
which is the list of functions that are truly matching for each annotation. Finally, sppasAnnSet
objects can be combined with the operators |
and &
, and expected to a sppasTier
instance.
The following matching names are proposed to select annotations:
exact: means that a tag is valid if it strictly corresponds to the expected pattern;
containsmeans that a tag is valid if it contains the expected pattern;
startswithmeans that a tag is valid if it starts with the expected pattern;
endswithmeans that a tag is valid if it ends with the expected pattern.
regexpto define regular expressions.
All these matches can be reversed, to represent does not exactly match, does not contain, does not start with or does not end with. Moreover, they can be case-insensitive by adding i
at the beginning like iexact
, etc. The full list of tag matching functions is obtained by invoking sppasTagCompare().get_function_names()
.
The next examples illustrate how to work with such pattern matching filter. In this example, f1
is a filter used to get all phonemes with the exact label a
. On the other side, f2
is a filter that ignores all phonemes matching with a
(mentioned by the symbol ~
) with a case insensitive comparison (iexact means insensitive-exact).
tier = trs.find("PhonAlign")
f = sppasFilter(tier)
ann_set_a = f.tag(exact='a')
ann_set_aA = f.tag(iexact='a')
The next example illustrates how to write a complex request. Notice that r1 is equal to r2, but getting r1 is faster:
tier = trs.find("TokensAlign")
f = sppasFilter(tier)
r1 = f.tag(startswith="pa", not_endswith='a', logic_bool="and")
r2 = f.tag(startswith="pa") & f.tag(not_endswith='a')
With this notation in hands, it is easy to formulate queries like for example: Extract words starting by ch
or sh
:
Practice:: Write a script to extract phonemes /a/ then phonemes /a/, /e/, /A/ and /E/. (solution: ex15_annotation_label_filter.py).
The following matching names are proposed to select annotations:
ltmeans that the duration of the annotation is lower than the given one;
lemeans that the duration of the annotation is lower or equal than the given one;
gtmeans that the duration of the annotation is greater than the given one;
gemeans that the duration of the annotation is greater or equal than the given one;
eqmeans that the duration of the annotation is equal to the given one;
nemeans that the duration of the annotation is not equal to the given one.
The full list of duration matching functions is obtained by invoking sppasDurationCompare().get_function_names()
.
Next example shows how to get phonemes during between 30 ms and 70 ms. Notice that r1 and r2 are equals!
tier = trs.find("PhonAlign")
f = sppasFilter(tier)
r1 = f.dur(ge=0.03) & f.dur(le=0.07)
r2 = f.dur(ge=0.03, le=0.07, logic_bool="and")
Practice: Extract phonemes
aoreduring more than 100ms (solution: ex16_annotation_dur_filter.py).
The following matching names are proposed to select annotations:
Next example allows to extract phonemes a
of the 5 first seconds:
tier = trs.find("PhonAlign")
f = sppasFilter(tier)
result = f.tag(exact='a') & f.loc(rangefrom=0., rangeto=5., logic_bool="and")
Relations between annotations is crucial if we want to extract multimodal data. The aim here is to select intervals of a tier depending on what is represented in another tier.
James Allen, in 1983, proposed an algebraic framework named Interval Algebra (IA), for qualitative reasoning with time intervals where the binary relationship between a pair of intervals is represented by a subset of 13 atomic relation, that are:
distinct because no pair of definite intervals can be related by more than one of the relationships;
exhaustive because any pair of definite intervals are described by one of the relations;
qualitative (rather than quantitative) because no numeric time spans are considered.
These relations and the operations on them form Allen’s Interval Algebra
.
Pujari, Kumari and Sattar proposed INDU in 1999: an Interval & Duration network. They extended the IA to model qualitative information about intervals and durations in a single binary constraint network. Duration relations are: greater, lower and equal. INDU comprises of 25 basic relations between a pair of two intervals.
anndata
implements the 13 Allen interval relations: before, after, meets, met by, overlaps, overlapped by, starts, started by, finishes, finished by, contains, during and equals; and it also contains the relations proposed in the INDU model. The full list of matching functions is obtained by invoking sppasIntervalCompare().get_function_names()
.
Moreover, in the implementation of anndata
, some functions accept options:
before
and after
accept a max_delay
value,overlaps
and overlappedby
accept an overlap_min
value and a boolean percent
which defines whether the value is absolute or is a percentage.The next example returns monosyllabic tokens and tokens that are overlapping a syllable (only if the overlap is during more than 40 ms):
tier = trs.find("TokensAlign")
other_tier = trs.find("Syllables")
f = sppasFilter(tier)
f.rel(other_tier, "equals", "overlaps", "overlappedby", min_overlap=0.04)
Below is another example of implementing a request. Which syllables stretch across 2 words?
# Get tiers from a sppasTranscription object
tier_syll = trs.find("Syllables")
tier_toks = trs.find("TokensAlign")
f = sppasFilter(tier_syll)
# Apply the filter with the relation function
ann_set = f.rel(tier_toks, "overlaps", "overlappedby")
# To convert filtered data into a tier:
tier = ann_set.to_tier("SyllStretch")
Practice 1: Create a script to get tokens followed by a silence. (solution: ex17_annotations_relation_filter1.py).
Practice 2: Create a script to get tokens preceded by OR followed by a silence. (solution: ex17_annotations_relation_filter2.py).
Practice 3: Create a script to get tokens preceded by AND followed by a silence. (solution: ex17_annotations_relation_filter3.py).
In addition to anndata, SPPAS contains several other API. They are all free and open source Python libraries, with a documentation and a set of tests.
Among others: