anndata package¶
Subpackages¶
- anndata.aio package
- Submodules
- anndata.aio.aioutils module
- anndata.aio.annotationpro module
- anndata.aio.anvil module
- anndata.aio.audacity module
- anndata.aio.basetrsio module
- anndata.aio.elan module
- anndata.aio.htk module
- anndata.aio.phonedit module
- anndata.aio.praat module
- anndata.aio.readwrite module
- anndata.aio.sclite module
- anndata.aio.subtitle module
- anndata.aio.table module
- anndata.aio.text module
- anndata.aio.transcriber module
- anndata.aio.xra module
- anndata.aio.xtrans module
- Module contents
- anndata.ann package
- Subpackages
- anndata.ann.annlabel package
- anndata.ann.annlocation package
- Submodules
- anndata.ann.annlocation.disjoint module
- anndata.ann.annlocation.duration module
- anndata.ann.annlocation.durationcompare module
- anndata.ann.annlocation.interval module
- anndata.ann.annlocation.intervalcompare module
- anndata.ann.annlocation.localization module
- anndata.ann.annlocation.localizationcompare module
- anndata.ann.annlocation.location module
- anndata.ann.annlocation.point module
- Module contents
- Submodules
- anndata.ann.annotation module
- anndata.ann.annset module
- Module contents
- Subpackages
Submodules¶
anndata.anndataexc module¶
- filename
sppas.src.anndata.anndataexc.py
- author
Brigitte Bigi
- contact
- summary
Exceptions for the package anndata.
- exception anndata.anndataexc.AioEmptyTierError(file_format, tier_name)[source]¶
Bases:
OSError
:ERROR 1525:.
The file format {!s:s} does not support to save empty tiers: {:s}.
- exception anndata.anndataexc.AioEncodingError(filename, error_msg, encoding='utf-8')[source]¶
Bases:
UnicodeDecodeError
:ERROR 1500:.
The file {filename} contains non {encoding} characters: {error}.
- exception anndata.anndataexc.AioError(filename)[source]¶
Bases:
OSError
:ERROR 1400:.
No such file: ‘{!s:s}’.
- exception anndata.anndataexc.AioFormatError(line)[source]¶
Bases:
OSError
:ERROR 1521:.
Unexpected format about ‘{!s:s}’.
- exception anndata.anndataexc.AioLineFormatError(number, line)[source]¶
Bases:
OSError
:ERROR 1520:.
Unexpected format string at line {:d}: ‘{!s:s}’.
- exception anndata.anndataexc.AioLocationTypeError(file_format, location_type)[source]¶
Bases:
TypeError
:ERROR 1530:.
The file format {!s:s} does not support tiers with {:s}.
- exception anndata.anndataexc.AioMultiTiersError(file_format)[source]¶
Bases:
OSError
:ERROR 1510:.
The file format {!s:s} does not support multi-tiers.
- exception anndata.anndataexc.AioNoTiersError(file_format)[source]¶
Bases:
OSError
:ERROR 1515:.
The file format {!s:s} does not support to save no tiers.
- exception anndata.anndataexc.AnnDataEqError(v1, v2)[source]¶
Bases:
Exception
:ERROR 1010:.
Values are expected to be equals but are {:s!s} and {:s!s}.
- exception anndata.anndataexc.AnnDataEqTypeError(obj, obj_ref)[source]¶
Bases:
TypeError
:ERROR 1105:.
{!s:s} is not of the same type as {!s:s}.
- exception anndata.anndataexc.AnnDataError[source]¶
Bases:
Exception
:ERROR 1000:.
No annotated data file is defined.
- exception anndata.anndataexc.AnnDataIndexError(index)[source]¶
Bases:
IndexError
:ERROR 1200:.
Invalid index value {:d}.
- exception anndata.anndataexc.AnnDataKeyError(data_name, value)[source]¶
Bases:
KeyError
:ERROR 1250:.
Invalid key ‘{!s:s}’ for data ‘{!s:s}’.
- exception anndata.anndataexc.AnnDataNegValueError(value)[source]¶
Bases:
ValueError
:ERROR 1310:.
Expected a positive value. Got ‘{:f}’.
- exception anndata.anndataexc.AnnDataTypeError(rtype, expected)[source]¶
Bases:
TypeError
:ERROR 1100:.
{!s:s} is not of the expected type ‘{:s}’.
- exception anndata.anndataexc.AnnDataValueError(data_name, value)[source]¶
Bases:
ValueError
:ERROR 1300:.
Invalid value ‘{!s:s}’ for ‘{!s:s}’.
- exception anndata.anndataexc.AnnUnkTypeError(rtype)[source]¶
Bases:
TypeError
:ERROR 1050:.
{!s:s} is not a valid type.
- exception anndata.anndataexc.CtrlVocabContainsError(tag)[source]¶
Bases:
ValueError
:ERROR 1130:.
{:s} is not part of the controlled vocabulary.
- exception anndata.anndataexc.CtrlVocabSetTierError(vocab_name, tier_name)[source]¶
Bases:
ValueError
:ERROR 1132:.
The controlled vocabulary {:s} can’t be associated to the tier {:s}.
- exception anndata.anndataexc.HierarchyAlignmentError(parent_tier_name, child_tier_name)[source]¶
Bases:
ValueError
:ERROR 1170:.
Can’t create a time alignment between tiers: ‘{:s}’ is not a superset of ‘{:s}’.”
- exception anndata.anndataexc.HierarchyAncestorTierError(child_tier_name, parent_tier_name)[source]¶
Bases:
ValueError
:ERROR 1178:.
The tier can’t be added into the hierarchy: ‘{:s}’ is an ancestor of ‘{:s}’.
- exception anndata.anndataexc.HierarchyAssociationError(parent_tier_name, child_tier_name)[source]¶
Bases:
ValueError
:ERROR 1172:.
Can’t create a time association between tiers: ‘{:s}’ and ‘{:s}’ are not supersets of each other.
- exception anndata.anndataexc.HierarchyChildTierError(tier_name)[source]¶
Bases:
ValueError
:ERROR 1176:.
The tier ‘{:s}’ can’t be added into the hierarchy: a tier can’t be its own child.
- exception anndata.anndataexc.HierarchyParentTierError(child_tier_name, parent_tier_name, link_type)[source]¶
Bases:
ValueError
:ERROR 1174:.
The tier can’t be added into the hierarchy: ‘{:s}’ has already a link of type {:s} with its parent tier ‘{:s}’.
- exception anndata.anndataexc.IntervalBoundsError(begin, end)[source]¶
Bases:
ValueError
:ERROR 1120:.
The begin must be strictly lesser than the end in an interval. Got: [{:s};{:s}].
- exception anndata.anndataexc.TagValueError(tag_str)[source]¶
Bases:
ValueError
:ERROR 1190:.
{!s:s} is not a valid tag.
- exception anndata.anndataexc.TierAddError(index)[source]¶
Bases:
ValueError
:ERROR 1142:.
Can’t add annotation. An annotation with the same location is already in the tier at index {:d}.
- exception anndata.anndataexc.TierAppendError(cur_end, ann_end)[source]¶
Bases:
ValueError
:ERROR 1140:.
Can’t append annotation. Current end {!s:s} is highest than the given one {!s:s}.
- exception anndata.anndataexc.TierHierarchyError(name)[source]¶
Bases:
ValueError
:ERROR 1144:.
Attempt a modification in tier ‘{:s}’ that invalidates its hierarchy.
- exception anndata.anndataexc.TrsAddError(tier_name, transcription_name)[source]¶
Bases:
ValueError
:ERROR 1150:.
Can’t add: ‘{:s}’ is already in ‘{:s}’.
anndata.ctrlvocab module¶
- filename
sppas.src.anndata.ctrlvocab.py
- author
Brigitte Bigi
- contact
- summary
Represent a controlled vocabulary.
- class anndata.ctrlvocab.sppasCtrlVocab(name, description='')[source]¶
Bases:
anndata.metadata.sppasMetaData
Generic representation of a controlled vocabulary.
A controlled Vocabulary is a set of tags. It is used to restrict the use of tags in a label: only the accepted tags can be set to a label.
A controlled vocabulary is made of an identifier name, a description and a list of pairs tag/description.
- __init__(name, description='')[source]¶
Create a new sppasCtrlVocab instance.
- Parameters
name – (str) Identifier name of the controlled vocabulary
description – (str)
- add(tag, description='')[source]¶
Add a tag to the controlled vocab.
- Parameters
tag – (sppasTag): the tag to add.
description – (str)
- Returns
Boolean
- contains(tag)[source]¶
Test if a tag is in the controlled vocabulary.
Attention: Do not check the instance but the data content of the tag.
- Parameters
tag – (sppasTag) the tag to check.
- Returns
Boolean
- get_tag_description(tag)[source]¶
Return the unicode string of the description of an entry.
- Parameters
tag – (sppasTag) the tag to get the description.
- Returns
(str)
- remove(tag)[source]¶
Remove a tag of the controlled vocab.
- Parameters
tag – (sppasTag) the tag to remove.
- Returns
Boolean
- set_description(description='')[source]¶
Set the description of the controlled vocabulary.
- Parameters
description – (str)
anndata.hierarchy module¶
- filename
sppas.src.anndata.hierarchy.py
- author
Brigitte Bigi
- contact
- summary
Represent a hierarchy, i.e. constraints among tiers.
- class anndata.hierarchy.sppasHierarchy[source]¶
Bases:
anndata.metadata.sppasMetaData
Generic representation of a hierarchy between tiers.
Two types of hierarchy are considered:
TimeAssociation: the points of a child tier are all equals to the points of a reference tier, as for example:
parent: Words | l’ | âne | est | là |child: Lemmas | le | âne | être | là |TimeAlignment: the points of a child tier are all included in the set of points of a reference tier, as for example:
parent: Phonemes | l | a | n | e | l | a |child: Words | l’ | âne | est | là |parent: Phonemes | l | a | n | e | l | a |child: Syllables | l.a | n.e | l.a |
In that example, notice that there’s no hierarchy link between “Tokens” and “Syllables” and notice that “Phonemes” is the grand-parent of “Lemmas”.
And the following obvious rules are applied:
A child can have ONLY ONE parent!
A parent can have as many children as wanted.
A hierarchy is a tree, not a graph.
Todo is to consider a time association that is not fully completed:
parent: Tokens | l’ | âne | euh | euh | est | là | @ |child: Lemmas | le | âne | | être | là |- add_link(link_type, parent_tier, child_tier)[source]¶
Validate and add a hierarchy link between 2 tiers.
- Parameters
link_type – (constant) One of the hierarchy types
parent_tier – (sppasTier) The parent tier
child_tier – (sppasTier) The child tier to be linked to parent
- get_ancestors(child_tier)[source]¶
Return all the direct ancestors of a tier.
- Parameters
child_tier – (sppasTier)
- Returns
List of tiers with parent, grand-parent, grand-grand-parent…
- get_children(parent_tier, link_type=None)[source]¶
Return the list of children of a tier, for a given type.
- Parameters
parent_tier – (sppasTier) The child tier to found
link_type – (str) The type of hierarchy
- Returns
List of tiers
- get_hierarchy_type(child_tier)[source]¶
Return the hierarchy type between a child tier and its parent.
- Returns
(str) one of the hierarchy type
- get_parent(child_tier)[source]¶
Return the parent tier for a given child tier.
- Parameters
child_tier – (sppasTier) The child tier to found
- static infer_hierarchy_type(tier1, tier2)[source]¶
Test if tier1 can be a parent tier for tier2.
- Returns
One of hierarchy types or an empty string
- remove_child(child_tier)[source]¶
Remove a hierarchy link between a parent and a child.
- Parameters
child_tier – (sppasTier) The tier linked to a reference
- remove_parent(parent_tier)[source]¶
Remove all hierarchy links between a parent and its children.
- Parameters
parent_tier – (sppasTier) The parent tier
- remove_tier(tier)[source]¶
Remove all occurrences of a tier inside the hierarchy.
- Parameters
tier – (sppasTier) The tier to remove as parent or child.
- types = {'TimeAlignment', 'TimeAssociation'}¶
- validate_link(link_type, parent_tier, child_tier)[source]¶
Validate a hierarchy link between 2 tiers.
- Parameters
link_type – (constant) One of the hierarchy types
parent_tier – (sppasTier) The parent tier
child_tier – (sppasTier) The child tier to be linked to parent
- Raises
AnnDataTypeError, HierarchyParentTierError, HierarchyChildTierError, HierarchyAncestorTierError, HierarchyAlignmentError, HierarchyAssociationError
anndata.media module¶
- filename
sppas.src.anndata.media.py
- author
Brigitte Bigi
- contact
- summary
Represent a media – a recording file.
- class anndata.media.sppasMedia(filename, media_id=None, mime_type=None)[source]¶
Bases:
anndata.metadata.sppasMetaData
Generic representation of a media file.
anndata.metadata module¶
- filename
sppas.src.anndata.metadata.py
- author
Brigitte Bigi
- contact
- summary
Represent a set of metadata and an identifier.
- class anndata.metadata.sppasDefaultMeta[source]¶
Bases:
anndata.metadata.sppasMetaData
Dictionary of default meta data in SPPAS.
Many annotation tools are using metadata… Moreover, each annotation tool is encoding data with its own formalism. SPPAS aio API enables metadata to store information related to the read data in order to give them back when writing the data, either in the same file format or to export in another format. Such option is possible only if some kind of “generic” metadata names are fixed.
- media()[source]¶
Add metadata related to a media.
For compatibility with sclite, xtrans, subtitle, elan, annotation pro.
- class anndata.metadata.sppasMetaData[source]¶
Bases:
object
Dictionary of meta data including a required ‘id’.
Meta data keys and values are unicode strings.
- __init__()[source]¶
Create a sppasMetaData instance.
Add a GUID-like in the dictionary of metadata, with key “id”.
- add_annotator_metadata(name='', version='', version_date='')[source]¶
Add metadata about an annotator.
- Parameters
name – (str)
version – (str)
version_date – (str)
TODO: CHECK IF KEYS ARE NOT ALREADY EXISTING.
- add_language_metadata()[source]¶
Add metadata about the language (und).
TODO: CHECK IF KEYS NOT ALREADY EXISTING.
- add_project_metadata()[source]¶
Add metadata about the project this object is included-in.
Currently do not assign any value. TODO: CHECK IF KEYS NOT ALREADY EXISTING.
- add_software_metadata()[source]¶
Add metadata about this software.
TODO: CHECK IF KEYS NOT ALREADY EXISTING.
- get_meta(entry, default='')[source]¶
Return the value of the given key.
- Parameters
entry – (str) Entry to be checked as a key.
default – (str) Default value to return if entry is not a key.
- Returns
(str) meta data value or default value
anndata.tier module¶
- filename
sppas.src.anndata.tier.py
- author
Brigitte Bigi
- contact
- summary
Represent a tier, i.e. a layer with annotations.
- class anndata.tier.sppasTier(name=None, ctrl_vocab=None, media=None, parent=None)[source]¶
Bases:
anndata.metadata.sppasMetaData
Representation of a tier, a structured set of annotations.
Annotations of a tier are sorted depending on their location (from lowest to highest).
A Tier is made of:
a name (used to identify the tier),
a set of metadata,
an array of annotations,
a controlled vocabulary (optional),
a media (optional),
a parent (optional).
- __init__(name=None, ctrl_vocab=None, media=None, parent=None)[source]¶
Create a new sppasTier instance.
- Parameters
name – (str) Name of the tier. It is used as identifier.
ctrl_vocab – (sppasCtrlVocab)
media – (sppasMedia)
parent – (sppasTranscription)
- add(annotation)[source]¶
Add an annotation to the tier in sorted order.
Assign this tier as parent to the annotation.
- Parameters
annotation – (sppasAnnotation)
- Raises
AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError
- Returns
the index of the annotation in the tier
- append(annotation)[source]¶
Append the given annotation at the end of the tier.
Assign this tier as parent to the annotation.
- Parameters
annotation – (sppasAnnotation)
- Raises
AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError, TierAppendError
- create_annotation(location, labels=None)[source]¶
Create and add a new annotation into the tier.
- Parameters
location – (sppasLocation) the location(s) where the annotation happens
labels – (sppasLabel, list) the label(s) to stamp this annot.
- Returns
sppasAnnotation
- create_annotation_after(idx)[source]¶
Create and add a new annotation in the hole after idx.
- Parameters
idx – (int) Index of an existing annotation
- Returns
sppasAnnotation
:raises AnnDataTypeError, AnnDataIndexError
- create_annotation_before(idx)[source]¶
Create and add a new annotation in the hole before idx.
- Parameters
idx – (int) Index of an existing annotation
- Returns
sppasAnnotation
:raises AnnDataTypeError, AnnDataIndexError
- create_ctrl_vocab(name=None)[source]¶
Create the controlled vocabulary from annotation labels.
Create (or re-create) the controlled vocabulary from the list of already existing annotation labels. The current controlled vocabulary is deleted.
- Parameters
name – (str) Name of the controlled vocabulary. The name of the tier is used by default.
- export_to_intervals(separators)[source]¶
Create a tier with the consecutive filled intervals.
Return an empty tier if ‘self’ is not of type “interval”. The created intervals are not filled.
- Parameters
separators – (list)
- Returns
(sppasTier)
- export_unfilled()[source]¶
Create a tier with the unlabelled/unfilled intervals.
Only for tiers of type Interval. It represents the “NOT tier”, ie where this tier is not annotated.
IMPORTANT: Never tested with overlapped annotations, actually not tested at all (but used in the plugin StatGroups).
- Returns
(sppasTier) or None
- find(begin, end, overlaps=True, indexes=False)[source]¶
Return a list of annotations between begin and end.
- Parameters
begin – sppasPoint or None to start from the beginning of the tier
end – sppasPoint or None to end at the end of the tier
overlaps – (bool) Return also overlapped annotations. Not relevant for tiers with points.
indexes – (bool) Return indexes instead of annotations
- Returns
List of sppasAnnotation or list of indexes
- fit(other)[source]¶
Select then slice or extend annotations to fit in other tier.
Keep only the annotations of self that have some overlapping time with the given other tier and slice the localization of such selected annotations to exactly match those of the other tier.
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
tier1: |_a_|_b_| |_c_| |_d_| |_e_| |f| tier2: |_w_| |_______x_______| |_y_| |_z_|
- Parameters
other – (sppasTier)
- Returns
(sppasTier)
- get_annotation(identifier)[source]¶
Find an annotation from its metadata ‘id’.
- Parameters
identifier – (str) Metadata ‘id’ of an annotation.
- Returns
sppasAnnotation or None
- get_annotation_index(ann)[source]¶
Find an annotation.
- Parameters
ann – (sppasAnnotation)
- Returns
(int) -1 if not found
- has_point(point)[source]¶
Return True if the tier contains a given point.
- Parameters
point – (sppasPoint) The point to find in the tier.
- Returns
(bool)
- index(moment)[source]¶
Return the index of the moment (int), or -1.
Only for tier with points.
- Parameters
moment – (sppasPoint)
- is_superset(other)[source]¶
Return True if this tier contains all points of the other tier.
- Parameters
other – (sppasTier)
- Returns
Boolean
- lindex(moment)[source]¶
Return the index of the interval starting at a given moment, or -1.
Only for tier with intervals or disjoint. If the tier contains more than one annotation starting at the same moment, the method returns the first one.
- Parameters
moment – (sppasPoint)
- merge(idx, direction)[source]¶
Merge the annotation at given index with next or previous one.
- if direction > 0:
ann_idx: [begin_idx, end_idx, labels_idx] next_ann: [begin_n, end_n, labels_n] result: [begin_idx, end_n, labels_idx + labels_n]
- if direction < 0:
prev_ann: [begin_p, end_p, labels_p] ann_idx: [begin_idx, end_idx, labels_idx] result: [begin_p, end_idx, labels_p + labels_idx]
- Parameters
idx – (int) Index of the annotation in the list
direction – (int) Positive for next, Negative for previous
- Returns
(bool) False if direction does not match with index
- Raise
Exception if merged annotation can’t be deleted of the tier
- mindex(moment, bound=0)[source]¶
Return index of the interval containing the given moment.
Only for tier with intervals or disjoint.
If the tier contains more than one annotation at the same moment, the method returns the first one (i.e. the one which started at first).
- Parameters
moment – (sppasPoint)
bound – (int) - 0 to exclude bounds of the interval; - -1 to include begin bound; - +1 to include end bound; - +2 to include both begin/end bounds; - others: the midpoint of moment is strictly inside
- Returns
(int) Index of the 1st annotation containing moment or -1
- near(moment, direction=1)[source]¶
Search for the annotation whose localization is closest.
Search for the nearest localization to the given moment into a given direction.
- Parameters
moment – (sppasPoint)
direction – (int) - nearest 0 - nereast forward 1 - nereast backward -1
- pop(index=- 1)[source]¶
Remove the annotation at the given position in the tier.
If no index is specified, pop() removes and returns the last annotation in the tier.
- Parameters
index – (int) Index of the annotation to remove.
- Raises
HierarchyContainsError
- remove(begin, end, overlaps=False)[source]¶
Remove annotation intervals between begin and end.
- Parameters
begin – (sppasPoint)
end – (sppasPoint)
overlaps – (bool)
- Returns
the number of removed annotations
- Raises
HierarchyContainsError
- remove_unlabelled()[source]¶
Remove annotations without labels.
Do not remove an annotation if it invalidates the hierarchy.
- Returns
the number of removed annotations
- rindex(moment)[source]¶
Return the index of the interval ending at the given moment.
Only for tier with intervals or disjoint. If the tier contains more than one annotation ending at the same moment, the method returns the last one.
- Parameters
moment – (sppasPoint)
- set_ctrl_vocab(ctrl_vocab=None)[source]¶
Set a controlled vocabulary to this tier.
- Parameters
ctrl_vocab – (sppasCtrlVocab or None)
- Raises
AnnDataTypeError, CtrlVocabContainsError
- set_media(media)[source]¶
Set a media to the tier.
- Parameters
media – (sppasMedia)
- Raises
AnnDataTypeError
- set_name(name=None)[source]¶
Set the name of the tier.
If no name is given, an GUID is randomly assigned. Important: An empty string is accepted.
- Parameters
name – (str) The identifier name or None.
- Returns
the formatted name
- set_radius(radius)[source]¶
Fix a radius value to all points of the tier.
- Parameters
radius – (int, float) New radius value
- Raise
AnnDataTypeError, AnnDataNegValueError
- split(idx)[source]¶
Split annotation at the given index into 2 annotations.
- Parameters
idx – (int) Index of the annotation to split.
- Returns
newly created annotation at index idx+1
- validate_annotation(annotation)[source]¶
Validate the annotation and set its parent to this tier.
- Parameters
annotation – (sppasAnnotation)
- Raises
AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError
anndata.transcription module¶
- filename
sppas.src.anndata.transcription.py
- author
Brigitte Bigi
- contact
- summary
The main object to represent transcriptions of recordings.
- class anndata.transcription.sppasTranscription(name=None)[source]¶
Bases:
anndata.metadata.sppasMetaData
Representation of a transcription, the root in our framework.
Transcriptions in SPPAS are represented with:
metadata: a list of tuple key/value;
a name (used to identify the transcription);
a list of tiers;
a hierarchy between tiers;
a list of media;
a list of controlled vocabularies.
Inter-tier relations are managed by establishing alignment or association links between 2 tiers:
alignment: annotations of a tier A (child) have only localization instances included in those of annotations of tier B (parent);
association: annotations of a tier A have exactly localization
instances included in those of annotations of tier B.
- Example
>>> # Create an instance >>> trs = sppasTranscription("trs name")
>>> # Create a tier >>> trs.create_tier("tier name")
>>> # Get a tier of a transcription from its index: >>> tier = trs[0]
>>> # Get a tier of a transcription from its name >>> tier = trs.find("tier name")
>>> # Get a tier from its identifier >>> tier = trs.get_object(guid)
- __init__(name=None)[source]¶
Create a new sppasTranscription instance.
- Parameters
name – (str) Name of the transcription.
- add_ctrl_vocab(new_ctrl_vocab)[source]¶
Add a new controlled vocabulary in the list of ctrl vocab.
- Parameters
new_ctrl_vocab – (sppasCtrlVocab)
- Raises
AnnDataTypeError, TrsAddError
- add_hierarchy_link(link_type, parent_tier, child_tier)[source]¶
Validate and add a hierarchy link between 2 tiers.
- Parameters
link_type – (constant) One of the hierarchy types
parent_tier – (Tier) The reference tier
child_tier – (Tier) The child tier to be linked to reftier
- add_media(new_media)[source]¶
Add a new media in the list of media.
Does not add the media if a media with the same id is already in self.
- Parameters
new_media – (sppasMedia) The media to add.
- Raises
AnnDataTypeError, TrsAddError
- create_tier(name=None, ctrl_vocab=None, media=None)[source]¶
Create and append a new empty tier.
- Parameters
name – (str) the name of the tier to create
ctrl_vocab – (sppasCtrlVocab)
media – (sppasMedia)
- Returns
newly created empty tier
- find(name, case_sensitive=True)[source]¶
Find a tier from its name.
- Parameters
name – (str) EXACT name of the tier
case_sensitive – (bool)
- Returns
sppasTier or None
- find_id(tier_id)[source]¶
Find a tier from its identifier.
- Parameters
tier_id – (str) Exact identifier of the tier
- Returns
sppasTier or None
- get_ctrl_vocab_from_id(ctrl_vocab_id)[source]¶
Return a sppasCtrlVocab from its id or None.
- Parameters
ctrl_vocab_id – (str) Identifier name of a ctrl vocab
- get_ctrl_vocab_from_name(ctrl_vocab_name)[source]¶
Return a sppasCtrlVocab from its name or None.
- Parameters
ctrl_vocab_name – (str) Identifier name of a ctrl vocabulary
- get_media_from_id(media_id)[source]¶
Return a sppasMedia from its name or None.
- Parameters
media_id – (str) Identifier name of a media
- get_object(identifier)[source]¶
Return the object matching the given identifier.
- Parameters
identifier – (GUID)
- get_tier_from_id(tier_id)[source]¶
Return a sppasTier from its id or None.
- Parameters
tier_id – (str) Identifier name of a tier
- get_tier_index(name, case_sensitive=True)[source]¶
Get the index of a tier from its name.
- Parameters
name – (str) EXACT name of the tier
case_sensitive – (bool)
- Returns
index or -1 if not found
- get_tier_index_id(identifier)[source]¶
Get the index of a tier from its id.
- Parameters
identifier – (str) GUID
- Returns
index or -1 if not found
- pop(index=- 1)[source]¶
Remove the tier at the given position in the transcription.
Return it. If no index is specified, pop() removes and returns the last tier in the transcription.
- Parameters
index – (int) Index of the tier to remove.
- Returns
(sppasTier)
- Raise
AnnDataIndexError
- remove_ctrl_vocab(old_ctrl_vocab)[source]¶
Remove a controlled vocabulary of the list of ctrl vocab.
- Parameters
old_ctrl_vocab – (sppasCtrlVocab)
- Raises
AnnDataTypeError, TrsRemoveError
- remove_media(old_media)[source]¶
Remove a media of the list of media.
- Parameters
old_media – (sppasMedia)
- Raises
AnnDataTypeError, TrsRemoveError
- rename_tier(tier)[source]¶
Rename a tier by appending a digit.
- Parameters
tier – (sppasTier) The tier to rename.
- set_ctrl_vocab_list(ctrl_vocab_list)[source]¶
Set the list of controlled vocabularies.
- Parameters
ctrl_vocab_list – (list)
- Returns
list of rejected ctrl_vocab
- set_media_list(media_list)[source]¶
Set the list of media.
- Parameters
media_list – (list)
- Returns
list of rejected media
- set_name(name=None)[source]¶
Set the name of the transcription.
- Parameters
name – (str or None) The identifier or None to set the GUID.
- Returns
the name
- set_tier_index(name, new_index, case_sensitive=True)[source]¶
Set the index of a tier from its name.
THIS SHOULD NEVER BE USED. USE TIER IDENTIFIER INSTEAD OF ITS NAME.
- Parameters
name – (str) EXACT name of the tier
new_index – (int) New index of the tier in self
case_sensitive – (bool)
- Returns
index or -1 if not found
- set_tier_index_id(identifier, new_index)[source]¶
Set the index of a tier from its identifier.
- Parameters
identifier – (str)
new_index – (int) New index of the tier in self
- Returns
index or -1 if not found
- shift(delay)[source]¶
Shift all annotation’ location to a given delay.
- Parameters
delay – (int, float) delay to shift all localizations
- Raise
AnnDataTypeError
- property tiers¶
Return the list of tiers.
Module contents¶
- filename
sppas.src.anndata.__init__.py
- author
Brigitte Bigi
- contact
- summary
Package to manage annotated data.
anndata: management of transcribed data.¶
anndata is a free and open source Python library to access and search data from annotated data. It can convert file formats like Elan’s EAF, Praat’s TextGrid and others into a sppasTranscription() object and convert into any of these formats. Those objects allow unified access to linguistic data from a wide range sources.
It requires the following other packages:
config
utils
- class anndata.FileFormatProperty(extension)[source]¶
Bases:
object
Represent one format and its properties.
- anndata.format_label(text, empty='', tag_type='str')[source]¶
Create a label from a text.
Use the “{ | }” system to parse the alternative tags and = for scores.
- Parameters
text – (str)
empty – (str) The text representing an empty tag.
tag_type – (str): One of: (‘str’, ‘int’, ‘float’, ‘bool’).
- Returns
sppasLabel
- anndata.format_labels(text, separator='\n', empty='', tag_type='str')[source]¶
Create a set of labels from a text.
Use the separator to split the text into labels. Use the “{ | }” system to parse the alternative tags.
- Examples
text = “{le|les} {chat|chats}” is 2 labels with 2 tags each text = “{le=0.6|les=0.4}” is a label with 2 tags and their score
- Parameters
text – (str)
separator – (str) String to separate labels.
empty – (str) The text representing an empty tag.
tag_type – (str): One of: (‘str’, ‘int’, ‘float’, ‘bool’).
- Returns
list of sppasLabel
- anndata.serialize_label(label, empty='', alt=True)[source]¶
Convert the label into a string, include or not alternative tags.
Use the “{ | }” system to serialize the alternative tags. Scores of the tags are not returned.
- Parameters
label – (sppasLabel)
empty – (str) The text to return if a tag is empty or not set.
alt – (bool) Include alternative tags
- Returns
(str)
- anndata.serialize_labels(labels, separator='\n', empty='', alt=True)[source]¶
Create a text from a list of labels.
Use the separator to split the text into labels. Use the “{ | }” system to parse the alternative tags and = for scores.
- Parameters
labels – (list of sppasLabel)
separator – (str) String separating labels
empty – (str) The text representing an empty tag
alt – (bool) Include alternative tags. If False, only the best tag is serialized.
- Returns
list of sppasLabel
- class anndata.sppasAnnSet[source]¶
Bases:
sppas.src.structs.basefset.sppasBaseSet
Manager for a set of annotations.
Mainly used with the data that are the result of the tier filter system. A sppasAnnSet() manages a dictionary with:
key: an annotation
value: a list of strings
- copy()[source]¶
Make a deep copy of self.
Overridden to return a sppasAnnSet() instead of a sppasBaseSet().
- to_tier(name='AnnSet', annot_value=False)[source]¶
Create a tier from the data set.
- Parameters
name – (str) Name of the tier to be returned
annot_value – (bool) format of the resulting annotation label. By default, the label of the annotation is used. Instead, its value in the data set is used.
- Returns
(sppasTier)
- class anndata.sppasAnnotation(location, labels=[])[source]¶
Bases:
sppas.src.anndata.metadata.sppasMetaData
Represents an annotation.
A sppasAnnotation() is a container for:
a sppasLocation()
a list of sppasLabel()
- Example
>>> location = sppasLocation(sppasPoint(1.5, radius=0.01)) >>> labels = sppasLabel(sppasTag("foo")) >>> ann = sppasAnnotation(location, labels)
- __init__(location, labels=[])[source]¶
Create a new sppasAnnotation instance.
- Parameters
location – (sppasLocation) the location(s) where the annotation happens
labels – (sppasLabel, list) the label(s) to stamp this annotation, or a list of them.
- add_tag(tag, score=None, label_idx=0)[source]¶
Append an alternative tag in a label.
- Parameters
tag – (sppasTag)
score – (float)
label_idx – (int)
- Raises
AnnDataTypeError, IndexError
- append_label(label)[source]¶
Append a label into the list of labels.
- Parameters
label – (sppasLabel)
- contains_localization(localization)[source]¶
Return True if the given localization is in the location.
- contains_tag(tag, function='exact', reverse=False, label_idx=0)[source]¶
Return True if the given tag is in the label.
- Parameters
tag – (sppasTag)
function – Search function
reverse – Reverse the function.
label_idx – (int)
- copy()[source]¶
Return a full copy of the annotation.
The location, the labels and the metadata are all copied. The ‘id’ of the returned annotation is then the same.
- Returns
sppasAnnotation()
- get_best_tag(label_idx=0)[source]¶
Return the tag with the highest score of a label or an empty str.
- Parameters
label_idx – (int)
- remove_tag(tag, label_idx=0)[source]¶
Remove an alternative tag of the label.
- Parameters
tag – (sppasTag) the tag to be removed of the list.
label_idx – (int)
- serialize_labels(separator='\n', empty='', alt=True)[source]¶
DEPRECATED. Return labels serialized into a string.
TODO: REMOVE THIS METHOD. Use aioutils.serialize_labels() instead.
- Parameters
separator – (str) String to separate labels.
empty – (str) The text to return if a tag is empty or not set.
alt – (bool) Include alternative tags
- Returns
(str)
- set_best_localization(localization)[source]¶
Set the best localization of the location.
- Parameters
localization – (sppasBaseLocalization)
- set_labels(labels=[])[source]¶
Fix/reset the list of labels of this annotation.
- Parameters
labels – (sppasLabel, list) the label(s) to stamp this annotation, or a list of them.
- Raises
AnnDataTypeError, TypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError
- set_parent(parent=None)[source]¶
Set a parent tier.
- Parameters
parent – (sppasTier) The parent tier of this annotation.
- Raises
CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError
- set_score(score=None)[source]¶
Set or reset the score to this annotation.
- Parameters
score – (float)
- validate()[source]¶
Validate the annotation.
Check if the labels and location match the requirements.
- Raises
TypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError
- class anndata.sppasCtrlVocab(name, description='')[source]¶
Bases:
anndata.metadata.sppasMetaData
Generic representation of a controlled vocabulary.
A controlled Vocabulary is a set of tags. It is used to restrict the use of tags in a label: only the accepted tags can be set to a label.
A controlled vocabulary is made of an identifier name, a description and a list of pairs tag/description.
- __init__(name, description='')[source]¶
Create a new sppasCtrlVocab instance.
- Parameters
name – (str) Identifier name of the controlled vocabulary
description – (str)
- add(tag, description='')[source]¶
Add a tag to the controlled vocab.
- Parameters
tag – (sppasTag): the tag to add.
description – (str)
- Returns
Boolean
- contains(tag)[source]¶
Test if a tag is in the controlled vocabulary.
Attention: Do not check the instance but the data content of the tag.
- Parameters
tag – (sppasTag) the tag to check.
- Returns
Boolean
- get_tag_description(tag)[source]¶
Return the unicode string of the description of an entry.
- Parameters
tag – (sppasTag) the tag to get the description.
- Returns
(str)
- remove(tag)[source]¶
Remove a tag of the controlled vocab.
- Parameters
tag – (sppasTag) the tag to remove.
- Returns
Boolean
- set_description(description='')[source]¶
Set the description of the controlled vocabulary.
- Parameters
description – (str)
- class anndata.sppasDisjoint(intervals=None)[source]¶
Bases:
anndata.ann.annlocation.localization.sppasBaseLocalization
Localization of a serie of intervals in time.
- __init__(intervals=None)[source]¶
Create a new sppasDisjoint instance.
- Parameters
intervals – (list of sppasInterval)
- append_interval(interval)[source]¶
Return the sppasInterval at the given index.
- Parameters
interval – (sppasInterval)
- middle()[source]¶
Return a sppasPoint() at the middle of the time interval.
To be tested.
- Returns
(sppasPoint)
- set(other)[source]¶
Set self members from another sppasDisjoint instance.
- Parameters
other – (sppasDisjoint)
- set_begin(tp)[source]¶
Set the begin sppasPoint instance to new sppasPoint.
- Parameters
tp – (sppasPoint)
- set_end(tp)[source]¶
Set the end sppasPoint instance to new sppasPoint.
- Parameters
tp – (sppasPoint)
- class anndata.sppasDuration(value, vagueness=0.0)[source]¶
Bases:
object
Representation of a duration with vagueness.
Represents a duration identified by 2 float values:
the duration value;
the duration margin.
- __init__(value, vagueness=0.0)[source]¶
Create a new sppasDuration instance.
- Parameters
value – (float) value of the duration.
vagueness – (float) represents the vagueness of the value.
- set(other)[source]¶
Set the value/vagueness of another sppasDuration instance.
- Parameters
other – (sppasDuration)
- class anndata.sppasDurationCompare[source]¶
Bases:
sppas.src.structs.basecompare.sppasBaseCompare
Comparison methods for sppasDuration.
- static eq(duration, x)[source]¶
Return True if duration is equal to x.
- Parameters
duration – (sppasDuration)
x – (int, float)
- Returns
(bool)
- static ge(duration, x)[source]¶
Return True if duration is greater or equal than x.
- Parameters
duration – (sppasDuration)
x – (int, float)
- Returns
(bool)
- static gt(duration, x)[source]¶
Return True if duration is greater than x.
- Parameters
duration – (sppasDuration)
x – (int, float)
- Returns
(bool)
- static le(duration, x)[source]¶
Return True if duration is lower or equal than x.
- Parameters
duration – (sppasDuration)
x – (int, float)
- Returns
(bool)
- class anndata.sppasHierarchy[source]¶
Bases:
anndata.metadata.sppasMetaData
Generic representation of a hierarchy between tiers.
Two types of hierarchy are considered:
TimeAssociation: the points of a child tier are all equals to the points of a reference tier, as for example:
parent: Words | l’ | âne | est | là |child: Lemmas | le | âne | être | là |TimeAlignment: the points of a child tier are all included in the set of points of a reference tier, as for example:
parent: Phonemes | l | a | n | e | l | a |child: Words | l’ | âne | est | là |parent: Phonemes | l | a | n | e | l | a |child: Syllables | l.a | n.e | l.a |
In that example, notice that there’s no hierarchy link between “Tokens” and “Syllables” and notice that “Phonemes” is the grand-parent of “Lemmas”.
And the following obvious rules are applied:
A child can have ONLY ONE parent!
A parent can have as many children as wanted.
A hierarchy is a tree, not a graph.
Todo is to consider a time association that is not fully completed:
parent: Tokens | l’ | âne | euh | euh | est | là | @ |child: Lemmas | le | âne | | être | là |- add_link(link_type, parent_tier, child_tier)[source]¶
Validate and add a hierarchy link between 2 tiers.
- Parameters
link_type – (constant) One of the hierarchy types
parent_tier – (sppasTier) The parent tier
child_tier – (sppasTier) The child tier to be linked to parent
- get_ancestors(child_tier)[source]¶
Return all the direct ancestors of a tier.
- Parameters
child_tier – (sppasTier)
- Returns
List of tiers with parent, grand-parent, grand-grand-parent…
- get_children(parent_tier, link_type=None)[source]¶
Return the list of children of a tier, for a given type.
- Parameters
parent_tier – (sppasTier) The child tier to found
link_type – (str) The type of hierarchy
- Returns
List of tiers
- get_hierarchy_type(child_tier)[source]¶
Return the hierarchy type between a child tier and its parent.
- Returns
(str) one of the hierarchy type
- get_parent(child_tier)[source]¶
Return the parent tier for a given child tier.
- Parameters
child_tier – (sppasTier) The child tier to found
- static infer_hierarchy_type(tier1, tier2)[source]¶
Test if tier1 can be a parent tier for tier2.
- Returns
One of hierarchy types or an empty string
- remove_child(child_tier)[source]¶
Remove a hierarchy link between a parent and a child.
- Parameters
child_tier – (sppasTier) The tier linked to a reference
- remove_parent(parent_tier)[source]¶
Remove all hierarchy links between a parent and its children.
- Parameters
parent_tier – (sppasTier) The parent tier
- remove_tier(tier)[source]¶
Remove all occurrences of a tier inside the hierarchy.
- Parameters
tier – (sppasTier) The tier to remove as parent or child.
- types = {'TimeAlignment', 'TimeAssociation'}¶
- validate_link(link_type, parent_tier, child_tier)[source]¶
Validate a hierarchy link between 2 tiers.
- Parameters
link_type – (constant) One of the hierarchy types
parent_tier – (sppasTier) The parent tier
child_tier – (sppasTier) The child tier to be linked to parent
- Raises
AnnDataTypeError, HierarchyParentTierError, HierarchyChildTierError, HierarchyAncestorTierError, HierarchyAlignmentError, HierarchyAssociationError
- class anndata.sppasInterval(begin, end)[source]¶
Bases:
anndata.ann.annlocation.localization.sppasBaseLocalization
Localization of an interval between two sppasPoint instances.
An interval is identified by two sppasPoint objects:
one is representing the beginning of the interval;
the other is representing the end of the interval.
- __init__(begin, end)[source]¶
Create a new sppasInterval instance.
- Parameters
begin – (sppasPoint)
end – (sppasPoint)
Degenerated interval is forbidden, i.e. begin > end.
- static check_interval_bounds(begin, end)[source]¶
Check bounds of a virtual interval.
- Parameters
begin – (sppasPoint)
end – (sppasPoint)
- static check_types(begin, end)[source]¶
True only if begin and end are both the same types of sppasPoint.
- Parameters
begin – any kind of data
end – any kind of data
- Returns
Boolean
- combine(other)[source]¶
Return a sppasInterval, the combination of two intervals.
- Parameters
other – (sppasInterval) the other interval to combine with.
- duration()[source]¶
Overridden. Return the duration of the time interval.
- Returns
(sppasDuration) Duration and its vagueness.
- middle()[source]¶
Return a sppasPoint() at the middle of the time interval.
To be tested.
- Returns
(sppasPoint)
- middle_value()[source]¶
Return the middle value of the time interval.
Return a float value even if points are integers.
- Returns
(float) value.
- set(other)[source]¶
Set self members from another sppasInterval instance.
- Parameters
other – (sppasInterval)
- set_begin(tp)[source]¶
Set the begin of the interval to a new sppasPoint.
Attention: it is a reference assignment.
- Parameters
tp – (sppasPoint)
- set_end(tp)[source]¶
Set the end of the interval to a new sppasPoint.
Attention: it is a reference assignment.
- Parameters
tp – (sppasPoint)
- set_radius(radius)[source]¶
Set a radius value to begin and end points.
- Parameters
radius – (int or float)
- Raise
ValueError
- class anndata.sppasLabel(tag, score=None)[source]¶
Bases:
object
Represent the content of an annotation.
sppasLabel allows to store a set of sppasTags with their scores. This class is using a list of lists, i.e. a list of pairs (tag, score). This is the best compromise between memory usage, speed and readability.
A label is a list of possible sppasTag(), represented as a UNICODE string. A data type can be associated, as sppasTag() can be ‘int’, ‘float’ or ‘bool’.
- __init__(tag, score=None)[source]¶
Create a new sppasLabel instance.
- Parameters
tag – (sppasTag or list of sppasTag or None)
score – (float or list of float or None)
- append(tag, score=None)[source]¶
Add a sppasTag into the list.
Do not add the tag if this alternative is already inside the list, but add the scores.
- Parameters
tag – (sppasTag)
score – (float)
- append_content(content, data_type='str', score=None)[source]¶
Add a text into the list.
- Parameters
content – (str)
data_type – (str): The type of this text content.
One of: (str, int, float, bool) :param score: (float)
- get_best()[source]¶
Return the best sppasTag, i.e. the one with the better score.
- Returns
(sppasTag or None)
- get_score(tag)[source]¶
Return the score of a tag or None if tag is not in the label.
- Parameters
tag – (sppasTag)
- Returns
score: (float)
- match(tag_functions, logic_bool='and')[source]¶
Return True if a tag matches all or any of the functions.
- Parameters
tag_functions – list of (function, value, logical_not)
logic_bool – (str) Apply a logical “and” or a logical “or” between the functions.
- Returns
(bool)
function: a function in python with 2 arguments: tag/value
value: the expected value for the tag
logical_not: boolean
- Example
Search if a tag is exactly matching “R”:
>>> l.match([(exact, "R", False)])
- Example
Search if a tag is starting with “p” or starting with “t”:
>>> l.match([(startswith, "p", False), >>> (startswith, "t", False), ], logic_bool="or")
- remove(tag)[source]¶
Remove a tag of the list.
- Parameters
tag – (sppasTag) the tag to be removed of the list.
- class anndata.sppasLocation(localization=None, score=None)[source]¶
Bases:
object
Location of an annotation of a tier.
sppasLocation allows to store a set of localizations with their scores. This class is using a list of lists, i.e. a list of pairs (localization, score). This is the best compromise between memory usage, speed and readability.
- __init__(localization=None, score=None)[source]¶
Create a new sppasLocation instance and add the entry.
- Parameters
localization – (Localization or list of localizations)
score – (float or list of float)
If a list of alternative localizations are given, the same score is assigned to all items.
- append(localization, score=None)[source]¶
Add a localization into the list.
- Parameters
localization – (Localization) the localization to append
score – (float)
- get_best()[source]¶
Return a copy of the best localization.
- Returns
(sppasLocalization) localization with the highest score.
- get_score(loc)[source]¶
Return the score of a localization or None if it is not in.
- Parameters
loc – (sppasLocalization)
- Returns
score: (float)
- match_duration(dur_functions, logic_bool='and')[source]¶
Return True if a duration matches all or any of the functions.
- Parameters
dur_functions – list of (function, value, logical_not)
logic_bool – (str) Apply a logical “and” or “or”
- Returns
(bool)
function: a function in python with 2 arguments: dur/value
value: the expected value for the duration (int/float/sppasDuration)
logical_not: boolean
- Example
Search if a duration is exactly 30ms
>>> d.match([(eq, 0.03, False)])
- Example
Search if a duration is not 30ms
>>> d.match([(eq, 0.03, True)]) >>> d.match([(ne, 0.03, False)])
- Example
Search if a duration is comprised between 0.3 and 0.7 >>> l.match([(ge, 0.03, False), >>> (le, 0.07, False)], logic_bool=”and”)
See sppasDurationCompare() to get a list of functions.
- match_localization(loc_functions, logic_bool='and')[source]¶
Return True if a localization matches all or any of the functions.
- Parameters
loc_functions – list of (function, value, logical_not)
logic_bool – (str) Apply a logical “and” or a logical “or”
between the functions. :returns: (bool)
function: a function in python with 2 arguments: loc/value
value: the expected value for the localization (int/float/sppasPoint)
logical_not: boolean
- Example
Search if a localization is after (or starts at) 1 minutes
>>> l.match([(rangefrom, 60., False)])
- Example
Search if a localization is before (or ends at) 3 minutes
>>> l.match([(rangeto, 180., True)])
- Example
Search if a localization is between 1 min and 3 min
>>> l.match([(rangefrom, 60., False), >>> (rangeto, 180., False)], logic_bool="and")
See sppasLocalizationCompare() to get a list of functions.
- remove(localization)[source]¶
Remove a localization of the list.
- Parameters
localization – (sppasLocalization) the loc to be removed
- set_radius(radius)[source]¶
Set a radius value to all localizations.
- Parameters
radius – (int, float) New radius value
- Raise
AnnDataTypeError, AnnDataNegValueError
- class anndata.sppasMedia(filename, media_id=None, mime_type=None)[source]¶
Bases:
anndata.metadata.sppasMetaData
Generic representation of a media file.
- class anndata.sppasMetaData[source]¶
Bases:
object
Dictionary of meta data including a required ‘id’.
Meta data keys and values are unicode strings.
- __init__()[source]¶
Create a sppasMetaData instance.
Add a GUID-like in the dictionary of metadata, with key “id”.
- add_annotator_metadata(name='', version='', version_date='')[source]¶
Add metadata about an annotator.
- Parameters
name – (str)
version – (str)
version_date – (str)
TODO: CHECK IF KEYS ARE NOT ALREADY EXISTING.
- add_language_metadata()[source]¶
Add metadata about the language (und).
TODO: CHECK IF KEYS NOT ALREADY EXISTING.
- add_project_metadata()[source]¶
Add metadata about the project this object is included-in.
Currently do not assign any value. TODO: CHECK IF KEYS NOT ALREADY EXISTING.
- add_software_metadata()[source]¶
Add metadata about this software.
TODO: CHECK IF KEYS NOT ALREADY EXISTING.
- get_meta(entry, default='')[source]¶
Return the value of the given key.
- Parameters
entry – (str) Entry to be checked as a key.
default – (str) Default value to return if entry is not a key.
- Returns
(str) meta data value or default value
- class anndata.sppasPoint(midpoint, radius=None)[source]¶
Bases:
anndata.ann.annlocation.localization.sppasBaseLocalization
Localization of a point for any numerical representation.
Represents a point identified by a midpoint value and a radius value. Generally, time is represented in seconds, as a float value ; frames are represented by integers like ranks.
In this class, the 3 relations <, = and > take into account a radius value, that represents the uncertainty of the localization. For a point x, with a radius value of rx, and a point y with a radius value of ry, these relations are defined as:
x = y iff |x - y| <= rx + ry
x < y iff not(x = y) and x < y
x > y iff not(x = y) and x > y
- Example 1
Strictly equals:
x = 1.000, rx=0.
y = 1.000, ry=0.
x = y is true
x = 1.00000000000, rx=0.
y = 0.99999999675, ry=0.
x = y is false
- Example 2
Using the radius:
x = 1.0000000000, rx=0.0005
y = 1.0000987653, ry=0.0005
x = y is true (accepts a margin of 1ms between x and y)
x = 1.0000000, rx=0.0005
y = 1.0011235, ry=0.0005
x = y is false
So… an overlap of the vagueness “area” makes the two points equals: |------------rx----------X-----ry===rx----Y--------ry------|
- __init__(midpoint, radius=None)[source]¶
Create a sppasPoint instance.
- Parameters
midpoint – (float, int) midpoint value.
radius – (float, int) represents the vagueness of the point.
Radius must be of the same type as midpoint.
- static check_types(x, y)[source]¶
True only if midpoint and radius are both of the same types.
- Parameters
x – any kind of data
y – any kind of data
- Returns
Boolean
- duration()[source]¶
Overrides. Return the duration of the point.
- Returns
(sppasDuration) Duration and its vagueness.
- set(other)[source]¶
Set self members from another sppasPoint instance.
- Parameters
other – (sppasPoint)
- set_midpoint(midpoint)[source]¶
Set the midpoint value.
In versions < 1.9.8, it was required that midpoint >= 0. Negative values are now accepted because some annotations are not properly synchronized and then some of them can be negative.
- Parameters
midpoint – (float, int) is the new midpoint value.
- Raise
AnnDataTypeError
- class anndata.sppasTag(tag_content, tag_type=None)[source]¶
Bases:
object
Represent one of the possible tags of a label.
A sppasTag is a data content of any type. By default, the type of the data is “str” and the content is empty, but internally the sppasTag stores ‘None’ values because None is 16 bits and an empty string is 37.
A sppasTag() content can be one of the following types:
string/unicode - (str)
integer - (int)
float - (float)
boolean - (bool)
point - (sppasFuzzyPoint)
rect - (sppasFuzzyRect)
Get access to the content with the get_content() method and to the typed content with get_typed_content().
>>> t1 = sppasTag("2") # "2" (str) >>> t2 = sppasTag(2) # "2" (str) >>> t3 = sppasTag(2, tag_type="int") # 2 (int) >>> t4 = sppasTag("2", tag_type="int") # 2 (int) >>> t5 = sppasTag("2", tag_type="float") # 2. (float) >>> t6 = sppasTag("true", tag_type="bool") # True (bool) >>> t7 = sppasTag(0, tag_type="bool") # False (bool) >>> t8 = sppasTag((27, 32), tag_type="point") # x=27, y=32 (point) >>> t9 = sppasTag((27, 32, 320, 200), tag_type="rect")
- TAG_TYPES = ('str', 'float', 'int', 'bool', 'point', 'rect')¶
- __init__(tag_content, tag_type=None)[source]¶
Initialize a new sppasTag instance.
- Parameters
tag_content – (any) Data content
tag_type – (str): The type of this content. One of: (‘str’, ‘int’, ‘float’, ‘bool’, ‘point’, ‘rect’).
‘str’ is the default tag_type.
- get_content()[source]¶
Return an unicode string corresponding to the content.
Also returns a unicode string in case of a list (elements are separated by a whitespace).
- Returns
(unicode)
- class anndata.sppasTagCompare[source]¶
Bases:
sppas.src.structs.basecompare.sppasBaseCompare
Comparison methods for sppasTag.
Label’tags can be of 3 types in anndata (str, num, bool) so that this class allows to create different comparison methods depending on the type of the tags.
- Example
Three different ways to compare a tag content to a given string
>>> tc = sppasTagCompare() >>> tc.exact(sppasTag("abc"), u("abc")) >>> tc.methods['exact'](sppasTag("abc"), u("abc")) >>> tc.get('exact')(sppasTag("abc"), u("abc"))
- static bool(tag, x)[source]¶
Return True if boolean value of the tag is equal to boolean x.
- Parameters
tag – (sppasTag) Tag to compare.
x – (bool)
- Returns
(bool)
- Raises
AnnDataTypeError
- static contains(tag, text)[source]¶
Test if the first text contains the second text.
- Parameters
tag – (sppasTag) Tag to compare.
text – (unicode) Unicode string to be compared with.
- Returns
(bool)
- Raises
AnnDataTypeError
- static endswith(tag, text)[source]¶
Test if first text ends with the characters of the second text.
- Parameters
tag – (sppasTag) Tag to compare.
text – (unicode) Unicode string to be compared with.
- Returns
(bool)
- Raises
AnnDataTypeError
- static equal(tag, x)[source]¶
Return True if numerical value of the tag is equal to x.
- Parameters
tag – (sppasTag) Tag to compare.
x – (int, float)
- Returns
(bool)
- Raises
AnnDataTypeError
- static exact(tag, text)[source]¶
Test if two texts strictly contain the same characters.
- Parameters
tag – (sppasTag) Tag to compare.
text – (unicode) Unicode string to be compared with.
- Returns
(bool)
- Raises
AnnDataTypeError
- static greater(tag, x)[source]¶
Return True if numerical value of the tag is greater than x.
- Parameters
tag – (sppasTag) Tag to compare.
x – (int, float)
- Returns
(bool)
- Raises
AnnDataTypeError
- static icontains(tag, text)[source]¶
Case-insensitive contains.
- Parameters
tag – (sppasTag) Tag to compare.
text – (unicode) Unicode string to be compared with.
- Returns
(bool)
- Raises
AnnDataTypeError
- static iendswith(tag, text)[source]¶
Case-insensitive endswith.
- Parameters
tag – (sppasTag) Tag to compare.
text – (unicode) Unicode string to be compared with.
- Returns
(bool)
- Raises
AnnDataTypeError
- static iexact(tag, text)[source]¶
Case-insensitive exact.
- Parameters
tag – (sppasTag) Tag to compare.
text – (unicode) Unicode string to be compared with.
- Returns
(bool)
- Raises
AnnDataTypeError
- static istartswith(tag, text)[source]¶
Case-insensitive startswith.
- Parameters
tag – (sppasTag) Tag to compare.
text – (unicode) Unicode string to be compared with.
- Returns
(bool)
- Raises
AnnDataTypeError
- static lower(tag, x)[source]¶
Return True if numerical value of the tag is lower than x.
- Parameters
tag – (sppasTag) Tag to compare.
x – (int, float)
- Returns
(bool)
- Raises
AnnDataTypeError
- class anndata.sppasTier(name=None, ctrl_vocab=None, media=None, parent=None)[source]¶
Bases:
anndata.metadata.sppasMetaData
Representation of a tier, a structured set of annotations.
Annotations of a tier are sorted depending on their location (from lowest to highest).
A Tier is made of:
a name (used to identify the tier),
a set of metadata,
an array of annotations,
a controlled vocabulary (optional),
a media (optional),
a parent (optional).
- __init__(name=None, ctrl_vocab=None, media=None, parent=None)[source]¶
Create a new sppasTier instance.
- Parameters
name – (str) Name of the tier. It is used as identifier.
ctrl_vocab – (sppasCtrlVocab)
media – (sppasMedia)
parent – (sppasTranscription)
- add(annotation)[source]¶
Add an annotation to the tier in sorted order.
Assign this tier as parent to the annotation.
- Parameters
annotation – (sppasAnnotation)
- Raises
AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError
- Returns
the index of the annotation in the tier
- append(annotation)[source]¶
Append the given annotation at the end of the tier.
Assign this tier as parent to the annotation.
- Parameters
annotation – (sppasAnnotation)
- Raises
AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError, TierAppendError
- create_annotation(location, labels=None)[source]¶
Create and add a new annotation into the tier.
- Parameters
location – (sppasLocation) the location(s) where the annotation happens
labels – (sppasLabel, list) the label(s) to stamp this annot.
- Returns
sppasAnnotation
- create_annotation_after(idx)[source]¶
Create and add a new annotation in the hole after idx.
- Parameters
idx – (int) Index of an existing annotation
- Returns
sppasAnnotation
:raises AnnDataTypeError, AnnDataIndexError
- create_annotation_before(idx)[source]¶
Create and add a new annotation in the hole before idx.
- Parameters
idx – (int) Index of an existing annotation
- Returns
sppasAnnotation
:raises AnnDataTypeError, AnnDataIndexError
- create_ctrl_vocab(name=None)[source]¶
Create the controlled vocabulary from annotation labels.
Create (or re-create) the controlled vocabulary from the list of already existing annotation labels. The current controlled vocabulary is deleted.
- Parameters
name – (str) Name of the controlled vocabulary. The name of the tier is used by default.
- export_to_intervals(separators)[source]¶
Create a tier with the consecutive filled intervals.
Return an empty tier if ‘self’ is not of type “interval”. The created intervals are not filled.
- Parameters
separators – (list)
- Returns
(sppasTier)
- export_unfilled()[source]¶
Create a tier with the unlabelled/unfilled intervals.
Only for tiers of type Interval. It represents the “NOT tier”, ie where this tier is not annotated.
IMPORTANT: Never tested with overlapped annotations, actually not tested at all (but used in the plugin StatGroups).
- Returns
(sppasTier) or None
- find(begin, end, overlaps=True, indexes=False)[source]¶
Return a list of annotations between begin and end.
- Parameters
begin – sppasPoint or None to start from the beginning of the tier
end – sppasPoint or None to end at the end of the tier
overlaps – (bool) Return also overlapped annotations. Not relevant for tiers with points.
indexes – (bool) Return indexes instead of annotations
- Returns
List of sppasAnnotation or list of indexes
- fit(other)[source]¶
Select then slice or extend annotations to fit in other tier.
Keep only the annotations of self that have some overlapping time with the given other tier and slice the localization of such selected annotations to exactly match those of the other tier.
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
tier1: |_a_|_b_| |_c_| |_d_| |_e_| |f| tier2: |_w_| |_______x_______| |_y_| |_z_|
- Parameters
other – (sppasTier)
- Returns
(sppasTier)
- get_annotation(identifier)[source]¶
Find an annotation from its metadata ‘id’.
- Parameters
identifier – (str) Metadata ‘id’ of an annotation.
- Returns
sppasAnnotation or None
- get_annotation_index(ann)[source]¶
Find an annotation.
- Parameters
ann – (sppasAnnotation)
- Returns
(int) -1 if not found
- has_point(point)[source]¶
Return True if the tier contains a given point.
- Parameters
point – (sppasPoint) The point to find in the tier.
- Returns
(bool)
- index(moment)[source]¶
Return the index of the moment (int), or -1.
Only for tier with points.
- Parameters
moment – (sppasPoint)
- is_superset(other)[source]¶
Return True if this tier contains all points of the other tier.
- Parameters
other – (sppasTier)
- Returns
Boolean
- lindex(moment)[source]¶
Return the index of the interval starting at a given moment, or -1.
Only for tier with intervals or disjoint. If the tier contains more than one annotation starting at the same moment, the method returns the first one.
- Parameters
moment – (sppasPoint)
- merge(idx, direction)[source]¶
Merge the annotation at given index with next or previous one.
- if direction > 0:
ann_idx: [begin_idx, end_idx, labels_idx] next_ann: [begin_n, end_n, labels_n] result: [begin_idx, end_n, labels_idx + labels_n]
- if direction < 0:
prev_ann: [begin_p, end_p, labels_p] ann_idx: [begin_idx, end_idx, labels_idx] result: [begin_p, end_idx, labels_p + labels_idx]
- Parameters
idx – (int) Index of the annotation in the list
direction – (int) Positive for next, Negative for previous
- Returns
(bool) False if direction does not match with index
- Raise
Exception if merged annotation can’t be deleted of the tier
- mindex(moment, bound=0)[source]¶
Return index of the interval containing the given moment.
Only for tier with intervals or disjoint.
If the tier contains more than one annotation at the same moment, the method returns the first one (i.e. the one which started at first).
- Parameters
moment – (sppasPoint)
bound – (int) - 0 to exclude bounds of the interval; - -1 to include begin bound; - +1 to include end bound; - +2 to include both begin/end bounds; - others: the midpoint of moment is strictly inside
- Returns
(int) Index of the 1st annotation containing moment or -1
- near(moment, direction=1)[source]¶
Search for the annotation whose localization is closest.
Search for the nearest localization to the given moment into a given direction.
- Parameters
moment – (sppasPoint)
direction – (int) - nearest 0 - nereast forward 1 - nereast backward -1
- pop(index=- 1)[source]¶
Remove the annotation at the given position in the tier.
If no index is specified, pop() removes and returns the last annotation in the tier.
- Parameters
index – (int) Index of the annotation to remove.
- Raises
HierarchyContainsError
- remove(begin, end, overlaps=False)[source]¶
Remove annotation intervals between begin and end.
- Parameters
begin – (sppasPoint)
end – (sppasPoint)
overlaps – (bool)
- Returns
the number of removed annotations
- Raises
HierarchyContainsError
- remove_unlabelled()[source]¶
Remove annotations without labels.
Do not remove an annotation if it invalidates the hierarchy.
- Returns
the number of removed annotations
- rindex(moment)[source]¶
Return the index of the interval ending at the given moment.
Only for tier with intervals or disjoint. If the tier contains more than one annotation ending at the same moment, the method returns the last one.
- Parameters
moment – (sppasPoint)
- set_ctrl_vocab(ctrl_vocab=None)[source]¶
Set a controlled vocabulary to this tier.
- Parameters
ctrl_vocab – (sppasCtrlVocab or None)
- Raises
AnnDataTypeError, CtrlVocabContainsError
- set_media(media)[source]¶
Set a media to the tier.
- Parameters
media – (sppasMedia)
- Raises
AnnDataTypeError
- set_name(name=None)[source]¶
Set the name of the tier.
If no name is given, an GUID is randomly assigned. Important: An empty string is accepted.
- Parameters
name – (str) The identifier name or None.
- Returns
the formatted name
- set_radius(radius)[source]¶
Fix a radius value to all points of the tier.
- Parameters
radius – (int, float) New radius value
- Raise
AnnDataTypeError, AnnDataNegValueError
- split(idx)[source]¶
Split annotation at the given index into 2 annotations.
- Parameters
idx – (int) Index of the annotation to split.
- Returns
newly created annotation at index idx+1
- validate_annotation(annotation)[source]¶
Validate the annotation and set its parent to this tier.
- Parameters
annotation – (sppasAnnotation)
- Raises
AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError
- class anndata.sppasTranscription(name=None)[source]¶
Bases:
anndata.metadata.sppasMetaData
Representation of a transcription, the root in our framework.
Transcriptions in SPPAS are represented with:
metadata: a list of tuple key/value;
a name (used to identify the transcription);
a list of tiers;
a hierarchy between tiers;
a list of media;
a list of controlled vocabularies.
Inter-tier relations are managed by establishing alignment or association links between 2 tiers:
alignment: annotations of a tier A (child) have only localization instances included in those of annotations of tier B (parent);
association: annotations of a tier A have exactly localization
instances included in those of annotations of tier B.
- Example
>>> # Create an instance >>> trs = sppasTranscription("trs name")
>>> # Create a tier >>> trs.create_tier("tier name")
>>> # Get a tier of a transcription from its index: >>> tier = trs[0]
>>> # Get a tier of a transcription from its name >>> tier = trs.find("tier name")
>>> # Get a tier from its identifier >>> tier = trs.get_object(guid)
- __init__(name=None)[source]¶
Create a new sppasTranscription instance.
- Parameters
name – (str) Name of the transcription.
- add_ctrl_vocab(new_ctrl_vocab)[source]¶
Add a new controlled vocabulary in the list of ctrl vocab.
- Parameters
new_ctrl_vocab – (sppasCtrlVocab)
- Raises
AnnDataTypeError, TrsAddError
- add_hierarchy_link(link_type, parent_tier, child_tier)[source]¶
Validate and add a hierarchy link between 2 tiers.
- Parameters
link_type – (constant) One of the hierarchy types
parent_tier – (Tier) The reference tier
child_tier – (Tier) The child tier to be linked to reftier
- add_media(new_media)[source]¶
Add a new media in the list of media.
Does not add the media if a media with the same id is already in self.
- Parameters
new_media – (sppasMedia) The media to add.
- Raises
AnnDataTypeError, TrsAddError
- create_tier(name=None, ctrl_vocab=None, media=None)[source]¶
Create and append a new empty tier.
- Parameters
name – (str) the name of the tier to create
ctrl_vocab – (sppasCtrlVocab)
media – (sppasMedia)
- Returns
newly created empty tier
- find(name, case_sensitive=True)[source]¶
Find a tier from its name.
- Parameters
name – (str) EXACT name of the tier
case_sensitive – (bool)
- Returns
sppasTier or None
- find_id(tier_id)[source]¶
Find a tier from its identifier.
- Parameters
tier_id – (str) Exact identifier of the tier
- Returns
sppasTier or None
- get_ctrl_vocab_from_id(ctrl_vocab_id)[source]¶
Return a sppasCtrlVocab from its id or None.
- Parameters
ctrl_vocab_id – (str) Identifier name of a ctrl vocab
- get_ctrl_vocab_from_name(ctrl_vocab_name)[source]¶
Return a sppasCtrlVocab from its name or None.
- Parameters
ctrl_vocab_name – (str) Identifier name of a ctrl vocabulary
- get_media_from_id(media_id)[source]¶
Return a sppasMedia from its name or None.
- Parameters
media_id – (str) Identifier name of a media
- get_object(identifier)[source]¶
Return the object matching the given identifier.
- Parameters
identifier – (GUID)
- get_tier_from_id(tier_id)[source]¶
Return a sppasTier from its id or None.
- Parameters
tier_id – (str) Identifier name of a tier
- get_tier_index(name, case_sensitive=True)[source]¶
Get the index of a tier from its name.
- Parameters
name – (str) EXACT name of the tier
case_sensitive – (bool)
- Returns
index or -1 if not found
- get_tier_index_id(identifier)[source]¶
Get the index of a tier from its id.
- Parameters
identifier – (str) GUID
- Returns
index or -1 if not found
- pop(index=- 1)[source]¶
Remove the tier at the given position in the transcription.
Return it. If no index is specified, pop() removes and returns the last tier in the transcription.
- Parameters
index – (int) Index of the tier to remove.
- Returns
(sppasTier)
- Raise
AnnDataIndexError
- remove_ctrl_vocab(old_ctrl_vocab)[source]¶
Remove a controlled vocabulary of the list of ctrl vocab.
- Parameters
old_ctrl_vocab – (sppasCtrlVocab)
- Raises
AnnDataTypeError, TrsRemoveError
- remove_media(old_media)[source]¶
Remove a media of the list of media.
- Parameters
old_media – (sppasMedia)
- Raises
AnnDataTypeError, TrsRemoveError
- rename_tier(tier)[source]¶
Rename a tier by appending a digit.
- Parameters
tier – (sppasTier) The tier to rename.
- set_ctrl_vocab_list(ctrl_vocab_list)[source]¶
Set the list of controlled vocabularies.
- Parameters
ctrl_vocab_list – (list)
- Returns
list of rejected ctrl_vocab
- set_media_list(media_list)[source]¶
Set the list of media.
- Parameters
media_list – (list)
- Returns
list of rejected media
- set_name(name=None)[source]¶
Set the name of the transcription.
- Parameters
name – (str or None) The identifier or None to set the GUID.
- Returns
the name
- set_tier_index(name, new_index, case_sensitive=True)[source]¶
Set the index of a tier from its name.
THIS SHOULD NEVER BE USED. USE TIER IDENTIFIER INSTEAD OF ITS NAME.
- Parameters
name – (str) EXACT name of the tier
new_index – (int) New index of the tier in self
case_sensitive – (bool)
- Returns
index or -1 if not found
- set_tier_index_id(identifier, new_index)[source]¶
Set the index of a tier from its identifier.
- Parameters
identifier – (str)
new_index – (int) New index of the tier in self
- Returns
index or -1 if not found
- shift(delay)[source]¶
Shift all annotation’ location to a given delay.
- Parameters
delay – (int, float) delay to shift all localizations
- Raise
AnnDataTypeError
- property tiers¶
Return the list of tiers.
- class anndata.sppasTrsRW(filename)[source]¶
Bases:
object
Main parser of annotated data: Reader and writer of annotated data.
All the 3 types of annotated files are supported: ANNOT, MEASURE, TABLE.
- TRANSCRIPTION_TYPES = {'IntensityTier': <class 'anndata.aio.praat.sppasIntensityTier'>, 'PitchTier': <class 'anndata.aio.praat.sppasPitchTier'>, 'TextGrid': <class 'anndata.aio.praat.sppasTextGrid'>, 'ant': <class 'anndata.aio.annotationpro.sppasANT'>, 'antx': <class 'anndata.aio.annotationpro.sppasANTX'>, 'anvil': <class 'anndata.aio.anvil.sppasAnvil'>, 'arff': <class 'anndata.aio.table.sppasARFF'>, 'aup': <class 'anndata.aio.audacity.sppasAudacity'>, 'csv': <class 'anndata.aio.text.sppasCSV'>, 'ctm': <class 'anndata.aio.sclite.sppasCTM'>, 'eaf': <class 'anndata.aio.elan.sppasEAF'>, 'hz': <class 'anndata.aio.phonedit.sppasSignaix'>, 'lab': <class 'anndata.aio.htk.sppasLab'>, 'mrk': <class 'anndata.aio.phonedit.sppasMRK'>, 'srt': <class 'anndata.aio.subtitle.sppasSubRip'>, 'stm': <class 'anndata.aio.sclite.sppasSTM'>, 'sub': <class 'anndata.aio.subtitle.sppasSubViewer'>, 'tdf': <class 'anndata.aio.xtrans.sppasTDF'>, 'tra': <class 'anndata.aio.table.sppasTRA'>, 'trs': <class 'anndata.aio.transcriber.sppasTRS'>, 'txt': <class 'anndata.aio.text.sppasRawText'>, 'vtt': <class 'anndata.aio.subtitle.sppasWebVTT'>, 'xra': <class 'anndata.aio.xra.sppasXRA'>, 'xrff': <class 'anndata.aio.table.sppasXRFF'>}¶
- static create_trs_from_extension(filename)[source]¶
Return a transcription according to a given filename.
Only the extension of the filename is used.
- Parameters
filename – (str)
- Returns
Transcription()
- static create_trs_from_heuristic(filename)[source]¶
Return a transcription according to a given filename.
The given file is opened and an heuristic allows to fix the format.
- Parameters
filename – (str)
- Returns
Transcription()