anndata package

Subpackages

Submodules

anndata.anndataexc module

filename

sppas.src.anndata.anndataexc.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Exceptions for the package anndata.

exception anndata.anndataexc.AioEmptyTierError(file_format, tier_name)[source]

Bases: OSError

:ERROR 1525:.

The file format {!s:s} does not support to save empty tiers: {:s}.

__init__(file_format, tier_name)[source]
exception anndata.anndataexc.AioEncodingError(filename, error_msg, encoding='utf-8')[source]

Bases: UnicodeDecodeError

:ERROR 1500:.

The file {filename} contains non {encoding} characters: {error}.

__init__(filename, error_msg, encoding='utf-8')[source]
exception anndata.anndataexc.AioError(filename)[source]

Bases: OSError

:ERROR 1400:.

No such file: ‘{!s:s}’.

__init__(filename)[source]
exception anndata.anndataexc.AioFormatError(line)[source]

Bases: OSError

:ERROR 1521:.

Unexpected format about ‘{!s:s}’.

__init__(line)[source]
exception anndata.anndataexc.AioLineFormatError(number, line)[source]

Bases: OSError

:ERROR 1520:.

Unexpected format string at line {:d}: ‘{!s:s}’.

__init__(number, line)[source]
exception anndata.anndataexc.AioLocationTypeError(file_format, location_type)[source]

Bases: TypeError

:ERROR 1530:.

The file format {!s:s} does not support tiers with {:s}.

__init__(file_format, location_type)[source]
exception anndata.anndataexc.AioMultiTiersError(file_format)[source]

Bases: OSError

:ERROR 1510:.

The file format {!s:s} does not support multi-tiers.

__init__(file_format)[source]
exception anndata.anndataexc.AioNoTiersError(file_format)[source]

Bases: OSError

:ERROR 1515:.

The file format {!s:s} does not support to save no tiers.

__init__(file_format)[source]
exception anndata.anndataexc.AnnDataEqError(v1, v2)[source]

Bases: Exception

:ERROR 1010:.

Values are expected to be equals but are {:s!s} and {:s!s}.

__init__(v1, v2)[source]
exception anndata.anndataexc.AnnDataEqTypeError(obj, obj_ref)[source]

Bases: TypeError

:ERROR 1105:.

{!s:s} is not of the same type as {!s:s}.

__init__(obj, obj_ref)[source]
exception anndata.anndataexc.AnnDataError[source]

Bases: Exception

:ERROR 1000:.

No annotated data file is defined.

__init__()[source]
exception anndata.anndataexc.AnnDataIndexError(index)[source]

Bases: IndexError

:ERROR 1200:.

Invalid index value {:d}.

__init__(index)[source]
exception anndata.anndataexc.AnnDataKeyError(data_name, value)[source]

Bases: KeyError

:ERROR 1250:.

Invalid key ‘{!s:s}’ for data ‘{!s:s}’.

__init__(data_name, value)[source]
exception anndata.anndataexc.AnnDataNegValueError(value)[source]

Bases: ValueError

:ERROR 1310:.

Expected a positive value. Got ‘{:f}’.

__init__(value)[source]
exception anndata.anndataexc.AnnDataTypeError(rtype, expected)[source]

Bases: TypeError

:ERROR 1100:.

{!s:s} is not of the expected type ‘{:s}’.

__init__(rtype, expected)[source]
exception anndata.anndataexc.AnnDataValueError(data_name, value)[source]

Bases: ValueError

:ERROR 1300:.

Invalid value ‘{!s:s}’ for ‘{!s:s}’.

__init__(data_name, value)[source]
exception anndata.anndataexc.AnnUnkTypeError(rtype)[source]

Bases: TypeError

:ERROR 1050:.

{!s:s} is not a valid type.

__init__(rtype)[source]
exception anndata.anndataexc.CtrlVocabContainsError(tag)[source]

Bases: ValueError

:ERROR 1130:.

{:s} is not part of the controlled vocabulary.

__init__(tag)[source]
exception anndata.anndataexc.CtrlVocabSetTierError(vocab_name, tier_name)[source]

Bases: ValueError

:ERROR 1132:.

The controlled vocabulary {:s} can’t be associated to the tier {:s}.

__init__(vocab_name, tier_name)[source]
exception anndata.anndataexc.HierarchyAlignmentError(parent_tier_name, child_tier_name)[source]

Bases: ValueError

:ERROR 1170:.

Can’t create a time alignment between tiers: ‘{:s}’ is not a superset of ‘{:s}’.”

__init__(parent_tier_name, child_tier_name)[source]
exception anndata.anndataexc.HierarchyAncestorTierError(child_tier_name, parent_tier_name)[source]

Bases: ValueError

:ERROR 1178:.

The tier can’t be added into the hierarchy: ‘{:s}’ is an ancestor of ‘{:s}’.

__init__(child_tier_name, parent_tier_name)[source]
exception anndata.anndataexc.HierarchyAssociationError(parent_tier_name, child_tier_name)[source]

Bases: ValueError

:ERROR 1172:.

Can’t create a time association between tiers: ‘{:s}’ and ‘{:s}’ are not supersets of each other.

__init__(parent_tier_name, child_tier_name)[source]
exception anndata.anndataexc.HierarchyChildTierError(tier_name)[source]

Bases: ValueError

:ERROR 1176:.

The tier ‘{:s}’ can’t be added into the hierarchy: a tier can’t be its own child.

__init__(tier_name)[source]
exception anndata.anndataexc.HierarchyParentTierError(child_tier_name, parent_tier_name, link_type)[source]

Bases: ValueError

:ERROR 1174:.

The tier can’t be added into the hierarchy: ‘{:s}’ has already a link of type {:s} with its parent tier ‘{:s}’.

__init__(child_tier_name, parent_tier_name, link_type)[source]
exception anndata.anndataexc.IntervalBoundsError(begin, end)[source]

Bases: ValueError

:ERROR 1120:.

The begin must be strictly lesser than the end in an interval. Got: [{:s};{:s}].

__init__(begin, end)[source]
exception anndata.anndataexc.TagValueError(tag_str)[source]

Bases: ValueError

:ERROR 1190:.

{!s:s} is not a valid tag.

__init__(tag_str)[source]
exception anndata.anndataexc.TierAddError(index)[source]

Bases: ValueError

:ERROR 1142:.

Can’t add annotation. An annotation with the same location is already in the tier at index {:d}.

__init__(index)[source]
exception anndata.anndataexc.TierAppendError(cur_end, ann_end)[source]

Bases: ValueError

:ERROR 1140:.

Can’t append annotation. Current end {!s:s} is highest than the given one {!s:s}.

__init__(cur_end, ann_end)[source]
exception anndata.anndataexc.TierHierarchyError(name)[source]

Bases: ValueError

:ERROR 1144:.

Attempt a modification in tier ‘{:s}’ that invalidates its hierarchy.

__init__(name)[source]
exception anndata.anndataexc.TrsAddError(tier_name, transcription_name)[source]

Bases: ValueError

:ERROR 1150:.

Can’t add: ‘{:s}’ is already in ‘{:s}’.

__init__(tier_name, transcription_name)[source]
exception anndata.anndataexc.TrsInvalidTierError(tier_name, transcription_name)[source]

Bases: ValueError

:ERROR 1160:.

{:s} is not a tier of {:s}. It can’t be included in its hierarchy.

__init__(tier_name, transcription_name)[source]
exception anndata.anndataexc.TrsRemoveError(tier_name, transcription_name)[source]

Bases: ValueError

:ERROR 1152:.

Can’t remove: ‘{:s}’ is not in ‘{:s}’.

__init__(tier_name, transcription_name)[source]

anndata.ctrlvocab module

filename

sppas.src.anndata.ctrlvocab.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Represent a controlled vocabulary.

class anndata.ctrlvocab.sppasCtrlVocab(name, description='')[source]

Bases: anndata.metadata.sppasMetaData

Generic representation of a controlled vocabulary.

A controlled Vocabulary is a set of tags. It is used to restrict the use of tags in a label: only the accepted tags can be set to a label.

A controlled vocabulary is made of an identifier name, a description and a list of pairs tag/description.

__init__(name, description='')[source]

Create a new sppasCtrlVocab instance.

Parameters
  • name – (str) Identifier name of the controlled vocabulary

  • description – (str)

add(tag, description='')[source]

Add a tag to the controlled vocab.

Parameters
  • tag – (sppasTag): the tag to add.

  • description – (str)

Returns

Boolean

contains(tag)[source]

Test if a tag is in the controlled vocabulary.

Attention: Do not check the instance but the data content of the tag.

Parameters

tag – (sppasTag) the tag to check.

Returns

Boolean

get_description()[source]

Return the unicode str of the description of the ctrl vocab.

get_name()[source]

Return the name of the controlled vocabulary.

get_tag_description(tag)[source]

Return the unicode string of the description of an entry.

Parameters

tag – (sppasTag) the tag to get the description.

Returns

(str)

remove(tag)[source]

Remove a tag of the controlled vocab.

Parameters

tag – (sppasTag) the tag to remove.

Returns

Boolean

set_description(description='')[source]

Set the description of the controlled vocabulary.

Parameters

description – (str)

set_tag_description(tag, description)[source]

Set the unicode string of the description of an entry.

Parameters
  • tag – (sppasTag) the tag to get the description.

  • description – (str)

Returns

(str)

validate_tag(tag)[source]

Check if the given tag can be added to the ctrl vocabulary.

Parameters

tag – (sppasTag) the tag to check.

Returns

Boolean

anndata.hierarchy module

filename

sppas.src.anndata.hierarchy.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Represent a hierarchy, i.e. constraints among tiers.

class anndata.hierarchy.sppasHierarchy[source]

Bases: anndata.metadata.sppasMetaData

Generic representation of a hierarchy between tiers.

Two types of hierarchy are considered:

  • TimeAssociation: the points of a child tier are all equals to the points of a reference tier, as for example:

    parent: Words | l’ | âne | est | là |
    child: Lemmas | le | âne | être | là |
  • TimeAlignment: the points of a child tier are all included in the set of points of a reference tier, as for example:

    parent: Phonemes | l | a | n | e | l | a |
    child: Words | l’ | âne | est | là |

    parent: Phonemes | l | a | n | e | l | a |
    child: Syllables | l.a | n.e | l.a |

In that example, notice that there’s no hierarchy link between “Tokens” and “Syllables” and notice that “Phonemes” is the grand-parent of “Lemmas”.

And the following obvious rules are applied:

  • A child can have ONLY ONE parent!

  • A parent can have as many children as wanted.

  • A hierarchy is a tree, not a graph.

Todo is to consider a time association that is not fully completed:

parent: Tokens | l’ | âne | euh | euh | est | là | @ |
child: Lemmas | le | âne | | être | là |
__init__()[source]

Create a new sppasHierarchy instance.

Validate and add a hierarchy link between 2 tiers.

Parameters
  • link_type – (constant) One of the hierarchy types

  • parent_tier – (sppasTier) The parent tier

  • child_tier – (sppasTier) The child tier to be linked to parent

copy()[source]

Return a deep copy of the hierarchy.

get_ancestors(child_tier)[source]

Return all the direct ancestors of a tier.

Parameters

child_tier – (sppasTier)

Returns

List of tiers with parent, grand-parent, grand-grand-parent…

get_children(parent_tier, link_type=None)[source]

Return the list of children of a tier, for a given type.

Parameters
  • parent_tier – (sppasTier) The child tier to found

  • link_type – (str) The type of hierarchy

Returns

List of tiers

get_hierarchy_type(child_tier)[source]

Return the hierarchy type between a child tier and its parent.

Returns

(str) one of the hierarchy type

get_parent(child_tier)[source]

Return the parent tier for a given child tier.

Parameters

child_tier – (sppasTier) The child tier to found

static infer_hierarchy_type(tier1, tier2)[source]

Test if tier1 can be a parent tier for tier2.

Returns

One of hierarchy types or an empty string

remove_child(child_tier)[source]

Remove a hierarchy link between a parent and a child.

Parameters

child_tier – (sppasTier) The tier linked to a reference

remove_parent(parent_tier)[source]

Remove all hierarchy links between a parent and its children.

Parameters

parent_tier – (sppasTier) The parent tier

remove_tier(tier)[source]

Remove all occurrences of a tier inside the hierarchy.

Parameters

tier – (sppasTier) The tier to remove as parent or child.

types = {'TimeAlignment', 'TimeAssociation'}

Validate a hierarchy link between 2 tiers.

Parameters
  • link_type – (constant) One of the hierarchy types

  • parent_tier – (sppasTier) The parent tier

  • child_tier – (sppasTier) The child tier to be linked to parent

Raises

AnnDataTypeError, HierarchyParentTierError, HierarchyChildTierError, HierarchyAncestorTierError, HierarchyAlignmentError, HierarchyAssociationError

static validate_time_alignment(parent_tier, child_tier)[source]

Validate a time alignment hierarchy link between 2 tiers.

Parameters
  • parent_tier – (sppasTier) The parent tier

  • child_tier – (sppasTier) The child tier to be linked to parent

Raises

HierarchyAlignmentError

static validate_time_association(parent_tier, child_tier)[source]

Validate a time association hierarchy link between 2 tiers.

Parameters
  • parent_tier – (sppasTier) The parent tier

  • child_tier – (sppasTier) The child tier to be linked to the parent

Raises

HierarchyAssociationError

anndata.media module

filename

sppas.src.anndata.media.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Represent a media – a recording file.

class anndata.media.sppasMedia(filename, media_id=None, mime_type=None)[source]

Bases: anndata.metadata.sppasMetaData

Generic representation of a media file.

__init__(filename, media_id=None, mime_type=None)[source]

Create a new sppasMedia instance.

Parameters
  • filename – (str) File name of the media

  • media_id – (str) Identifier of the media

  • mime_type – (str) Mime type of the media

get_content()[source]

Return the content of the media.

get_filename()[source]

Return the URL of the media.

get_mime_type()[source]

Return the mime type of the media.

set_content(content)[source]

Set the content of the media.

Parameters

content – (str)

anndata.metadata module

filename

sppas.src.anndata.metadata.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Represent a set of metadata and an identifier.

class anndata.metadata.sppasDefaultMeta[source]

Bases: anndata.metadata.sppasMetaData

Dictionary of default meta data in SPPAS.

Many annotation tools are using metadata… Moreover, each annotation tool is encoding data with its own formalism. SPPAS aio API enables metadata to store information related to the read data in order to give them back when writing the data, either in the same file format or to export in another format. Such option is possible only if some kind of “generic” metadata names are fixed.

__init__()[source]

Instantiate a default set of meta data.

media()[source]

Add metadata related to a media.

For compatibility with sclite, xtrans, subtitle, elan, annotation pro.

speaker()[source]

Add metadata related to a speaker.

For compatibility with sclite, transcriber, xtrans, elan

tier()[source]

Add metadata related to a tier.

For compatibility with audacity and annotation pro.

class anndata.metadata.sppasMetaData[source]

Bases: object

Dictionary of meta data including a required ‘id’.

Meta data keys and values are unicode strings.

__init__()[source]

Create a sppasMetaData instance.

Add a GUID-like in the dictionary of metadata, with key “id”.

add_annotator_metadata(name='', version='', version_date='')[source]

Add metadata about an annotator.

Parameters
  • name – (str)

  • version – (str)

  • version_date – (str)

TODO: CHECK IF KEYS ARE NOT ALREADY EXISTING.

add_language_metadata()[source]

Add metadata about the language (und).

TODO: CHECK IF KEYS NOT ALREADY EXISTING.

add_license_metadata(idx)[source]

Add metadata about the license applied to the object (GPLv3).

add_project_metadata()[source]

Add metadata about the project this object is included-in.

Currently do not assign any value. TODO: CHECK IF KEYS NOT ALREADY EXISTING.

add_software_metadata()[source]

Add metadata about this software.

TODO: CHECK IF KEYS NOT ALREADY EXISTING.

gen_id()[source]

Re-generate an ‘id’.

get_id()[source]

Return the identifier of this object.

get_meta(entry, default='')[source]

Return the value of the given key.

Parameters
  • entry – (str) Entry to be checked as a key.

  • default – (str) Default value to return if entry is not a key.

Returns

(str) meta data value or default value

get_meta_keys()[source]

Return the list of metadata keys.

is_meta_key(entry)[source]

Check if an entry is a key in the list of metadata.

Parameters

entry – (str) Entry to check

Returns

(Boolean)

pop_meta(key)[source]

Remove a metadata from its key.

Parameters

key – (str)

set_meta(key, value)[source]

Set or update a metadata.

Parameters
  • key – (str) The key of the metadata.

  • value – (str) The value assigned to the key.

key, and value are formatted and stored in unicode.

anndata.tier module

filename

sppas.src.anndata.tier.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Represent a tier, i.e. a layer with annotations.

class anndata.tier.sppasTier(name=None, ctrl_vocab=None, media=None, parent=None)[source]

Bases: anndata.metadata.sppasMetaData

Representation of a tier, a structured set of annotations.

Annotations of a tier are sorted depending on their location (from lowest to highest).

A Tier is made of:

  • a name (used to identify the tier),

  • a set of metadata,

  • an array of annotations,

  • a controlled vocabulary (optional),

  • a media (optional),

  • a parent (optional).

__init__(name=None, ctrl_vocab=None, media=None, parent=None)[source]

Create a new sppasTier instance.

Parameters
  • name – (str) Name of the tier. It is used as identifier.

  • ctrl_vocab – (sppasCtrlVocab)

  • media – (sppasMedia)

  • parent – (sppasTranscription)

add(annotation)[source]

Add an annotation to the tier in sorted order.

Assign this tier as parent to the annotation.

Parameters

annotation – (sppasAnnotation)

Raises

AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError

Returns

the index of the annotation in the tier

append(annotation)[source]

Append the given annotation at the end of the tier.

Assign this tier as parent to the annotation.

Parameters

annotation – (sppasAnnotation)

Raises

AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError, TierAppendError

copy()[source]

Return a deep copy of the tier.

Returns

(sppasTier) including the ‘id’.

create_annotation(location, labels=None)[source]

Create and add a new annotation into the tier.

Parameters
  • location – (sppasLocation) the location(s) where the annotation happens

  • labels – (sppasLabel, list) the label(s) to stamp this annot.

Returns

sppasAnnotation

create_annotation_after(idx)[source]

Create and add a new annotation in the hole after idx.

Parameters

idx – (int) Index of an existing annotation

Returns

sppasAnnotation

:raises AnnDataTypeError, AnnDataIndexError

create_annotation_before(idx)[source]

Create and add a new annotation in the hole before idx.

Parameters

idx – (int) Index of an existing annotation

Returns

sppasAnnotation

:raises AnnDataTypeError, AnnDataIndexError

create_ctrl_vocab(name=None)[source]

Create the controlled vocabulary from annotation labels.

Create (or re-create) the controlled vocabulary from the list of already existing annotation labels. The current controlled vocabulary is deleted.

Parameters

name – (str) Name of the controlled vocabulary. The name of the tier is used by default.

export_to_intervals(separators)[source]

Create a tier with the consecutive filled intervals.

Return an empty tier if ‘self’ is not of type “interval”. The created intervals are not filled.

Parameters

separators – (list)

Returns

(sppasTier)

export_unfilled()[source]

Create a tier with the unlabelled/unfilled intervals.

Only for tiers of type Interval. It represents the “NOT tier”, ie where this tier is not annotated.

IMPORTANT: Never tested with overlapped annotations, actually not tested at all (but used in the plugin StatGroups).

Returns

(sppasTier) or None

find(begin, end, overlaps=True, indexes=False)[source]

Return a list of annotations between begin and end.

Parameters
  • begin – sppasPoint or None to start from the beginning of the tier

  • end – sppasPoint or None to end at the end of the tier

  • overlaps – (bool) Return also overlapped annotations. Not relevant for tiers with points.

  • indexes – (bool) Return indexes instead of annotations

Returns

List of sppasAnnotation or list of indexes

fit(other)[source]

Select then slice or extend annotations to fit in other tier.

Keep only the annotations of self that have some overlapping time with the given other tier and slice the localization of such selected annotations to exactly match those of the other tier.

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

tier1: |_a_|_b_| |_c_| |_d_| |_e_| |f| tier2: |_w_| |_______x_______| |_y_| |_z_|

tier1.fit(tier2) result is:

|a|b| |_c_| |_d_| |e|

tier2.fit(tier1) result is:

|w|w| |_x_| |_x_| |y|

Parameters

other – (sppasTier)

Returns

(sppasTier)

get_all_points()[source]

Return the list of all points of the tier.

get_annotation(identifier)[source]

Find an annotation from its metadata ‘id’.

Parameters

identifier – (str) Metadata ‘id’ of an annotation.

Returns

sppasAnnotation or None

get_annotation_index(ann)[source]

Find an annotation.

Parameters

ann – (sppasAnnotation)

Returns

(int) -1 if not found

get_ctrl_vocab()[source]

Return the controlled vocabulary of the tier.

get_first_point()[source]

Return the first point of the first annotation.

get_labels_type()[source]

Return the current type of labels, or an empty string.

get_last_point()[source]

Return the last point of the last location.

get_media()[source]

Return the media of the tier.

get_midpoint_intervals()[source]

Return midpoint values of all the intervals.

get_midpoint_points()[source]

Return midpoint values of all the points.

get_name()[source]

Return the identifier name of the tier.

get_nb_filled_labels()[source]

Return the number annotation with a filled label.

get_parent()[source]

Return the parent of the tier.

has_location(location)[source]

Return True if the tier has the given location.

to be tested.

has_point(point)[source]

Return True if the tier contains a given point.

Parameters

point – (sppasPoint) The point to find in the tier.

Returns

(bool)

index(moment)[source]

Return the index of the moment (int), or -1.

Only for tier with points.

Parameters

moment – (sppasPoint)

is_bool()[source]

All label tags are boolean values or None.

is_disjoint()[source]

Return True if the tier is made of disjoint localizations.

is_empty()[source]

Return True if the tier does not contain annotations.

is_float()[source]

All label tags are float values or None.

is_int()[source]

All label tags are integer values or None.

is_interval()[source]

Return True if the tier is made of interval localizations.

is_point()[source]

Return True if the tier is made of point localizations.

is_string()[source]

All label tags are string or unicode or None.

is_superset(other)[source]

Return True if this tier contains all points of the other tier.

Parameters

other – (sppasTier)

Returns

Boolean

lindex(moment)[source]

Return the index of the interval starting at a given moment, or -1.

Only for tier with intervals or disjoint. If the tier contains more than one annotation starting at the same moment, the method returns the first one.

Parameters

moment – (sppasPoint)

merge(idx, direction)[source]

Merge the annotation at given index with next or previous one.

if direction > 0:

ann_idx: [begin_idx, end_idx, labels_idx] next_ann: [begin_n, end_n, labels_n] result: [begin_idx, end_n, labels_idx + labels_n]

if direction < 0:

prev_ann: [begin_p, end_p, labels_p] ann_idx: [begin_idx, end_idx, labels_idx] result: [begin_p, end_idx, labels_p + labels_idx]

Parameters
  • idx – (int) Index of the annotation in the list

  • direction – (int) Positive for next, Negative for previous

Returns

(bool) False if direction does not match with index

Raise

Exception if merged annotation can’t be deleted of the tier

mindex(moment, bound=0)[source]

Return index of the interval containing the given moment.

Only for tier with intervals or disjoint.

If the tier contains more than one annotation at the same moment, the method returns the first one (i.e. the one which started at first).

Parameters
  • moment – (sppasPoint)

  • bound – (int) - 0 to exclude bounds of the interval; - -1 to include begin bound; - +1 to include end bound; - +2 to include both begin/end bounds; - others: the midpoint of moment is strictly inside

Returns

(int) Index of the 1st annotation containing moment or -1

near(moment, direction=1)[source]

Search for the annotation whose localization is closest.

Search for the nearest localization to the given moment into a given direction.

Parameters
  • moment – (sppasPoint)

  • direction – (int) - nearest 0 - nereast forward 1 - nereast backward -1

pop(index=- 1)[source]

Remove the annotation at the given position in the tier.

If no index is specified, pop() removes and returns the last annotation in the tier.

Parameters

index – (int) Index of the annotation to remove.

Raises

HierarchyContainsError

remove(begin, end, overlaps=False)[source]

Remove annotation intervals between begin and end.

Parameters
  • begin – (sppasPoint)

  • end – (sppasPoint)

  • overlaps – (bool)

Returns

the number of removed annotations

Raises

HierarchyContainsError

remove_unlabelled()[source]

Remove annotations without labels.

Do not remove an annotation if it invalidates the hierarchy.

Returns

the number of removed annotations

rindex(moment)[source]

Return the index of the interval ending at the given moment.

Only for tier with intervals or disjoint. If the tier contains more than one annotation ending at the same moment, the method returns the last one.

Parameters

moment – (sppasPoint)

set_ctrl_vocab(ctrl_vocab=None)[source]

Set a controlled vocabulary to this tier.

Parameters

ctrl_vocab – (sppasCtrlVocab or None)

Raises

AnnDataTypeError, CtrlVocabContainsError

set_media(media)[source]

Set a media to the tier.

Parameters

media – (sppasMedia)

Raises

AnnDataTypeError

set_name(name=None)[source]

Set the name of the tier.

If no name is given, an GUID is randomly assigned. Important: An empty string is accepted.

Parameters

name – (str) The identifier name or None.

Returns

the formatted name

set_parent(parent)[source]

Set the parent of the tier.

Parameters

parent – (sppasTranscription)

set_radius(radius)[source]

Fix a radius value to all points of the tier.

Parameters

radius – (int, float) New radius value

Raise

AnnDataTypeError, AnnDataNegValueError

split(idx)[source]

Split annotation at the given index into 2 annotations.

Parameters

idx – (int) Index of the annotation to split.

Returns

newly created annotation at index idx+1

validate()[source]
validate_annotation(annotation)[source]

Validate the annotation and set its parent to this tier.

Parameters

annotation – (sppasAnnotation)

Raises

AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError

validate_annotation_label(label)[source]

Validate a label.

Parameters

label – (sppasLabel)

Raises

CtrlVocabContainsError

validate_annotation_location(location)[source]

Ask the parent to validate a location.

Parameters

location – (sppasLocation)

Raises

AnnDataTypeError, HierarchyContainsError, HierarchyTypeError

anndata.transcription module

filename

sppas.src.anndata.transcription.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

The main object to represent transcriptions of recordings.

class anndata.transcription.sppasTranscription(name=None)[source]

Bases: anndata.metadata.sppasMetaData

Representation of a transcription, the root in our framework.

Transcriptions in SPPAS are represented with:

  • metadata: a list of tuple key/value;

  • a name (used to identify the transcription);

  • a list of tiers;

  • a hierarchy between tiers;

  • a list of media;

  • a list of controlled vocabularies.

Inter-tier relations are managed by establishing alignment or association links between 2 tiers:

  • alignment: annotations of a tier A (child) have only localization instances included in those of annotations of tier B (parent);

  • association: annotations of a tier A have exactly localization

instances included in those of annotations of tier B.

Example
>>> # Create an instance
>>> trs = sppasTranscription("trs name")
>>> # Create a tier
>>> trs.create_tier("tier name")
>>> # Get a tier of a transcription from its index:
>>> tier = trs[0]
>>> # Get a tier of a transcription from its name
>>> tier = trs.find("tier name")
>>> # Get a tier from its identifier
>>> tier = trs.get_object(guid)
__init__(name=None)[source]

Create a new sppasTranscription instance.

Parameters

name – (str) Name of the transcription.

add_ctrl_vocab(new_ctrl_vocab)[source]

Add a new controlled vocabulary in the list of ctrl vocab.

Parameters

new_ctrl_vocab – (sppasCtrlVocab)

Raises

AnnDataTypeError, TrsAddError

Validate and add a hierarchy link between 2 tiers.

Parameters
  • link_type – (constant) One of the hierarchy types

  • parent_tier – (Tier) The reference tier

  • child_tier – (Tier) The child tier to be linked to reftier

add_media(new_media)[source]

Add a new media in the list of media.

Does not add the media if a media with the same id is already in self.

Parameters

new_media – (sppasMedia) The media to add.

Raises

AnnDataTypeError, TrsAddError

append(tier)[source]

Append a new tier.

Parameters

tier – (sppasTier) the tier to append

contains(tier)[source]

Return True if the given tier is in the list of tiers.

create_tier(name=None, ctrl_vocab=None, media=None)[source]

Create and append a new empty tier.

Parameters
  • name – (str) the name of the tier to create

  • ctrl_vocab – (sppasCtrlVocab)

  • media – (sppasMedia)

Returns

newly created empty tier

find(name, case_sensitive=True)[source]

Find a tier from its name.

Parameters
  • name – (str) EXACT name of the tier

  • case_sensitive – (bool)

Returns

sppasTier or None

find_id(tier_id)[source]

Find a tier from its identifier.

Parameters

tier_id – (str) Exact identifier of the tier

Returns

sppasTier or None

get_ctrl_vocab_from_id(ctrl_vocab_id)[source]

Return a sppasCtrlVocab from its id or None.

Parameters

ctrl_vocab_id – (str) Identifier name of a ctrl vocab

get_ctrl_vocab_from_name(ctrl_vocab_name)[source]

Return a sppasCtrlVocab from its name or None.

Parameters

ctrl_vocab_name – (str) Identifier name of a ctrl vocabulary

get_ctrl_vocab_list()[source]

Return the list of controlled vocabularies.

get_hierarchy()[source]

Return the hierarchy.

get_max_loc()[source]

Return the sppasPoint with the highest value through all tiers.

get_media_from_id(media_id)[source]

Return a sppasMedia from its name or None.

Parameters

media_id – (str) Identifier name of a media

get_media_list()[source]

Return the list of sppasMedia.

get_min_loc()[source]

Return the sppasPoint with the lowest value through all tiers.

get_name()[source]

Return the name of the transcription.

get_object(identifier)[source]

Return the object matching the given identifier.

Parameters

identifier – (GUID)

get_tier_from_id(tier_id)[source]

Return a sppasTier from its id or None.

Parameters

tier_id – (str) Identifier name of a tier

get_tier_index(name, case_sensitive=True)[source]

Get the index of a tier from its name.

Parameters
  • name – (str) EXACT name of the tier

  • case_sensitive – (bool)

Returns

index or -1 if not found

get_tier_index_id(identifier)[source]

Get the index of a tier from its id.

Parameters

identifier – (str) GUID

Returns

index or -1 if not found

get_tier_list()[source]

Return the list of tiers.

is_empty()[source]

Return True if the transcription does not contain tiers.

pop(index=- 1)[source]

Remove the tier at the given position in the transcription.

Return it. If no index is specified, pop() removes and returns the last tier in the transcription.

Parameters

index – (int) Index of the tier to remove.

Returns

(sppasTier)

Raise

AnnDataIndexError

remove_ctrl_vocab(old_ctrl_vocab)[source]

Remove a controlled vocabulary of the list of ctrl vocab.

Parameters

old_ctrl_vocab – (sppasCtrlVocab)

Raises

AnnDataTypeError, TrsRemoveError

remove_media(old_media)[source]

Remove a media of the list of media.

Parameters

old_media – (sppasMedia)

Raises

AnnDataTypeError, TrsRemoveError

rename_tier(tier)[source]

Rename a tier by appending a digit.

Parameters

tier – (sppasTier) The tier to rename.

set_ctrl_vocab_list(ctrl_vocab_list)[source]

Set the list of controlled vocabularies.

Parameters

ctrl_vocab_list – (list)

Returns

list of rejected ctrl_vocab

set_media_list(media_list)[source]

Set the list of media.

Parameters

media_list – (list)

Returns

list of rejected media

set_name(name=None)[source]

Set the name of the transcription.

Parameters

name – (str or None) The identifier or None to set the GUID.

Returns

the name

set_tier_index(name, new_index, case_sensitive=True)[source]

Set the index of a tier from its name.

THIS SHOULD NEVER BE USED. USE TIER IDENTIFIER INSTEAD OF ITS NAME.

Parameters
  • name – (str) EXACT name of the tier

  • new_index – (int) New index of the tier in self

  • case_sensitive – (bool)

Returns

index or -1 if not found

set_tier_index_id(identifier, new_index)[source]

Set the index of a tier from its identifier.

Parameters
  • identifier – (str)

  • new_index – (int) New index of the tier in self

Returns

index or -1 if not found

shift(delay)[source]

Shift all annotation’ location to a given delay.

Parameters

delay – (int, float) delay to shift all localizations

Raise

AnnDataTypeError

property tiers

Return the list of tiers.

validate_annotation_location(tier, location)[source]

Validate a location.

Parameters
  • tier – (Tier) The reference tier

  • location – (sppasLocation)

Raises

AnnDataTypeError, HierarchyContainsError, HierarchyTypeError

validate_hierarchy(tier)[source]

Module contents

filename

sppas.src.anndata.__init__.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Package to manage annotated data.

anndata: management of transcribed data.

anndata is a free and open source Python library to access and search data from annotated data. It can convert file formats like Elan’s EAF, Praat’s TextGrid and others into a sppasTranscription() object and convert into any of these formats. Those objects allow unified access to linguistic data from a wide range sources.

It requires the following other packages:

  • config

  • utils

class anndata.FileFormatProperty(extension)[source]

Bases: object

Represent one format and its properties.

__init__(extension)[source]

Create a FileFormatProperty instance.

Parameters

extension – (str) File name extension.

get_extension()[source]

Return the extension, including the initial dot.

get_reader()[source]

Return True if SPPAS can read files of the extension.

get_software()[source]

Return the name of the software matching the extension.

get_trs_type()[source]

Return the transcription type: ANNOT, MEASURE, TABLE or None.

get_writer()[source]

Return True if SPPAS can write files of the extension.

anndata.format_label(text, empty='', tag_type='str')[source]

Create a label from a text.

Use the “{ | }” system to parse the alternative tags and = for scores.

Parameters
  • text – (str)

  • empty – (str) The text representing an empty tag.

  • tag_type – (str): One of: (‘str’, ‘int’, ‘float’, ‘bool’).

Returns

sppasLabel

anndata.format_labels(text, separator='\n', empty='', tag_type='str')[source]

Create a set of labels from a text.

Use the separator to split the text into labels. Use the “{ | }” system to parse the alternative tags.

Examples

text = “{le|les} {chat|chats}” is 2 labels with 2 tags each text = “{le=0.6|les=0.4}” is a label with 2 tags and their score

Parameters
  • text – (str)

  • separator – (str) String to separate labels.

  • empty – (str) The text representing an empty tag.

  • tag_type – (str): One of: (‘str’, ‘int’, ‘float’, ‘bool’).

Returns

list of sppasLabel

anndata.serialize_label(label, empty='', alt=True)[source]

Convert the label into a string, include or not alternative tags.

Use the “{ | }” system to serialize the alternative tags. Scores of the tags are not returned.

Parameters
  • label – (sppasLabel)

  • empty – (str) The text to return if a tag is empty or not set.

  • alt – (bool) Include alternative tags

Returns

(str)

anndata.serialize_labels(labels, separator='\n', empty='', alt=True)[source]

Create a text from a list of labels.

Use the separator to split the text into labels. Use the “{ | }” system to parse the alternative tags and = for scores.

Parameters
  • labels – (list of sppasLabel)

  • separator – (str) String separating labels

  • empty – (str) The text representing an empty tag

  • alt – (bool) Include alternative tags. If False, only the best tag is serialized.

Returns

list of sppasLabel

class anndata.sppasAnnSet[source]

Bases: sppas.src.structs.basefset.sppasBaseSet

Manager for a set of annotations.

Mainly used with the data that are the result of the tier filter system. A sppasAnnSet() manages a dictionary with:

  • key: an annotation

  • value: a list of strings

__init__()[source]

Create a sppasAnnSet instance.

copy()[source]

Make a deep copy of self.

Overridden to return a sppasAnnSet() instead of a sppasBaseSet().

to_tier(name='AnnSet', annot_value=False)[source]

Create a tier from the data set.

Parameters
  • name – (str) Name of the tier to be returned

  • annot_value – (bool) format of the resulting annotation label. By default, the label of the annotation is used. Instead, its value in the data set is used.

Returns

(sppasTier)

class anndata.sppasAnnotation(location, labels=[])[source]

Bases: sppas.src.anndata.metadata.sppasMetaData

Represents an annotation.

A sppasAnnotation() is a container for:

  • a sppasLocation()

  • a list of sppasLabel()

Example
>>> location = sppasLocation(sppasPoint(1.5, radius=0.01))
>>> labels = sppasLabel(sppasTag("foo"))
>>> ann = sppasAnnotation(location, labels)
__init__(location, labels=[])[source]

Create a new sppasAnnotation instance.

Parameters
  • location – (sppasLocation) the location(s) where the annotation happens

  • labels – (sppasLabel, list) the label(s) to stamp this annotation, or a list of them.

add_tag(tag, score=None, label_idx=0)[source]

Append an alternative tag in a label.

Parameters
  • tag – (sppasTag)

  • score – (float)

  • label_idx – (int)

Raises

AnnDataTypeError, IndexError

append_label(label)[source]

Append a label into the list of labels.

Parameters

label – (sppasLabel)

contains_localization(localization)[source]

Return True if the given localization is in the location.

contains_tag(tag, function='exact', reverse=False, label_idx=0)[source]

Return True if the given tag is in the label.

Parameters
  • tag – (sppasTag)

  • function – Search function

  • reverse – Reverse the function.

  • label_idx – (int)

copy()[source]

Return a full copy of the annotation.

The location, the labels and the metadata are all copied. The ‘id’ of the returned annotation is then the same.

Returns

sppasAnnotation()

get_all_points()[source]

Return the list of a copy of all points of this annotation.

get_best_tag(label_idx=0)[source]

Return the tag with the highest score of a label or an empty str.

Parameters

label_idx – (int)

get_highest_localization()[source]

Return a copy of the sppasPoint with the highest loc.

get_label_type()[source]

Return the current type of tags, or an empty string.

get_labels()[source]

Return the list of sppasLabel() of this annotation.

get_labels_best_tag()[source]

Return a list with the best tag of each label.

get_location()[source]

Return the sppasLocation() of this annotation.

get_lowest_localization()[source]

Return a copy of the sppasPoint with the lowest localization.

get_parent()[source]

Return the parent tier or None.

get_score()[source]

Return the score of this annotation or None if no score is set.

is_labelled()[source]

Return True if at least a sppasTag exists and is not None.

label_is_bool()[source]

Return True if the type of the labels is ‘bool’.

label_is_filled()[source]

Return True if at least one BEST tag is filled.

label_is_float()[source]

Return True if the type of the labels is ‘float’.

label_is_int()[source]

Return True if the type of the labels is ‘int’.

label_is_string()[source]

Return True if the type of the labels is ‘str’.

location_is_disjoint()[source]

Return True if the location is made of sppasDisjoint locs.

location_is_interval()[source]

Return True if the location is made of sppasInterval locs.

location_is_point()[source]

Return True if the location is made of sppasPoint locs.

remove_tag(tag, label_idx=0)[source]

Remove an alternative tag of the label.

Parameters
  • tag – (sppasTag) the tag to be removed of the list.

  • label_idx – (int)

serialize_labels(separator='\n', empty='', alt=True)[source]

DEPRECATED. Return labels serialized into a string.

TODO: REMOVE THIS METHOD. Use aioutils.serialize_labels() instead.

Parameters
  • separator – (str) String to separate labels.

  • empty – (str) The text to return if a tag is empty or not set.

  • alt – (bool) Include alternative tags

Returns

(str)

set_best_localization(localization)[source]

Set the best localization of the location.

Parameters

localization – (sppasBaseLocalization)

set_labels(labels=[])[source]

Fix/reset the list of labels of this annotation.

Parameters

labels – (sppasLabel, list) the label(s) to stamp this annotation, or a list of them.

Raises

AnnDataTypeError, TypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError

set_parent(parent=None)[source]

Set a parent tier.

Parameters

parent – (sppasTier) The parent tier of this annotation.

Raises

CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError

set_score(score=None)[source]

Set or reset the score to this annotation.

Parameters

score – (float)

validate()[source]

Validate the annotation.

Check if the labels and location match the requirements.

Raises

TypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError

validate_label(label)[source]

Validate the label.

Check if the label matches the requirements of this annotation.

Raises

CtrlVocabContainsError, TypeError

validate_location()[source]

Validate the location of the annotation.

Raises

class anndata.sppasCtrlVocab(name, description='')[source]

Bases: anndata.metadata.sppasMetaData

Generic representation of a controlled vocabulary.

A controlled Vocabulary is a set of tags. It is used to restrict the use of tags in a label: only the accepted tags can be set to a label.

A controlled vocabulary is made of an identifier name, a description and a list of pairs tag/description.

__init__(name, description='')[source]

Create a new sppasCtrlVocab instance.

Parameters
  • name – (str) Identifier name of the controlled vocabulary

  • description – (str)

add(tag, description='')[source]

Add a tag to the controlled vocab.

Parameters
  • tag – (sppasTag): the tag to add.

  • description – (str)

Returns

Boolean

contains(tag)[source]

Test if a tag is in the controlled vocabulary.

Attention: Do not check the instance but the data content of the tag.

Parameters

tag – (sppasTag) the tag to check.

Returns

Boolean

get_description()[source]

Return the unicode str of the description of the ctrl vocab.

get_name()[source]

Return the name of the controlled vocabulary.

get_tag_description(tag)[source]

Return the unicode string of the description of an entry.

Parameters

tag – (sppasTag) the tag to get the description.

Returns

(str)

remove(tag)[source]

Remove a tag of the controlled vocab.

Parameters

tag – (sppasTag) the tag to remove.

Returns

Boolean

set_description(description='')[source]

Set the description of the controlled vocabulary.

Parameters

description – (str)

set_tag_description(tag, description)[source]

Set the unicode string of the description of an entry.

Parameters
  • tag – (sppasTag) the tag to get the description.

  • description – (str)

Returns

(str)

validate_tag(tag)[source]

Check if the given tag can be added to the ctrl vocabulary.

Parameters

tag – (sppasTag) the tag to check.

Returns

Boolean

class anndata.sppasDisjoint(intervals=None)[source]

Bases: anndata.ann.annlocation.localization.sppasBaseLocalization

Localization of a serie of intervals in time.

__init__(intervals=None)[source]

Create a new sppasDisjoint instance.

Parameters

intervals – (list of sppasInterval)

append_interval(interval)[source]

Return the sppasInterval at the given index.

Parameters

interval – (sppasInterval)

copy()[source]

Return a deep copy of self.

duration()[source]

Return the sppasDuration.

Make the sum of all interval’ durations.

get_begin()[source]

Return the first sppasPoint instance.

get_end()[source]

Return the last sppasPoint instance.

get_interval(index)[source]

Return the sppasInterval at the given index.

Parameters

index – (int)

get_intervals()[source]

Return the list of intervals.

is_bound(point)[source]

Return True if point is a bound of an interval.

is_disjoint()[source]

Return True because self is representing a disjoint intervals.

middle()[source]

Return a sppasPoint() at the middle of the time interval.

To be tested.

Returns

(sppasPoint)

set(other)[source]

Set self members from another sppasDisjoint instance.

Parameters

other – (sppasDisjoint)

set_begin(tp)[source]

Set the begin sppasPoint instance to new sppasPoint.

Parameters

tp – (sppasPoint)

set_end(tp)[source]

Set the end sppasPoint instance to new sppasPoint.

Parameters

tp – (sppasPoint)

set_intervals(intervals)[source]

Set a new list of intervals.

Parameters

intervals – list of sppasInterval.

set_radius(radius)[source]

Set radius value to all points.

shift(delay)[source]

Shift all the intervals to a given delay.

Parameters

delay – (int, float) delay to shift bounds

Raise

AnnDataTypeError

class anndata.sppasDuration(value, vagueness=0.0)[source]

Bases: object

Representation of a duration with vagueness.

Represents a duration identified by 2 float values:

  • the duration value;

  • the duration margin.

__init__(value, vagueness=0.0)[source]

Create a new sppasDuration instance.

Parameters
  • value – (float) value of the duration.

  • vagueness – (float) represents the vagueness of the value.

copy()[source]

Return a deep copy of self.

get()[source]

Return myself.

get_margin()[source]

Return the vagueness of the duration (float).

get_value()[source]

Return the duration value (float).

set(other)[source]

Set the value/vagueness of another sppasDuration instance.

Parameters

other – (sppasDuration)

set_margin(vagueness)[source]

Fix the vagueness margin of the duration.

Parameters

vagueness – (float) the duration margin.

set_value(value)[source]

Set the duration to a new value.

Parameters

value – (float) the new duration value.

class anndata.sppasDurationCompare[source]

Bases: sppas.src.structs.basecompare.sppasBaseCompare

Comparison methods for sppasDuration.

__init__()[source]

Create a sppasDurationCompare instance.

static eq(duration, x)[source]

Return True if duration is equal to x.

Parameters
  • duration – (sppasDuration)

  • x – (int, float)

Returns

(bool)

static ge(duration, x)[source]

Return True if duration is greater or equal than x.

Parameters
  • duration – (sppasDuration)

  • x – (int, float)

Returns

(bool)

static gt(duration, x)[source]

Return True if duration is greater than x.

Parameters
  • duration – (sppasDuration)

  • x – (int, float)

Returns

(bool)

static le(duration, x)[source]

Return True if duration is lower or equal than x.

Parameters
  • duration – (sppasDuration)

  • x – (int, float)

Returns

(bool)

static lt(duration, x)[source]

Return True if duration is lower than x.

Parameters
  • duration – (sppasDuration)

  • x – (int, float)

Returns

(bool)

static ne(duration, x)[source]

Return True if duration is different to x.

Parameters
  • duration – (sppasDuration)

  • x – (int, float)

Returns

(bool)

class anndata.sppasHierarchy[source]

Bases: anndata.metadata.sppasMetaData

Generic representation of a hierarchy between tiers.

Two types of hierarchy are considered:

  • TimeAssociation: the points of a child tier are all equals to the points of a reference tier, as for example:

    parent: Words | l’ | âne | est | là |
    child: Lemmas | le | âne | être | là |
  • TimeAlignment: the points of a child tier are all included in the set of points of a reference tier, as for example:

    parent: Phonemes | l | a | n | e | l | a |
    child: Words | l’ | âne | est | là |

    parent: Phonemes | l | a | n | e | l | a |
    child: Syllables | l.a | n.e | l.a |

In that example, notice that there’s no hierarchy link between “Tokens” and “Syllables” and notice that “Phonemes” is the grand-parent of “Lemmas”.

And the following obvious rules are applied:

  • A child can have ONLY ONE parent!

  • A parent can have as many children as wanted.

  • A hierarchy is a tree, not a graph.

Todo is to consider a time association that is not fully completed:

parent: Tokens | l’ | âne | euh | euh | est | là | @ |
child: Lemmas | le | âne | | être | là |
__init__()[source]

Create a new sppasHierarchy instance.

Validate and add a hierarchy link between 2 tiers.

Parameters
  • link_type – (constant) One of the hierarchy types

  • parent_tier – (sppasTier) The parent tier

  • child_tier – (sppasTier) The child tier to be linked to parent

copy()[source]

Return a deep copy of the hierarchy.

get_ancestors(child_tier)[source]

Return all the direct ancestors of a tier.

Parameters

child_tier – (sppasTier)

Returns

List of tiers with parent, grand-parent, grand-grand-parent…

get_children(parent_tier, link_type=None)[source]

Return the list of children of a tier, for a given type.

Parameters
  • parent_tier – (sppasTier) The child tier to found

  • link_type – (str) The type of hierarchy

Returns

List of tiers

get_hierarchy_type(child_tier)[source]

Return the hierarchy type between a child tier and its parent.

Returns

(str) one of the hierarchy type

get_parent(child_tier)[source]

Return the parent tier for a given child tier.

Parameters

child_tier – (sppasTier) The child tier to found

static infer_hierarchy_type(tier1, tier2)[source]

Test if tier1 can be a parent tier for tier2.

Returns

One of hierarchy types or an empty string

remove_child(child_tier)[source]

Remove a hierarchy link between a parent and a child.

Parameters

child_tier – (sppasTier) The tier linked to a reference

remove_parent(parent_tier)[source]

Remove all hierarchy links between a parent and its children.

Parameters

parent_tier – (sppasTier) The parent tier

remove_tier(tier)[source]

Remove all occurrences of a tier inside the hierarchy.

Parameters

tier – (sppasTier) The tier to remove as parent or child.

types = {'TimeAlignment', 'TimeAssociation'}

Validate a hierarchy link between 2 tiers.

Parameters
  • link_type – (constant) One of the hierarchy types

  • parent_tier – (sppasTier) The parent tier

  • child_tier – (sppasTier) The child tier to be linked to parent

Raises

AnnDataTypeError, HierarchyParentTierError, HierarchyChildTierError, HierarchyAncestorTierError, HierarchyAlignmentError, HierarchyAssociationError

static validate_time_alignment(parent_tier, child_tier)[source]

Validate a time alignment hierarchy link between 2 tiers.

Parameters
  • parent_tier – (sppasTier) The parent tier

  • child_tier – (sppasTier) The child tier to be linked to parent

Raises

HierarchyAlignmentError

static validate_time_association(parent_tier, child_tier)[source]

Validate a time association hierarchy link between 2 tiers.

Parameters
  • parent_tier – (sppasTier) The parent tier

  • child_tier – (sppasTier) The child tier to be linked to the parent

Raises

HierarchyAssociationError

class anndata.sppasInterval(begin, end)[source]

Bases: anndata.ann.annlocation.localization.sppasBaseLocalization

Localization of an interval between two sppasPoint instances.

An interval is identified by two sppasPoint objects:

  • one is representing the beginning of the interval;

  • the other is representing the end of the interval.

__init__(begin, end)[source]

Create a new sppasInterval instance.

Parameters
  • begin – (sppasPoint)

  • end – (sppasPoint)

Degenerated interval is forbidden, i.e. begin > end.

static check_interval_bounds(begin, end)[source]

Check bounds of a virtual interval.

Parameters
  • begin – (sppasPoint)

  • end – (sppasPoint)

static check_types(begin, end)[source]

True only if begin and end are both the same types of sppasPoint.

Parameters
  • begin – any kind of data

  • end – any kind of data

Returns

Boolean

combine(other)[source]

Return a sppasInterval, the combination of two intervals.

Parameters

other – (sppasInterval) the other interval to combine with.

copy()[source]

Return a deep copy of self.

duration()[source]

Overridden. Return the duration of the time interval.

Returns

(sppasDuration) Duration and its vagueness.

get_begin()[source]

Return the begin sppasPoint instance.

get_end()[source]

Return the end sppasPoint instance.

is_bound(point)[source]

Return True if point is the begin or the end of the interval.

is_float()[source]
is_int()[source]
is_interval()[source]

Overrides. Return True, because self represents an interval.

middle()[source]

Return a sppasPoint() at the middle of the time interval.

To be tested.

Returns

(sppasPoint)

middle_value()[source]

Return the middle value of the time interval.

Return a float value even if points are integers.

Returns

(float) value.

set(other)[source]

Set self members from another sppasInterval instance.

Parameters

other – (sppasInterval)

set_begin(tp)[source]

Set the begin of the interval to a new sppasPoint.

Attention: it is a reference assignment.

Parameters

tp – (sppasPoint)

set_end(tp)[source]

Set the end of the interval to a new sppasPoint.

Attention: it is a reference assignment.

Parameters

tp – (sppasPoint)

set_radius(radius)[source]

Set a radius value to begin and end points.

Parameters

radius – (int or float)

Raise

ValueError

shift(delay)[source]

Shift the interval to a given delay.

Parameters

delay – (int, float) delay to shift bounds

Raise

AnnDataTypeError

union(other)[source]

Return a sppasInterval representing the union of two intervals.

Parameters

other – (sppasInterval) the other interval to merge with.

class anndata.sppasLabel(tag, score=None)[source]

Bases: object

Represent the content of an annotation.

sppasLabel allows to store a set of sppasTags with their scores. This class is using a list of lists, i.e. a list of pairs (tag, score). This is the best compromise between memory usage, speed and readability.

A label is a list of possible sppasTag(), represented as a UNICODE string. A data type can be associated, as sppasTag() can be ‘int’, ‘float’ or ‘bool’.

__init__(tag, score=None)[source]

Create a new sppasLabel instance.

Parameters
  • tag – (sppasTag or list of sppasTag or None)

  • score – (float or list of float or None)

append(tag, score=None)[source]

Add a sppasTag into the list.

Do not add the tag if this alternative is already inside the list, but add the scores.

Parameters
  • tag – (sppasTag)

  • score – (float)

append_content(content, data_type='str', score=None)[source]

Add a text into the list.

Parameters
  • content – (str)

  • data_type – (str): The type of this text content.

One of: (str, int, float, bool) :param score: (float)

copy()[source]

Return a deep copy of the label.

get_best()[source]

Return the best sppasTag, i.e. the one with the better score.

Returns

(sppasTag or None)

get_score(tag)[source]

Return the score of a tag or None if tag is not in the label.

Parameters

tag – (sppasTag)

Returns

score: (float)

get_type()[source]

Return the type of the tags content.

is_bool()[source]

Return True if tags are of type “bool”.

Return False if no tag is set.

is_float()[source]

Return True if tags are of type “float”.

Return False if no tag is set.

is_int()[source]

Return True if tags are of type “int”.

Return False if no tag is set.

is_point()[source]

Return True if tags are of type “point”.

Return False if no tag is set.

is_string()[source]

Return True if tags are string or unicode.

Return False if no tag is set.

is_tagged()[source]

Return False if no tag is set.

match(tag_functions, logic_bool='and')[source]

Return True if a tag matches all or any of the functions.

Parameters
  • tag_functions – list of (function, value, logical_not)

  • logic_bool – (str) Apply a logical “and” or a logical “or” between the functions.

Returns

(bool)

  • function: a function in python with 2 arguments: tag/value

  • value: the expected value for the tag

  • logical_not: boolean

Example

Search if a tag is exactly matching “R”:

>>> l.match([(exact, "R", False)])
Example

Search if a tag is starting with “p” or starting with “t”:

>>> l.match([(startswith, "p", False),
>>>          (startswith, "t", False), ], logic_bool="or")
remove(tag)[source]

Remove a tag of the list.

Parameters

tag – (sppasTag) the tag to be removed of the list.

serialize(empty='', alt=True)[source]

Convert the label into a string, include or not alternative tags.

@DeprecationWarning Use aioutils.serialize_label() instead.

set_score(tag, score)[source]

Set a score to a given tag.

Parameters
  • tag – (sppasTag)

  • score – (float)

class anndata.sppasLocation(localization=None, score=None)[source]

Bases: object

Location of an annotation of a tier.

sppasLocation allows to store a set of localizations with their scores. This class is using a list of lists, i.e. a list of pairs (localization, score). This is the best compromise between memory usage, speed and readability.

__init__(localization=None, score=None)[source]

Create a new sppasLocation instance and add the entry.

Parameters
  • localization – (Localization or list of localizations)

  • score – (float or list of float)

If a list of alternative localizations are given, the same score is assigned to all items.

append(localization, score=None)[source]

Add a localization into the list.

Parameters
  • localization – (Localization) the localization to append

  • score – (float)

contains(point)[source]

Return True if the localization point is in the list.

copy()[source]

Return a deep copy of the location.

get_best()[source]

Return a copy of the best localization.

Returns

(sppasLocalization) localization with the highest score.

get_highest_localization()[source]

Return a copy of the sppasPoint with the highest loc.

get_lowest_localization()[source]

Return a copy of the sppasPoint with the lowest localization.

get_score(loc)[source]

Return the score of a localization or None if it is not in.

Parameters

loc – (sppasLocalization)

Returns

score: (float)

is_disjoint()[source]

Return True if the location is made of sppasDisjoint locs.

is_interval()[source]

Return True if the location is made of sppasInterval locs.

is_point()[source]

Return True if the location is made of sppasPoint localizations.

match_duration(dur_functions, logic_bool='and')[source]

Return True if a duration matches all or any of the functions.

Parameters
  • dur_functions – list of (function, value, logical_not)

  • logic_bool – (str) Apply a logical “and” or “or”

Returns

(bool)

  • function: a function in python with 2 arguments: dur/value

  • value: the expected value for the duration (int/float/sppasDuration)

  • logical_not: boolean

Example

Search if a duration is exactly 30ms

>>> d.match([(eq, 0.03, False)])
Example

Search if a duration is not 30ms

>>> d.match([(eq, 0.03, True)])
>>> d.match([(ne, 0.03, False)])
Example

Search if a duration is comprised between 0.3 and 0.7 >>> l.match([(ge, 0.03, False), >>> (le, 0.07, False)], logic_bool=”and”)

See sppasDurationCompare() to get a list of functions.

match_localization(loc_functions, logic_bool='and')[source]

Return True if a localization matches all or any of the functions.

Parameters
  • loc_functions – list of (function, value, logical_not)

  • logic_bool – (str) Apply a logical “and” or a logical “or”

between the functions. :returns: (bool)

  • function: a function in python with 2 arguments: loc/value

  • value: the expected value for the localization (int/float/sppasPoint)

  • logical_not: boolean

Example

Search if a localization is after (or starts at) 1 minutes

>>> l.match([(rangefrom, 60., False)])
Example

Search if a localization is before (or ends at) 3 minutes

>>> l.match([(rangeto, 180., True)])
Example

Search if a localization is between 1 min and 3 min

>>> l.match([(rangefrom, 60., False),
>>>          (rangeto, 180., False)], logic_bool="and")

See sppasLocalizationCompare() to get a list of functions.

remove(localization)[source]

Remove a localization of the list.

Parameters

localization – (sppasLocalization) the loc to be removed

set_radius(radius)[source]

Set a radius value to all localizations.

Parameters

radius – (int, float) New radius value

Raise

AnnDataTypeError, AnnDataNegValueError

set_score(loc, score)[source]

Set a score to a given localization.

Parameters
  • loc – (sppasLocalization)

  • score – (float)

shift(delay)[source]

Shift the location to a given delay.

Parameters

delay – (int, float) delay to shift all localizations

Raise

AnnDataTypeError

class anndata.sppasMedia(filename, media_id=None, mime_type=None)[source]

Bases: anndata.metadata.sppasMetaData

Generic representation of a media file.

__init__(filename, media_id=None, mime_type=None)[source]

Create a new sppasMedia instance.

Parameters
  • filename – (str) File name of the media

  • media_id – (str) Identifier of the media

  • mime_type – (str) Mime type of the media

get_content()[source]

Return the content of the media.

get_filename()[source]

Return the URL of the media.

get_mime_type()[source]

Return the mime type of the media.

set_content(content)[source]

Set the content of the media.

Parameters

content – (str)

class anndata.sppasMetaData[source]

Bases: object

Dictionary of meta data including a required ‘id’.

Meta data keys and values are unicode strings.

__init__()[source]

Create a sppasMetaData instance.

Add a GUID-like in the dictionary of metadata, with key “id”.

add_annotator_metadata(name='', version='', version_date='')[source]

Add metadata about an annotator.

Parameters
  • name – (str)

  • version – (str)

  • version_date – (str)

TODO: CHECK IF KEYS ARE NOT ALREADY EXISTING.

add_language_metadata()[source]

Add metadata about the language (und).

TODO: CHECK IF KEYS NOT ALREADY EXISTING.

add_license_metadata(idx)[source]

Add metadata about the license applied to the object (GPLv3).

add_project_metadata()[source]

Add metadata about the project this object is included-in.

Currently do not assign any value. TODO: CHECK IF KEYS NOT ALREADY EXISTING.

add_software_metadata()[source]

Add metadata about this software.

TODO: CHECK IF KEYS NOT ALREADY EXISTING.

gen_id()[source]

Re-generate an ‘id’.

get_id()[source]

Return the identifier of this object.

get_meta(entry, default='')[source]

Return the value of the given key.

Parameters
  • entry – (str) Entry to be checked as a key.

  • default – (str) Default value to return if entry is not a key.

Returns

(str) meta data value or default value

get_meta_keys()[source]

Return the list of metadata keys.

is_meta_key(entry)[source]

Check if an entry is a key in the list of metadata.

Parameters

entry – (str) Entry to check

Returns

(Boolean)

pop_meta(key)[source]

Remove a metadata from its key.

Parameters

key – (str)

set_meta(key, value)[source]

Set or update a metadata.

Parameters
  • key – (str) The key of the metadata.

  • value – (str) The value assigned to the key.

key, and value are formatted and stored in unicode.

class anndata.sppasPoint(midpoint, radius=None)[source]

Bases: anndata.ann.annlocation.localization.sppasBaseLocalization

Localization of a point for any numerical representation.

Represents a point identified by a midpoint value and a radius value. Generally, time is represented in seconds, as a float value ; frames are represented by integers like ranks.

In this class, the 3 relations <, = and > take into account a radius value, that represents the uncertainty of the localization. For a point x, with a radius value of rx, and a point y with a radius value of ry, these relations are defined as:

  • x = y iff |x - y| <= rx + ry

  • x < y iff not(x = y) and x < y

  • x > y iff not(x = y) and x > y

Example 1

Strictly equals:

  • x = 1.000, rx=0.

  • y = 1.000, ry=0.

  • x = y is true

  • x = 1.00000000000, rx=0.

  • y = 0.99999999675, ry=0.

  • x = y is false

Example 2

Using the radius:

  • x = 1.0000000000, rx=0.0005

  • y = 1.0000987653, ry=0.0005

  • x = y is true (accepts a margin of 1ms between x and y)

  • x = 1.0000000, rx=0.0005

  • y = 1.0011235, ry=0.0005

  • x = y is false

So… an overlap of the vagueness “area” makes the two points equals: |------------rx----------X-----ry===rx----Y--------ry------|

__init__(midpoint, radius=None)[source]

Create a sppasPoint instance.

Parameters
  • midpoint – (float, int) midpoint value.

  • radius – (float, int) represents the vagueness of the point.

Radius must be of the same type as midpoint.

static check_types(x, y)[source]

True only if midpoint and radius are both of the same types.

Parameters
  • x – any kind of data

  • y – any kind of data

Returns

Boolean

copy()[source]

Return a deep copy of self.

duration()[source]

Overrides. Return the duration of the point.

Returns

(sppasDuration) Duration and its vagueness.

get_midpoint()[source]

Return the midpoint value.

get_radius()[source]

Return the radius value (float or None).

is_float()[source]

Return True if the value of the point is a float.

is_int()[source]

Return True if the value of the point is an integer.

is_point()[source]

Override. Return True, because self represents a point.

set(other)[source]

Set self members from another sppasPoint instance.

Parameters

other – (sppasPoint)

set_midpoint(midpoint)[source]

Set the midpoint value.

In versions < 1.9.8, it was required that midpoint >= 0. Negative values are now accepted because some annotations are not properly synchronized and then some of them can be negative.

Parameters

midpoint – (float, int) is the new midpoint value.

Raise

AnnDataTypeError

set_radius(radius=None)[source]

Fix the radius value, ie. the vagueness of the point.

The midpoint value must be set first.

Parameters

radius – (float, int, None) the radius value

Raise

AnnDataTypeError, AnnDataNegValueError

shift(delay)[source]

Shift the point to a given delay.

Parameters

delay – (int, float) delay to shift midpoint

Raise

AnnDataTypeError

class anndata.sppasTag(tag_content, tag_type=None)[source]

Bases: object

Represent one of the possible tags of a label.

A sppasTag is a data content of any type. By default, the type of the data is “str” and the content is empty, but internally the sppasTag stores ‘None’ values because None is 16 bits and an empty string is 37.

A sppasTag() content can be one of the following types:

  1. string/unicode - (str)

  2. integer - (int)

  3. float - (float)

  4. boolean - (bool)

  5. point - (sppasFuzzyPoint)

  6. rect - (sppasFuzzyRect)

Get access to the content with the get_content() method and to the typed content with get_typed_content().

>>> t1 = sppasTag("2")                      # "2" (str)
>>> t2 = sppasTag(2)                        # "2" (str)
>>> t3 = sppasTag(2, tag_type="int")        # 2 (int)
>>> t4 = sppasTag("2", tag_type="int")      # 2 (int)
>>> t5 = sppasTag("2", tag_type="float")    # 2. (float)
>>> t6 = sppasTag("true", tag_type="bool")  # True (bool)
>>> t7 = sppasTag(0, tag_type="bool")       # False (bool)
>>> t8 = sppasTag((27, 32), tag_type="point")  # x=27, y=32 (point)
>>> t9 = sppasTag((27, 32, 320, 200), tag_type="rect")
TAG_TYPES = ('str', 'float', 'int', 'bool', 'point', 'rect')
__init__(tag_content, tag_type=None)[source]

Initialize a new sppasTag instance.

Parameters
  • tag_content – (any) Data content

  • tag_type – (str): The type of this content. One of: (‘str’, ‘int’, ‘float’, ‘bool’, ‘point’, ‘rect’).

‘str’ is the default tag_type.

copy()[source]

Return a deep copy of self.

get_content()[source]

Return an unicode string corresponding to the content.

Also returns a unicode string in case of a list (elements are separated by a whitespace).

Returns

(unicode)

get_type()[source]

Return the type of the tag content.

get_typed_content()[source]

Return the content value, in its appropriate type.

Excepted for strings which are systematically returned as unicode.

is_dummy()[source]

Return True if the tag is a dummy label.

is_empty()[source]

Return True if the tag is an empty string.

is_laugh()[source]

Return True if the tag is a laughing.

is_noise()[source]

Return True if the tag is a noise.

is_pause()[source]

Return True if the tag is a short pause.

is_silence()[source]

Return True if the tag is a silence.

is_speech()[source]

Return True if the tag is not a silence.

set(other)[source]

Set self members from another sppasTag instance.

Parameters

other – (sppasTag)

set_content(tag_content, tag_type=None)[source]

Change content of this sppasTag.

Parameters
  • tag_content – (any) New text content for this sppasTag

  • tag_type – The type of this tag. Default is ‘str’ to represent an unicode string.

Raise

AnnUnkTypeError, AnnDataTypeError

class anndata.sppasTagCompare[source]

Bases: sppas.src.structs.basecompare.sppasBaseCompare

Comparison methods for sppasTag.

Label’tags can be of 3 types in anndata (str, num, bool) so that this class allows to create different comparison methods depending on the type of the tags.

Example

Three different ways to compare a tag content to a given string

>>> tc = sppasTagCompare()
>>> tc.exact(sppasTag("abc"), u("abc"))
>>> tc.methods['exact'](sppasTag("abc"), u("abc"))
>>> tc.get('exact')(sppasTag("abc"), u("abc"))
__init__()[source]

Create a sppasTagCompare instance.

static bool(tag, x)[source]

Return True if boolean value of the tag is equal to boolean x.

Parameters
  • tag – (sppasTag) Tag to compare.

  • x – (bool)

Returns

(bool)

Raises

AnnDataTypeError

static contains(tag, text)[source]

Test if the first text contains the second text.

Parameters
  • tag – (sppasTag) Tag to compare.

  • text – (unicode) Unicode string to be compared with.

Returns

(bool)

Raises

AnnDataTypeError

static endswith(tag, text)[source]

Test if first text ends with the characters of the second text.

Parameters
  • tag – (sppasTag) Tag to compare.

  • text – (unicode) Unicode string to be compared with.

Returns

(bool)

Raises

AnnDataTypeError

static equal(tag, x)[source]

Return True if numerical value of the tag is equal to x.

Parameters
  • tag – (sppasTag) Tag to compare.

  • x – (int, float)

Returns

(bool)

Raises

AnnDataTypeError

static exact(tag, text)[source]

Test if two texts strictly contain the same characters.

Parameters
  • tag – (sppasTag) Tag to compare.

  • text – (unicode) Unicode string to be compared with.

Returns

(bool)

Raises

AnnDataTypeError

static greater(tag, x)[source]

Return True if numerical value of the tag is greater than x.

Parameters
  • tag – (sppasTag) Tag to compare.

  • x – (int, float)

Returns

(bool)

Raises

AnnDataTypeError

static icontains(tag, text)[source]

Case-insensitive contains.

Parameters
  • tag – (sppasTag) Tag to compare.

  • text – (unicode) Unicode string to be compared with.

Returns

(bool)

Raises

AnnDataTypeError

static iendswith(tag, text)[source]

Case-insensitive endswith.

Parameters
  • tag – (sppasTag) Tag to compare.

  • text – (unicode) Unicode string to be compared with.

Returns

(bool)

Raises

AnnDataTypeError

static iexact(tag, text)[source]

Case-insensitive exact.

Parameters
  • tag – (sppasTag) Tag to compare.

  • text – (unicode) Unicode string to be compared with.

Returns

(bool)

Raises

AnnDataTypeError

static istartswith(tag, text)[source]

Case-insensitive startswith.

Parameters
  • tag – (sppasTag) Tag to compare.

  • text – (unicode) Unicode string to be compared with.

Returns

(bool)

Raises

AnnDataTypeError

static lower(tag, x)[source]

Return True if numerical value of the tag is lower than x.

Parameters
  • tag – (sppasTag) Tag to compare.

  • x – (int, float)

Returns

(bool)

Raises

AnnDataTypeError

static regexp(tag, pattern)[source]

test if text matches pattern.

Parameters
  • tag – (sppasTag) Tag to compare.

  • pattern – (unicode) Pattern to search.

Returns

(bool)

Raises

AnnDataTypeError

static startswith(tag, text)[source]

Test if first text starts with the characters of the second text.

Parameters
  • tag – (sppasTag) Tag to compare.

  • text – (unicode) Unicode string to be compared with.

Returns

(bool)

Raises

AnnDataTypeError

class anndata.sppasTier(name=None, ctrl_vocab=None, media=None, parent=None)[source]

Bases: anndata.metadata.sppasMetaData

Representation of a tier, a structured set of annotations.

Annotations of a tier are sorted depending on their location (from lowest to highest).

A Tier is made of:

  • a name (used to identify the tier),

  • a set of metadata,

  • an array of annotations,

  • a controlled vocabulary (optional),

  • a media (optional),

  • a parent (optional).

__init__(name=None, ctrl_vocab=None, media=None, parent=None)[source]

Create a new sppasTier instance.

Parameters
  • name – (str) Name of the tier. It is used as identifier.

  • ctrl_vocab – (sppasCtrlVocab)

  • media – (sppasMedia)

  • parent – (sppasTranscription)

add(annotation)[source]

Add an annotation to the tier in sorted order.

Assign this tier as parent to the annotation.

Parameters

annotation – (sppasAnnotation)

Raises

AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError

Returns

the index of the annotation in the tier

append(annotation)[source]

Append the given annotation at the end of the tier.

Assign this tier as parent to the annotation.

Parameters

annotation – (sppasAnnotation)

Raises

AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError, TierAppendError

copy()[source]

Return a deep copy of the tier.

Returns

(sppasTier) including the ‘id’.

create_annotation(location, labels=None)[source]

Create and add a new annotation into the tier.

Parameters
  • location – (sppasLocation) the location(s) where the annotation happens

  • labels – (sppasLabel, list) the label(s) to stamp this annot.

Returns

sppasAnnotation

create_annotation_after(idx)[source]

Create and add a new annotation in the hole after idx.

Parameters

idx – (int) Index of an existing annotation

Returns

sppasAnnotation

:raises AnnDataTypeError, AnnDataIndexError

create_annotation_before(idx)[source]

Create and add a new annotation in the hole before idx.

Parameters

idx – (int) Index of an existing annotation

Returns

sppasAnnotation

:raises AnnDataTypeError, AnnDataIndexError

create_ctrl_vocab(name=None)[source]

Create the controlled vocabulary from annotation labels.

Create (or re-create) the controlled vocabulary from the list of already existing annotation labels. The current controlled vocabulary is deleted.

Parameters

name – (str) Name of the controlled vocabulary. The name of the tier is used by default.

export_to_intervals(separators)[source]

Create a tier with the consecutive filled intervals.

Return an empty tier if ‘self’ is not of type “interval”. The created intervals are not filled.

Parameters

separators – (list)

Returns

(sppasTier)

export_unfilled()[source]

Create a tier with the unlabelled/unfilled intervals.

Only for tiers of type Interval. It represents the “NOT tier”, ie where this tier is not annotated.

IMPORTANT: Never tested with overlapped annotations, actually not tested at all (but used in the plugin StatGroups).

Returns

(sppasTier) or None

find(begin, end, overlaps=True, indexes=False)[source]

Return a list of annotations between begin and end.

Parameters
  • begin – sppasPoint or None to start from the beginning of the tier

  • end – sppasPoint or None to end at the end of the tier

  • overlaps – (bool) Return also overlapped annotations. Not relevant for tiers with points.

  • indexes – (bool) Return indexes instead of annotations

Returns

List of sppasAnnotation or list of indexes

fit(other)[source]

Select then slice or extend annotations to fit in other tier.

Keep only the annotations of self that have some overlapping time with the given other tier and slice the localization of such selected annotations to exactly match those of the other tier.

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

tier1: |_a_|_b_| |_c_| |_d_| |_e_| |f| tier2: |_w_| |_______x_______| |_y_| |_z_|

tier1.fit(tier2) result is:

|a|b| |_c_| |_d_| |e|

tier2.fit(tier1) result is:

|w|w| |_x_| |_x_| |y|

Parameters

other – (sppasTier)

Returns

(sppasTier)

get_all_points()[source]

Return the list of all points of the tier.

get_annotation(identifier)[source]

Find an annotation from its metadata ‘id’.

Parameters

identifier – (str) Metadata ‘id’ of an annotation.

Returns

sppasAnnotation or None

get_annotation_index(ann)[source]

Find an annotation.

Parameters

ann – (sppasAnnotation)

Returns

(int) -1 if not found

get_ctrl_vocab()[source]

Return the controlled vocabulary of the tier.

get_first_point()[source]

Return the first point of the first annotation.

get_labels_type()[source]

Return the current type of labels, or an empty string.

get_last_point()[source]

Return the last point of the last location.

get_media()[source]

Return the media of the tier.

get_midpoint_intervals()[source]

Return midpoint values of all the intervals.

get_midpoint_points()[source]

Return midpoint values of all the points.

get_name()[source]

Return the identifier name of the tier.

get_nb_filled_labels()[source]

Return the number annotation with a filled label.

get_parent()[source]

Return the parent of the tier.

has_location(location)[source]

Return True if the tier has the given location.

to be tested.

has_point(point)[source]

Return True if the tier contains a given point.

Parameters

point – (sppasPoint) The point to find in the tier.

Returns

(bool)

index(moment)[source]

Return the index of the moment (int), or -1.

Only for tier with points.

Parameters

moment – (sppasPoint)

is_bool()[source]

All label tags are boolean values or None.

is_disjoint()[source]

Return True if the tier is made of disjoint localizations.

is_empty()[source]

Return True if the tier does not contain annotations.

is_float()[source]

All label tags are float values or None.

is_int()[source]

All label tags are integer values or None.

is_interval()[source]

Return True if the tier is made of interval localizations.

is_point()[source]

Return True if the tier is made of point localizations.

is_string()[source]

All label tags are string or unicode or None.

is_superset(other)[source]

Return True if this tier contains all points of the other tier.

Parameters

other – (sppasTier)

Returns

Boolean

lindex(moment)[source]

Return the index of the interval starting at a given moment, or -1.

Only for tier with intervals or disjoint. If the tier contains more than one annotation starting at the same moment, the method returns the first one.

Parameters

moment – (sppasPoint)

merge(idx, direction)[source]

Merge the annotation at given index with next or previous one.

if direction > 0:

ann_idx: [begin_idx, end_idx, labels_idx] next_ann: [begin_n, end_n, labels_n] result: [begin_idx, end_n, labels_idx + labels_n]

if direction < 0:

prev_ann: [begin_p, end_p, labels_p] ann_idx: [begin_idx, end_idx, labels_idx] result: [begin_p, end_idx, labels_p + labels_idx]

Parameters
  • idx – (int) Index of the annotation in the list

  • direction – (int) Positive for next, Negative for previous

Returns

(bool) False if direction does not match with index

Raise

Exception if merged annotation can’t be deleted of the tier

mindex(moment, bound=0)[source]

Return index of the interval containing the given moment.

Only for tier with intervals or disjoint.

If the tier contains more than one annotation at the same moment, the method returns the first one (i.e. the one which started at first).

Parameters
  • moment – (sppasPoint)

  • bound – (int) - 0 to exclude bounds of the interval; - -1 to include begin bound; - +1 to include end bound; - +2 to include both begin/end bounds; - others: the midpoint of moment is strictly inside

Returns

(int) Index of the 1st annotation containing moment or -1

near(moment, direction=1)[source]

Search for the annotation whose localization is closest.

Search for the nearest localization to the given moment into a given direction.

Parameters
  • moment – (sppasPoint)

  • direction – (int) - nearest 0 - nereast forward 1 - nereast backward -1

pop(index=- 1)[source]

Remove the annotation at the given position in the tier.

If no index is specified, pop() removes and returns the last annotation in the tier.

Parameters

index – (int) Index of the annotation to remove.

Raises

HierarchyContainsError

remove(begin, end, overlaps=False)[source]

Remove annotation intervals between begin and end.

Parameters
  • begin – (sppasPoint)

  • end – (sppasPoint)

  • overlaps – (bool)

Returns

the number of removed annotations

Raises

HierarchyContainsError

remove_unlabelled()[source]

Remove annotations without labels.

Do not remove an annotation if it invalidates the hierarchy.

Returns

the number of removed annotations

rindex(moment)[source]

Return the index of the interval ending at the given moment.

Only for tier with intervals or disjoint. If the tier contains more than one annotation ending at the same moment, the method returns the last one.

Parameters

moment – (sppasPoint)

set_ctrl_vocab(ctrl_vocab=None)[source]

Set a controlled vocabulary to this tier.

Parameters

ctrl_vocab – (sppasCtrlVocab or None)

Raises

AnnDataTypeError, CtrlVocabContainsError

set_media(media)[source]

Set a media to the tier.

Parameters

media – (sppasMedia)

Raises

AnnDataTypeError

set_name(name=None)[source]

Set the name of the tier.

If no name is given, an GUID is randomly assigned. Important: An empty string is accepted.

Parameters

name – (str) The identifier name or None.

Returns

the formatted name

set_parent(parent)[source]

Set the parent of the tier.

Parameters

parent – (sppasTranscription)

set_radius(radius)[source]

Fix a radius value to all points of the tier.

Parameters

radius – (int, float) New radius value

Raise

AnnDataTypeError, AnnDataNegValueError

split(idx)[source]

Split annotation at the given index into 2 annotations.

Parameters

idx – (int) Index of the annotation to split.

Returns

newly created annotation at index idx+1

validate()[source]
validate_annotation(annotation)[source]

Validate the annotation and set its parent to this tier.

Parameters

annotation – (sppasAnnotation)

Raises

AnnDataTypeError, CtrlVocabContainsError, HierarchyContainsError, HierarchyTypeError

validate_annotation_label(label)[source]

Validate a label.

Parameters

label – (sppasLabel)

Raises

CtrlVocabContainsError

validate_annotation_location(location)[source]

Ask the parent to validate a location.

Parameters

location – (sppasLocation)

Raises

AnnDataTypeError, HierarchyContainsError, HierarchyTypeError

class anndata.sppasTranscription(name=None)[source]

Bases: anndata.metadata.sppasMetaData

Representation of a transcription, the root in our framework.

Transcriptions in SPPAS are represented with:

  • metadata: a list of tuple key/value;

  • a name (used to identify the transcription);

  • a list of tiers;

  • a hierarchy between tiers;

  • a list of media;

  • a list of controlled vocabularies.

Inter-tier relations are managed by establishing alignment or association links between 2 tiers:

  • alignment: annotations of a tier A (child) have only localization instances included in those of annotations of tier B (parent);

  • association: annotations of a tier A have exactly localization

instances included in those of annotations of tier B.

Example
>>> # Create an instance
>>> trs = sppasTranscription("trs name")
>>> # Create a tier
>>> trs.create_tier("tier name")
>>> # Get a tier of a transcription from its index:
>>> tier = trs[0]
>>> # Get a tier of a transcription from its name
>>> tier = trs.find("tier name")
>>> # Get a tier from its identifier
>>> tier = trs.get_object(guid)
__init__(name=None)[source]

Create a new sppasTranscription instance.

Parameters

name – (str) Name of the transcription.

add_ctrl_vocab(new_ctrl_vocab)[source]

Add a new controlled vocabulary in the list of ctrl vocab.

Parameters

new_ctrl_vocab – (sppasCtrlVocab)

Raises

AnnDataTypeError, TrsAddError

Validate and add a hierarchy link between 2 tiers.

Parameters
  • link_type – (constant) One of the hierarchy types

  • parent_tier – (Tier) The reference tier

  • child_tier – (Tier) The child tier to be linked to reftier

add_media(new_media)[source]

Add a new media in the list of media.

Does not add the media if a media with the same id is already in self.

Parameters

new_media – (sppasMedia) The media to add.

Raises

AnnDataTypeError, TrsAddError

append(tier)[source]

Append a new tier.

Parameters

tier – (sppasTier) the tier to append

contains(tier)[source]

Return True if the given tier is in the list of tiers.

create_tier(name=None, ctrl_vocab=None, media=None)[source]

Create and append a new empty tier.

Parameters
  • name – (str) the name of the tier to create

  • ctrl_vocab – (sppasCtrlVocab)

  • media – (sppasMedia)

Returns

newly created empty tier

find(name, case_sensitive=True)[source]

Find a tier from its name.

Parameters
  • name – (str) EXACT name of the tier

  • case_sensitive – (bool)

Returns

sppasTier or None

find_id(tier_id)[source]

Find a tier from its identifier.

Parameters

tier_id – (str) Exact identifier of the tier

Returns

sppasTier or None

get_ctrl_vocab_from_id(ctrl_vocab_id)[source]

Return a sppasCtrlVocab from its id or None.

Parameters

ctrl_vocab_id – (str) Identifier name of a ctrl vocab

get_ctrl_vocab_from_name(ctrl_vocab_name)[source]

Return a sppasCtrlVocab from its name or None.

Parameters

ctrl_vocab_name – (str) Identifier name of a ctrl vocabulary

get_ctrl_vocab_list()[source]

Return the list of controlled vocabularies.

get_hierarchy()[source]

Return the hierarchy.

get_max_loc()[source]

Return the sppasPoint with the highest value through all tiers.

get_media_from_id(media_id)[source]

Return a sppasMedia from its name or None.

Parameters

media_id – (str) Identifier name of a media

get_media_list()[source]

Return the list of sppasMedia.

get_min_loc()[source]

Return the sppasPoint with the lowest value through all tiers.

get_name()[source]

Return the name of the transcription.

get_object(identifier)[source]

Return the object matching the given identifier.

Parameters

identifier – (GUID)

get_tier_from_id(tier_id)[source]

Return a sppasTier from its id or None.

Parameters

tier_id – (str) Identifier name of a tier

get_tier_index(name, case_sensitive=True)[source]

Get the index of a tier from its name.

Parameters
  • name – (str) EXACT name of the tier

  • case_sensitive – (bool)

Returns

index or -1 if not found

get_tier_index_id(identifier)[source]

Get the index of a tier from its id.

Parameters

identifier – (str) GUID

Returns

index or -1 if not found

get_tier_list()[source]

Return the list of tiers.

is_empty()[source]

Return True if the transcription does not contain tiers.

pop(index=- 1)[source]

Remove the tier at the given position in the transcription.

Return it. If no index is specified, pop() removes and returns the last tier in the transcription.

Parameters

index – (int) Index of the tier to remove.

Returns

(sppasTier)

Raise

AnnDataIndexError

remove_ctrl_vocab(old_ctrl_vocab)[source]

Remove a controlled vocabulary of the list of ctrl vocab.

Parameters

old_ctrl_vocab – (sppasCtrlVocab)

Raises

AnnDataTypeError, TrsRemoveError

remove_media(old_media)[source]

Remove a media of the list of media.

Parameters

old_media – (sppasMedia)

Raises

AnnDataTypeError, TrsRemoveError

rename_tier(tier)[source]

Rename a tier by appending a digit.

Parameters

tier – (sppasTier) The tier to rename.

set_ctrl_vocab_list(ctrl_vocab_list)[source]

Set the list of controlled vocabularies.

Parameters

ctrl_vocab_list – (list)

Returns

list of rejected ctrl_vocab

set_media_list(media_list)[source]

Set the list of media.

Parameters

media_list – (list)

Returns

list of rejected media

set_name(name=None)[source]

Set the name of the transcription.

Parameters

name – (str or None) The identifier or None to set the GUID.

Returns

the name

set_tier_index(name, new_index, case_sensitive=True)[source]

Set the index of a tier from its name.

THIS SHOULD NEVER BE USED. USE TIER IDENTIFIER INSTEAD OF ITS NAME.

Parameters
  • name – (str) EXACT name of the tier

  • new_index – (int) New index of the tier in self

  • case_sensitive – (bool)

Returns

index or -1 if not found

set_tier_index_id(identifier, new_index)[source]

Set the index of a tier from its identifier.

Parameters
  • identifier – (str)

  • new_index – (int) New index of the tier in self

Returns

index or -1 if not found

shift(delay)[source]

Shift all annotation’ location to a given delay.

Parameters

delay – (int, float) delay to shift all localizations

Raise

AnnDataTypeError

property tiers

Return the list of tiers.

validate_annotation_location(tier, location)[source]

Validate a location.

Parameters
  • tier – (Tier) The reference tier

  • location – (sppasLocation)

Raises

AnnDataTypeError, HierarchyContainsError, HierarchyTypeError

validate_hierarchy(tier)[source]
class anndata.sppasTrsRW(filename)[source]

Bases: object

Main parser of annotated data: Reader and writer of annotated data.

All the 3 types of annotated files are supported: ANNOT, MEASURE, TABLE.

TRANSCRIPTION_TYPES = {'IntensityTier': <class 'anndata.aio.praat.sppasIntensityTier'>, 'PitchTier': <class 'anndata.aio.praat.sppasPitchTier'>, 'TextGrid': <class 'anndata.aio.praat.sppasTextGrid'>, 'ant': <class 'anndata.aio.annotationpro.sppasANT'>, 'antx': <class 'anndata.aio.annotationpro.sppasANTX'>, 'anvil': <class 'anndata.aio.anvil.sppasAnvil'>, 'arff': <class 'anndata.aio.table.sppasARFF'>, 'aup': <class 'anndata.aio.audacity.sppasAudacity'>, 'csv': <class 'anndata.aio.text.sppasCSV'>, 'ctm': <class 'anndata.aio.sclite.sppasCTM'>, 'eaf': <class 'anndata.aio.elan.sppasEAF'>, 'hz': <class 'anndata.aio.phonedit.sppasSignaix'>, 'lab': <class 'anndata.aio.htk.sppasLab'>, 'mrk': <class 'anndata.aio.phonedit.sppasMRK'>, 'srt': <class 'anndata.aio.subtitle.sppasSubRip'>, 'stm': <class 'anndata.aio.sclite.sppasSTM'>, 'sub': <class 'anndata.aio.subtitle.sppasSubViewer'>, 'tdf': <class 'anndata.aio.xtrans.sppasTDF'>, 'tra': <class 'anndata.aio.table.sppasTRA'>, 'trs': <class 'anndata.aio.transcriber.sppasTRS'>, 'txt': <class 'anndata.aio.text.sppasRawText'>, 'vtt': <class 'anndata.aio.subtitle.sppasWebVTT'>, 'xra': <class 'anndata.aio.xra.sppasXRA'>, 'xrff': <class 'anndata.aio.table.sppasXRFF'>}
__init__(filename)[source]

Create a Transcription reader-writer.

Parameters

filename – (str)

static annot_extensions()[source]

Return the list of ANNOT extensions (case sensitive).

static create_trs_from_extension(filename)[source]

Return a transcription according to a given filename.

Only the extension of the filename is used.

Parameters

filename – (str)

Returns

Transcription()

static create_trs_from_heuristic(filename)[source]

Return a transcription according to a given filename.

The given file is opened and an heuristic allows to fix the format.

Parameters

filename – (str)

Returns

Transcription()

static extensions()[source]

Return the whole list of supported extensions (case sensitive).

static extensions_in()[source]

Return the list of supported extensions if the reader exists.

static extensions_out()[source]

Return the list of supported extensions if the writer exists.

get_filename()[source]

Return the filename.

static measure_extensions()[source]

Return the list of MEASURE extensions (case sensitive).

read(heuristic=False)[source]

Read a transcription from a file.

Parameters

heuristic – (bool) if the extension of the file is unknown, use

an heuristic to detect the format, then to choose the reader-writer. :returns: sppasTranscription reader-writer

set_filename(filename)[source]

Set a new filename.

Parameters

filename – (str)

static table_extensions()[source]

Return the list of TABLE extensions (case sensitive).

write(transcription)[source]

Write a transcription into a file.

Parameters

transcription – (sppasTranscription)