The `utterance` Module¶

An utterance is a unit of speech. It is usually from the start of speaking until a significant pause. In speech recognition, speech audio is segmented into utterances based on the level and duration of voice activity.

An Utterance object is the decoded data for a given audio utterance. It contains a time-ordered list of the most likely decoded events as well as the speech recognition lattice.

Each event contains the decoded word with start and end time stamps, and the system’s confidence that the entry is the correct decoding.

`Utterance` Objects¶

class client.utterance.Utterance(metadata=None, lattice=None)¶

An Utterance object holds decoded data, consisting of the time-ordered list of words, events and the associated client.lattice.Lattice.

words¶

list of decoded words.

events¶

time ordered list of UtteranceEvent objects.

lattice¶

decoded speech Lattice object.

metadata¶

dictionary of tags associated with this utterance.

start¶

time stamp of first contained event.

end¶

time stamp of last contained event.

text()¶

Return the text string for this utterance.

id()¶

Return the id string of this utterance in the form <source>[c<channel id>][u<utterance number>].

merge(*utts)¶

Merge this utterance with given utterances. It is more efficient to merge multiple utterances at the same time instead of one by one.

`UtteranceEvents` Objects¶

class client.utterance.UtteranceEvent¶

Describes a single utterance event. Each audio event has the following attributes:

word¶

The word for this event.

start¶

The start event time stamp.

end¶

The end event time stamp.

confidence¶

System’s level of confidence in this entry’s correctness.

Module Quick Links

Table Of Contents

The utterance Module
- Utterance Objects
- UtteranceEvents Objects

Previous topic

The speechdata Module

Next topic

The scanner Module

This Page