The lattice Module

A speech recognition lattice is the matrix of speech possibilities explored by the recognizer for a given audio source. For each time division or frame of audio, the best possible word matches are included as entries in the lattice. Each entry is given a confidence score representing the system’s level of confidence in that particular match. Using the lattice, you can explore alternatives to the system’s top word hypothesis. This is especially useful when the top entries in time frame have very similar scores.

A Lattice object provides access to the speech recognition lattice for a particular utterance. The lattice is stored as a list of time-ordered LatticeEntry objects. Each LatticeEntry encapsulates a possible word match.

Lattice.bestpathlist returns the system’s most probable sequence of entries in the Lattice. This is the same as the client.utterance.Utterance.words list.

Lattice Objects

class client.lattice.Lattice(start=None)

A Lattice object representing various speech possibilities explored. A Lattice is a matrix of possible words for each time division (frame) in the audio stream, stored as a list of ordered LatticeEntry objects.

entries
LatticeEntry instances ordered by time.
start
timestamp in audio stream for the beginning of this Lattice.
end
timestamp in audio stream for the end of this Lattice.
rindex(word, offset=0, toupper=True)
Return highest index in entries where entry’s word matches word. Start search at offset from the end. If toupper, uppercase word before matching.
bestpath()
Return the LatticeEntry of the most probable path through the lattice.
bestpathlist()
Return a time-ordered list of the LatticeEntry objects corresponding to the most probable path through the lattice. Special word entries are omitted.

LatticeEntry Objects

class client.lattice.LatticeEntry(word, endframe, score, prev, latstartdate)

A Lattice is comprised of one or more LatticeEntry objects. A LatticeEntry contains a word, frame, score, pointer to the previous word.

Each LatticeEntry includes a pointer to the most likely word occurring before it in the lattice. Entries’ frames are used to calculate the start and end timestamps. A frame is the smallest division of time within an audio stream (set by the recognition server).

word
word string identified by the speech recognition server.
frame
frame location for the end of word.
prev
pointer to the most likely word occurring before it in the lattice.
latstartdate
timestamp in audio stream for the beginning of the Lattice.
isspecial()
Return True if this is a special word entry. A special word represents a cough, silence, or other non-speech sounds.
prevwordentry()
Return preceding non-special word entry or None (if first word entry).
start()
Return the timestamp for the start of this LatticeEntry.
end()
Return the timestamp for the end of this LatticeEntry.

Module Quick Links

Table Of Contents

Previous topic

The streamlistener Module

This Page