A speech recognition lattice is the matrix of speech possibilities explored by the recognizer for a given audio source. For each time division or frame of audio, the best possible word matches are included as entries in the lattice. Each entry is given a confidence score representing the system’s level of confidence in that particular match. Using the lattice, you can explore alternatives to the system’s top word hypothesis. This is especially useful when the top entries in time frame have very similar scores.
A Lattice object provides access to the speech recognition lattice for a particular utterance. The lattice is stored as a list of time-ordered LatticeEntry objects. Each LatticeEntry encapsulates a possible word match.
Lattice.bestpathlist returns the system’s most probable sequence of entries in the Lattice. This is the same as the client.utterance.Utterance.words list.
A Lattice object representing various speech possibilities explored. A Lattice is a matrix of possible words for each time division (frame) in the audio stream, stored as a list of ordered LatticeEntry objects.
A Lattice is comprised of one or more LatticeEntry objects. A LatticeEntry contains a word, frame, score, pointer to the previous word.
Each LatticeEntry includes a pointer to the most likely word occurring before it in the lattice. Entries’ frames are used to calculate the start and end timestamps. A frame is the smallest division of time within an audio stream (set by the recognition server).