segment-nbest
segment-nbest
NAME
segment-nbest - rescore and segment N-best lists using hidden segment N-gram model
SYNOPSIS
\fsegment-nbest\fP [ -help ] option ... nbest-file-list ...
DESCRIPTION
segment-nbest
processes a series of consecutive N-best lists from a speech
recognizer
and applies a hidden segment N-gram language model to them.
The language model is a standard backoff N-gram model in ARPA
ngram-format(5)
modeling sentence segmentation using the boundary tags <s> and </s>.
The program reads in all N-best lists and outputs the
hypotheses that have the highest aggregate (combined acoustic
and language model) score.
Hypothesized sentence boundaries are marked by <s> tags.
OPTIONS
Each filename argument can be an ASCII file, or a
compressed file (name ending in .Z or .gz), or ``-'' to indicate
stdin/stdout.
- -help
-
Print option summary.
- -version
-
Print version information.
- -order n
-
Set the maximal N-gram order to be used, by default 3.
NOTE: The order of the model is not set automatically when a model
file is read, so the same file can be used at various orders.
- -debug level
-
Set the debugging output level (0 means no debugging output).
Debugging messages are sent to stderr.
- -lm file
-
Read the N-gram model from
file.
- -tolower
-
Map all vocabulary to lowercase.
Useful if case conventions for N-best lists and language model differ.
- -mix-lm file
-
Read a second, standard N-gram model for interpolation purposes.
- -lambda weight
-
Set the weight of the main model when interpolating with
-mix-lm.
Default value is 0.5.
- -bayes length
-
Interpolate the second and the main model using posterior probabilities
for local N-gram-contexts of length
length.
The
-lambda
value is used as a prior mixture weight in this case.
- -bayes-scale scale
-
Set the exponential scale factor on the context likelihood in conjunction
with the
-bayes
function.
Default value is 1.0.
- -nbest-files list
-
Specifies a list of N-best files.
The file
list
should contain a list of filenames, one per line,
each corresponding to an N-best file in one of the formats
described in
nbest-format(5).
The N-best files should correspond to consecutive speech waveforms
in the order listed.
- -fb-rescore
-
Perform Forward-backward rescoring.
This generates new N-best lists
as output whose LM scores reflect the posterior probability of each
hypothesis.
The default is to perform Viterbi rescoring and output only the
best combined hypothesis.
- -write-nbest-dir dir
-
Write rescored N-best lists to directory
dir
instead of to stdout.
The filenames from the input are preserved.
- -max-nbest n
-
Limits the number of hypotheses read from each N-best list to the first
n.
- -max-rescore m
-
Only choose among the top
m
hypotheses of each list (after reordering hypotheses, see below).
This is an effective way to limit the quadratic computation
of the Viterbi or forward/backward dynamic programming.
- -no-reorder
-
Do not reorder the hypotheses before limiting the computation to
the top
m.
By default the hypotheses will first be sorted according to the
acoustic and language model scores recorded in the N-best lists.
- -rescore-lmw weight
-
Specifies the language model weight to be use in combining
acoustic and language model scores to select the best hypotheses.
- -rescore-wtw weight
-
Specifies the word transition weight to be used in selecting the
best hypotheses.
- -noise noise-tag
-
Designate
noise-tag
as a vocabulary item that is to be ignored by the LM.
(This is typically used to identify a noise marker.)
- -noise-vocab file
-
Read several noise tags from
file,
instead of, or in addition to, the single noise tag specified by
-noise.
- -decipher-lm model-file
-
Designates the N-gram backoff model (typically a bigram) that was used by the
Decipher(TM) recognizer in computing composite scores.
Used to compute acoustic scores from the composite scores if the
N-best lists are in "NBestList1.0" format.
- -decipher-lmw weight
-
Specifies the language model weight used by the recognizer.
Used to compute acoustic scores from the composite scores.
- -decipher-wtw weight
-
Specifies the word transition weight used by the recognizer.
Used to compute acoustic scores from the composite scores.
- -stag string
-
Use
string
to mark segment boundaries in the output.
Default is the start-of-sentence symbol defined in the language model (<s>).
- -bias b
-
Make a segment boundary a priori more likely by a factor of
b.
If
b
is 0, the dynamic program algorithm is restricted to never consider
hidden sentence boundaries; this is useful when
segment-nbest
is used merely for its ability to apply the LM across N-best boundaries.
- -start-tag string
-
Insert a tag
string
at the front of every N-best hypothesis read in.
- -end-tag string
-
Insert a tag
string
at the end of every N-best hypothesis read in.
This and the previous option are useful if the LM marks acoustic
waveform boundaries with a special tag.
segment-nbest
will also process any command line arguments following the options
as lists of N-best lists, as with the
-nbest-files
option.
Each
nbest-file-list
will be processed in turn,
with individual output delimited by a line of the form
<nbestfile nbest-file-list>
SEE ALSO
ngram-count(1), segment(1), ngram-format(5), nbest-format(5).
A. Stolcke, ``Modeling Linguistic Segment and Turn Boundaries for N-best
Rescoring of Spontaneous Speech,'' Proc. Eurospeech, 2779-2782, 1997.
BUGS
N-gram models of arbitrary order can be used, but the context at the
beginning of a hypothesis never extends beyond the words from the preceding
N-best list.
AUTHOR
Andreas Stolcke <stolcke@speech.sri.com>.
Copyright 1997-2004 SRI International