Reference of HMM-based Speech Synthesis Engine
hts engine API version 1.10
HTS Working Group
December 25, 2015
1 Engine structures
1.1 Audio
HTS Audio Audio output wrapper.
'
size t sampling frequency size t
short
size t
max bu size
*bu
bu size
void *audio interface
&
$
sampling frequency
buer size of audio output device
current buer
current buer size
audio interface specified in compile step
1.2 Model
HTS Window Window coecients to calculate dynamic features.
'
size t size
- # of windows (static + deltas)
- left width of windows
int *l width
int *r width
- right width of windows
double
size t
&
**coecient
max width
window coecients
maximum width of windows
HTS Pattern List of patterns in a question and a tree.
char *string - pattern string
HTS Pattern *next
- pointer to the next pattern
HTS Question List of questions in a tree.
char *string - name of this question
HTS Pattern *head
- pointer to the head of pattern list
HTS Question
*next
pointer to the next question
%
HTS Node List of tree nodes in a tree.
'
int index
- index of this node
size t pdf
- index of PDF for this node (leaf node only)
HTS Node *yes
- pointer to its child node (yes)
HTS Node
HTS Node
HTS Question
&
*no
*next
pointer to its child node (no)
pointer to the next node
*quest
question applied at this node
HTS Tree List of decision trees in a model.
HTS Pattern *head - pointer to the head of pattern list for this tree
HTS Tree *next - pointer to the next tree
HTS Node *root - root node of this tree
size t
state
state index of this tree
HTS Model Set of PDFs, decision trees and questions.
'
size t vector length
- vector length (static features only)
size t num windows - # of windows for delta
HTS Boolean is msd
- flag for MSD
size t
size t
float
HTS Tree
HTS Question
&
ntree
*npdf
# of trees
# of PDFs at each tree
***pdf
*tree
*question
PDFs
pointer to the list of trees
pointer to the list of questions
HTS ModelSet Set of duration models, HMMs and GV models.
'
char *hts voice version
- version of HTS voice format
size t
size t
size t
sampling frequency
frame period
num voices
sampling frequency
frame period
# of HTS voices
size t
size t
num states
num streams
# of HMM states
# of streams
*stream type
*fullcontext format
*fullcontext version
stream type
fullcontext label format
version of fullcontext label
HTS Question
char
*gv o context
**option
GV switch
options for each stream
HTS Model
HTS Window
HTS Model
*duration
*window
**stream
duration PDFs and trees
window coecients for delta
parameter PDFs and trees
**gv
GV PDFs and trees
char
char
char
HTS Model
&
%
$
%
$
1.3 Label
HTS LabelString Individual label string with time information.
HTS LabelString *next
- pointer to the next label string
char *name - label string
double
double
start
end
start frame specified in the given label
end frame specified in the given label
HTS Label List of label strings.
HTS LabelString *head
size t
size
pointer to the head of label string
# of label strings
1.4 State stream
HTS SStream Individual state stream.
'
size t vector length
double **mean
double **vari
-
$
vector length (static features only)
mean vector sequence
variance vector sequence
double
size t
*msd
win size
MSD parameter sequence
# of windows (static + deltas)
int
int
double
*win l width
*win r width
**win coecient
left width of windows
right width of windows
window coecients
size t
double
win max width
*gv mean
maximum width of windows
mean vector of GV
*gv vari
*gv switch
variance vector of GV
GV flag sequence
double
HTS Boolean
&
HTS SStreamSet Set of state stream.
'
HTS SStream *sstream
- state streams
&
size t
size t
nstream
nstate
# of streams
# of states
size t
size t
size t
*duration
total state
total frame
duration sequence
total state
total frame
1.5 PDF stream
HTS SMatrices Matrices/Vectors used in the speech parameter generation algorithm.
%
$
'
double
$
**mean
mean vector sequence
double
double
**ivar
*g
inverse diagonal variance sequence
vector used in the forward substitution
double
double
&
**wuw
*wum
W U 1 W
W U 1 m
HTS PStream Individual PDF stream.
'
size t vector length
size t length
size t
double
HTS SMatrices
size t
int
int
double
HTS Boolean
double
double
HTS Boolean
size t
&
$
vector length (static features only)
stream length
width
**par
width of dynamic window
output parameter vector
sm
win size
*win l width
matrices for parameter generation
# of windows (static + deltas)
left width of windows
*win r width
**win coecient
*msd flag
right width of windows
window coecients
Boolean sequence for MSD
*gv mean
*gv vari
mean vector of GV
variance vector of GV
*gv switch
gv length
GV flag sequence
frame length for GV calculation
HTS PStreamSet Set of PDF streams.
HTS PStream *pstream
- PDF streams
size t nstream
- # of PDF streams
size t
total frame
total frame
%
1.6 Generated parameter stream
HTS GStream Generated parameter stream.
size t vector length - vector length (static features only)
double
**par
generated parameter
HTS GStreamSet Set of generated parameter stream.
'
size t total nsample - total sample
size t
size t
HTS GStream
&
double
total frame
nstream
*gstream
total frame
# of streams
generated parameter streams
*gspeech
generated speech
1.7 Engine
HTS Condition Synthesis condition.
'
size t sampling frequency
size t fperiod
$
-
sampling frequency
frame period
audio bu size
stop
volume
audio buer size (for audio device)
stop flag
volume
*msd threshold
*gv weight
MSD thresholds
GV weights
HTS Boolean
double
size t
phoneme alignment flag
speed
stage
flag for using phoneme alignment in label
speech speed
if stage = 0 then gamma = 0 else gamma = 1/stage
HTS Boolean
double
use log gain
alpha
log gain flag (for LSP)
all-pass constant
double
double
double
beta
additional half tone
*duration iw
postfiltering coecient
additional half tone
weights for duration interpolation
double
double
**parameter iw
**gv iw
weights for parameter interpolation
weights for GV interpolation
size t
HTS Boolean
double
double
double
&
HTS Engine Engine itself.
'
HTS Condition condition
HTS Audio audio
HTS ModelSet
HTS Label
HTS SStreamSet
HTS PStreamSet
HTS GStreamSet
&
%
$
synthesis condition
audio output
ms
label
set of duration models, HMMs and GV models
label
sss
pss
gss
set of state streams
set of PDF streams
set of generated parameter streams
2 Engine functions
2.1 Initialize engine
2.1.1 HTS Engine initialize
Type
Use
void
Initialize engine.
Arguments
Attention!!
HTS Engine *engine - pointer to HTS Engine structure
To start engine, first you must call this function.
2.2 Load models
2.2.1 HTS Engine load
Type
HTS Boolean
Use
Arguments
Load duration PDFs and trees from files using given file names.
- pointer to HTS Engine structure
HTS Engine *engine
char **voices
- HTS voice file names
Attention!!
size t num voices - # of HTS voices
You must initialize engine using HTS Engine initialize before calling this function.
2.3 Synthesize speech and set/get synthesis parameters
2.3.1 HTS Engine set sampling frequency
Type
Use
void
set sampling frequency.
Arguments
HTS Engine
size t
*engine
i
pointer to HTS Engine structure
sampling frequency (Hz), 1 i
2.3.2 HTS Engine get sampling frequency
Type
size t
Use
Arguments
get sampling frquency.
HTS Engine *engine
pointer to HTS Engine structure
pointer to HTS Engine structure
frame shift (point), 1 i
2.3.3 HTS Engine set fperiod
Type
void
Use
Arguments
set frame shift.
HTS Engine *engine
size t i
2.3.4 HTS Engine get fperiod
Type
size t
Use
Arguments
get frame shift.
HTS Engine *engine
pointer to HTS Engine structure
2.3.5 HTS Engine set audio bu size
Type
void
Use
Arguments
set buer size for direct audio output.
HTS Engine *engine - pointer to HTS Engine structure
size t i
- buer size (sample)
Attention!!
Default value is 0. If i = 0, direct audio play is turned o.
2.3.6 HTS Engine get audio bu size
Type
size t
Use
Arguments
Attention!!
get buer size for direct audio output.
HTS Engine *engine - pointer to HTS Engine structure
Default value is 0. If i = 0, direct audio play is turned o.
2.3.7 HTS Engine set stop flag
Type
Use
void
set stop flag.
Arguments
HTS Engine *engine
HTS Boolean b
Default value is FALSE.
Attention!!
pointer to HTS Engine structure
flag
2.3.8 HTS Engine get stop flag
Type
Use
HTS Boolean
get stop flag.
Arguments
Attention!!
HTS Engine *engine Default value is FALSE.
pointer to HTS Engine structure
2.3.9 HTS Engine set volume
Type
void
Use
Arguments
set volume in db.
HTS Engine *engine
double f
Attention!!
Default value is 0.0.
pointer to HTS Engine structure
volume in db
2.3.10 HTS Engine get volume
Type
double
Use
Arguments
get volume in db.
HTS Engine *engine
pointer to HTS Engine structure
2.3.11 HTS Engine set msd threshold
Type
Use
void
set MSD threshold.
Arguments
HTS Engine
size t
double
*engine
stream index
pointer to HTS Engine structure
index of streams
threshold
pointer to HTS Engine structure
index of streams
pointer to HTS Engine structure
index of streams
GV weight
pointer to HTS Engine structure
index of streams
2.3.12 HTS Engine get msd threshold
Type
Use
Arguments
double
get MSD threshold.
HTS Engine *engine
size t
stream index
2.3.13 HTS Engine set gv weight
Type
Use
void
set GV weight.
Arguments
HTS Engine
size t
double
Attention!!
Default value is 1.0.
*engine
stream index
f
2.3.14 HTS Engine get gv weight
Type
Use
double
get GV weight.
Arguments
HTS Engine
size t
*engine
stream index
2.3.15 HTS Engine set speed
Type
void
Use
Arguments
set speech speed.
HTS Engine *engine
double f
Attention!!
Default value is 1.0.
pointer to HTS Engine structure
speed
2.3.16 HTS Engine set phoneme alignment flag
Type
Use
void
set flag to use phoneme alignment in label.
Arguments
HTS Engine
HTS Boolean
Attention!!
Default value is FALSE.
*engine
b
pointer to HTS Engine structure
flag
2.3.17 HTS Engine set alpha
Type
Use
void
set frequency warping parameter alpha.
Arguments
HTS Engine
double
*engine
f
pointer to HTS Engine structure
alpha, 0.0 f 1.0
2.3.18 HTS Engine get alpha
Type
Use
Arguments
double
get frequency warping parameter alpha.
HTS Engine *engine - pointer to HTS Engine structure
2.3.19 HTS Engine set beta
Type
Use
void
set postfiltering coecient parameter beta.
Arguments
HTS Engine *engine
double f
Default value is 0.0.
Attention!!
pointer to HTS Engine structure
beta, 0.0 f 1.0
2.3.20 HTS Engine get beta
Type
Use
Arguments
double
get postfiltering coecient parameter beta.
HTS Engine *engine - pointer to HTS Engine structure
Attention!!
Default value is 0.0.
2.3.21 HTS Engine add half tone
Type
void
Use
Arguments
add half tone.
HTS Engine *engine
double f
pointer to HTS Engine structure
half tone
2.3.22 HTS Engine set duration interpolation weight
Type
void
Use
Arguments
set weight for duration interpolation.
HTS Engine *engine
- pointer to HTS Engine structure
size t
double
voice index
f
index of duration models
interpolation weight
2.3.23 HTS Engine get duration interpolation weight
Type
Use
double
get weight for duration interpolation.
Arguments
HTS Engine
size t
*engine
voice index
pointer to HTS Engine structure
index of duration models
2.3.24 HTS Engine set parameter interpolation weight
Type
void
Use
Arguments
set weight for parameter interpolation.
HTS Engine *engine
- pointer to HTS Engine structure
size t voice index
- index of parameter models
size t
double
stream index
f
index of streams
interpolation weight
2.3.25 HTS Engine get parameter interpolation weight
Type
double
Use
Arguments
get weight for parameter interpolation.
- pointer to HTS Engine structure
HTS Engine *engine
size t
size t
voice index
stream index
index of parameter models
index of streams
10
2.3.26 HTS Engine set gv interpolation weight
Type
void
Use
Arguments
set weight for GV interpolation.
HTS Engine *engine
size t voice index
size t
double
stream index
f
pointer to HTS Engine structure
index of GV models
index of streams
interpolation weight
2.3.27 HTS Engine get gv interpolation weight
Type
double
Use
Arguments
get weight for GV interpolation.
HTS Engine *engine
size t voice index
size t
stream index
pointer to HTS Engine structure
index of GV models
index of streams
2.3.28 HTS Engine get total state
Type
Use
size t
get total # of state.
Arguments
HTS Engine
*engine
pointer to HTS Engine structure
2.3.29 HTS Engine set state mean
Type
Use
Arguments
void
set mean value of state.
HTS Engine *engine
pointer to HTS Engine structure
size t
size t
stream index
state index
index of streams
index of states
size t
double
vector index
f
index of vector
mean value
pointer to HTS Engine structure
index of streams
index of states
index of vector
2.3.30 HTS Engine get state mean
Type
double
Use
Arguments
get mean value of state.
HTS Engine *engine
size t stream index
size t
size t
state index
vector index
11
2.3.31 HTS Engine get state duration
Type
size t
Use
Arguments
get state duration.
HTS Engine *engine
size t state index
pointer to HTS Engine structure
index of states
2.3.32 HTS Engine get nvoices
Type
Use
Arguments
size t
get # of HTS voices.
HTS Engine *engine
pointer to HTS Engine structure
pointer to HTS Engine structure
pointer to HTS Engine structure
2.3.33 HTS Engine get nstream
Type
Use
Arguments
size t
get # of stream.
HTS Engine *engine
2.3.34 HTS Engine get nstate
Type
Use
size t
get # of state.
Arguments
HTS Engine
*engine
2.3.35 HTS Engine get fullcontext label format
Type
Use
const char *
get fullcontext label format defined in HTS voice.
Arguments
HTS Engine
*engine
pointer to HTS Engine structure
2.3.36 HTS Engine get fullcontext label version
Type
Use
const char *
get fullcontext label version defined in HTS voice.
Arguments
HTS Engine
*engine
pointer to HTS Engine structure
pointer to HTS Engine structure
2.3.37 HTS Engine get total frame
Type
Use
size t
get total # of frame.
Arguments
HTS Engine
*engine
12
2.3.38 HTS Engine get nsamples
Type
size t
Use
Arguments
get # of samples.
HTS Engine *engine
pointer to HTS Engine structure
2.3.39 HTS Engine get generated parameter
Type
size t
Use
Arguments
get generated parameter.
HTS Engine *engine
size t stream index
size t
size t
frame index
vector index
pointer to HTS Engine structure
index of streams
index of frames
index of vector
2.3.40 HTS Engine get generated speech
Type
Use
Arguments
size t
get generated speech.
HTS Engine *engine
size t
index
pointer to HTS Engine structure
index of samples
2.3.41 HTS Engine synthesize from fn
Type
HTS Boolean
Use
Arguments
synthesize speech from file name.
HTS Engine *engine - pointer to HTS Engine structure
char *fn
- label file name
2.3.42 HTS Engine synthesize from strings
Type
Use
HTS Boolean
synthesize speech from string list.
Arguments
HTS Engine
char
size t
*engine
**lines
num lines
pointer to HTS Engine structure
label string list
# of lines
2.3.43 HTS Engine generate from fn
Type
Use
Arguments
HTS Boolean
generate state sequence from file name (1/3 synthesis step)
HTS Engine *engine - pointer to HTS Engine structure
char
*fn
label file name
13
2.3.44 HTS Engine generate from strings
Type
HTS Boolean
Use
Arguments
generate state sequence from string list (1/3 synthesis step)
HTS Engine *engine
- pointer to HTS Engine structure
char **lines
- label string list
size t
num lines
# of lines
2.3.45 HTS Engine generate parameter sequence
Type
Use
HTS Boolean
generate parameter sequence (2/3 synthesis step)
Arguments
HTS Engine
*engine
pointer to HTS Engine structure
2.3.46 HTS Engine generate sample sequence
Type
HTS Boolean
Use
Arguments
generate sample sequence (3/3 synthesis step)
HTS Engine *engine - pointer to HTS Engine structure
2.3.47 HTS Engine save information
Type
void
Use
Arguments
output trace information.
HTS Engine *engine FILE
*fp
pointer to HTS Engine structure
output file pointer
pointer to HTS Engine structure
output file pointer
Attention!!
2.3.48 HTS Engine save label
Type
void
Use
Arguments
output label with time.
HTS Engine *engine
FILE
*fp
Attention!!
2.3.49 HTS Engine save generated parameter
Type
void
Use
Arguments
output generated parameter.
HTS Engine *engine FILE *fp
-
pointer to HTS Engine structure
output file pointer
Attention!!
14
2.3.50 HTS Engine save generated speech
Type
void
Use
Arguments
output generated speech.
HTS Engine *engine FILE *fp
-
pointer to HTS Engine structure
output file pointer
Attention!!
2.3.51 HTS Engine save ri
Type
Use
void
output ri format file.
Arguments
HTS Engine
FILE
*engine
*fp
pointer to HTS Engine structure
output file pointer
Attention!!
2.3.52 HTS Engine refresh
Type
void
Use
Arguments
free label, state streams, PDF streams and generated parameter streams per one time synthesis
HTS Engine *engine - pointer to HTS Engine structure
Attention!!
2.4 Free engine
2.4.1 HTS Engine clear
Type
Use
Arguments
void
free engine.
HTS Engine
*engine
pointer to HTS Engine structure
Attention!!
15