Àâòîðèçàöèÿ
Ïîèñê ïî óêàçàòåëÿì
Taylor P. — Text-to-Speech Synthesis
Îáñóäèòå êíèãó íà íàó÷íîì ôîðóìå
Íàøëè îïå÷àòêó? Âûäåëèòå åå ìûøêîé è íàæìèòå Ctrl+Enter
Íàçâàíèå: Text-to-Speech Synthesis
Àâòîð: Taylor P.
Àííîòàöèÿ: Text-to-Speech Synthesis provides a complete, end-to-end account of the process of generating speech by computer. Giving an in-depth explanation of all aspects of current speech synthesis technology, it assumes no specialized prior knowledge. Introductory chapters on linguistics, phonetics, signal processing and speech signals lay the foundation, with subsequent material explaining how this knowledge is put to use in building practical systems that generate speech. Including coverage of the very latest techniques such as unit selection, hidden Markov model synthesis, and statistical text analysis, explanations of the more traditional techniques such as format synthesis and synthesis by rule are also provided. Weaving together the various strands of this multidisciplinary field, the book is designed for graduate students in electrical engineering, computer science, and linguistics. It is also an ideal reference for practitioners in the fields of human communication interaction and telephony.
ßçûê:
Ðóáðèêà: Computer science /
Ñòàòóñ ïðåäìåòíîãî óêàçàòåëÿ: Ãîòîâ óêàçàòåëü ñ íîìåðàìè ñòðàíèö
ed2k: ed2k stats
Ãîä èçäàíèÿ: 2009
Êîëè÷åñòâî ñòðàíèö: 597
Äîáàâëåíà â êàòàëîã: 31.10.2010
Îïåðàöèè: Ïîëîæèòü íà ïîëêó |
Ñêîïèðîâàòü ññûëêó äëÿ ôîðóìà | Ñêîïèðîâàòü ID
Ïðåäìåòíûé óêàçàòåëü
Lexicons, grapheme-to-phoneme algorithms 208
Lexicons, language lexicons 207
Lexicons, memorising the data 209
Lexicons, offline lexicon 213—214
Lexicons, orthographic and pronunciation variants 210—212
Lexicons, orthography-pronunciation lexicons 207
Lexicons, over-fitting data 209
Lexicons, quality of 215—216
Lexicons, rules for 208—210
Lexicons, simple dictionary formats 210
Lexicons, speaker's lexicons 207
Lexicons, system lexicon 214—215
Lexicons, unknown word problems 216—218
Lijencrants — Fant model for glottal flow 332 374 376
Limited-domain synthesis systems 44
Line-spectrum frequencies (LSFs) 367—369
Linear filters, assumptions concerning 337
Linear time-invariant (LTI) filters 288 310
Linear time-invariant (LTI) filters for nasalised vowels 333—334
Linear-prediction (LP) PSOLA 423—424
Linear-prediction (LP) speech analysis 357—365
Linear-prediction (LP) speech analysis about linear prediction 357—358
Linear-prediction (LP) speech analysis, autocorrelation method 360—361
Linear-prediction (LP) speech analysis, Cholskey decomposition method 359
Linear-prediction (LP) speech analysis, covariance method for finding coefficients 358—360
Linear-prediction (LP) speech analysis, Levinson — Durbin recursion technique 361—362
Linear-prediction (LP) speech analysis, perceptual linear prediction 370
Linear-prediction (LP) speech analysis, spectra for 362—365
Linear-prediction (LP) speech analysis, Toeplitz matrix 361
Linear-prediction (LP) synthesis see "Classical linear-prediction (LP) synthesis" "Residual-excited
Linear-prediction cepstra 369
Linguistic levels 16—17
Linguistic levels, morphemes/morphology 16
Linguistic levels, phonetics/phonology 16
Linguistic levels, pragmatics 17
Linguistic levels, semantics 16—17
Linguistic levels, speech acoustics 16
Linguistic levels, syntax 16
Linguistic-analysis TTS models 39
Linguistics/speech technology relationship 533—536
Linguistics/speech technology relationship, future of 537—538
Log area ratios 367
Log power spectrum 343—344
Logographic writing 34
Logotomes/nonsense words 415
Lookup tables 72
Lossless tube, assumptions 338
lp see "Linear-prediction (LP) ..."
Lumped-parameter speech generation model 389
Machine Translation 1
Machine-readable phonetic alphabet (MRPA) phoneme inventory 204—205
Macro-concatenation 497
Magnetic resonance imaging (MRI) 156
Manhattan distance 486
Marginal distributions 543
markup languages 68—69
Markup languages, java speech markup language 69
Markup languages, speech synthesis markup language (SSML) 69
Markup languages, spoken text markup language 69
Markup languages, VoiceXML 69
Maximal onset principle 185
MBROLA technique 429
Mean opinion score 524
Meaning-to-speech system 42—43
Meaning/form/signal, and communication 12—13
Medium (means of conversion) 13
Mel-frequency cepstral coefficients (MFCCs) 370 429—431 439
Mel-scale 351
Memorising the data (machine learning) 209
Memory-based learning 220—221
Message/form-to-speech synthesis 42
Messages 18
Messages, message generation 20—21
Metrical phonology 114 120 183
Metrical stress 188
Micro-prosody 229
Minimal pair principle/analysis 163 197—199 204
Mis-spellings, decoding 98
Model effectiveness, and synthesis with vocal-tract models 407
Models of TTS 37—41
Models of TTS, common-form model 38
Models of TTS, comparisons 40—41
Models of TTS, complete prosody generation 40
Models of TTS, full linguistic-analysis 39—40
Models of TTS, grapheme form 39
Models of TTS, phoneme form 39
Models of TTS, pipelined 39
Models of TTS, prosody from the text 40
Models of TTS, signal-to-signal 39
Models of TTS, text-as-language 39
Modified rhyme test (mrt) 523 524
Modified timit ascii character set 166
Modularity, and synthesis with vocal-tract models 407
Moments of a PMF 542
Monophthongs 153
Morphemes/morphology 16
Morphology 222—223
Morphology and scope 59
Morphology, derivational 59 222
Morphology, inflectional 222
Morphology, morphological decomposition 222—223
MRPA phoneme inventory 204—205
Multi-band-excitation (MBE) 427
Multi-centroid analysis 371
Multi-pass searching 509
n-gram model 91
Naive Bayes' classifier 86—87
Names, pronunciation 223
Nasal and oral sounds 150
Nasal cavity modelling 333—335
Nasal stops 153
Nasalisation colouring 167
Natural phonology 183
Natural-language parsing 102—105
Natural-language parsing, Cocke — Younger — Kasami (CYK) algorithm 104
Natural-language parsing, context-free grammars (CFGs) 102—104
Natural-language parsing, probabilistic parsers 104—105
Natural-language parsing, statistical parsing 105
Natural-language text decoding 46—47 97—101
Natural-language text decoding about natural-language text 97—98
Natural-language text decoding, acronyms 99
Natural-language text decoding, homograph disambiguation 99—101
Natural-language text decoding, letter sequences 99
Natural-language text decoding, non-homographs 101
Naturalness issues/tests 3 47—48 510 523 524
Naturalness issues/tests, mean opinion score 524
NetTalk algorithm 219—220
Neural networks, and G2P algorithms 219—220
Neutral vowel sound 152—153
NextGen system (AT&T) 513—514
Non-linear phonology 183
Non-linguistic issues 32—33
Non-natural-language text decoding 92—97
Non-natural-language text decoding about non-natural-language text 92
Non-natural-language text decoding, parsing 95
Non-natural-language text decoding, semiotic classification 92—94
Non-natural-language text decoding, semiotic decoding 95
Non-natural-language text decoding, verbalisation 95—97
Non-standard words (NSWs) 106
Non-uniform unit synthesis 480
Nonsense words/logotomes 415
Nuclear accents 230
Null/neutral prosody 18
Number-communication systems 33
Nyquist frequency 279
Observations for HMMs 436—438
Observations for HMMs, covariance matrix 437
Observations for HMMs, Gaussian/normal distribution/bell curve 436—438
Observations for HMMs, multivariate Gaussian 437
Observations for HMMs, probabilistic models 436
Observations for HMMs, probability density functions (pdfs) 436
Observations for HMMs, standard deviation 436
Observations for HMMs, variance 436
Obstruent consonants 154
Offline lexicon 213—214
Open-phase analysis 377—378
Optimal coupling 432
Optimality Theory 183
Oral and nasal sounds 150
Oral cavity 152
Oral cavity, sound source positions 335
Oral stops 153
Over-fitting data 209
Overwrite paradigm 71
Palatalisation 180
Parameterisation of glottal-flow signals 379
Parsing/parsers 53 95 103 see
Parsing/parsers, probabilistic parsers 104—105
Parsing/parsers, statistical parsing 105
Part-of-speech (POS) tagging 82 88—92
Part-of-speech (POS) tagging, generative models 89—90
Part-of-speech (POS) tagging, hidden Markov model (HMM) 89—91
Part-of-speech (POS) tagging, n-gram model 91
Part-of-speech (POS) tagging, observation probabilities 90
Part-of-speech (POS) tagging, POS homographs 88
Part-of-speech (POS) tagging, syntactic homonyms 88—89
Part-of-speech (POS) tagging, transition probabilities 90
Part-of-speech (POS) tagging, Viterbi algorithm 92
Partial-synthesis function 493
Perceptual linear prediction 370
Perceptual substitutability principle 485
Periodic signals 262—269 305—307
Phase mismatch issues 431—432
Phase shift 264
Phase-splicing systems 44
Phone-class join costs 498—499
Phoneme inventories 204—205
Phoneme inventories, British English MRPA 554
Phoneme inventories, modified TIMIT for General American 553
Phoneme TTS models 39
Phonemes and graphemes 28
Phonemes and verbal communication 14 16 161—164
Phones about phones 162—164
Phones, definitions 553—555
Phonetic similarity principle 197—199
Phonetics/phonology 16 see "Phonology" "Phonotactics" "Speech
Phonetics/phonology, phonetic context 167
Phonetics/phonology, phonetic variants 57
Phonological theories 181—184
Phonological theories, articulatory phonology 183
Phonological theories, autosegmental phonology 183
Phonological theories, computational phonology 184
Phonological theories, context sensitive rules 182
Phonological theories, dependency phonology 183
Phonological theories, feature geometry 183
Phonological theories, government phonology 183
Phonological theories, metrical phonology 183
Phonological theories, natural phonology 183
Phonological theories, non-linear phonology 183
Phonological theories, optimality theory 183
Phonological theories, The Sound Pattern of English (SPE) 181—182
Phonology 172—189 see "Phonotactics"
Phonology about phonology 172 189—191
Phonology, lexical stress 186—189
Phonology, maximal onset principle 185
Phonology, metrical phonology 114
Phonology, palatalisation 180
Phonology, phonological phrases 114
Phonology, syllabic consonants 184
Phonology, syllables 184—186
Phonology, word formation/lexical phonology 179—181
Phonotactics 172—179
Phonotactics, distinctive features 174
Phonotactics, feature structure 174—179
Phonotactics, phonotactic grammar 172—177 207
Phonotactics, primitives issues 176
Phonotactics, syllable structures 176—177
Phrasing prediction 129—136
Phrasing prediction, classifier approaches 132—133
Phrasing prediction, deterministic approaches 130—131
Phrasing prediction, experimental formulation 129—130
Phrasing prediction, HMM approaches 133—135
Phrasing prediction, hybrid approaches 135—136
Phrasing/prosodic phasing 112—115
Phrasing/prosodic phasing about phrasing 112—113
Phrasing/prosodic phasing, phasing models 113—115
Pictographic writing 34
Pipelined TTS models 39
Pitch accents 230 234—236
Pitch accents, alignment factors 235—236 534—535
Pitch accents, height factors 235
Pitch detection/tracking 379—381
Pitch detection/tracking, pitch-detection algorithms (PDAs) 379
Pitch range 233—234
Pitch-accent languages 121 227
Pitch-marking 381
Pitch-synchronous overlap and add (PSOLA) techniques 415—421
Pitch-synchronous overlap and add (PSOLA) techniques about PSOLA 415—416 421
Pitch-synchronous overlap and add (PSOLA) techniques, epoch manipulation 417—420
Pitch-synchronous overlap and add (PSOLA) techniques, time-domain PSOLA (TD-PSOLA) 416—417
Pitch-synchronous speech analysis 347 381
Polynomial analysis (poles and zeros) 294—297
Post-lexical processing 223—234
Pragmatics 17
pre-processing 52
Pre-recorded prompt systems 43
Precision and recall scheme 130
Probabilistic and sequence join function 501—502
Probabilistic models 436
Probabilistic parsers 104—105
Probability density functions (pdfs) 436
Probability mass functions (PMFs) 541
Probability theory, continuous random variables 547—550
Probability theory, continuous random variables, cumulative density functions 549—550
Probability theory, continuous random variables, expected values 548—549
Probability theory, continuous random variables, Gaussian (normal) distribution 549
Probability theory, continuous random variables, uniform distribution 549
Probability theory, discrete probabilities 540—542
Probability theory, discrete probabilities, discrete random variables 540—541
Probability theory, discrete probabilities, expected values 541—542
Probability theory, discrete probabilities, moments of a PMF 542
Probability theory, discrete probabilities, probability mass functions (PMFs) 541
Probability theory, pairs of continuous random variables 550—552
Probability theory, pairs of continuous random variables, entropy for 552
Probability theory, pairs of continuous random variables, independent versus uncorrected 551
Probability theory, pairs of continuous random variables, Kullback — Leibler distance 552
Probability theory, pairs of continuous random variables, sum of two 551—552
Probability theory, pairs of discrete random variables 542—557
Probability theory, pairs of discrete random variables, Baye's rule 545
Probability theory, pairs of discrete random variables, chain rule 546
Probability theory, pairs of discrete random variables, conditional probability 545
Probability theory, pairs of discrete random variables, correlation 544
Probability theory, pairs of discrete random variables, entropy 546—547
Probability theory, pairs of discrete random variables, expected values 543
Probability theory, pairs of discrete random variables, higher-order moments and covariance 544
Probability theory, pairs of discrete random variables, independence 543
Probability theory, pairs of discrete random variables, marginal distributions 543
Probability theory, pairs of discrete random variables, moments of a joint distribution 544
Probability theory, pairs of discrete random variables, sum of random variables 545—546
Problems in text-to-speech 44—50
Problems in text-to-speech, adapting systems 50
Problems in text-to-speech, assumed intent for prosody 49—50
Problems in text-to-speech, auxiliary generation for prosody 49—50
Problems in text-to-speech, homograph ambiguity 46
Problems in text-to-speech, intelligibility issues 48—49
Problems in text-to-speech, natural language text decoding 46—47
Problems in text-to-speech, naturalness 47—48
Problems in text-to-speech, syntactic ambiguity 46—47
Problems in text-to-speech, text classification/semiotic systems 44—46
Processing documents 68—71 see
Ðåêëàìà