Авторизация
Поиск по указателям
Witten I.H., Moffat A., Bell T.C. — Managing Gigabytes: Compressing and Indexing Documents and Images
Обсудите книгу на научном форуме
Нашли опечатку? Выделите ее мышкой и нажмите Ctrl+Enter
Название: Managing Gigabytes: Compressing and Indexing Documents and Images
Авторы: Witten I.H., Moffat A., Bell T.C.
Аннотация: In this fully updated second edition of the highly acclaimed Managing Gigabytes, authors Witten, Moffat, and Bell continue to provide unparalleled coverage of state-of-the-art techniques for compressing and indexing data. Whatever your field, if you work with large quantities of information, this book is essential reading—an authoritative theoretical resource and a practical guide to meeting the toughest storage and access challenges. It covers the latest developments in compression and indexing and their application on the Web and in digital libraries. It also details dozens of powerful techniques supported by mg, the authors' own system for compressing, storing, and retrieving text, images, and textual images. mg's source code is freely available on the Web.
Язык:
Рубрика: Технология /
Статус предметного указателя: Готов указатель с номерами страниц
ed2k: ed2k stats
Издание: Second edition
Год издания: 1999
Количество страниц: 519
Добавлена в каталог: 15.11.2009
Операции: Положить на полку |
Скопировать ссылку для форума | Скопировать ID
Предметный указатель
code 118—119 128 150 236
code 117—119 128 150 236 248
code in mg 421
see "Fibonacci sequence"
4-connectivity see "Connectivity"
8-connectivity see "Connectivity"
ACB (Associative Coder of Buyanovsky) 101
accumulators 201 209—210 222
Accumulators in hash table 426
Accumulators, continue strategy 207 209 426 428
Accumulators, memory 206—207
Accumulators, normalizing 205
Accumulators, quit strategy 207 209 426 428
Acyclic graph 165—168
Adams, John 432
Adaptive model 22 28—30
Adaptive model, DMC 69
Adaptive model, images 331
Adaptive model, synchronization of 85 90
Adult Web sites 196
Advertising in the World Wide Web 196—197 221
Agent 440—441
Alice in Wonderland 453
Aliweb 438
AltaVista 194 196 439
Answer (to query) 153
antialiasing 266
Antony, Mark 432 438
Approximate weights 203—205 222
Approximate weights in mg 425
Approximate weights, calculation of 203—205
Approximate weights, ranking using 205
Archimedes 15
Arithmetic coding 22 51—61 100
Arithmetic coding for bilevel image compression 276
Arithmetic coding for mg 394
Arithmetic coding in JBIG 282 287
Arithmetic coding in JPEG 301
Arithmetic coding, adaptive 52
Arithmetic coding, approximate 57—59 282 287
Arithmetic coding, binary alphabets 59
Arithmetic coding, char program 407
Arithmetic coding, compared with Huffman coding 395
Arithmetic coding, cumulative counts 59—61
Arithmetic coding, example 53
Arithmetic coding, inefficiency of 55
Arithmetic coding, interface 56—57
Arithmetic coding, semi-static 54
Arithmetic coding, speed of 98
Arithmetic coding, static 54
Arithmetic coding, synchronization of 87
Arnold, Matthew 3
Audio index 476
Autocorrelation function 363—365 388
B-tree 169 259
Barney 196
Batched frequency model 125 128
Beethoven 11
Bell, Timothy A.H. xxix 235
Bender, Todd K. 2 20
Bernoulli model, global 119—121
Bernoulli model, local 121—122
Bible collection 107 150
Bible collection, inverting 224 230
Bible collection, time to build database 420
Bible Cruden's concordance 12 19
Bible earliest concordance 19
Bible Moulton's Greek concordance 1
Bible Strong's concordance 14 20 149
Bible Young's concordance 20
Bigram indexing see "n-gram indexing"
Bilevel image compression 268—288 309 see "Clairvoyant "Two-level
Bilevel image compression in mg 416 466
Bilevel image compression, context-based 273—281
Bilevel image compression, performance 273
Binary coding 116 126 246
Binary coding for mg 394
Binary coding, compared with Huffman coding 41 395
Binary coding, FELICS 292
Birthday paradox 162 164
Bitmap 122 140—141
Bitmap, comparison with other methods 143—145
Bitmap, compression of 141—142
Bitmap, construction of 226 255—256
Bitmap, random access to 142—143
Bitsliced signature file 133—135
Bitsliced signature file, construction of 254—255
BitVector see "Bitmap"
Block sorting see "Burrows — Wheeler transform"
Boolean Query 111 153 174—180
Boolean query, conjunctive 174—179
Boolean query, disadvantages of 154—155
Boolean query, evaluation 174—175
Boolean query, fuzzy 222
Boolean query, nonconjunctive 180
Boolean query, skipped inverted file see "Skipping"
Boolean query, term processing order 175—176
Boolean query, using bitmap 140
Boolean query, using signature file 131—133
Boundary tracing 320—323
Braille 21 23
Burrows — Wheeler transform 65—69 101
Bush, V. 6 442 447
Byron 2
bzip2 program 91 407
bzip2 program, memory 410
bzip2 program, speed 391
Calgary corpus 92 102
CALIC 294—297 309 310
candidate 174 198
Canonical Huffman coding 32—41 73 see
Canonical Huffman coding in gzip 79
Canonical Huffman coding in huffword 91 406—407
Canonical Huffman coding in pack 90
Canonical Huffman coding, example 34—38
Canterbury corpus 92 101 406
Carroll, Lewis 453
Case folding 105 146 see
Case folding in NZDL 470
Case folding, effect on index size 147
CCITT 264
CCITT fax standard 268—272 310
CCITT fax standard, Group 3 268—273
CCITT fax standard, Group 4 268—273 347
CCITT fax standard, Huffman coding in 269
CCITT fax standard, performance of 272
CCITT, test images 273 281 310 319
CD-ROM 5 422
Centroid alignment (template matching) 418
Centroid alignment (template matching), performance 334
char program 91 102 407
City Council Minutes 476
Clairvoyant compression 279—281 330 342—343
Clarke, Mrs. Mary Cowden 2 19
Classification of regions 385—388
Cleopatra 432 438
Clustering of words 122
Codebook see "Compression dictionary-based"
Coding 23 30—61 see "Binary "Canonical " " "Golomb "Huffman "READ "Unary "Ziv
Coding in mg 394—395
Coding, Braille 21 23
Coding, length-limited 401—405 428
Coding, Morse 21 27
Coding, self-synchronizing 87—90
Colinear sets of points 359
COLOR 265
Color image compression see "JPEG"
Comact collection 256
Comact collection, description of 107 150
Comact collection, expansion of text 413
Comact collection, page numbers in 394
Comact collection, restricted decoding memory 411
Comact collection, synchronization experiments 407—409
Comite Consultatif International de Telegraphic et Telephonie see "CCITT"
Compact disc see "CD-ROM"
compact program 97
Complexity-based induction 446
compress program 82 84 91 96 102
Compression see also "Bilevel image compression" "Grayscale "Image "Index "JBIG" "JPEG" "Lossless "Lossy "Text "Textual
Compression, dictionary-based 23 74—84
Compression, grammar-based 27
Compression, leakage 236—237 247—248 394
Compression, statistical 23
Compression, symbolwise 23 61—74
Computer Science Technical Reports 469
Computer typesetting 436
Computists' Communique 472
Concordance xxvi 1—3 19—20 109 see "Inverted
Connectivity 321—323
Context-based image compression 273—282 290 295
Corner alignment (template matching) 418
Cosine measure 155 186—187
Cosine measure in mg 425—428
Cosine measure, implementation of 198—213
Cosine measure, use in dynamic collections 260
Cross-entropy 331 335
Cross-reference see "Inverted file"
Cruden, A. 19
d-gap 115 246
da Vinci, Leonardo 465
Darwin, Charles 3
Data Compression Conference 99
Data Mining 441—442
Dead links 220
Decode tree 31 36 38
Descriptor see "Signature file"
DICTIONARY 228 231 237
Dictionary, memory required 237 244
Dictionary-based models 23 74—84
digital journal 20 436
Digital libraries 442—444 469—483
Digital search tree 80—81
Digitization 313 see
Digitization, effect on segmenting document 361
Digitization, errors 263 307
Digitization, halftone image 369
Digram coding 74
Dinosaur 196
Discrete cosine transform 298—303
Disk access, random 230
Disk access, sequential 231
Distributed retrieval 218—221
Dithering 285
DMC 69—72 101
dmc program 91 102
dmc program, compression performance 391 407
dmc program, memory 99
Docstrum 370—372 386 388
Document database xxiv 6 104 see "Comact "GNUbib "TREC
Document database, definition of document 104
Document database, examples 106—109
Document database, final compressed size 422
Document database, time to build 420
Document grammar 383 388
Document image see "Textual image"
Document length 186 see "Document"
Document spectrum see "Docstrum"
Dynabook 447 450
Dynamic collections 226 256—260 412—415 429
Dynamic collections, expanding the index 257—259
Dynamic collections, expanding the text 256—257 412—415
Dynamic Markov compression see "DMC"
Eight queens 196
entropy 24 51 335 see "Self-entropy"
Entropy of English 94 102
Error map 326
Escape symbol for uncompressed text 256
Escape symbol in PPM 62
Escape symbol in word-based compression 73 410—415
Euler's constant 124 212
Excite 194 439
Exclusions (in PPM) 62
Extended precision arithmetic 399
False ligature 344—346 419
False match 220
False match in n-gram index 171
False match in signature file 131—132 134—139 143 144 151
False match, because of granularity 112
False match, because of word definition 109
FAQ archive 472
Far from the Madding Crowd 28 34 61 63 65 66 77 84
FELICS 291—294 309 310
FELICS in mg 417 466—467
Fibonacci sequence 125 397—398 400 428
Filling, flood 323
Filling, region 323—325
Finite-context model 26 61—72 275—279
Finite-state model 26 69—72
Fragmentation 358
Free Software Foundation 78
Frequency matrix 224
Front coding 159—161 410
Full-text retrieval xxiii 8 20 431
Full-text retrieval for gray literature 443
Full-text retrieval, Computer Science Technical Reports 469
Full-text retrieval, fixture developments 444
Full-text retrieval, library catalog 476
Full-text retrieval, mg system 451
Full-text retrieval, NZDL 469 481
Furness, Mrs. Horace Howard 2 14 19
Geden, A. 1
Geometric distribution 119
GIF 289—290 309
GLIMPSE system 429
GNUbib collection 107 150
Golomb coding 119—121 150 247—248
Golomb coding in FELICS 293
Golomb coding in JPEG-LS 297
Golomb coding in mg 421
Golomb coding, average case 120
Golomb coding, worst case 247
Golomb, Solomon 119
Gopher 438
Gradient-adjusted prediction 295 297
Grammar-based compression 27
Granularity 105 112 171 423 459
Gray bars xxv
Gray codes 281
Gray literature 443
Grayscale image compression 309 see "FELICS"
Grayscale image compression in mg 417 466
Grayscale image compression, use of JBIG 281
Greedy parsing 79 101
gzip program 78—79 102 290 453
gzip program, performance 91 98 407
Halftone 266
Halftone, identifying graphics 355 373 386
Halftone, JBIG 285
Halftone, READ coding 274
Hamlet 458
Hapax legomena 34 36 50 63 122
Hardy, Thomas 28
Hash function 151 161 254 255
Hash function, choice of 137
Hash function, collision 162
Hash function, evaluation time 255
Hash function, minimal perfect 162 see
Hash function, order-preserving 162
Hash function, perfect 162
Hash table 162 228 229
Реклама