Авторизация
Поиск по указателям
Clarke C.L.A., Cormack G.V. — Information Retrieval: Implementing and Evaluating Search Engines
Обсудите книгу на научном форуме
Нашли опечатку? Выделите ее мышкой и нажмите Ctrl+Enter
Название: Information Retrieval: Implementing and Evaluating Search Engines
Авторы: Clarke C.L.A., Cormack G.V.
Аннотация: Information retrieval is the foundation for modern search engines. This textbook offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. Wumpus, a multi-user open-source information retrieval system developed by one of the authors and available online, provides model implementations and a basis for student work.
The modular structure of the book allows instructors to use it in a variety of graduate-level courses, including courses taught from a database systems implementation perspective, traditional information retrieval courses with a focus on IR theory, and courses covering the basics of Web retrieval. Additionally, professionals in computer science, computer engineering, and software engineering will find Information Retrieval a valuable reference.
After an introduction to the basics of information retrieval, the text covers three major topic areas — indexing, retrieval, and evaluation — in self-contained parts. The final part of the book draws on and extends the general material in the earlier parts, treating specific application areas, including parallel search engines, link analysis, crawling, and information retrieval over collections of XML documents. End-of-chapter references point to further reading; end-of-chapter exercises range from pencil and paper problems to substantial programming projects.
Язык:
Рубрика: Технология /
Статус предметного указателя: Готов указатель с номерами страниц
ed2k: ed2k stats
Год издания: 2010
Количество страниц: 632
Добавлена в каталог: 18.06.2014
Операции: Положить на полку |
Скопировать ссылку для форума | Скопировать ID
Предметный указатель
Reciprocal rank 409 461
Reciprocal rank fusion 380
Redundancy 496—498
Refresh policy 548
Region algebra 160—168 169 567
Relevance 8 24 67 261 442
Relevance feedback 273—275 319 326 354
Relevance Ranking 3
Relevance, binary 8 407
Relevance, graded 8 395 451—453
Replacement algorithm 482
Replication 471 497
Replication, dormant 498
Replication, partial 497
Resampling 424
research hypothesis 427
Response time 8 75 470 476
Restart probability (PageRank) 518
Retrieval model, Boolean 63
Retrieval model, language modeling 258 286
Retrieval model, probabilistic 258
Retrieval model, vector space 54
Retrieval status value see "Score"
Rice code see "Golomb code"
Robertson, Stephen 258
Robertson/Spaerck Jones weighting formula 265
Robots Exclusion Protocol 544
Robots.txt 544
ROC curve 329 397
Rocchio classifier 354
Rocchio feedback 280
Romeo and Juliet 50 58
Routing 4 310
RSV see "Score"
Run (TREC) 73 411
Russian 94
S stemmer 97
SALSA algorithm 532—534 554
Salton, Gerald 54
Sample 420
Scalar product 55
Scheduling algorithm 478
Schema-dependent index 48
Schema-independent index 33 48 49
Score 7 59
Scottish Gaelic 94
Seek latency 493 592
Segmentation 94 95
Selective dissemination of information 4 310
Selector 193 208
Self-indexing 112
Self-information 299
Semi-static coding 177
Semi-supervised learning 336
Sensitivity 332
SEO 553
Sequential scan 40
SERP 510
Service rate see "Throughput"
Service time 470
SGML 11 568
Shakespeare in Love 303
Shakespeare, William 9 33 51 89 160 263 278 302 536
Shannon's Theorem see "Source coding th'm"
Shannon, Claude 180 191
Shape property 141
Shard 500
Shingle 550
Shortest job first 478
Signature file 77 131
significance 425
Significance level 416
Significance test 430—438
Significant inversion rate 445
Simple-9 207
SJF see "Shortest job first"
SMART 78
Smoothing 20—21 264 290 340 450
Smoothing, Dirichlet 291 295
Smoothing, Jelinek — Mercer 291 295
Smoothing, linear 291
Snippet 7 131 540
Snippet, generation 302
Snowball stemmer 95 97
Source coding theorem 180 188
Source population 415
Spaerck Jones, Karen 258
Spam 507 555
spam filtering 325 342
Spanish 94 95 98
Spec 469
Specificity 8 332 584
Spelling correction 98
Splits 384
stacking 376 381—385
Standard error 386 422 429
Standard Generalized Markup Language see "SGML"
Starvation 479
Static coding 177
Static page 510
Static rank 54 517—535
Stationary distribution of a Markov chain 530
Steady state 231 248 249
Stemming 84 86—89 95 97
STL 120
Stochastic matrix 22
Stochastically independent 268
stopping 84
Stopword 85 89—90
Structural index 585
Structural metadata 568
Student's t-distribution 423
Suffix array 77 131
Suffix tree 77 131 133
summarization 5
Supervised learning 319 336
Support vector machine see "SVM"
Support vectors 353
SVM 353 368
SVM, multicategory 393
SVM, ranking 396
Swedish 94 95
Symbol 14 176
Synchronization point 112 195 219
Synonymy 78
Systematic error 413
t-distribution 423
Table-driven decoding 208—209
Target population 415
tdt see "Topic detection and tracking"
Teleport (PageRank) 523
Term 6 15
Term descriptor 217
Term frequency 48 53 57 266
Term partitioning 493—495 496
Term proximity 54 60—63 302—304
Term selection value 274
Term vector 51
Test collection 23—26 411 453
Test collection, ClueWeb 09 25
Test collection, construction 73—75
Test collection, GOV2 25
Test collection, INEX 583
Test collection, TREC45 26
Text REtrieval Conference see "TREC"
TF see "Term frequency"
TF-IDF 57 270 293
The Merry Wives of Windsor 20
The Winter's Tale 14
Thompson, Ken 97
Throughput 8 75 470 477
Token 13
Toolbars 526
Topic 5
Topic detection and tracking 5
Traffic intensity see "Utilization"
Transactional query 514
Transductive learning 336
Transfer function 356 422
Transferability 415
Transition matrix 22
TREC 23—26 67 98 272 410—412
TREC, Filtering Track 314
TREC, Million Query Track 25
TREC, Public Spam Corpus 325
TREC, Robust Track 282
TREC, Spam Track 371
TREC, Terabyte Track 25 213 539
TREC45 collection 26
Trigram 115
True negative 332
True negative rate 332
True positive 332
True positive rate 332
Trust bias 540
TrustRank 555
TSV see "Term selection value"
Turkish 95
Twig 585
Two-Poisson model 267
Type 1 error 331
Type 2 error 331
Unary code 192
Unicode 13 91 95 97
Universal codeword set 223
University of Massachusetts 27
Unsupervised learning 337
URL 525
User intent 513
user satisfaction 410 453 470
UTF-8 13 91 97
utilization 476 477 493
Validity 406 413 434—438
Validity, external 415
Validity, internal 415
vByte 205—206 213 220 223 253
Vector space model 54—60 78
Vocabulary 14
W3C 564 572
Warmup period (cache) 480
Web crawler 507 541—552 556
Web crawler, crawler trap 557
Web crawler, incremental 547
Web graph 508
Web query 507 513
Web search evaluation 538
Web spam 507 555
Web, hidden 511
Web, indexable 511
Webster, John 18
Wikipedia 3 27 277
Wilcoxon T distribution 433
Winnie the Pooh 80
Wisdom of crowds 376
Word-aligned coding 206—207
World Wide Web Consortium 564
WUMPUS 28
XCG 584
XML 11 29 160 565—570
XML Query 574
XML schema 568 570
XML, declaration 565
XML, DTD 568
XML, empty-element tag 566
XML, exhaustivity 584
XML, overlap 580
XML, ranked retrieval 579—584
XML, specificity 584
XML, well-formed document 568
XML, XCG 584
XML, XML Schema 568
XPath 564 571—572
XQuery 564 574—576
XSL 572
Zelazny, Roger 39
Zero-order model 178 286
Zipf's law 16 107 121 237 239 480 513
Zipf, George 16
Ziv — Lempel compression 191 220
Реклама