Авторизация 
		         
		        
					
 
		          
		        
			         
		          
		        
			        Поиск по указателям 
		         
		        
			        
					 
				        
					
			         
		          
		        
			         
		          
			
			         
		         
       		 
			         
		          
                
                    
                        
                     
                  
		
			         
		          
		        
			         
		          
		
            
	     
	    
	     
	    
	    
            
		 
                
                    Clarke C.L.A., Cormack G.V. — Information Retrieval: Implementing and Evaluating Search Engines 
                  
                
                    
                        
                            
                                
                                    Обсудите книгу на научном форуме      
 Нашли опечатку? Выделите ее мышкой и нажмите Ctrl+Enter 
 
                                 
                                
                                    Название:   Information Retrieval: Implementing and Evaluating Search Engines 
Авторы:   Clarke C.L.A., Cormack G.V.  
Аннотация:  Information retrieval is the foundation for modern search engines. This textbook offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. Wumpus, a multi-user open-source information retrieval system developed by one of the authors and available online, provides model implementations and a basis for student work.
 The modular structure of the book allows instructors to use it in a variety of graduate-level courses, including courses taught from a database systems implementation perspective, traditional information retrieval courses with a focus on IR theory, and courses covering the basics of Web retrieval. Additionally, professionals in computer science, computer engineering, and software engineering will find Information Retrieval a valuable reference.
 After an introduction to the basics of information retrieval, the text covers three major topic areas — indexing, retrieval, and evaluation — in self-contained parts. The final part of the book draws on and extends the general material in the earlier parts, treating specific application areas, including parallel search engines, link analysis, crawling, and information retrieval over collections of XML documents. End-of-chapter references point to further reading; end-of-chapter exercises range from pencil and paper problems to substantial programming projects.
 
Язык:   
Рубрика:  Технология / 
Статус предметного указателя:  Готов указатель с номерами страниц  
ed2k:   ed2k stats  
Год издания:  2010 
Количество страниц:  632 
Добавлена в каталог:  18.06.2014 
Операции:  Положить на полку  |
	 
	Скопировать ссылку для форума  | Скопировать ID 
                                 
                             
                        
                     
                 
                                                                
			         
	          
                
                    Предметный указатель 
                  
                
                    
                        Heap        128   141   184    
Hidden Markov model        306    
Hidden Web        511    
HITS algorithm       532—534   554    
Holdout validation       383    
Holistic twig joins       585    
Home page finding       539    
Host crowding       493    
HTML        9   525   567    
HTML anchor        277   536    
HTML body        277    
HTML header        277    
Huffman code        181—185   189   200    
Huffman code, canonical       184—185   199   201    
Huffman code, length-limited       185   201   209    
Hungarian        94    
Hybrid index maintenance       238—239    
Hyperlinks        9    
HyperText Markup Language        see "HTML"    
Hypothesis test        427—429    
IDF        see "Inverse document frequency"    
IE       see "Information extraction"    
Impact ordering       153   494    
Implicit user feedback       526   535   540   555    
In-degree        509    
Incremental crawling       547    
Independence assumption        261    
Index block size       116    
Index construction        118—131    
Index construction, in-memory       119—125    
Index construction, merge-based       127—131   229    
Index construction, sort-based       125—127    
Index construction, two-pass       123    
Index partition        127   228   240   471   488    
Index pruning       153—160   495    
Index types        46—51    
Index updates, distributed       490    
Index updates, incremental       231—242    
Index updates, non-incremental       243—251    
Indexable Web        511    
Indexing time       105    
Indri        27—28    
Inex        565   579    
INEX, CAS task       579    
INEX, CO task       579    
infAP       449—450    
Inference network model       280    
Inferred average precision       see "infAP"    
Information extraction        5    
Information gain        366    
Information need        5    
Informational query        514    
Inner product        55    
Insert-at-back heuristic       121    
Inter-query parallelism       488    
Interactive search and judging       443    
Interpolative coding       202—204   213   223    
Intra-query parallelism       489   494    
Intranet        511    
Invalidation list       243—244    
Inverse document frequency        57   264   581    
inverted index        33    
Inverted index, docid       49    
Inverted index, frequency       49    
Inverted index, positional       49    
Inverted index, schema-dependent       48    
Inverted index, schema-independent       33   48   49    
Irish        94    
Italian        94   95    
Japanese        95   98    
JavaScript        13    
Jelinek — Mercer smoothing       291   295    
Jump vector (PageRank)       523    
Kendall's        445    
Kendall's notation       474    
Kendall, David        474    
Kendall, Maurice        445    
Kernel trick        358    
KL divergence        see "Kullback — Leibler divergence"    
Korean        95    
Kullback — Leibler divergence        156   286   296   527    
Lam        328    
Landmark-diff       252    
Language model        17—23    
Language modeling        258   286   287—298    
Laplace's law of succession        301    
Laplace, Pierre-Simon        298    
latency        8   470    
Latent Semantic Analysis        78    
Lazy evaluation        244    
Learning , on-line        337    
Learning , semi-supervised       336    
Learning , supervised        336    
Learning , transductive       336    
Learning , unsupervised        337    
Learning to rank        312   376   394—400    
Learning, incremental        337    
Legal search        46    
Lemma        87    
Lemmatization        87    
Length normalization        see "Document length normalization"    
LETOR       399    
Lexeme        87    
LFU        482    
Lightweight structure       160—168    
Likelihood ratio        333   341    
Linear classifier        349    
Link analysis        517—534   554    
Link function        356    
Linked list        122    
Linked list, unrolled       123   124   130    
List compression, batched       196    
List compression, global       195    
List compression, local       195   210    
ListNet       399    
Little's Law        475   476    
Little, John        475    
LLRUN       200—201   209   212   253    
LLRUN-k       202    
LOG        422    
Log-odds        260    
Logical document structure       11    
Logistic regression        346   383   389    
Logistic regression, gradient descent       348    
Logistic regression, multicategory       392    
logit        260   422    
Logit average       328    
Long tail        480   513    
Lookup table        208    
Lovins stemmer       97    
LRU        482    
LSA        78    
Lucene        27    
m-cover        303    
M/M/1 queueing model        475—477    
Macbeth        9   33   290   508   567   577    
Machine learning        312   336    
Macro-average       322    
MAP        71—74   137   409   444   447   584    
MapReduce        498—503    
Markov chain        23   529    
Markov chain, aperiodic        529    
Markov chain, continuous        475    
Markov chain, irreducible        529    
Markov chain, periodic        529    
Markov model        21—23   362    
Maximal marginal relevance       461   493    
Maximum likelihood        17   289   297    
MaxScore       143—145   491    
Mean average precision       see "MAP"    
Mean reciprocal rank       see "MRR"    
Mean, arithmetic        44   68   409    
Mean, geometric        44   409    
Mean, harmonic        68    
Mean, weighted harmonic       68    
Merge operation, cascaded       126   129    
Merge operation, multiway       126   128   241    
Meta-analysis        415   439—441    
Metalanguage        11    
Metasearch        380    
Micro-average       322    
Microsoft Office        13    
Monty Python's Flying Circus       78    
Morphology        86    
Move-to-front heuristic        121    
Move-to-front pooling       444    
MRR        322   409   539    
Multicategory classification        388—394    
Multicategory ranking       388—394    
N       48    
N-gram        92—93   95   96    
Naieve Bayes       334    
Named page finding       538    
Navigational query       513   539    
nDCG       451—453   538    
Near-duplicate Web page       549    
New Oxford English Dictionary       160   169    
NEXI        564   572—573    
NeXT        33    
nextDoc       49    
NIST        23    
No Merge        232   233    
Nonparametric code       192—195   216    
Normal distribution        417    
Normalized Discounted Cumulative Gain       see "nDCG"    
Novelty        455—460   537   549    
NTCIR       98    
nugget        459    
Null hypothesis        427    
Obama, Barack       441   515    
OCR        97   see    
Odds        333    
Odds ratio        333    
ODP        see "Open Directory Project"    
Offset        48    
Okapi BM25       see "BM25"    
Okapi BM25F       see "BM25F"    
Omega code       see "  code"    
On-line indexing        see "Index updates"    
Open Directory Project        526   547    
Open source        27    
Open Source IR Systems       27—28    
Optical Character Recognition        4   85    
Order-preserving        260    
Orthography        94    
Out-degree        509    
Overfitting        338   349    
Overlap        580    
p-value        426    
Package-Merge        185    
PageRank        105   517—532   554    
PageRank, focused       526    
PageRank, personalized       526    
PageRank, topic-oriented       526    
Parametric code        195—201   216    
Passage retrieval        302—305    
Path expressions        571    
PCA        554    
PDF        11    
Pearson, Karl        427    
Per-term index       112   133    
Perceptron algorithm        352—353   357    
Perron — Frobenius theorem        530    
Phrase search        35—39   111    
Physical document structure       11    
Pike, Rob        97    
Pinyin        96    
Pivoted document length normalization       78    
Poisson distribution        267—268   473   548    
Poisson, Simeon Denis        268    
Polish        94    
Pooling (TREC)       73—75   411   441   443—448    
Popper, Karl Raimund        427    
Population        414    
Porter stemmer        87    
Porter, Martin       87   95    
Portuguese        94    
Position tree        see "Suffix tree"    
Positional index        49    
Postings list       33   110—114   161    
PostScript        11    
Power        406   434—438    
Power method        530    
PPM        190    
Pre-allocation factor       123    
Pre-allocation, proportional       123   236    
Preamble        184   186   212   223    
PRECISION        67—68   318   328   407    
Precision at k documents       69   408    
Precision of measurement        413    
Precision, interpolated       70    
Prefix query       106   110   113   133    
Prefix-free        178    
prev        33    
prevDoc       49    
PRF        see "Pseudo-relevance feedback"    
Principal component analysis        554    
Prior odds        334    
Probabilistic model        258    
Probability density function        341   417   473    
Probability density function, cumulative        417    
probability distribution        see "Distribution"    
Probability Ranking Principle       8   259   287    
Proper binary tree       179    
Prosecutor's fallacy        332    
Proximity ranking        see "Term proximity"    
PRP        see "Probability Ranking Principle"    
Pseudo-frequencies       279    
Pseudo-relevance feedback       131   156   275—277   469    
qrels       24   411   441   443    
Query        6    
Query abandonment       540    
Query arrival rate       473    
Query drift        277    
Query execution plan        244    
Query expansion        273   297    
Query log        98   472   480   513    
Query processing, document-at-a-time       139—145    
Query processing, term-at-a-time       145—151   493    
Query processing, top-k       142—145    
Query reformulation       540    
Query term frequency       271    
Query time       105    
Question answering        5   302   457    
Queue discipline        474   478    
Queueing theory        472—477    
Random access        35   111   116   196   216    
Random error        413    
Range encoding       223    
Rank effectiveness        454    
Rank-biased precision       461    
Rank-equivalent       260    
Rank-preserving        260    
RankBoost       399    
RankEff       454    
RankSVM       399    
realloc        123   124    
Recall        67—68   88   138   318   328   407    
Recall-precision curve       70    
Receptionist        490    
                            
                     
                  
			 
		          
			Реклама