383—384 387
-subsumption 302
-subsumption, relationship with entailment and more_general_than partial ordering 299—300
Absorbing state 371
ABSTRIPS 329
Acyclic neural networks See "Multilayer feedforward networks"
Adaline rule See "Delta rule"
Additive Chernoff bounds 210—211
Adelines 123
Agents in reinforcement learning 368
Agnostic learning 210—211 225
ALVINN system 82—84
Analytical learning 307—330
Analytical learning, inductive learning, comparison with 310 328—329 334—336 362
Analytical-inductive learning See "Inductive-analytical learning"
ANN learning See "Neural network learning"
ANNs See "Neural networks artificial"
Antecedents of Horn clause 285
AQ algorithm 279—280
AQ14 algorithm, comparison with GABIL 256 258
Arbitrary functions, representation by feedforward networks 105—106
Artificial intelligence, influence on machine learning 4
Artificial Neural Networks See "Neural networks artificial"
Assistant 77
Astronomical structures, machine learning classification of 3
Attributes, choice of, in sequential vs. simultaneous covering algorithms 280—281
Attributes, continuous-valued 72—73
Attributes, cost-sensitive measures 75—76
Attributes, discrete-valued 72
Attributes, measures for selection of 73—74 77
Attributes, missing values, strategies for 75
Autonomous vehicles 3 4 82—84
Average reward 371
Backgammon learning program See "TD-Gammon"
Backpropagation algorithm 83 97 124
Backpropagation algorithm in Q learning 384
Backpropagation algorithm, applications of 81 84 85 96 113
Backpropagation algorithm, convergence and local minima 104—105
Backpropagation algorithm, definition of 98
Backpropagation algorithm, discovery of hidden layer representations 106—109 123
Backpropagation algorithm, feedforward networks as hypothesis space 105—106
Backpropagation algorithm, gradient descent search 89 115—116 123
Backpropagation algorithm, inductive bias of 106
Backpropagation algorithm, KBANN algorithm, comparison with 344—345
Backpropagation algorithm, KBANN algorithm, use in 339
Backpropagation algorithm, momentum, addition of 100 104
Backpropagation algorithm, overfitting in 108 110—111
Backpropagation algorithm, search of hypothesis space 97 106 122—123
Backpropagation algorithm, search of hypothesis space by genetic algorithms, comparison with 259
Backpropagation algorithm, search of hypothesis space by KBANN and TangentProp algorithms, comparison with 350—351
Backpropagation algorithm, search of hypothesis space in decision tree learning, comparison with 106
Backpropagation algorithm, stochastic gradient descent version 98—100 104—105 107—108
Backpropagation algorithm, TangentProp algorithm, comparison with 349
Backpropagation algorithm, weight update rule for hidden unit weights 103
Backpropagation algorithm, weight update rule for output unit weights 102—103 171
Backpropagation algorithm, weight update rule in KBANN algorithm 343—344
Backpropagation algorithm, weight update rule, alternative error functions 117—118
Backpropagation algorithm, weight update rule, derivation of 101—102
Backpropagation algorithm, weight update rule, optimization methods 119
Backtracking, ID3 algorithm and 62
Backward chaining search for explanation generation 314
Baldwin effect 250 267
Baldwin effect, computational models for 267—268
Bayes classifier, naive See "Naive Bayes classifier"
Bayes optimal classifier 174—176 197 222
Bayes optimal classifier, learning Boolean concepts using version spaces 176
Bayes optimal learner See "Bayes optimal classifier"
Bayes rule See "Bayes theorem"
Bayes theorem 4 156—159
Bayes theorem in Brute-Force MAP Learning algorithm 160—162
Bayes theorem in inductive-analytical learning 338
Bayes theorem, concept learning and 158—163
Bayesian belief networks 184—191
Bayesian belief networks, choice among alternative networks 190
Bayesian belief networks, conditional independence in 185
Bayesian belief networks, constraint-based approaches in 191
Bayesian belief networks, gradient ascent search in 188—190
Bayesian belief networks, inference methods 187—188
Bayesian belief networks, joint probability distribution representation 185—187
Bayesian belief networks, learning from training data 188—191
Bayesian belief networks, naive Bayes classifier, comparison with 186
Bayesian belief networks, representation of causal knowledge 187
Bayesian classifiers 198 See "Naive
Bayesian learning 154—198
Bayesian learning, decision tree learning, comparison with 198
Bayesian methods, influence on machine learning 4
Beam search, general-to-specific See "General-to-specific beam search"
Beam search, generate-and-test See "Generate-and-test beam search"
Bellman residual errors 385
Bellman — Ford shortest path algorithm 386 387
Bellman's equation 385—386
BFS-ID3 algorithm 63
Binomial distribution 133—137 143 151
Biological evolution 249 250 266—267
Biological neural networks, comparison with artificial neural networks 82
Bit strings 252—253 258—259 269
Blocks, stacking of See "Stacking problems"
Body of Horn clause 285
Boolean conjunctions, PAC learning of 211—212
Boolean functions, representation by feedforward networks 105—106
Boolean functions, representation by perceptrons 87—88
Boundary set representation for version spaces 31—36
Boundary set representation for version spaces, definition of 31
Bounds, one-sided 141 144
Bounds, two-sided 141
Brain, neural activity in 82
Breadth first search in ID3 algorithm 63
Brute-Force MAP Learning algorithm 159—162
Brute-Force MAP Learning algorithm, Bayes theorem in 160—162
C4.5 algorithm 55 77
C4.5 algorithm, GABIL, comparison with 256 258
C4.5 algorithm, missing attribute values, method for handling 75
C4.5 algorithm, rule post-pruning in 71—72
CADET system 241—244
Candidate specializations, generated by FOCL algorithm 357—361
Candidate specializations, generated by FOIL algorithm 287—288 357—358
Candidate-Elimination algorithm 29—37 45—47
Candidate-Elimination algorithm, applications of 29 302
Candidate-Elimination algorithm, Bayesian interpretation of 163
Candidate-Elimination algorithm, computation of version spaces 32—36
Candidate-Elimination algorithm, computation of version spaces, definition of 33
Candidate-Elimination algorithm, ID3 algorithm, comparison with 61—64
Candidate-Elimination algorithm, inductive bias of 43—46 63—64
Candidate-Elimination algorithm, limitations of 29 37 41 42 46
Candidate-Elimination algorithm, search of hypothesis space 64
CART system 77
Cascade-Correlation algorithm 121—123
Case-based reasoning 231 240—244 246 247
Case-based reasoning, advantages of 243—244
Case-based reasoning, applications of 240
Case-based reasoning, other instance-based learning methods, comparison with 240
Causal knowledge, representation by Bayesian belief networks 187
Central limit theorem 133 142—143 167
Checkers learning program 2—3 5—14 387
Checkers learning program as sequential control process 369
Checkers learning program, algorithms for 14
Checkers learning program, design 13
Chemical mass spectroscopy, Candidate-Elimination algorithm in 29
Chess learning program 308—310
Chess learning program, explanation-based learning in 325
chunking 327 330
Cigol 302
Circuit design, genetic programming in 265—266
Circuit layout, genetic algorithms in 256
Classification problems 54
Classify_naive_Bayes_text 182—183
CLAUDIEN 302
clauses 284 285
CLS See "Concept Learning System"
Clustering 191
| CN2 algorithm 278 301
CN2 algorithm, choice of attribute-pairs in 280—281
Complexity, sample See "Sample complexity"
Computational complexity 202
Computational complexity theory, influence on machine learning 4
Computational learning theory 201—227
Concept learning 20—47
Concept Learning System 77
Concept learning, algorithms for 47
Concept learning, Bayes theorem and 158—163
Concept learning, definition of 21
Concept learning, genetic algorithms in 256
Concept learning, ID3 algorithm specialized for 56
Concept learning, notation for 22—23
Concept learning, search of hypothesis space 23—25 46—47
Concept learning, task design in 21—22
Concepts, partially learned 38—39
Conditional independence 185
Conditional independence in Bayesian belief networks 186—187
confidence intervals 133 138—141 150 151
Confidence intervals for discrete-valued hypotheses 131—132 140—141
Confidence intervals for discrete-valued hypotheses, derivation of 142—143
Confidence intervals, one-sided 144 145
Conjugate gradient method 119
Conjunction of boolean literals, PAC learning of 211—212
Consequent of Horn clause 285
Consistent learners 162—163
Consistent learners, bound on sample complexity 207—210 225
Consistent learners, bound on sample complexity, equation for 209
Constants in logic 284 285
Constraint-based approaches in Bayesian belief networks 191
Constructive induction 292
Continuous functions, representation by feedforward networks 105—106
Continuous-valued hypotheses, training error of 89—90
Continuous-valued target function 197
Continuous-valued target function, maximum likelihood (ML) hypothesis for 164—167
Control theory, influence on machine learning 4
Convergence of Q learning algorithm in deterministic environments 377—380 386
Convergence of Q learning algorithm in nondeterministic environments 382—383 386
Credit assignment 5
Critic 12 13
Cross entropy 170
Cross entropy, minimization of 118
Cross-validation 111—112
Cross-validation for comparison of learning algorithms 145—151
Cross-validation in k-Nearest Neighbor algorithm 235
Cross-validation in neural network learning 111—112
Cross-validation, k-fold See "k-fold cross-validation"
Cross-validation, leave-one-out 235
Crossover mask 254
Crossover operators 252—254 261 262
Crossover operators, single-point 254 261
Crossover operators, two-point 254 257—258
Crossover operators, uniform 255
Crowding 259
Cumulative reward 371
Curse of dimensionality 235
Data Mining 17
Decision tree learning 52—77
Decision tree learning, algorithms for 55 77 See "ID3
Decision tree learning, applications of 54
Decision tree learning, Bayesian learning, comparison with 198
Decision tree learning, impact of pruning on accuracy 128—129
Decision tree learning, inductive bias in 63—66
Decision tree learning, k-Nearest Neighbor algorithm, comparison with 235
Decision tree learning, Minimum Description Length principle in 173—174
Decision tree learning, neural network learning, comparison with 85
Decision tree learning, overfitting in 67—69 76—77 111
Decision tree learning, post-pruning in 68—69 77
Decision tree learning, reduced-error pruning in 69—71
Decision tree learning, rule post-pruning in 71—72 281
Decision tree learning, search of hypothesis space 60—62
Decision tree learning, search of hypothesis space by Backpropagation algorithm, comparison with 106
Deductive learning 321—322
Degrees of freedom 147
Delayed learning methods, comparison with eager learning 244—245
Delayed reward in reinforcement learning 369
Delta rule 11 88—90 94 99 123
Demes 268
Determinations 325
Deterministic environments, Q learning algorithm for 375
Directed acyclic neural networks See "Multilayer feedforward networks"
Discounted cumulative reward 371
Discrete-valued hypotheses, confidence intervals for 131—132 140—141
Discrete-valued hypotheses, confidence intervals for, derivation of 142—143
Discrete-valued hypotheses, training error of 205
Discrete-valued target functions, approximation by decision tree learning 52
Disjunctive sets of rules, learning by sequential covering algorithms 275—276
Distance-weighted k-Nearest Neighbor algorithm 233—234
Domain theory 310 329 See "Perfect "Prior
Domain theory as KBANN neural network 342—343
Domain theory in analytical learning 311—312
Domain theory in Prolog-EBG 322
Domain theory, weighting of components in EBNN 351—352
Domain-independent learning algorithms 336
Dyna 380
Dynamic programming, applications to reinforcement learning 380
Dynamic programming, reinforcement learning and 385—387
Eager learning methods, comparison with lazy learning 244—245
EBG algorithm 313
EBNN algorithm 351—356 362 387
EBNN algorithm, other explanation-based learning methods, comparison with 356
EBNN algorithm, prior knowledge and gradient descent in 339
EBNN algorithm, TangentProp algorithm in 353
EBNN algorithm, weighting of inductive-analytical components in 355 362
EGGS algorithm 313
EM algorithm 190—196 197
EM algorithm, applications of 191 194
EM algorithm, derivation of algorithm for k-means 195—196
EM algorithm, search for maximum likelihood (ML) hypothesis 194—195
Entailment 321n
Entailment, relationship with -subsumption and more_general_than partial ordering 299—300
entropy 55—57 282
Entropy of optimal code 172n
Environment in reinforcement learning 368
Equivalent sample size 179—180
Error bars for discrete-valued hypotheses See "Confidence intervals for discrete-valued hypotheses"
Error of hypotheses, sample See "Sample error"
Error of hypotheses, training See "Training error"
Error of hypotheses, true See "True error"
Estimation bias 133 137—138 151
Estimator 133 137—138 143 150—151
Evolution of populations in genetic algorithms 260—262
Evolution of populations, argument for Occam's razor 66
Evolutionary computation 250 262
Evolutionary computation, applications of 269
Example-driven search, comparison with generate-and-test beam search 281
Expected value 133 136
Experiment generator 12—13
Explanation-based learning 312—330
Explanation-based learning, applications of 325—328
Explanation-based learning, derivation of new features 320—321
Explanation-based learning, inductive bias in 322—323
Explanation-based learning, inductive learning and 330
Explanation-based learning, lazy methods in 328
Explanation-based learning, limitations of 308 329
Explanation-based learning, prior knowledge in 308—309
Explanation-based learning, reinforcement learning and 330
Explanation-based learning, utility analysis in 327—328
Explanations generated by backward chaining search 314
Explicit prior knowledge 329
Exploration in reinforcement learning 369
Face recognition 17
Face recognition, Backpropagation algorithm in 81 112—117
Feedforward networks See "Multilayer feedforward networks"
Find-S algorithm 26—28 46
Find-S algorithm, Bayesian interpretation of 162—163
Find-S algorithm, definition of 26
Find-S algorithm, inductive bias of 45
Find-S algorithm, limitations of 28—29
|