Maximum a posteriori (MAP) hypothesis, output of consistent learners 162—163
Maximum likelihood (ML) hypothesis 157
Maximum likelihood (ML) hypothesis, EM algorithm search for 194—195
Maximum likelihood (ML) hypothesis, least-squared error hypothesis and 164—167
Maximum likelihood (ML) hypothesis, prediction of probabilities with 167—170
MDP See "Markov decision processes"
Mean error 143
Mean value 133 136
Means-ends planner 326
Mechanical design, case-based reasoning in 240—244
Medical diagnosis, attribute selection measure 76
Medical diagnosis, Bayes theorem in 157—158
Meta-DENDRAL 302
mFOIL 302
Minimum description length principle 66 69 171—173 197 198
Minimum Description Length principle in decision tree learning 173—174
Minimum Description Length principle in inductive logic programming 292—293
MIS 302
Mistake bounds, optimal See "Optimal mistake bounds"
Mistake-bound learning 202 220 226
Mistake-bound learning in Candidate-Elimination algorithm 221—222
Mistake-bound learning in Find-S algorithm 220—221
Mistake-bound learning in Halving algorithm 221—222
Mistake-bound learning in List-Then-Eliminate algorithm 221—222
Mistake-bound learning in Weighted-Majority algorithm 224—225
ML hypothesis See "Maximum likelihood hypothesis"
Momentum, addition to Backpropagation algorithm 100 104
More_general_than partial ordering 24—28 46
More_general_than partial ordering in Candidate-Elimination algorithm 29
More_general_than partial ordering in Find-S algorithm 26—28
More_general_than partial ordering in version spaces 31
More_general_than partial ordering, -subsumption, entailment, and 299—300
Multilayer feedforward networks, Backpropagation algorithm in 95—101
Multilayer feedforward networks, function representation in 105—106 115
Multilayer feedforward networks, representation of decision surfaces 96
Multilayer feedforward networks, training of multiple networks 105
Multilayer feedforward networks, VC dimension of 218—220
Mutation operator 252 253 255 257 262
Naive Bayes classifier 154—155 177—179 197
Naive Bayes classifier, Bayesian belief network, comparison with 186
Naive Bayes classifier, maximum a posteriori (MAP) hypothesis and 178
Naive Bayes classifier, use in text classification 180—184
Naive Bayes learner See "Naive Bayes classifier"
Negation-as-failure strategy 279 319 321n
Negative literal 284 285
Neural network learning 81—124 See "Cascade-Correlation "EBNN "KBANN "TangentProp
Neural network learning in Q learning 384
Neural network learning, applications of 83 85
Neural network learning, applications of, in face recognition 113
Neural network learning, cross-validation in 111—112
Neural network learning, decision tree learning, comparison with 85
Neural network learning, discovery of hidden layer representations in 107
Neural network learning, overfitting in 123
Neural network learning, representation in 82—84 105—106
Neural networks, artificial 81—124 See "Radial "Recurrent
Neural networks, artificial, biological neural networks, comparison with 82
Neural networks, artificial, creation by KBANN algorithm 342—343
Neural networks, artificial, VC dimension of 218—220
Neural networks, biological 82
Neurobiology, influence on machine learning 4 82
New features, derivation in Backpropagation algorithm 106—109 123
New features, derivation in explanation-based learning 320—321
NewsWeeder system 183—184
Nondeterministic environments, Q learning in 381—383
Normal distribution 133 139—140 143 151 165
Normal distribution for noise 167
Normal distribution in paired tests 149
Occam's razor 4 65—66 171
Offline learning systems 385
One-sided bounds 141 144
Online learning systems 385
Optimal brain damage approach 122
Optimal code 172
Optimal mistake bounds 222—223
Optimal policy for selecting actions 371—372
Optimization problems, explanation-based learning in 325
Optimization problems, genetic algorithms in 256 269
Optimization problems, reinforcement learning in 256
Output encoding in face recognition 114—115
Output units, Backpropagation weight update rule for 102—103
Overfitting 123
Overfitting in Backpropagation algorithm 108 110—111
Overfitting in decision tree learning 66—69 76—77 111
Overfitting in neural network learning 123
Overfitting, definition of 67
Overfitting, Minimum Description Length principle and 174
PAC learning 203—207 225 226
PAC learning of boolean conjunctions 211—212
PAC learning, definition of 206—207
PAC learning, training error in 205
PAC learning, true error in 204—205
Paired tests 147—150 152
Parallelization in genetic algorithms 268
Partially learned concepts 38—39
Partially observable states in reinforcement learning 369—370
Perceptron training rule 88—89 94 95
Perceptrons 86 95 96 123
Perceptrons, representation of boolean functions 87—88
Perceptrons, VC dimension of 219
Perceptrons, weight update rule 88—89 94 95
Perfect domain theory 312—313
Performance measure 6
Performance system 11—13
Philosophy, influence on machine learning 4
Planning problems, case-based reasoning in 240—241
Planning problems, Prodigy in 327
Policy for selecting actions 370—372
Population evolution in genetic algorithms 260—262
Positive literal 284 285
Post-pruning in decision tree learning 68—69 77 281
Post-pruning in FOIL algorithm 291
Post-pruning in Learn-one-rule 281
Posterior probability 155—156 162
Power law of practice 4
Power set 40—42
predicates 284 285
Preference bias 64 76 77
Prior knowledge 155—156 336 See
Prior knowledge in Bayesian learning 155
Prior knowledge in explanation-based learning 308—309
Prior knowledge in human learning 330
Prior knowledge in Prolog-EBG 313
Prior knowledge to augment search operators 357—361
Prior knowledge, derivatives of target function 346—356 362
Prior knowledge, explicit, use in learning 329
Prior knowledge, initialize-the-hypothesis approach 339—346 362
Prior knowledge, search alteration in inductive-analytical learning 339—340 362
Prior knowledge, weighting in inductive-analytical learning 338 362
Prioritized sweeping 380
Probabilistic reasoning 163
Probabilities, estimation of 179—180
Probabilities, formulas 159
Probabilities, maximum likelihood (ML) hypothesis for prediction of 167—170
probability density 165
probability distribution 133 See "Normal
Probably approximately correct (PAC) learning See "PAC learning"
Process control in manufacturing 17
Prodigy 326—327 330
Product rule 159
PROGOL 300—302
PROLOG 275 302 330
Prolog-EBG 313—321 328—329
Prolog-EBG, applications of 325
Prolog-EBG, deductive learning in 321—322
Prolog-EBG, definition of 314
Prolog-EBG, derivation of new features in 320—321
Prolog-EBG, domain theory in 322
Prolog-EBG, EBNN algorithm, comparison with 356
Prolog-EBG, explanation of training examples 314—318
Prolog-EBG, explanation of training examples, weakest preimage in 329
| Prolog-EBG, inductive bias in 322—323
Prolog-EBG, inductive logic programming, comparison with 322
Prolog-EBG, limitations of 329
Prolog-EBG, perfect domain theory in 313
Prolog-EBG, prior knowledge in 313
Prolog-EBG, properties of 319
Prolog-EBG, regression process in 316—318
Propositional rules, learning by sequential covering algorithms 275
Propositional rules, learning first-order rules, comparison with 283
Psychology, influence on machine learning 4
Q function in deterministic environments 374
Q function in deterministic environments, convergence of Q learning towards 377—380
Q function in nondeterministic environments 381
Q function in nondeterministic environments, convergence of Q learning towards 382
Q learning algorithm 372—376 See
Q learning algorithm in deterministic environments 375
Q learning algorithm in deterministic environments, convergence 377—380
Q learning algorithm in deterministic environments, training rule 375—376
Q learning algorithm in nondeterministic environments 381—383
Q learning algorithm in nondeterministic environments, convergence 382—383
Q learning algorithm in nondeterministic environments, training rule 382
Q learning algorithm, advantages of 386
Q learning algorithm, experimentation strategies in 379
Q learning algorithm, lookup table, neural network substitution for 384
Q learning algorithm, updating sequence 379
Query strategies 37—38
Radial basis function networks 231 238—240 245—247
Radial basis function networks, advantages of 240
Random variable 133 134 137 151
Randomized method 150
Rank selection 256
RBF networks See "Radial basis function networks"
RDT program 303
Real-valued target function See "Continuous-valued target function"
Recurrent networks 119—121 See artificial"
Recursive rules 284
Recursive rules, learning by FOIL algorithm 290
Reduced-error pruning in decision tree learning 69—71
Regress algorithm 317—318
Regression 236
Regression in Prolog-EBG 316—381
Reinforcement learning 367—387 See
Reinforcement learning, applications of 387
Reinforcement learning, differences from other methods 369—370
Reinforcement learning, dynamic programming and 380 385—387
Reinforcement learning, explanation-based learning and 330
Reinforcement learning, function approximation algorithms in 384—385
Relational descriptions, learning of 302
Relative frequency 282
Relative mistake bound for Weighted-Majority algorithm 224—225
Residual 236
Resolution rule 293—294
Resolution rule, first-order 296—297
Resolution rule, inverse entailment operator and 294—296
Resolution rule, prepositional 294
Restriction bias 64
Reward function in reinforcement learning 368
Robot control by Backpropagation and EBNN algorithms, comparison of 356
Robot control, genetic programming in 269
Robot driving See "Autonomous vehicles"
Robot perception, attribute cost measures in 76
Robot planning problems, explanation-based learning in 327
Rote-Learner algorithm, inductive bias of 44—45
Roulette wheel selection 255
Rule for estimating training values 10 383
Rule learning 274—303
Rule learning by FOCL algorithm 357—360
Rule learning by genetic algorithms 256—259 269—270 274
Rule learning in decision trees 71—72
Rule learning in explanation-based learning 311—319
Rule post-pruning in decision tree learning 71—72
Rules, disjunctive sets of, learning by sequential covering algorithms 275—276
Rules, first-order See "First-order rules"
Rules, prepositional See "Prepositional rules"
SafeToStack 310—312
Sample complexity 202 See
Sample complexity for finite hypothesis spaces 207—214
Sample complexity for infinite hypothesis spaces 214—220
Sample complexity of k-term CNF and DNF expressions 213—214
Sample complexity of unbiased concepts 212—213
Sample complexity, bound for consistent learners 207—210 225
Sample complexity, bound for consistent learners, equation for 209
Sample complexity, VC dimension bound 217—218
Sample error 130—131 133—134 143
Sample error, training error and 205
Sampling theory 132—141
Scheduling problems, case-based reasoning in 241
Scheduling problems, explanation-based learning in 325
Scheduling problems, Prodigy in 327
Scheduling problems, reinforcement learning in 368
Schema theorem 260—262
Schema theorem, genetic operators in 261—262
Search bias See "Preference bias"
Search control problems as sequential control processes 369
Search control problems, explanation-based learning in 325—330
Search control problems, explanation-based learning in, limitations of 327—328
Search of hypothesis space See "Hypothesis space search"
Sequential control processes 368—369
Sequential control processes, learning task in 370—373
Sequential control processes, search control problems in 369
Sequential covering algorithms 274—279 301 313 363
Sequential covering algorithms, choice of attribute-pairs in 280—282
Sequential covering algorithms, definition of 276
Sequential covering algorithms, FOIL algorithm, comparison with 287 301—302
Sequential covering algorithms, ID3 algorithm, comparison with 280—281
Sequential covering algorithms, simultaneous covering algorithms, comparison with 280—282
Sequential covering algorithms, variations of 279—280 286
Shattering 214—215
Shepard's method 234
Sigmoid function 97 104
Sigmoid units 95—96 115
Simultaneous covering algorithms, choice of attributes in 280—281
Simultaneous covering algorithms, sequential covering algorithms, comparison with 280—282
Single-point crossover operator 254 261
Soar 327 330
Specific-to-general search 281
Specific-to-general search in FOIL algorithm 287
Speech recognition 3
Speech recognition, Backpropagation algorithm in 81
Speech recognition, representation by multilayer network 95 96
Speech recognition, weight sharing in 118
Speedup learning 325 330
Sphinx 3
Split information 73—74
Squashing function 96
Stacking problems See also "SafeToStack"
Stacking problems, analytical learning in 310
Stacking problems, explanation-based learning in 310
Stacking problems, genetic programming in 263—265
Stacking problems, Prodigy in 327
Standard deviation 133 136—137
State-transition function 380
Statistics, basic definitions 133
Statistics, influence on machine learning 4
Stochastic gradient descent 93—94 98—100 104—105
Student t tests 147—150 152
Substitution 285 296
Sum rule 759
t tests 147—150 152
TangentProp algorithm 347—350 362
TangentProp algorithm in EBNN algorithm 352
TangentProp algorithm, Backpropagation algorithm, comparison with 349
TangentProp algorithm, search of hypothesis space by KBANN and Backpropagation algorithms, comparison with 350—351
Tanh function 97
Target concept 22—23 40—41
Target concept, PAC learning of 211—213
Target function 7—8 17
Target function, continuous-valued See "Continuous-valued target function"
Target function, representation of 8—9 14 17
TD-Gammon 3 14 369 383
|