Bertsekas D.P. — Dynamic programming and optimal control (Vol. 2) :: Электронная библиотека попечительского совета мехмата МГУ

Главная Ex Libris Книги Журналы Статьи Серии Каталог Wanted Загрузка ХудЛит Справка Поиск по индексам Поиск Форум

Авторизация

Поиск по указателям

Красота

Bertsekas D.P. — Dynamic programming and optimal control (Vol. 2)

Bertsekas D.P. — Dynamic programming and optimal control (Vol. 2)

Обсудите книгу на научном форуме

Нашли опечатку?
Выделите ее мышкой и нажмите Ctrl+Enter

Название: Dynamic programming and optimal control (Vol. 2)

Автор: Bertsekas D.P.

Аннотация:

This is a modest revision of Vol. 2 of the 1995 best-selling dynamic programming 2-volume book by Bertsekas. DP is a central algorithmic method for optimal control, sequential decision making under uncertainty, and combinatorial optimization. The treatment focuses on basic unifying themes and conceptual foundations. It illustrates the power of the method with many examples and applications from engineering, operations research, and economics. Among its special features, the book: (a) provides a unifying framework for sequential decision making (b) develops the theory of deterministic optimal control including the Pontryagin Minimum Principle © describes neuro-dynamic programming techniques for practical application of DP to complex problems that involve the dual curse of large dimension and lack of an accurate mathematical model (d) provides a comprehensive treatment of infinite horizon problems in the second volume, and an introductory treatment in the first volume (e) contains many exercises, with solutions of the most theoretical ones posted on the book's www page.

Язык:

Рубрика: Computer science/

Статус предметного указателя: Готов указатель с номерами страниц

ed2k: ed2k stats

Год издания: 1995

Количество страниц: 292

Добавлена в каталог: 04.12.2005

Операции: Положить на полку | Скопировать ссылку для форума | Скопировать ID

Предметный указатель

$\epsilon$ -optimal policy      172
Admissible policy      3
Advantage updating      122 132
Aggregation      44 104 219
Approximation in policy space      117
Asset, selling      157 275
Asynchronous algorithms      30 74 120
Average cost problem      184 249 266
Basis functions      51 65 103
Bellman's equation      8 11 83 108 137 180 191 196 225 247 268
Blackwell optimal policy      193 233
Bold Strategy      102
Chess      102 117
Column reduction      67
Consistently improving policies      90 122 127
Contraction mappings      52 65 80 128
Controllability      151 228
Cost approximation      51 101 225
Data transformations      72 263 271
Differential cost      186 192
Dijkstra's algorithm      90 122
Discount oil cost      9 186 213 202
Discretization      65
Distributed computation      74 120
Duality      65 222
Error bounds      19 69 209 213 234 239
Feature extraction      103
Feature vectors      103
Feature-based aggregation      101
gambling      100 173 180
Gauss — Seidel method      28 88 208
Improper policy      80
INDEX function      56
Index of a project      55
Index rule      55 65
Inventory control      153 170
Irreducible Markov chain      211
Jacobi method      68
Label collecting method      90
Linear programming      19 150 221
Linear quadratic problems      150 176—178 228 235
LLL strategy      90
Measurability issues      64 172
Minimax problems      72
Monotone Convergence Theorem      130
Monte-Carlo simulation      90 112 120 131 223
Multiarmed bandit, problem      54 250
Multiple-rank corrections      48 61
Negative DP model      134
Neuro-dynamic programming      122
Newton's method      71
Nonstationary problems      107
Observability      151 228
One-step-lookahead rule      157 159 160

Optimistic policy iteration      110 122
Parallel computation      64 71 120
Periodic problems      107 171 177 179
Policy      3
Policy evaluation      30 214
Policy existence      100 172 182 220
Policy improvement      30 214
Policy iteration      35 71 73 91 149 180 213 223
Policy iteration, approximate      41 91 112 115
Policy iteration, modified      39 91
Polynomial approximations      102
Positive DP model      131
Priority assignment      254
Proper policy      80
Q-factor      99 132
Q-learning      10 99 122 224 230 239
Quadratic cost      150 170—178 228 235
Queueing control      250 205
Randomized policy      222
Rank-one correction      30 68
Reachability      181 182
Reinforcement learning      122
Relative cost      180 192
Replacement problems      14 200 270
Riccati equation      151 228
Robbins — Monro method      98
Routing      257
Scheduling problems      54
Semi-Markov problems      201
Sequential hypothesis testing      158
Sequential probability ratio      158
Sequential space decomposition      125
Shortest path problem      78 90 126
Simulation-based methods      10 78 94 222
SLF strategy      90
Stationary policy      3
Stationary policy, existence      13 83 143 100 172 182 227
Stochastic shortest, paths      78 185 230—239
Stochastic, approximation method      98
Stopping problems      87 155
Successive approximation      19
Temporal differences      10 97 115 122 223
Tetris      105 111
Threshold policies      73
Unbounded costs per stage      134
Uncontrollable state components      105 125
Undiscounted problems      134 249
Unichain policy      190
Uniformization      212 271
Value iteration      19 88 144 180 202 211 224 238
Value iteration, approximate      33
Value iteration, relative      204 211 229 232
Value iteration, termination      23 89
Weighted sup norm      86 128

Реклама

© Электронная библиотека попечительского совета мехмата МГУ, 2004-2025

Электронная библиотека мехмата МГУ

Valid HTML 4.01!

|

Valid CSS!

О проекте