CIS 603. Artificial Intelligence

Reinforcement Learning

1. The Problem

learn what to do from reward and punishment

optimize utility or fitness function with selected actions

learning about actions

2. Proposed Solutions

"Reinforcement learning" in machine learning study Sequential decision according to algorithmic probability Evolutionary learning

3. Issues

various assumptions about the environment and the system

delayed feedback, credit/blame assignment

incomplete description, non-existing function, non-repeated decision

evolution and intelligence

4. Reading

Sections 17.2-3, 21.1-3, 4.3

5. ideas

solving problems with action sequences: static vs. dynamic [Computation and Intelligence in Problem Solving]

knowledge about action:

changing knowledge and resources restrictions may completely change the problem

intelligence and evolution: