Introduction to Artificial Intelligence. Lecture 9: Reinforcement Learning. Prof. Gilles Louppe [email protected] 1 / 58. ... Image credits: CS188, UC Berkeley. 38 / 58. Détour: Q-values. The state-value V (s) V(s) V (s) of the state s s s is the expected utility starting in s s s and acting optimally.

