Programming Reinforcement Learning