説明
Companion Video: https://youtu.be/Wy7HDj2igPo
Tabular Q-Learning can be used to provide an NPC with intentional behavior including avoiding enemy players, collecting health points, and most behaviors a human is capable of manifesting within the game environment. With this system, instead of having to create complex hand-crafted behavior trees, simply reward the actions you want the agent to take and it learns strategies on its own to recieve those rewards.
Q-learning is also the foundation for more advanced systems of intelligent behavior such as those found in the AI Emotions Toolkit and the MindMaker DRL Engine. It is best suited for low dimension game environments where there are a limited number of actions and objects the AI is learning from. For more complex games, try the Neurostudio Learning Engine.
In Q-Learning, behavior is broken up into an exploration and exploitation phase. During exploration, the agent acquires knowledge about the effects of its actions. In the exploitation phase it uses that knowledge to make strategic decisions. In this example project, the Q-learning algorithm is used to solve a 'match to sample' puzzle in which the NPC learns that it must activate a switch within the level at the same time that a light is on in order to receive a “food reward”.
含まれる形式
- versions