Neural Networks used in Deep Q-learning


Frequently Asked Questions

An architecture for deep learning uses large sets of labeled data and neural networks to learn features without requiring manual extraction. A deep neural network can have as many as 150 hidden layers, while a traditional neural network has only 2-3.

Yes. By using the Bellman Optimality Equation, Q-learning continuously updates the Q-values until the Q-function (Action-Value function) gets stable. It is referred to as Value-Iteration.

Based on the formula, the Q-learning algorithm learns a Q-table ('Q' for quality) containing future rewards for actions and states. By choosing the action that maximizes the expected future reward, the agent can make the best choice for each state.

