August 1, 2021
In this project, uniform experience replay is applied using a deep Q-network, in which transitions are uniformly sampled from the agents’ replay memory.
In the latter part, proportional prioritization is applied to replay the transitions that are categorized as more important, at a higher frequency. Prioritization gives a more robust and effective learning system.
Following is the network architecture implemented in PyTorch using novel agent and Q-Table classes.
GitHub Repository