PACMAN, using Advanced Deep Q- Learning Networks

Leonardo Zaim
2 min readApr 23, 2021
Photo by Kirill Sharkovski on Unsplash

Reinforcement Learning (RL) has sure caught our attention

Machine Learning

It’s been around since 1955 when Arthur Samuel thought whether computers could actually learn some kind of behavior instead of being programed gradually. Rather than creating specific instructions, we give the computer data and tools needed to solve a specific problem. So in a way or another we’re giving the machine an ability to learn.

In machine learning, we use artificial intelligence to help our program get patterns in big datasets.

We got supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning.

In supervised learning we use labeled data to show the machine an output. The fact that the data is labeled presents the major difference between this kind of machine learning and other un-supervised or reinforcement methods. The input would be the labeled data, and the output is the result we want. Using a training set, the machine get an ideia that there’s a direct connection between the some specific inputs with the output thus creating a data model. Other test data can be used to tune the whole data model so that the result is more accurate each time. Supervised learning can be categorized to binary classification, multiclass classification, and regression problems.

Unsupervised learning uses algorithms to let the machine create connections by analyzing the data. The key to unsupervised learning is improving the whole process using trial and error. The machine studies the data and come up with its own observations. This kind of machine learning needs a large dataset so that it gets better results.

Semi-supervised learning is used in cases when only unsupervised learning wouldn’t be enough. The machine starts with a small dataset for training, then we feed the machine with the rest of the data. This is called inductive reasoning. We also have transductive reasoning, where we narrow down unlabeled data, by thinking about data in a larger context.

Reinforcement learning is different than the other 3 approaches. The machine is forced to iterate to continuously try to get a better outcome. So we are reinforcing the machine how to behave. RL uses Q-learning as we want to improve the Quality of the outcome. During the process, the computer is given a reward each time it get the correct outcome or perform the next move. Q-learning has a set of environments or states, with some possible actions or responses.

incomplet

--

--