Navigation:

Python: Machine Learning

Workshop Resources

Pre-requisite: none

Difficulty: Beginner

Let’s learn some machine learning to evaluate player overall ratings in FIFA video game

Machine learning is the science to study algorithms and models that enable computers to recognize things, make decisions, even predict results without explicit instructions. As an example, when talking to your phone assistant such as Siri or Cortana, machine learning helps to translate your voice into text and further understand what you requested. Is that amazing?

Today we are going to show you how to teach a computer evaluate overall ratings for soccer player based on their attributes step by step.

Let’s get on to it!

A little background

Assume that there’s a formula to calculate the “Overall” ratings for soccer players by EA Sports (The developer of FIFA 2019). With this formula, we can easily calculate the overall ratings for any player even if he/she is not in the game. The problem is, we don’t know what exactly the formula looks like.
We know the input which consists of player attributes and the output which is the Overall ratings. Then we can use an approach called “regression” to “estimate” the formula based on the input/output.

Today, we are going to use a simple model called Linear Regression. Let assume the formula that calculates the overall ratings of soccer player ( y = f(x)) is [ f(x) = ax + b ] The linear regression aims to figure out (a) and (b). The formula (f(x)) is called “model” in machine learning, and the process of solving/estimating the model is called “training” the model. Once we trained the model, we can use it to predict target (y) of new data.

Back to our story, if we only have 1 variable (x), estimate (f(x)) should be easy. Everyone should be able to solve it with a pen and a piece of paper. However, when (x) is a long list of attributes of soccer players like speed, power, passing, tackling, it becomes complicated. The formula should be rewritten into [ f(x_1, x_2, …, x_n) = a_1 * x_1 + a_2 * x_2 + … + a_n * x_n + b ] Then we have to feed the model with a lot of high-quality data to make the model more closer to the “real” formula. Let’s get started!

Table of Contents

Step 1: Get Dataset

Step 2: Start the project

Step 3: Load dataset

Step 4: Pre-process data

Step 5: Feature selection

Step 6: Train the model

Step 7: Try the model on testing data

Closing