Step 7: Try the model on testing data

Workshop Resources

Now we are using the trained model to estimate players in test_data. Similar to what we do to the train_data, we create x_test and y_test.

model.predict() will generate a list of predicted results.

# we would like to sort test data on target value ("Overall")
test_data = test_data.sort_values([target], ascending=False)

x_test = test_data[features]
y_test = test_data[target]

y_pred = model.predict(x_test)

Let’s compare with the actual overall ratings

# add a new column of predicted overall to test_data
test_data['Predicted Overall'] = y_pred.copy()

# add a new column of prediction difference ratio to test_data
difference = (y_pred - y_test) / y_test * 100
test_data['Difference (%)'] = difference

# print the results
test_data[["Name", "Nationality", "Club", "Overall", "Predicted Overall", "Difference (%)"]]

NameNationalityClubOverallPredicted OverallDifference (%)
1Cristiano RonaldoPortugalJuventus9491.973701-2.155638
10R. LewandowskiPolandFC Bayern München9088.135513-2.071652
23S. AgüeroArgentinaManchester City8987.807637-1.339733
48C. ImmobileItalyLazio8785.933234-1.226168
159Louri BerettaBrazilAtlético Mineiro8381.583941-1.706096
193RodrigoSpainValencia CF8381.784946-1.463921
179S. GnabryGermanyFC Bayern München8379.978980-3.639783
315David VillaSpainNew York City FC8281.259066-0.903578
362Paco AlcácerSpainBorussia Dortmund8181.8365321.032756
518Alexandre PatoBrazilTianjin Quanjian FC8078.322831-2.096461
499L. de JongNetherlandsPSV8079.993062-0.008672
523K. GameiroFranceValencia CF8079.130702-1.086622
721B. YılmazTurkeyTrabzonspor7978.092396-1.148866
693S. JovetićMontenegroAS Monaco7979.3530440.446891
591L. AlarioArgentinaBayer 04 Leverkusen7979.0664460.084109
569André SilvaPortugalSevilla FC7979.9252291.171175
588M. PhilippGermanyBorussia Dortmund7978.962674-0.047248
561L. MartínezArgentinaInter7979.4119400.521443
874A. DzyubaRussiaNaN7876.855093-1.467829
825S. GarcíaUruguayGodoy Cruz7877.375588-0.800528
909V. GermainFranceOlympique de Marseille7777.5090050.661045
1095N. JørgensenDenmarkFeyenoord7776.745918-0.329976
992J. SandArgentinaDeportivo Cali7778.8861692.449570
1137Rubén CastroSpainUD Las Palmas7777.7979841.036343
895M. HarnikAustriaSV Werder Bremen7776.926679-0.095222
1413Alan CarvalhoBrazilGuangzhou Evergrande Taobao FC7675.922866-0.101492
1327K. DolbergDenmarkAjax7676.0608310.080041
1496F. MonteroColombiaSporting CP7677.0171871.338404
1240I. PopovBulgariaSpartak Moscow7675.734350-0.349540
1357I. SlimaniAlgeriaFenerbahçe SK7676.4945070.650667
.....................
17484J. LankesterEnglandIpswich Town5456.1218843.929415
17469J. GallagherRepublic of IrelandAtlanta United5454.6924441.282304
17501M. SaavedraChileAudax Italiano5454.1374630.254561
17361E. McKeownEnglandColchester United5452.796085-2.229473
17399Mao HaoyuChina PRTianjin TEDA FC5453.964477-0.065783
17313M. HowardEnglandPreston North End5453.339370-1.223389
17355V. BarberoArgentinaBelgrano de Córdoba5454.0113440.021008
17422Y. OgakiJapanNagoya Grampus5454.0410240.075970
17447Xie WeijunChina PRTianjin TEDA FC5453.452376-1.014118
17367T. LauritsenNorwayOdds BK5454.9446411.749336
17482F. Al BirekanSaudi ArabiaAl Nassr5452.727175-2.357084
17609S. JamiesonScotlandSt. Mirren5353.5096500.961604
17716M. KnoxScotlandLivingston FC5352.826053-0.328201
17578Lei WenjieChina PRShanghai SIPG FC5352.770581-0.432867
17665J. SmylieAustraliaCentral Coast Mariners5352.469974-1.000049
17611Felipe FerreyraBrazilCuricó Unido5352.861431-0.261451
17765A. GeorgiouCyprusStevenage5252.1677860.322665
17757L. SmythNorthern IrelandStevenage5251.999942-0.000111
17923A. ReghbaRepublic of IrelandBohemian FC5151.0755010.148041
17956C. MurphyRepublic of IrelandCork City5151.7319851.435265
17971M. NajjarAustraliaMelbourne City FC5151.0355410.069688
18013W. MøllerDenmarkEsbjerg fB5150.796960-0.398118
18062Gao DalunChina PRJiangsu Suning FC5049.677371-0.645259
18094M. Al DhafeeriSaudi ArabiaAl Batin5051.5539643.107928
18063R. Hackett-FairchildEnglandCharlton Athletic5050.1407620.281524
18028D. AsonganyiEnglandMilton Keynes Dons5050.3498960.699792
18140K. HawleyEnglandMorecambe4949.7873321.606799
18166N. AyévaSwedenÖrebro SK4848.8029351.672781
18177R. RoacheRepublic of IrelandBlackpool4849.2260152.554197
18200J. YoungScotlandSwindon Town4748.0193872.168908

538 rows × 6 columns

Is that amazing? With the result, you’re confident to use this model to estimate the overall ratings of any soccer player in the world!

Now let’s do some plotting to visualize it.

# Plot outputs
plt.scatter(range(0,y_test.shape[0]), y_test,  color='blue', label="Actual")
plt.plot(range(0,y_test.shape[0]), y_pred, color='red', label="Predicted")

# add ticks, labels, legend
plt.xticks(())
plt.xlabel("Players (Sorted by Actual Overall ratings)")
plt.ylabel("Overall ratings")
plt.legend(loc='upper right')
plt.show()

Final graph