Which country is the best? Dealing with ranking data

Who does not dream about living in the best country of the globe? But what counts the best? There are various factors, along which countries can be compared. Countries are ranked on the basis of their GDP, the well-being of their citizens, the level of their inhabitants’ freedom and on many more. Our research aims to come up with a top list of the countries, which amalgamates more aspects of ranking. This goal is achieved with rank aggregation.

What is rank aggregation?

Ranking data is ubiquitous. We love university rankings and we tend to check sport league rankings. Before buying a gadget, many of us check various test sites. But what if there is more than one ranking available and they are not identical? Rank aggregation helps us out! In this post, we’ll examine the very basics of rank aggregation. Instead of gadgets, we will compare various country rankings. But we keep in mind that rank aggregation is used in many fields, from voting theory to economics, where it is often called preference ordering. Additionally, some years ago, it was used in meta-search to aggregate search results from various search engines.

The data

Our initial aim was to find out which is the best country in the world. However, we shortly realized that there exists not a single top list of the countries, but there are several ones depending on the thing they are the best at. Hence, we decided to take more rankings into consideration to answer the question. We chose the following country rankings and indices:

First we converted each table into a ranking order, then we merged the eight tables listed above with the the coco country converter Python package. As a result, we got a master table with 8 columns and 106 rows. The columns represent the rankings of the countries from different aspects, while the rows consist the countries. In short, now we have a master table in which each and every country has got a value between 1 and 106 in every column. Below, you can have a look at the data along with our aggregations and the result of the clustering.

And the winner is …

Let’s have a look at the aggregated ranking. The first ten countries are mainly European (Sweden, Germany, Norway, Finland, ranked first, second, third and fourth, and the UK, ranked tenth), there are three commonwealth countries (Singapore, Australia and New-Zealand, ranked seventh, eighth, and ninth) and Japan (ranked sixth). All countries has got consistently high scores at each ranking, except for Singapore, which is ranked 73rd on freedom. Although the United States gets good scores on all the eight rankings, its aggregated rank is 74, so it comes right after Kyrgyzstan. It can be explained with the fact that, as we will see later, no aggregated ranking is perfect.

Let’s explore the data

It’s so good that we managed to convert eight rankings into one, but it’s not self-explanatory what we have now in a master table. Let’s analyze the data and try to find some pattern in this mass. The three-dimensional barplot below is our first attempt to interpret the data.

Although the visualization above coud have been done in a way that it would be more pleasing to the eye, but it would not be more comprehensible. The problem with it is that it doesn’t help us reveal any pattern in the data.

What if we treat each country as a vector of eight integer value, so using dimensionality reduction? As a second attempt, let’s give it a try and let’s make a self-organizing map. Here comes the more compact and interactive viz.