I was recently watching a 2017 GDC talk on Ranking algorithms in games and implementations, during that the speaker briefly touched on building his own ranking algorithm for a very specific problem he had. That got me wondering about what it took to develop one, so I read up on two main algos Elo and TrueSkill. based on deductions made from that I tried to deduce my own mathematical basis for a ranking system. Through this blog i wanted to elaborate on the entire process for that
What is a ranking system
A ranking system mathematically models how individuals or entities are ordered based on a measurable performance or skill level, dynamically updated through interactions or comparisons.
After this formal establishment I looked at two important algorithms, Elo and TrueSkill and tried to extract from them important aspects of a ranking algorithm.
Elo Algorithm
Elo is based on a normal distrubutions, the idea is to rank all players of a game by assigning a metric to them. the metric to update player scores is as follows
RatingDiff = (Score - Expected) * K-factor
score is 0 when loss, 0.5 when draw and 1 when win
expected[A] = 1 / (1 + 10^(Rating[B-A]/400))
K-factor can be a constant usually K = 32
the points are updated based on the result. if the expected winner ends up loosing, it is a surprise for the system and subsequently a much more drastic shift in rating is observed.
TrueSkill Algorithm
originates from need for fair matchmaking for xbox live and improves on the elo premise. Each player has a skill distribution
trueskill considers two variables μ
average skill and σ
degree of certainity.
We compute expected outcome as follows
Ea=Φ(σA2+σB2μA−μB)
We update the ratings with the following
Δμ=K⋅(S−E)
What makes a good ranking system
While there are many factors that we can consider while designing a ranking system I chose the following.
- Credibility: The system must be fair and evaluate every match up properly
- Simplicity: Should be simple enough, so that players can compute results themselves
- New player handling: New players should be able to fit into the system relatively easily
- Time decay: The ratings should ideally not inflate/deflate too much over the course of time.
- Player experience: The system should enable a fun player experience.
- Fair matchmaking: The matchmaking enabled by the system should be fair but at the same time should provide enough variation so that the matches remain fun.
Now that we have defined all of this, lets also establish what exactly we need from our system.
Core tenets of my ranking system
To do that i chose the following as the cornerstones that the system is based on
- It should be a zero sum system
- Must provide slight delta of randomness in matchmaking so there is variation in player matchmaking
- The system must converge quickly for newer accounts
- As an additional challenge I chose a constraint that discourages grinding continously.
Key assumptions
- there a limited amount of rating points available, say 1000, and a set number of players compete for this rating points.
- We assume the following distribution more players with lower ratings, fewer players with high ratings, and the midpoint serving as an “average.” This shape of
y = 1 - x
will naturally emerge under this constraints. - Points are directly transferred between players after win or loss, i want to keep the aspect of elo where the more surprising the result is, the larger is the points exchange.
- when a player is on
x
no. of consecutive games i want this to negatively affect the points exchange so that grinding for wins is discouraged.
Mathematical Formalism
Based on the above tenets and assumptions we can formalise our system mathematically as follows.
Player representation
Each player is represented by the following tuple Pi=(Ri,Si,Li)
;
where:
- Ri is the rating of player
- Si is the current winning streak
- Li is the current league of the player
Match simulation
1. Expected score:
For match between players Pa and Pb, expected score Ea can be formalized as follows Ea = max(0, min(1, 0.5 + k.(Ra - Rb)))
here k
is some small scaling factor. where as Eb for Pb is characterized as Eb = 1 - Ea
Here we establish a basis for the points exchange between players based how surprising the result is to the system.
2. Outcome
Outcome Oa for Pa is Oa = 1 if Pa wins, else 0
, whereas outcome for Pb is Ob = 1 - Oa
.
3. Streak Penalty
Streak penalty Pi is defined as Pi = 1 / (1 + α.Si)
, and effective match penalty is Penalty(Pa).Penalty(Pb).
Addition of this will ensure that players are discouraged from grinding for wins.
4. K factor
K factor for the player is determined by the league and is formalized as
Ki = K-factor[Li]
where
K-factor["Bronze"] = 200
K-factor["Silver"] = 150
K-factor["Gold"] = 100
K-factor["Platinum"] = 50
Effective K-factor of the match up is K = (Ka + Kb)/2
A changing K factor will help players converge to their appropriate rating quicker
5. Rating update
Rating update for Pa is ΔRa and is characterized as
ΔRa = K . Penalty . (Oa - Ea)
, and simultaneously rating update for Pb is ΔRb which is ΔRb = - ΔRa
6. Rating adjustment
To maintain the zero sum nature we need to adjust ratings after games so that they dont inflate over time. this is done as following
Adjustment = (TOTAL_RATING_POOL - ∑Ri)/N
League Assignment
Each player's league Li is determined based on their rating Ri:
- bronze: 0 ≤ Ri < 100
- silver: 100 ≤ Ri < 200
- gold: 200 ≤ Ri < 300
- platinum: Ri ≥ 300
Match simulation
For M matches
- Sample two players Pa and Pb from the pool of players
- Compute Ea, Oa, Penalty, K, and update ratings.
- Adjust streaks
- adjust ratings
Simulation
Once this formalism is defined I wrote a script that simulated 10,000 players with the above rating pool.
At the end of the simulation we had a player distribution that looked like the following
- Bronze league: 7992 players
- Silver league: 1780 players
- Gold league: 304 players
- Platinum league: 2 players
the top 5 players in the system looked like the following
- id: 3736, rating:276, streak: 4, league: platinum
- id: 8998, rating:271, streak: 6, league: platinum
- id: 4858, rating:264, streak: 4, league: gold
- id: 54, rating:255, streak: 0, league: gold
- id: 1669, rating:254, streak: 1, league: gold
the bottom 5 on the other hand looked like the following
- id: 355, rating: -276, streak: 0, league: platinum
- id: 48, rating: -281, streak: 0, league: platinum
- id: 7874, rating: -284, streak: 2, league: gold
- id: 5472, rating: -290, streak: 0, league: gold
- id: 1822, rating:299, streak: 0, league: gold
hi pss you should consider subscribing to my blog. click here to do so.