The 'how do we work out handicaps for internal club tournaments' conversation reared its ugly head last week. Having played chess in my (mis-spent) youth, I was familiar with the Elo ranking system (see the article 'Elo rating' in Wikipedia for details). Essentially, the better your ranking relative to your opponent, the fewer points you get for winning and the more points you lose for losing. Does anyone have any experience with using the Elo rating system in badminton? If so, which C & K-factors did you use & how did you combine ratings for doubles?
how does a rating (elo or otherwise) give you a handicap?? hint: it doesnt so no only would you have to calculate rankings throughout some period of time, you would then have to devise a method of converting the ratings of the players into a handicap.
I will answer in brief because all the detail about how it works is contained in the aforementioned article. The Elo rating system uses relative rankings (& a scaling 'C' factor) to generate an expected result (which is a number between 0 and 1). This is then compared to the actual result (also between 0 & 1) & the difference is (scaled by a 'K' factor &) then added or subtracted from the players ranking. Therefore, over time, the rankings will represent the relative strengths of the players. The system I was thinking about was to use the relative rankings to estimate the proportion of points that a player would win. For example 0.6 would equate to a 21-14 win (21 out of 35 points equals 0.6). Then, depending upon the proportion of points won by the players, their rankings would be adjusted. For example, if our player won 21-7 (21 out of 28 equals 0.75) their ranking would rise by K * (0.75 -0.60). Similarly, their opponent's ranking would fall by the same amount. Needless to say, past results would need to be recorded, but the overhead of re-calculating rankings can be done easily in a spreadsheet. So, other than recording who played who and the score and then 5 minutes of manual data entry, there's no massive outlay. One point that does need to be cleared is the method of combining the rankings of 2 players to make a single doubles ranking. I was hoping that someone out there had some guidance on that.
This is certainly an interesting idea. If you really want to do an accurate Elo rating system, you probably need the correct badminton points distribution model. I think one of my previous threads should be an interesting read for you. http://www.badmintoncentral.com/for...adminton-Win-Probability-From-Points-to-Games I'll have to think about making such an Elo system, and I am interested in what you come up with.
Reading my last post, I realised that I haven't specifically addressed amleto's question with regard to handicapping. I'll do so here: A benchmark is set (typically half-way between the highest and lowest rankings). The expected result of each participant (x) would then be evaluated against the benchmark: S_x = P * 10 ^ ( (R_x - R_bm) / C), where P is the number of points in a game, C is a scaling parameter described in the last post, R_x is the participant's rating and R_bm is the rating of the benchmark. For example, in a tournament up to 21 points and a scaling factor of 400, a participant who's rating is 50 higher than the benchmark would get S_x = 21 * 10 ^ ( 50/400) = 28.004. In other words, in the time for the benchmark to get to 21 points, our player would get 28 and a small bit. Therefore, a handicap of -7 would bring the player in line with the benchmark. Therefore, the general formula for a handicap is: H_x = P * ( 1 - 10^((R_x - R_bm)/C) )
thats qutie interesting. I think trying to get a handicap for pairs from singles ranking is only good enough for first-order estimate though. Just down to the fact that some people play together much better than others. That is to say that four people of similar abilites do not necessarily make 6 pairs of similar abilities 1-2 1-3 1-4 2-3 2-4 3-4 Maybe this first estimate is 'good enough', otherwise it looks to be necessary to track doubles games as well as singles.
Just some quick thoughts during lunch break. Use the Elo ratings to set the point win probability between two players (this is, of course, a little bit arbitrary): Elo of player A = Elo_A = 1500 Elo of player B = Elo_B = 1000 Probability of player A winning any one point from B = (Elo_A)/(Elo_A + Elo_B) = (1500)/(1500+1000) = 0.6 Then we can use the point distribution to determine the expected game scores to see if the players performed above or below expectations. Thinking about it, the Elo system probably doesn't really work with single elimination tournaments. Since players knocked out early will not have another chance to play. For the doubles ratings, it's probably best to use a separate rating for doubles play. Although using the average of the pair is probably good as an initial rating. I'm in a bit of a hurry, so sorry for the bits of rambling.
I generally dislike these handicap systems in sports with so many variables. Oftentimes, 3 players, A B C. A>B>C, but C>A... because of different play styles. If in the past, C loses to B, and B loses to A. C supposedly has a large handicap if C plays A. However, everyone knows that C can beat A even without a handicap. In which case, the handicap is not evening the odds, but actually stacking the odds against the weaker player. If you wanna nerd out using handicaps, go bowling or play some golf.
Ratings for A, B and C would depend on how well they play against other people, not only against each other. It might turn out that on average they're all the same strength, so no handicap needed.
Glad hhwoot got in touch. His excellent thread (see #4) was one of the inspirations for this, though I'd forgotton where I'd seen it. One thing I'd like to correct on is that the predicted score depends upon the difference in ratings. Therefore the predicted score for 1000vs800 would the same as 2000vs1800 etc. The average of the group is arbitary & elo works fine with negative ratings. However, I'm not sure my ego could handle a negative rating! I agree that individuals should have separate 'singles' and 'doubles' ratings. I was going to just add 2 players doubles ratings together to make a pair's rating, but the average would be better. Presuming a player is equally 'good' at singles and doubles, if they are 400 points better in singles, they would roughly be only 200 points better in doubles. Taking the average would make the ratings more comparible. With regards to A>B>C>A, Elo is a reflection of the average performance, so there is always going to be some variation. You can't tell me that A's handicap should depend upon whether B and/or C is in their group in a tournament. Reading the wikipedia article, Elo's intention was that a 200 point difference in rating should equate to the stronger player having a 75% chance of winning. From hhwoot's thread, a point percentage of 55% leads to a 75% chance of winning in a game to 21. Rearranging the equation of prediced proportion, this leads to a 'C' factor of 2200 (compared with 400 in chess). Other rearrangement yields: A = P.10^(-del_R/C), where A is the score of the expected loser, P is the length of game in points, del_R is the difference in ratings & C is the scaling parameter. Therefore a rating difference of 400 in a game to 21 would lead to a prediced score of 21*10^(-400/2200) = 13.82 Finally, one of the nice things about doing Elo against proportion of points won is that it can be used for different game lengths and games that may go to a deciding game. However, this can lead to the issue where the player/pair who wins the most point loses the game.
Listen big shot, don't make me come there and show you how abc works...you know on my greatest day and your worst I can clean the floor with you
Probably you want a weighted average: if two players differ in strength a lot, then opponents will tend to hit to the weaker player, so the rating of the pair will be more like 2/3 of the lower rating plus 1/3 of the higher rating.
The ELO rating system has its inherent flaws. It is inflationary with time. Actually, I think it depends on the total number of players having ELO ratings. Let me illustrate. These were the ratings of the top ten players in the world in January 1980 Rating Anatoly Karpov 2725 Mikhail Tal 2705 Victor Korchnoi 2695 Lajos Portisch 2655 Lev Polugaevsky 2635 Boris Spassky 2615 Tigran Petrosian 2615 Zoltan Ribli 2610 Florin Gheorghiu 2605 Yuri Balashov 2600 30 years later, in January 2010 Rating Magnus Carlsen 2810 Veselin Topalov 2805 ViswanathanAnand 2790 Vladimir Kramnik 2788 Levon Aronian 2781 Boris Gelfand 2761 Vugar Gashimov 2759 Vassily Ivanchuk 2749 Wang Yue 2749 Peter Svidler 2744 So, either there is a tremendous improvement in the playing skills of the world's best, or the ratings cannot be taken for its absolute value. Of course, with the advent of all the computer programs and other aids, the levels of the players are different from those of the earlier generations. But it cannot account for the meteoric rise of the ratings. The ELO rating system, and other similar ones, are useful only when comparing relative strengths of players at a particular instant, not of players that are years or decades apart. So we'll never really know if Garry Kasparov is actually better than Jose Capablanca or Mikhail Botvinnik
phil-mm is correct to state that Elo doesn't give an estimation of absolute standard. However, I don't believe that we are/should be asking it to. Elo gives an estimation of result based on the difference between 2 ratings (see first paragraph of post #10). For the purposes of handicapping players within a relatively closed group (e.g. the players at a club) and/or currently ranking those players relative to one another, this is fine. That said, Elo may struggle to deal with new or returning players. Their ratings may no longer be applicable to the changed group dynamic. In that case, their ratings may have to be artificially tweaked to be more representative. New-comers would also have to have their initial ratings estimated. Interesting idea from alexh (post #12). Would suggest that we'd need some experimental data to draw conclusions on the split between higher/lower doubles rating. Another question would be what proportion of the rating points gained or lost would go to the higher ranked player.
In fact, I think the Elo rating system may not even truly reflect relative strengths. For many years, I had been pondering over why the ratings are inflationary, but without making much headway. It's only recently that I have some inkling as to how it works. Imagine a large pyramid, where the volume represents the total number of players, and the height their ratings. Within this pyramid are many smaller ones situated at different heights. These represent the tournaments of different categories. The stronger players will cannibalise the ratings of the weaker ones and move upwards. The larger the pyramid, the higher will be the apex. This is what has happened over the last few decades. The total number of players has increased significantly. Elo should have modified his formula to account for this increase. Also, imagine two separate communities of players with differing populations who do not interact with each other. Two players of the same standard playing in each community will evolve to a rating that is different from each other, due to a difference in the population. As long as they stay within their own communities, their relative strengths within their communities are accurate. When they play with each other, obviously their relative strength is wrong. So, technically, the Elo system does not really work. It is just an approximation. How complicated is Elo, encompassing both Einsteinian stuff and Darwinism.
The pyramid analogy is a good one, but I believe that a normal distribution (upon which Elo is based) is an even better one. As the number of players increase, the top 10 go from being the top 1 in a 1000 to the top 1 in 10000 etc. As the proportion gets smaller, the sigma (deviation from the mean) gets larger, hence inflation. I still maintain that the absolute strength (and its inflation) isn't relevant to a single club of players. However, 'separate communities' is a real issue & could also be a contributary factor to top 10 inflation. Club players will very rarely (if ever) play internationals, so the difference in their ratings won't be accurate. However, national players tend to beat club players and internationals tend to beat nationals, so the ratings gap may be unrealistically high. For just the players of one club, the 'separate community' issue is kept to a minimum, so long as most players regularly play with and against most of the other players. However, clubs that have multiple 'tiers' will suffer from this issue & elo isn't appropriate for when players from 2 groups 'mix'. As with any system, Elo isn't perfect. It is initially complicated (though once set-up, maintennance is relatively easy) & it is vulnerable to separate communities and ratings inflation. However, if its limitations are observed and accounted for, proper implementation of Elo does provide an unbiased estimator of relative ability.
Yes, but of course. I had been wrongly assuming that the weakest players are the most numerous . That is not true. The pyramid model is misconceived. It has to a normal distribution. A larger curve will explain both inflation and deflation, which is also a feature of the Elo system. I just realised that both FIDE and USCF has mechanisms to combat this inflation and deflation.
has anyone considered that tournaments with the ELO rating will both be a spectator's and player's nightmare? Finals: 2 mediocre players who marginally exceeded their normal playing performance. And any match-up where the best player meets the worst player, the best player will need to completely destroy the worst player in order to win. Competitive players might start head hunting in these matches.
The Elo rating system is not a handicap system, although it could be used to help assign handicaps more objectively. The Elo system is a way of measuring how good a player is, relative to other players. A tournament could specify a range of allowed Elo ratings; this would determine the playing standard of the tournament. Using Elo doesn't necessarily imply a handicap tournament. Imagine an international tournament where some amateur players can win places by lottery. I win a ticket, and I get to play against Lin Dan. I have an Elo rating of (say) 1700, and Lin Dan has an Elo rating of 2800 (I'm just using rough numbers from chess here). The difference in Elo ratings indicates that Lin Dan is a much, much stronger player than I am, and will almost certainly win. Indeed, you can use the ratings to estimate the probability of him winning (nearly 100%, in this case). He still wins a match by winning 2 games of 21 points -- there's no handicap applied here. Even if the score line is 21--19, 21--19 (in his favour), he still wins (obviously that's not going to happen. I'd be delighted with even one point). The improvement in his Elo rating will be tiny, because I'm such an easy player for him to beat. In the extremely unlikely event that I beat Lin Dan, however, my rating will improve quite a lot, and his rating will suffer quite badly.