Elo 等級分

Elo 等級分制度（英文：Elo rating system，當中 Elo 好接近粵拼：ji1 lou4；假借漢字：伊勞）係一套評估一隻遊戲嘅玩家技術幾好嘅做法，適用於好似中國象棋同國際象棋等嘅零和遊戲^[1]。除咗呢啲棋類遊戲，廿一世紀初嘅電競都會用到伊勞等級分嚟衡量啲玩家嘅技術^[2]。

諗頭

「要量度一個個體嘅等級分，我諗就好似量度一嚿水松木嘅位置... 當中嚿水松木喺翻滾嘅水面郁上郁落，量度者用嘅架生係一條碼棍，條碼棍綁咗喺一條喺風中擺動嘅繩度。」^{[註 1]}^[3]

—伊勞等級系統嘅創始人阿柏特·伊勞^{[歐 1]}

首先，伊勞等級系統假設咗隻遊戲係零和^{[歐 2]}嘅：如果話一隻遊戲係零和，簡單講意思即係話場遊戲只有其中一位或者一隊玩家會贏，再唔係就打和，冇得話例如兩方玩家一齊贏^[4]。中國象棋、國際象棋、圍棋同埋多數嘅 PvP 視像遊戲（好似係畀玩家互射嘅射擊遊戲）都屬於零和遊戲。

齋靠日常觀察已知，玩家之間喺技術上可以有差異^{[註 2]}：有啲玩家無論同邊個對局，都有好大機率贏，而有啲玩家就無論同邊個對局，都大機率輸；好多人都想搵方法客觀噉量度玩家嘅技術有幾好，例如搞國際象棋比賽嘅組織就成日都想評估棋手嘅技術，伊勞就係其中一套最成功嘅系統；而用行話講，伊勞能夠達致等距噉量度技術呢樣嘢^[5]。

喺伊勞等級系統下，每位玩家都掕住個伊勞值^{[歐 3]}，而個系統會不斷噉^[6]^:2.2：

基於兩位玩家而家呢刻嘅伊勞，預估賽果；
對局完咗之後，噏哋ap1 dei1每位玩家嘅伊勞，伊勞改變幅度取決於個結果有幾「出人意表」；
經歷咗好多次噏哋之後，每位玩家會到達佢嘅最後伊勞，呢個最終伊勞就會（唔完美噉）反映佢嘅實際技術水平。

以下落嚟講嘅數學細節，假設讀者學過基本嘅概率論同統計學，識機會率同平均值等嘅基本概念。

基本模型

最基本嗰隻伊勞制度會用到嘅概念，可以用以下噉嘅方式數學化。

技術幾高

已知一個事實：每位玩家嘅表現，梗會或多或少有啲飄忽；技術好嘅玩家可能有一日狀態唔好，搞到連輸幾場；而技術冇咁好嘅玩家，亦有可能咁啱撞正狀態大好而連贏幾場；噉即係話一位玩家嘅表現，可以簡化噉想像成一個常態分佈^{[歐 4]}；想像下圖入面嘅 X 軸係玩家嘅表現，而 Y 軸係每個表現值出現嘅機率，兩條線分別代表阿 A 同阿 B 兩位玩家^{[註 3]}，講平均表現（ $\mu$ ）嘅話，

阿 A 嘅平均表現係 76 分咁高，用日常用語講即係話佢表現通常會係 76 分或者接近 76 分，
阿 B 嘅平均表現係 52 分咁高，用日常用語講即係話佢表現通常會係 52 分或者接近 52 分，

兩者都有 12 分咁多嘅標準差（ $\sigma$ ）。如果阿 A 同阿 B 對局，阿 A 多數時候會贏（表現高過阿 B），但間中有可能會撞啱有日阿 A 狀態唔好同時阿 B 狀態大好，令到阿 B 嗰日嘅表現好啲。噉亦表示，要準確噉評估一位玩家嘅技術，一定要首先畀佢經過有返咁上下數量嘅對局^[1]^:1.4。

一場對局

分析者可以靠住玩家嘅預估技術水平嚟預測一場對局嘅賽果。依家想像有兩位玩家，阿 $i$ 同阿 $j$ 兩個要鬥捉圍棋，設^[7]^:2.1

$\theta _{i}$ 做阿 $i$ 嘅表現，或者預估技術值；
$\theta _{j}$ 做阿 $j$ 嘅表現，或者預估技術值；
$R_{ij}\in \{0,1\}$ 做兩者對局嘅可能結果，當中 1 表示「阿 $i$ 贏」，而 0 表示「阿 $j$ 贏」；

阿 $i$ 贏嘅機率（ $\Pr({\text{i wins}})$ ）理應會受制於 $(\theta _{i}-\theta _{j})$ ^{[註 2]}。如果畫做圖，當中 Y 軸做 $\Pr({\text{i wins}})$ 而 X 軸做 $(\theta _{i}-\theta _{j})$ ，會得出好似以下噉嘅線：

當

(\theta _{i}-\theta _{j})=0

（兩位玩家表現相約），

\Pr({\text{i wins}})=0.5

（阿

i

有 50% 機會贏）

當

(\theta _{i}-\theta _{j})

係大嘅正數（阿

i

表現高一截），

\Pr({\text{i wins}})\to 1

（阿

i

大機會贏）

當

(\theta _{i}-\theta _{j})

係大嘅負數（阿

j

表現高一截），

\Pr({\text{i wins}})\to 0

（阿

j

大機會贏）

呢啲嘢之間嘅關係，可以用以下噉嘅式嚟表達^[6]^:(1)：

E_{i}={\frac {1}{1+10^{(\theta _{j}-\theta _{i})/400}}}

；

[1]

$E_{i}$ 係指預想阿 $i$ 會攞到幾多分，反映緊 $\Pr({\text{i wins}})$ ^[8]。想像：

如果

\theta _{j}=\theta _{i}

，

E_{i}=0.5

如果

\theta _{j}-\theta _{i}=200

，

E_{i}\approx 0.24

如果

\theta _{j}-\theta _{i}=-200

，

E_{i}\approx 0.76

... 如此類推。

噏哋伊勞

經歷一場對局之後，個系統就要按照對局結果，郁手噏哋ap1 dei1兩位玩家嘅伊勞值。設 ${\text{Elo}}_{i,t}$ 做阿 $i$ 喺時間點 $t$ 嘅伊勞值，並且設 $[1]$ 入面嘅 $\theta _{i}={\text{Elo}}_{i,t}$ ，每場對局之後，兩位玩家嘅新伊勞值可以用以下呢條式嚟計^[6]^:(2)：

{\text{Elo}}_{i,t+1}={\text{Elo}}_{i,t}+k(S_{i}-E_{i})

；

[2]

，當中

${\text{Elo}}_{i,t+1}$ 係阿 $i$ 跟住落嚟嘅伊勞值；
$E_{i}$ 係基於兩位玩家先前嘅伊勞值做預測，得出嘅預估賽果；
$S_{i}$ 係實際上嘅賽果，如果阿 $i$ 贏， $S_{i}=1$ ，而如果阿 $i$ 輸， $S_{i}=0$ ；
$k$ 係一個參數，數值可以調節，反映伊勞值噏哋得有幾快， $k$ 數值愈大，伊勞值噏哋得愈快。

而家有一大班玩家，將佢哋嘅伊勞值冚唪唥都設做 1,000，跟住落嚟會發生噉嘅事：

如果阿 $i$ 嘅伊勞明顯低過佢應有嘅值，噉佢會成日撞到伊勞相約但查實技術渣過佢嘅對手， $E_{i}$ 接近 0.5， $S_{i}$ 傾向係 1，佢嘅伊勞變化傾向會係明顯嘅正值；
如果阿 $i$ 嘅伊勞明顯高過佢應有嘅值，噉佢會成日撞到伊勞相約但查實技術勁過佢嘅對手， $E_{i}$ 接近 0.5， $S_{i}$ 傾向係 0，佢嘅伊勞變化傾向會係明顯嘅負值；
如果阿 $i$ 技術好勁而且到咗佢應有嘅伊勞，噉佢就會成日撞到伊勞同實際技術都低過佢嘅對手， $E_{i}$ 接近 1， $S_{i}$ 傾向係 1，佢打低對手嗰陣 $(S_{i}-E_{i})$ 傾向細，佢伊勞就唔會點郁；
如果阿 $i$ 技術好渣而且到咗佢應有嘅伊勞，噉佢就會成日撞到伊勞同實際技術都高過佢嘅對手， $E_{i}$ 接近 0， $S_{i}$ 傾向係 0，佢輸畀對手嗰陣 $(S_{i}-E_{i})$ 傾向細，佢伊勞就唔會點郁；

—噉即係話，將啲玩家嘅伊勞值冚唪唥設做 1,000 再畀佢哋玩好多場對局，各人就會慢慢噉趨向到達佢哋應有嘅伊勞值。最終嘅伊勞值會反映佢哋嘅實際技術。

各界應用

國際象棋

喺廿一世紀初，伊勞係國際象棋上最常用嘅棋手評分方法。國際象棋聯會^{[歐 5]}等嘅國際象棋組織以及網上國際象棋遊戲，都係靠伊勞系統嚟評定啲棋手實力有幾高嘅，而事實上伊勞制度嘅創始人阿柏特·伊勞^{[歐 1]}都係一個熱愛國際象棋嘅人，仲去到國際象棋大師嘅水平^[9]。

一般嚟講，一個國際象棋組織會有套方案，講明「伊勞係咁多咁多分至咁多咁多分嘅棋手，可以配以呢個呢個稱號」噉，但唔同嘅國際象棋組織喺呢方面嘅做法都有啲唔同^[10]。以國際象棋聯會為例，根據 2023 年嘅國際象棋聯會準則，一位棋手嘅預設伊勞值係 1,000，佢喺國際象棋比賽同第啲棋手較量，官方紀錄就會按佢啲對手嘅伊勞以及對局嘅結果，調整呢位棋手嘅伊勞。如果一位棋手要得到國際象棋特級大師^{[歐 6]}嘅稱號，就需要達致起碼 2,500 咁高嘅伊勞值，而且佢打低嘅對手亦要有返咁上下伊勞先得^[11]。

電子競技

進階變種

批評

簡史

睇埋

註釋

↑ 呢句粵文譯版嚟自英文譯句－"The measurement of the rating of an individual might well be compared with the measurement of the position of a cork bobbing up and down on the surface of agitated water with a yardstick tied to a rope and which is swaying in the wind."
↑ ^2.0 ^2.1 假設隻遊戲係技術遊戲
↑ 姑且唔好諗計量單位係乜住。

歐詞

↑ ^1.0 ^1.1 Arpad Elo
↑ zero-sum
↑ Elo
↑ normal distribution
↑ Fédération Internationale des Échecs，FIDE
↑ Grandmaster，GM

文獻

Berg, A. (2020). Statistical analysis of the elo rating system in chess (PDF). Chance, 33(3), 31-38.
Hu, W., & Barradas, D. (2023, July). Work in Progress: A Glance at Social Media Self-Censorship in North America. In 2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW) (pp. 609-618). IEEE，伊勞等級嘅做法可以用嚟量度一啲主觀嘅概念，例如依份研究就叫一班受試者做「呢兩句嘢，邊句比較唔適合貼上去社交媒體」做好多次，得出手上嗰啲句子最後嘅「伊勞值」作為量度「恰當度」嘅方法。
Morrison, B. (2019). Comparing elo, glicko, irt, and bayesian irt statistical models for educational and gaming data (PDF). University of Arkansas.
Pelánek, R. (2016). Applications of the Elo rating system in adaptive educational systems. Computers & Education, 98, 169-179.

參考資料

↑ ^1.0 ^1.1 Elo, A. E. (1978). The Rating of Chessplayers, Past and Present (PDF) 2nd Ed. New York: Arco Pub.
↑ Batchelder, W. H., & Bershad, N. J. (1979). The statistical analysis of a Thurstonian model for rating chess players. Journal of Mathematical Psychology, 19(1), 39-60.
↑ Elo Rating System. Chess Terms.
↑ Von Neumann, John; Oskar Morgenstern (2007). Theory of games and economic behavior (60th anniversary ed.). Princeton: Princeton University Press.
↑ Berg, A. (2020). Statistical analysis of the elo rating system in chess (PDF). Chance, 33(3), 31-38.
↑ ^6.0 ^6.1 ^6.2 Lehmann, R., & Wohlrabe, K. (2017). Who is the 'Journal Grand Master'? A new ranking based on the Elo rating system. Journal of Informetrics, 11(3), 800-809.
↑ Pelánek, R. (2016). Applications of the Elo rating system in adaptive educational systems. Computers & Education, 98, 169-179.
↑ Glickman, M. E., & Jones, A. C. (1999). Rating the chess rating system. Chance, 12(2), 21-28.
↑ The Rating of Chess Players, Past and Present (First Edition 1978, Second Edition 1986), Arco.
↑ Elo Rating System – Everything You Need to Know. Chess Klub.
↑ FIDE HANDBOOK. Section B. 01.

[3] 呢句粵文譯版嚟自英文譯句－"The measurement of the rating of an individual might well be compared with the measurement of the position of a cork bobbing up and down on the surface of agitated water with a yardstick tied to a rope and which is swaying in the wind."

[geiseot-8] 2.0 ^2.1 假設隻遊戲係技術遊戲

[13] 姑且唔好諗計量單位係乜住。

[arpad-5] 1.0 ^1.1 Arpad Elo

[6] zero-sum

[10] Elo

[12] rmal distribution

[16] Fédération Internationale des Échecs，FIDE

[19] Grandmaster，GM

[elo78-1] 1.0 ^1.1 Elo, A. E. (1978). The Rating of Chessplayers, Past and Present (PDF) 2nd Ed. New York: Arco Pub.

[2] Batchelder, W. H., & Bershad, N. J. (1979). The statistical analysis of a Thurstonian model for rating chess players. Journal of Mathematical Psychology, 19(1), 39-60.

[4] Elo Rating System. Chess Terms.

[vonneumann-7] Von Neumann, John; Oskar Morgenstern (2007). Theory of games and economic behavior (60th anniversary ed.). Princeton: Princeton University Press.

[9] Berg, A. (2020). Statistical analysis of the elo rating system in chess (PDF). Chance, 33(3), 31-38.

[Lehmann17-11] 6.0 ^6.1 ^6.2 Lehmann, R., & Wohlrabe, K. (2017). Who is the 'Journal Grand Master'? A new ranking based on the Elo rating system. Journal of Informetrics, 11(3), 800-809.

[14] Pelánek, R. (2016). Applications of the Elo rating system in adaptive educational systems. Computers & Education, 98, 169-179.

[15] Glickman, M. E., & Jones, A. C. (1999). Rating the chess rating system. Chance, 12(2), 21-28.

[17] The Rating of Chess Players, Past and Present (First Edition 1978, Second Edition 1986), Arco.

[18] Elo Rating System – Everything You Need to Know. Chess Klub.

[20] FIDE HANDBOOK. Section B. 01.

[1]

[2]

[註 1]

[3]

[歐 1]

[歐 2]

[4]

[註 2]

[5]

[歐 3]

[6]

[歐 4]

[註 3]

[7]

[8]

[歐 5]

[9]

[10]

[歐 6]

[11]