行為強化

行為強化（粵拼：hang4 wai4 koeng4 faa3；英文：behavioral reinforcement），通常就噉叫強化，喺心理學同動物行為學上係操作制約（operant conditioning）必然涉及嘅一個概念。最基本上，行為強化係指一啲會「強化」（reinforce）帶嚟自己嘅行為嘅結果。

舉個例說明，想像依家韞隻大家鼠入個斯金納箱裏面^[1]，响佢每次撳碌反應槓桿（response lever）嗰時，都會有嘢食喺個分配器（dispenser）嗰度跌出嚟；實證研究表明咗，嘢食會令到隻大家鼠更有興致去撳碌反應槓桿，令佢撳得密啲或者撳得大力啲呀噉－喺成個過程當中，「有嘢食」呢個結果係個行為強化，強化咗「撳反應槓桿」呢樣行為。即係話簡化噉講，強化可以想像成「會提升一隻動物做某樣行為嘅動力」嘅獎勵^[2]^[3]。

對強化嘅心理學同神經科學研究有廣泛嘅應用價值：呢啲研究會諗「強化嘅過程由邊啲腦區控制」等嘅問題，而呢啲研究會俾人應用喺精神醫學（例：要點樣用藥控制衝動呢啲源於強化相關功能失靈嘅心理病？）^[4]、人工智能（例：點樣教人工智能程式好似真嘅動物噉透過強化嚟學習？）^[5]同埋遊戲設計（例：點樣將隻遊戲設計成最能夠強化「繼續玩落去」呢樣行為？）^[6]等嘅多個領域嗰度。

基礎

操作制約（operant conditioning）係心理學同動物行為學上嘅一個概念，泛指一隻動物因為對某個刺激（ $s$ ）起某個反應（ $r$ ）持續噉引致某啲後果，而將 $s$ 同 $r$ 聯想埋一齊（刺激同反應之間嘅關係起變化；Stimulus-Response），通常會導致 $s$ 同 $r$ 之間嘅關係變強或者變弱。喺定義上，會令 $r$ 變強嘅後果係所謂嘅強化，而會令 $r$ 變弱嘅後果就係所謂嘅懲罰（punishment）^[7]。

「強化」一樣行為指嘅嘢包括^[7]：

令一樣行為發生得更加頻密（例：撳掣撳得密啲）；
令一樣行為發生嘅時間更加長（例：撳掣嗰陣對手撳住個掣耐啲）；
令一樣行為嘅強度提升（例：撳掣撳得大力啲）；
令一樣行為嘅等待期短啲（例：撳掣撳得快手啲）... 等等。

舉個簡單例子說明，想像家吓研究者擺一隻大家鼠喺一個斯金納箱入面，個箱入面有條槓桿（ $s$ ），每當隻大家鼠撳條槓桿（ $r$ ）嗰時，個箱就會自動噉有嘢食跌落嚟俾隻大家鼠食；實證嘅研究表明咗，噉做會令到隻大家鼠學識「撳槓桿」（ $r$ ）同「有嘢食」（結果）之間有啦掕，跟住佢就會係噉勁撳條槓桿（ $r$ 受到強化）－「有嘢食」係一個行為強化，會「強化」隻動物知道會帶嚟嘢食嘅行為，提升隻動物做嗰樣行為嘅機會率同做嗰樣行為嗰時嘅強度（例：撳起槓桿上嚟撳得幾大力）。行為強化呢種現象喺人等嘅多種動物身上都可以輕易觀察得到^[7]。

一隻蜜蜂試過「推開個塞」會「有蔗醣」之後，學識次次見到個塞都走去推－展示咗操作制約。

喺概念上，行為強化可以分做正（positive）同負（negative）兩種^[8]^[9]：

正強化係指一個行為帶嚟一啲隻動物想要嘅嘢，令到嗰樣行為發生嘅機會率提升：例如想像家陣有個人喺度吸毒，吸咗啲毒品，跟住就有好興奮嘅感覺，令到佢下次想再吸毒－喺成個過程當中，「興奮嘅感覺」就係一個正強化，提升「吸毒」呢樣行為發生嘅機率。
負強化係指一個行為幫隻動物避開佢唔鍾意嘅嘢，令到嗰樣行為發生嘅機會率提升：例如又想像家陣有個人喺度吸毒，佢吸毒吸到有股癮，硬係要日日都吸一次先至舒服，唔吸就會出現頭暈、失眠同肚痾等唔舒服嘅症狀^{[註 1]}，而佢一吸就能夠幫自己緩解呢啲症狀，令到佢下次想再吸毒－喺成個過程當中，「消除唔舒服嘅症狀」就係一個負強化，提升「吸毒」呢樣行為發生嘅機率^[10]^[11]。

進化起源

動物唔使學都會識得「想要嘢食」。

喺最基本上，行為強化可以用進化心理學嘅角度嚟諗，分做原級強化物（primary reinforcer）同次級強化物（secondary reinforcer）兩大種：

原級強化物係隻動物唔使學都會想要（件強化物對佢有強化效果）嘅嘢，例如係嘢食同水呀噉：一般認為，喺一個動物物種嘅進化史上，嘢食同水係生存必需嘅嘢，唔使學都曉係噉做「會帶嚟嘢食同水嘅行為」嘅個體生存能力會強啲，所以現存嘅動物物種冚唪唥進化到會本能上噉做帶嚟呢啲強化物嘅行為。不過，一種原級強化物嘅強化效果（ $v$ ）可以受個各種因素影響－同一個物種嘅唔同個體之間可以喺「對原級強化物嘅反應」上有差異（例：有啲人大食啲有啲人冇咁大食，即係嘢食呢樣原級強化物對唔同人嚟講 $v$ 都唔同），而且 $v$ 仲可以受內部因素影響，例如一隻動物肚餓嗰陣會零舍想去搵嘢食，令嘢食對佢嘅 $v$ 响短時間內提升^[12]^[13]。
次級強化物係指一隻動物要經歷學習先至知佢哋有價值嘅強化：例如想像一隻狗，嘢食對佢嚟講係原級強化物，不過佢主人次次喺俾狗餅佢食之前都會講句「乖仔」，令到隻狗將句嘢同狗餅聯想埋一齊，就有可能令到「俾人讚係『乖仔』」對隻狗嚟講變成一個強化－句嘢成為咗個次級強化物^[14]。

強化程序

強化程序（schedules of reinforcement）係指做出一個強化「出現嘅時間」以及「出唔出現」同相應嘅行為之間係咪成可靠嘅關係。

個體差異

數學模型

精神醫學應用

上癮

人工智能應用

拉雜應用

心理操控

動物訓練

經濟研究

輕推理論

遊戲設計

註釋

↑ 喺現實世界，的確好多毒品都有啲噉嘅效果。

睇埋

文獻

Mitchell, J. T., Kimbrel, N. A., Hundt, N. E., Cobb, A. R., Nelson‐Gray, R. O., & Lootens, C. M. (2007). An analysis of reinforcement sensitivity theory and the five‐factor model (PDF). European Journal of Personality: Published for the European Association of Personality Psychology, 21(7), 869-887.

攷

↑ Pineño, O. (2014). ArduiPod Box: A low-cost and open-source Skinner box using an iPod Touch and an Arduino microcontroller (PDF). Behavior research methods, 46(1), 196-205.
↑ Glickman, S. E., & Schiff, B. B. (1967). A biological theory of reinforcement. Psychological review, 74(2), 81.
↑ Wiegand, D. M., & Geller, E. S. (2005). Connecting positive psychology and organizational behavior management: Achievement motivation and the power of positive reinforcement. Journal of Organizational Behavior Management, 24(1-2), 3-25.
↑ Robbins, T. W., Gillan, C. M., Smith, D. G., de Wit, S., & Ersche, K. D. (2012). Neurocognitive endophenotypes of impulsivity and compulsivity: towards dimensional psychiatry. Trends in cognitive sciences, 16(1), 81-91.
↑ Wiering, M., & Van Otterlo, M. (2012). Reinforcement learning. Adaptation, learning, and optimization, 12(3).
↑ Joseph Kim: The Compulsion Loop Explained 互聯網檔案館嘅歸檔，歸檔日期2020年1月16號，., Gamasutra.
↑ ^7.0 ^7.1 ^7.2 Schacter, Daniel L., Daniel T. Gilbert, and Daniel M. Wegner. "B. F. Skinner: The role of reinforcement and Punishment", subsection in: Psychology; Second Edition. New York: Worth, Incorporated, 2011, 278–288.
↑ Flora, S. (2004). The Power of Reinforcement. Albany: State University of New York Press. p. 253.
↑ Valenchon, M., Lévy, F., Moussu, C., & Lansade, L. (2017). Stress affects instrumental learning based on positive or negative reinforcement in interaction with personality in domestic horses. PloS one, 12(5), e0170783.
↑ Pantazis, C. B., Gonzalez, L. A., Tunstall, B. J., Carmack, S. A., Koob, G. F., & Vendruscolo, L. F. (2021). Cues conditioned to withdrawal and negative reinforcement: Neglected but key motivational elements driving opioid addiction (PDF). Science Advances, 7(15), eabf0364.
↑ Magoon, M. A., Critchfield, T. S., Merrill, D., Newland, M. C., & Schneider, W. J. (2017). Are positive and negative reinforcement "different"? Insights from a free‐operant differential outcomes effect (PDF). Journal of the experimental analysis of behavior, 107(1), 39-64.
↑ Fleischman, D. S. (2016). An evolutionary behaviorist perspective on orgasm. Socioaffective neuroscience & psychology, 6(1), 32130.
↑ Buss, D. M. (2020). Evolutionary psychology is a scientific revolution (PDF). Evolutionary Behavioral Sciences.
↑ Bersh, P. J. (1951). The influence of two variables upon the establishment of a secondary reinforcer for operant responses. Journal of Experimental Psychology, 41(1), 62.

拎

[10] 喺現實世界，的確好多毒品都有啲噉嘅效果。

[1] Pineño, O. (2014). ArduiPod Box: A low-cost and open-source Skinner box using an iPod Touch and an Arduino microcontroller (PDF). Behavior research methods, 46(1), 196-205.

[glickman1967-2] Glickman, S. E., & Schiff, B. B. (1967). A biological theory of reinforcement. Psychological review, 74(2), 81.

[wiegand2005-3] Wiegand, D. M., & Geller, E. S. (2005). Connecting positive psychology and organizational behavior management: Achievement motivation and the power of positive reinforcement. Journal of Organizational Behavior Management, 24(1-2), 3-25.

[4] Robbins, T. W., Gillan, C. M., Smith, D. G., de Wit, S., & Ersche, K. D. (2012). Neurocognitive endophenotypes of impulsivity and compulsivity: towards dimensional psychiatry. Trends in cognitive sciences, 16(1), 81-91.

[5] Wiering, M., & Van Otterlo, M. (2012). Reinforcement learning. Adaptation, learning, and optimization, 12(3).

[kimgama-6] Joseph Kim: The Compulsion Loop Explained 互聯網檔案館嘅歸檔，歸檔日期2020年1月16號，., Gamasutra.

[schacter2011-7] 7.0 ^7.1 ^7.2 Schacter, Daniel L., Daniel T. Gilbert, and Daniel M. Wegner. "B. F. Skinner: The role of reinforcement and Punishment", subsection in: Psychology; Second Edition. New York: Worth, Incorporated, 2011, 278–288.

[8] Flora, S. (2004). The Power of Reinforcement. Albany: State University of New York Press. p. 253.

[9] Valenchon, M., Lévy, F., Moussu, C., & Lansade, L. (2017). Stress affects instrumental learning based on positive or negative reinforcement in interaction with personality in domestic horses. PloS one, 12(5), e0170783.

[11] Pantazis, C. B., Gonzalez, L. A., Tunstall, B. J., Carmack, S. A., Koob, G. F., & Vendruscolo, L. F. (2021). Cues conditioned to withdrawal and negative reinforcement: Neglected but key motivational elements driving opioid addiction (PDF). Science Advances, 7(15), eabf0364.

[12] Magoon, M. A., Critchfield, T. S., Merrill, D., Newland, M. C., & Schneider, W. J. (2017). Are positive and negative reinforcement "different"? Insights from a free‐operant differential outcomes effect (PDF). Journal of the experimental analysis of behavior, 107(1), 39-64.

[13] Fleischman, D. S. (2016). An evolutionary behaviorist perspective on orgasm. Socioaffective neuroscience & psychology, 6(1), 32130.

[14] Buss, D. M. (2020). Evolutionary psychology is a scientific revolution (PDF). Evolutionary Behavioral Sciences.

[15] Bersh, P. J. (1951). The influence of two variables upon the establishment of a secondary reinforcer for operant responses. Journal of Experimental Psychology, 41(1), 62.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[註 1]

[10]

[11]

[12]

[13]

[14]