Logging (電腦)
即係話一段典型嘅 log 望落會好似以下噉[1]:
到咗廿一世紀初,logging 有相當廣泛嘅用途,寫程式嘅人做 debug 嗰陣,往往都會睇吓個程式行嗰時出嘅紀錄檔案,靠噉嚟搵出錯處喺邊;除此之外,仲有資訊系統方面嘅工作者發覺,佢哋可以用 log 嚟分析啲用家用一件電腦產品嗰陣時嘅行為,而佢哋做嘅呢啲分析就形成咗流程探勘嘅研究[2][3]。
因為 logging 咁有用,好多電腦系統都會內置 logging 功能:主流嘅作業系統都內建咗 logging 功能,個別嘅程式亦都識得自己將啲 log 寫喺特定嘅檔案度[註 1],伺服器通常亦都會紀錄埋每行 log 係由邊個子系統或者邊個程式寫嘅(server log)。除此之外,仲有第三方嘅公用程式可以提供 logging 呢種功能[4][5]。
基本概念
編輯Log 檔案並冇話有咩指定嘅格式。唔同系統或者唔同程式嘅 log 檔案可以完全唔同樣,例如 nginx(一套網頁伺服器)嘅存取 log 可以好似以下噉,有源頭嘅 IP 位址、日子時間、攞咗乜檔案、自稱係乜嘢瀏覽器等等:
10.11.12.13 - - [14/Mar/2023:22:07:20 +0000] "GET /xxx HTTP/1.1" 404 142 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:77.0; zzz) Gecko/20190101 Firefox/77.0" 10.11.12.13 - - [14/Mar/2023:22:07:20 +0000] "GET /yyy HTTP/1.1" 404 142 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:77.0; zzz) Gecko/20190101 Firefox/77.0" ...
相比之下,LaTeX 嘅 log 檔案就好似下面噉,日子時間統統欠奉,淨係將顯示咗嘅資訊紀錄嗮落嚟:
This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex 2016.10.1) 22 DEC 2017 14:45 entering extended mode restricted \write18 enabled. %&-line parsing enabled. **a (./a.tex LaTeX2e <2016/02/01> Babel <3.9q> and hyphenation patterns for 81 language(s) loaded. ...
流程探勘
編輯流程探勘(process mining,PM)係數據科學上一連串用嚟分析 log 檔案嘅方法,旨在想由 log 嘅數據度搵出「啲人嘅行為會跟從咩規律」噉嘅知識[2][6]。做 PM 可以用到好多唔同嘅演算法,不過啲演算法嚟嚟去去都係做緊 3 樣嘢當中其中一樣[註 2]:
- Discovery:由數據度建立描述啲人嘅行為嘅模型;
- Conformance:計吓數,睇吓現有嗰個模型預測嘅行為同數據實際睇到嘅有幾大出入;
- Extension:睇吓現有嗰個模型可以點調整,令個模型更加能夠準確預測啲人嘅行為;
分析者攞住啲 log 數據,會或多或少噉執吓佢先-包括係剷走啲明顯出咗錯嘅數據,甚至將啲數據入面嘅數值做些少轉換,即係例如手上已有「用家喺呢段期間邊啲時間點撳咗掣」噉嘅資訊,可以簡單噉計出「用家喺嗰段期間撳咗幾多次掣」呢樣資訊[7]。跟住分析者就可以用啲數據建立統計模型,用啲模型嘅方程式嚟模擬用家嘅行為。
例如下圖就係一個典型嘅流程探勘模型,將用家嘅行為想像成「首先做 A,然後做決策(decision point
),決定跟住要去 B C 定 E...」噉嘅樣。分析者可以睇吓呢個模型會預測嘅行為規律同數據實際睇到嘅爭幾遠。
睇埋
編輯註釋
編輯文獻
編輯- Guénégo, M., & Deneckère, R. (2022, December). Can Process Mining Detect Video Game Addiction Through Player's Character Class Behavior?. In 19th European Mediterranean & Middle Eastern Conference on Information Systems (EMCIS 2022).
- Macak, M., Daubner, L., Jamnicka, J., & Buhnova, B. (2022). Game Achievement Analysis: Process Mining Approach. In Advanced Data Mining and Applications: 17th International Conference, ADMA 2021, Cham: Springer International Publishing.
- Ramadan, S., Baqapuri, H. I., Roecher, E., & Mathiak, K. (2019, June). Process mining of logged gaming behavior. In 2019 International Conference on Process Mining (ICPM) (pp. 57-64). IEEE,依篇文分析啲人玩 FPS 嗰時嘅行為,就係用咗 logging 紀錄「玩家喺依點依點時間做咗啲乜」噉嘅資訊。
- Rojas, E., Munoz-Gama, J., Sepúlveda, M., & Capurro, D. (2016). Process mining in healthcare: A literature review. Journal of biomedical informatics, 61, 224-236.
- Van Der Aalst, W. (2012). Process mining: Overview and opportunities. ACM Transactions on Management Information Systems (TMIS), 3(2), 1-17.
引咗
編輯- ↑ DeLaRosa, Alexander (February 8, 2018). "Log Monitoring: not the ugly sister". Pandora FMS. Archived from the original on February 14, 2018. Retrieved February 14, 2018. "A log file is a text file or XML file used to register the automatically produced and time-stamped documentation of events, behaviors and conditions relevant to a particular system."
- ↑ 2.0 2.1 Van Der Aalst, W. (2012). Process mining: Overview and opportunities. ACM Transactions on Management Information Systems (TMIS), 3(2), 1-17.
- ↑ Peters, T. A. (1993). The history and development of transaction log analysis (PDF). Library hi tech.
- ↑ Rice, R. E., & Borgman, C. L. (1983). The use of computer‐monitored data in information science and communication research. Journal of the American Society for Information Science, 34(4), 247-256.
- ↑ "The Transaction Log (SQL Server) - SQL Server". learn.microsoft.com.
- ↑ Ma'arif, M. R. (2017, September). Revealing daily human activity pattern using process mining approach (PDF). In 2017 4th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI) (pp. 1-5). IEEE.
- ↑ Ramadan, S., Baqapuri, H. I., Roecher, E., & Mathiak, K. (2019, June). Process mining of logged gaming behavior. In 2019 International Conference on Process Mining (ICPM) (pp. 57-64). IEEE.