Tf-idf(全名 Term Frequency-Inverse Document Frequency)喺資訊提取上泛指一啲反映隻字「喺份文件入面有幾重要」嘅數值。
一隻字嘅 term frequency 係隻字喺份文件入面出現咗幾多次除以份文件嘅總字數。
- Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5), 513-523.