文字分割英文text segmentation)係指將要處理嘅一段字分割做若干嚿各自有意思嘅單位,方便做一步嘅分析或者其他處理。常見嘅有將段字切割做句子或者個別嘅呀噉。

即係例如[1]

Input:San Pedro is a town on the southern part of the island of Ambergris Caye in the Belize District of the nation of Belize, in Central America. According to 2015 mid-year estimates, the town has a population of about 16, 444.
Output
San Pedro is a town on the southern part of the island of Ambergris Caye in the 2.Belize District of the nation of Belize, in Central America.
According to 2015 mid-year estimates, the town has a population of about 16, 444. It is the second-largest town in the Belize District and largest in the Belize Rural South constituency.(分割咗做唔同句子)

睇埋

  1. Freddy Y. Y. Choi (2000). "Advances in domain independent linear text segmentation". Proceedings of the 1st Meeting of the North American Chapter of the Association for Computational Linguistics (ANLP-NAACL-00). pp. 26–33.