Discourse segmentation refers to the task of breaking a given text into a sequence of elementary discourse units (EDUs). EDUs are clause-like units that serve as building blocks for discourse parsing in Rhetorical Structure Theory.
For example, there are 3 EDUs in this sentence (color coded): Sheraton and Pan Am said they are assured under the Soviet joint-venture law that they can repatriate profits from their hotel venture.
We released the pre-trained model as the off-the-shelf tool for downstream applications.
@inproceedings{DBLP:conf/ijcai/LiSJ18, author = {Jing Li and Aixin Sun and Shafiq R. Joty}, editor = {J{\'{e}}r{\^{o}}me Lang}, title = {SegBot: {A} Generic Neural Text Segmentation Model with Pointer Network}, booktitle = {Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, {IJCAI} 2018, July 13-19, 2018, Stockholm, Sweden}, pages = {4166--4172}, publisher = {ijcai.org}, year = {2018}, url = {https://doi.org/10.24963/ijcai.2018/579}, doi = {10.24963/ijcai.2018/579}, timestamp = {Tue, 20 Aug 2019 16:19:08 +0200}, biburl = {https://dblp.org/rec/conf/ijcai/LiSJ18.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }