Computational Linguistics
About

Rhetorical Structure Theory

Rhetorical Structure Theory (RST) provides a hierarchical framework for analyzing discourse organization by decomposing texts into elementary discourse units connected through nucleus-satellite and multinuclear rhetorical relations.

T = RST-Tree(EDUs, Relations)

Rhetorical Structure Theory, developed by William Mann and Sandra Thompson in the 1980s, is one of the most influential frameworks for describing the organization of natural language texts. RST posits that coherent texts can be represented as hierarchical tree structures in which elementary discourse units (EDUs) — typically clauses — are recursively combined through rhetorical relations. These relations describe the functional role each text span plays relative to its neighbors: one span may elaborate on another, provide evidence for a claim, or present a contrasting viewpoint. RST has become a foundational framework in computational discourse analysis.

Core Principles and Relations

RST Tree Structure Elementary Discourse Units: EDU₁, EDU₂, …, EDUₙ

Nucleus-Satellite: R_ns(N, S) — nucleus is essential, satellite supports
Multinuclear: R_mn(N₁, N₂) — both spans equally important

Common relations: Elaboration, Cause, Contrast, Condition,
Evidence, Concession, Background, Joint, Sequence

RST distinguishes between two types of rhetorical relations. In nucleus-satellite relations, one span (the nucleus) carries the primary information while the other (the satellite) plays a supporting role. For example, in an Elaboration relation, the satellite provides additional detail about the nucleus. In multinuclear relations such as List or Contrast, both spans are equally important. The original RST proposal identified approximately 25 relations, though subsequent work has both expanded and refined this inventory. A key structural constraint is that RST trees are projective — every text span forms a contiguous substring of the original text.

RST Parsing

Computational RST parsing involves three subtasks: segmenting text into EDUs, determining which pairs of spans should be connected, and labeling the connecting relations. Early approaches used hand-crafted rules, while statistical parsers such as HILDA (Hernault et al., 2010) and the parser of Joty et al. (2013) applied discriminative models with rich linguistic features. More recent neural approaches use hierarchical attention networks or pointer networks to build RST trees, achieving F1 scores above 60% for full labeled parsing on the RST Discourse Treebank. EDU segmentation, by contrast, is largely solved, with modern systems exceeding 95% F1.

RST Discourse Treebank

The RST Discourse Treebank (RST-DT), constructed by Carlson, Marcu, and Okurowski (2001), provides gold-standard RST annotations for 385 Wall Street Journal articles from the Penn Treebank. It defines 78 fine-grained and 18 coarse-grained relation types and has served as the primary benchmark for RST parsing research. Inter-annotator agreement on the corpus reveals that humans agree on tree structure at roughly 83% and on relation labeling at roughly 66%, establishing an upper bound for automatic systems.

Applications and Extensions

RST has been applied to a wide range of NLP tasks. In text summarization, nuclearity provides a natural importance ranking: nuclei tend to contain the most summary-worthy content. In essay scoring and argumentation mining, the distribution of rhetorical relations correlates with text quality. Sentiment analysis systems use RST structure to determine the scope and polarity of opinions. Cross-linguistic RST studies have examined whether the same relation inventory and tree structures apply across languages, with large-scale treebanks now available for Spanish, Portuguese, German, Chinese, and other languages.

Extensions of RST address limitations of the original framework. Some researchers have proposed graph-based representations to handle non-projective structures and multiple simultaneous relations. Others have integrated RST with genre theory, showing that different text types (news, scientific articles, editorials) exhibit characteristic distributions of rhetorical relations. The relationship between RST and other discourse frameworks, particularly PDTB and SDRT, continues to be a productive area of theoretical and empirical investigation.

Related Topics

References

  1. Mann, W. C., & Thompson, S. A. (1988). Rhetorical Structure Theory: Toward a functional theory of text organization. Text, 8(3), 243–281. doi:10.1515/text.1.1988.8.3.243
  2. Carlson, L., Marcu, D., & Okurowski, M. E. (2001). Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory. Proceedings of the 2nd SIGdial Workshop on Discourse and Dialogue, 1–10. doi:10.3115/1118078.1118083
  3. Joty, S., Carenini, G., Ng, R. T., & Mehdad, Y. (2013). Combining intra- and multi-sentential rhetorical parsing for document-level discourse analysis. Proceedings of the 51st Annual Meeting of the ACL, 486–496.

External Links