Context-sensitive languages (CSLs) form the third level of the Chomsky hierarchy, lying between context-free languages and recursively enumerable languages. They are generated by context-sensitive grammars (Type 1), in which production rules may replace a non-terminal only when it appears in a specific surrounding context. Equivalently, CSLs are exactly the languages recognized by linear bounded automata — nondeterministic Turing machines whose tape is limited to the length of the input. This class captures a range of phenomena that exceed context-free power, including some patterns observed in natural language.
Formal Definition
A context-sensitive grammar G = (V, Σ, R, S) has production rules of the form αAβ → αγβ, where A is a non-terminal, α and β are strings of terminals and non-terminals providing the required context, and γ is a non-empty string of terminals and non-terminals. The non-empty requirement on γ ensures that no derivation step shrinks the string, which means context-sensitive grammars are monotonically non-decreasing — the sentential form never gets shorter during a derivation.
This language of repeated strings is context-sensitive but not context-free.
It models reduplication, a morphological process found in many natural languages.
Linguistic Relevance
The linguistic significance of context-sensitive languages became apparent through the work of Stuart Shieber, who demonstrated in 1985 that the cross-serial dependency construction in Swiss German generates a pattern isomorphic to {aⁿbⁿcⁿdⁿ}, which is not context-free. Similarly, the copy language {ww} models total reduplication, found in languages such as Indonesian and many Austronesian languages. These phenomena suggest that a complete grammar of natural language requires at least some context-sensitive power.
Although natural language appears to require more than context-free power, there is strong evidence that it does not require the full power of context-sensitive grammars. The notion of mild context-sensitivity, proposed by Joshi, characterizes the sweet spot: languages that go slightly beyond context-free power, are parsable in polynomial time, and exhibit the constant growth property. Full context-sensitive parsing is PSPACE-complete, making it computationally intractable for practical NLP.
Computational Properties
The membership problem for context-sensitive languages — determining whether a given string belongs to a given CSL — is decidable but PSPACE-complete, meaning that while an algorithm exists, it may require exponential space in the worst case. This contrasts sharply with the polynomial-time parsing algorithms available for context-free languages. The emptiness problem (whether the language is empty) is undecidable for context-sensitive grammars, highlighting the significant jump in computational complexity from the context-free level.
Despite their theoretical importance, context-sensitive grammars are rarely used directly in computational linguistics due to their parsing complexity. Instead, mildly context-sensitive formalisms such as tree-adjoining grammars, linear indexed grammars, and multiple context-free grammars provide a practical middle ground — capturing the specific trans-context-free phenomena found in natural language while remaining efficiently parsable.