Inflection is the morphological process by which a word is modified to encode grammatical information — tense, aspect, mood, person, number, gender, case, definiteness, and other categories — while preserving the word's fundamental meaning and part of speech. Unlike derivation, which creates new lexemes (e.g., "happy" to "happiness"), inflection creates different forms of the same lexeme (e.g., "run," "runs," "ran," "running"). Computational modeling of inflection is essential for morphological generation, machine translation, grammatical error correction, and any system that must produce correctly inflected text.
Inflectional Paradigms
paradigm(L) = { inflect(L, f) | f ∈ F }
Example (Spanish "hablar"):
inflect(hablar, {Pres, 1sg, Ind}) → hablo
inflect(hablar, {Pres, 2sg, Ind}) → hablas
inflect(hablar, {Pret, 3sg, Ind}) → habló
inflect(hablar, {Subj, 1pl, Pres}) → hablemos
An inflectional paradigm is the complete set of forms a lexeme can take. The size of paradigms varies enormously across languages: English verbs have at most five forms (walk, walks, walked, walking, walked), while a Finnish noun has over 2,000 forms when all case, number, and possessive combinations are considered. Paradigm structure exhibits regularities that computational models exploit: most forms follow predictable patterns, with irregularity concentrated in high-frequency items.
Neural Morphological Inflection
The SIGMORPHON shared tasks have established morphological inflection — generating the correct surface form given a lemma and a set of morphological features — as a benchmark task for neural sequence modeling. Encoder-decoder architectures with attention, operating at the character level, achieve high accuracy across typologically diverse languages. These models take as input the characters of the lemma concatenated with feature tags and produce the characters of the inflected form. Hard attention mechanisms and copy mechanisms improve performance by allowing the model to preserve stem characters while modifying affixes.
A fundamental question in inflectional morphology is how speakers (and models) generalize from observed forms to fill unobserved cells in a paradigm. Ackerman, Blevins, and Malouf (2009) formalized this as the "paradigm cell filling problem": given some forms of a lexeme, predict the rest. Information-theoretic analyses show that paradigms in natural languages tend to have low conditional entropy — knowing one form strongly constrains others — suggesting that languages are structured to facilitate this generalization. Neural models implicitly learn these inter-form dependencies during training.
Syncretism and Irregularity
Inflectional systems exhibit syncretism (identical forms for different feature combinations) and irregularity (forms that deviate from regular patterns). Syncretism is linguistically systematic — for example, German adjective endings systematically collapse distinctions in certain environments — and can be modeled by mapping multiple feature bundles to the same exponent. Irregular inflection, as in English strong verbs (sing/sang/sung), requires memorization of specific forms. Neural inflection models handle irregularity by memorizing patterns from training data, but struggle with rare irregulars not seen during training.
Cross-linguistic variation in inflection is enormous. Isolating languages like Mandarin Chinese have virtually no inflection, while polysynthetic languages like Mohawk encode in a single verb form what English requires an entire sentence to express. Computational models of inflection must be flexible enough to handle this typological diversity, and recent multilingual approaches use language embeddings and shared character representations to transfer inflectional knowledge across related languages.