Named entity recognition (NER) is the task of locating and classifying named entities in unstructured text into predefined semantic categories. The most common categories are person (PER), organization (ORG), location (LOC), and miscellaneous (MISC), though domain-specific NER systems may recognize entities like gene names, drug names, chemical compounds, or legal citations. NER is a foundational component of information extraction pipelines and serves as input to relation extraction, knowledge base population, and question answering.
Sequence Labeling Formulation
Barack/B-PER Obama/I-PER was/O born/O in/O Honolulu/B-LOC
IOBES encoding:
Barack/B-PER Obama/E-PER was/O born/O in/O Honolulu/S-LOC
Nested NER: [ORG [LOC New York] Times]
Requires span-based or hypergraph models
Like chunking, NER is typically formulated as a sequence labeling task using IOB encoding. Each token is assigned a tag indicating whether it begins (B), continues (I), or is outside (O) an entity of a given type. This formulation works well for flat, non-overlapping entities but cannot handle nested entities (e.g., "New York" as a location inside "New York Times" as an organization). Nested NER requires span-based, hypergraph, or sequence-to-sequence approaches.
Methods
NER systems have evolved through three generations. Rule-based and gazeteer-based systems used handcrafted patterns and dictionaries. Statistical systems, particularly CRFs with hand-crafted features (orthographic patterns, word shape, gazetteers), dominated from the mid-2000s. The current state of the art uses neural architectures: BiLSTM-CRF models (Lample et al., 2016) with character-level embeddings, and more recently, fine-tuned pre-trained language models like BERT that achieve F1 scores above 93% on the CoNLL-2003 English benchmark.
Evaluation and Challenges
NER evaluation uses entity-level F1: a predicted entity is correct only if both its boundaries and type match the gold standard exactly. The CoNLL-2003 shared task datasets (English and German) remain the most widely used benchmarks. Key challenges include recognizing entities in informal text (social media, conversational language), handling rare and emerging entities not seen in training, multilingual and cross-lingual NER, and resolving entity ambiguity (e.g., "Washington" as person, location, or organization).