Computational Linguistics
About

Percy Liang

Percy Liang (b. 1983) is a Stanford professor who has made influential contributions to semantic parsing, language model evaluation, and AI transparency through the HELM benchmark and the Stanford Center for Research on Foundation Models.

Semantic Parsing: argmax_z P(z|x) where z is a logical form

Percy Liang is a computer scientist at Stanford University who directs the Center for Research on Foundation Models (CRFM). His research spans semantic parsing, language grounding, and the systematic evaluation of large language models. He has been a leading voice in developing transparent, holistic evaluation methodologies for foundation models.

Early Life and Education

Born in 1983, Liang studied at MIT and earned his PhD from UC Berkeley in 2011. His doctoral work on semantic parsing and learning from natural language supervision demonstrated how NLP systems could map natural language utterances to executable logical forms. He joined the Stanford faculty and rapidly established a research program bridging theoretical machine learning with practical NLP systems.

1983

Born in the United States

2011

Completed PhD at UC Berkeley

2012

Joined Stanford University faculty

2015

Developed the SQuAD reading comprehension benchmark

2021

Co-founded the Stanford Center for Research on Foundation Models

2022

Released HELM (Holistic Evaluation of Language Models)

Key Contributions

Liang's work on semantic parsing advanced methods for mapping natural language to formal representations. His approach to learning semantic parsers from question-answer pairs (rather than annotated logical forms) reduced the annotation burden and enabled broader application of semantic parsing to question answering over databases and knowledge bases.

He co-created SQuAD (Stanford Question Answering Dataset), one of the most widely used benchmarks for reading comprehension, which spurred rapid progress in neural question answering. His HELM (Holistic Evaluation of Language Models) framework provides comprehensive, multi-dimensional evaluation of large language models across accuracy, calibration, robustness, fairness, bias, toxicity, and efficiency, addressing the need for more nuanced assessment than single-number benchmarks provide.

"We need to evaluate language models not just on accuracy, but on a broad set of metrics that reflect the diverse ways these models impact people." — Percy Liang, on the motivation for HELM

Legacy

Liang's contributions to semantic parsing, benchmark development, and foundation model evaluation have shaped how the field measures progress and identifies limitations. SQuAD became a standard benchmark used by researchers worldwide. HELM and the CRFM have established new standards for transparency and comprehensiveness in language model evaluation. His work connects technical NLP research with broader questions of AI governance and accountability.

Interactive Calculator

Enter a CSV of publications: year,title,citations_count. The calculator computes total citations, h-index, peak year, and a per-decade breakdown of scholarly output.

Click Calculate to see results, or Animate to watch the statistics update one record at a time.

Related Topics

References

  1. Liang, P., Jordan, M. I., & Klein, D. (2011). Learning dependency-based compositional semantics. Proceedings of the 49th Annual Meeting of the ACL, 590–599.
  2. Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). SQuAD: 100,000+ questions for machine comprehension of text. Proceedings of EMNLP, 2383–2392. doi:10.18653/v1/D16-1264
  3. Liang, P., et al. (2022). Holistic evaluation of language models. arXiv preprint arXiv:2211.09110.
  4. Bommasani, R., Hudson, D. A., Adeli, E., ..., & Liang, P. (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.

External Links