Computational Linguistics
About

John Cocke

John Cocke (1925–2002) was a computer scientist at IBM who made pioneering contributions to compiler design, RISC architecture, and the CYK parsing algorithm that bears his name, bridging hardware innovation with computational linguistics.

CYK: T[i,j,A] = ∃k,B,C: T[i,k,B] ∧ T[k+1,j,C] ∧ (A→BC) ∈ G

John Cocke was an American computer scientist at IBM whose wide-ranging contributions spanned computer architecture, compiler optimisation, and computational linguistics. He is best known in NLP for co-developing the CYK (Cocke-Younger-Kasami) parsing algorithm, a dynamic programming method for parsing context-free grammars that remains a fundamental algorithm in computational linguistics.

Early Life and Education

Born in Charlotte, North Carolina, in 1925, Cocke earned his PhD in mathematics from Duke University in 1956. He spent his entire career at IBM, where his extraordinary breadth of technical interests led to contributions across multiple fields of computer science.

1925

Born in Charlotte, North Carolina

1956

Completed PhD in mathematics at Duke University

1960s

Co-developed the CYK parsing algorithm

1970s

Pioneered RISC (Reduced Instruction Set Computer) architecture at IBM

1987

Received the ACM Turing Award

2002

Died in Valhalla, New York

Key Contributions

The CYK algorithm (independently developed by Cocke, Daniel Younger, and Tadao Kasami in the 1960s) is a dynamic programming algorithm for determining whether a string can be generated by a given context-free grammar and, if so, producing all possible parse trees. Operating on grammars in Chomsky normal form, CYK fills a triangular table bottom-up in O(n³|G|) time, where n is the string length and |G| is the grammar size. It remains the standard algorithm for teaching parsing and is the basis for probabilistic CYK used in statistical parsing.

Cocke's contributions extended far beyond parsing. His work on RISC architecture revolutionised computer hardware design, and his innovations in compiler optimisation influenced how programming languages are processed. At IBM, he also participated in the early speech recognition efforts, contributing his mathematical expertise to the statistical models being developed by Jelinek's group.

"The CYK algorithm demonstrated that parsing could be solved efficiently through dynamic programming, turning a potentially exponential search into a polynomial-time computation." — On the significance of the CYK algorithm

Legacy

Cocke received the ACM Turing Award in 1987, the National Medal of Technology in 1991, and the National Medal of Science in 1994 — an almost unprecedented combination of honours. The CYK algorithm he co-developed is taught in every computational linguistics and formal language theory course worldwide and remains the foundation for probabilistic parsing methods used in modern NLP systems.

Interactive Calculator

Enter a CSV of publications: year,title,citations_count. The calculator computes total citations, h-index, peak year, and a per-decade breakdown of scholarly output.

Click Calculate to see results, or Animate to watch the statistics update one record at a time.

Related Topics

References

  1. Younger, D. H. (1967). Recognition and parsing of context-free languages in time n³. Information and Control, 10(2), 189–208. doi:10.1016/S0019-9958(67)80007-X
  2. Kasami, T. (1966). An efficient recognition and syntax-analysis algorithm for context-free languages. Air Force Cambridge Research Lab Report AFCRL-65-758.
  3. Cocke, J., & Schwartz, J. T. (1970). Programming Languages and Their Compilers: Preliminary Notes. Courant Institute of Mathematical Sciences, NYU.
  4. ACM. (1987). John Cocke — ACM Turing Award Laureate. Association for Computing Machinery.

External Links