John Cocke was an American computer scientist at IBM whose wide-ranging contributions spanned computer architecture, compiler optimisation, and computational linguistics. He is best known in NLP for co-developing the CYK (Cocke-Younger-Kasami) parsing algorithm, a dynamic programming method for parsing context-free grammars that remains a fundamental algorithm in computational linguistics.
Early Life and Education
Born in Charlotte, North Carolina, in 1925, Cocke earned his PhD in mathematics from Duke University in 1956. He spent his entire career at IBM, where his extraordinary breadth of technical interests led to contributions across multiple fields of computer science.
Born in Charlotte, North Carolina
Completed PhD in mathematics at Duke University
Co-developed the CYK parsing algorithm
Pioneered RISC (Reduced Instruction Set Computer) architecture at IBM
Received the ACM Turing Award
Died in Valhalla, New York
Key Contributions
The CYK algorithm (independently developed by Cocke, Daniel Younger, and Tadao Kasami in the 1960s) is a dynamic programming algorithm for determining whether a string can be generated by a given context-free grammar and, if so, producing all possible parse trees. Operating on grammars in Chomsky normal form, CYK fills a triangular table bottom-up in O(n³|G|) time, where n is the string length and |G| is the grammar size. It remains the standard algorithm for teaching parsing and is the basis for probabilistic CYK used in statistical parsing.
Cocke's contributions extended far beyond parsing. His work on RISC architecture revolutionised computer hardware design, and his innovations in compiler optimisation influenced how programming languages are processed. At IBM, he also participated in the early speech recognition efforts, contributing his mathematical expertise to the statistical models being developed by Jelinek's group.
"The CYK algorithm demonstrated that parsing could be solved efficiently through dynamic programming, turning a potentially exponential search into a polynomial-time computation." — On the significance of the CYK algorithm
Legacy
Cocke received the ACM Turing Award in 1987, the National Medal of Technology in 1991, and the National Medal of Science in 1994 — an almost unprecedented combination of honours. The CYK algorithm he co-developed is taught in every computational linguistics and formal language theory course worldwide and remains the foundation for probabilistic parsing methods used in modern NLP systems.