Jakob Uszkoreit is a computer scientist who was one of the eight co-authors of "Attention Is All You Need." His work at Google Brain contributed to the design and validation of the Transformer architecture. He later leveraged the insights from Transformer development to pioneer the application of attention-based models to biological sequences, founding a company at the intersection of deep learning and RNA therapeutics.
Early Life and Education
Born in Germany, Uszkoreit comes from a family with deep roots in computational linguistics — his father, Hans Uszkoreit, is a prominent computational linguist at the German Research Center for Artificial Intelligence (DFKI) and Saarland University. Jakob studied computer science and joined Google, where he worked on natural language understanding and neural architecture design.
Worked at Google on NLP and machine learning
Co-authored "Attention Is All You Need"
Co-founded Inceptive, applying Transformer models to RNA design
Key Contributions
Uszkoreit contributed to the Transformer's design, including the positional encoding scheme that uses sinusoidal functions to inject sequence order information: PE(pos, 2i) = sin(pos / 10000^(2i/d_model)) and PE(pos, 2i+1) = cos(pos / 10000^(2i/d_model)). This elegant solution allows the model to generalise to sequence lengths not seen during training and provides a smooth, continuous representation of position.
His subsequent work demonstrated that the self-attention mechanisms developed for natural language could be applied to biological sequences — proteins and RNA — where the same fundamental challenge of modelling long-range dependencies in sequences arises. This cross-pollination between NLP and biology exemplifies how architectural innovations in computational linguistics can have far-reaching impact.
"The principles of attention and representation learning that work for language turn out to be remarkably effective for biological sequences as well." — Jakob Uszkoreit, on applying Transformers to biology
Legacy
Uszkoreit's contributions to the Transformer paper helped enable the revolution in language modelling. His subsequent career demonstrates the broad applicability of NLP architectures beyond language, particularly in computational biology. His family connection to computational linguistics — bridging symbolic and neural approaches across generations — is a unique thread in the field's history.