Skip to main content

Research Scientist, Foundational Data Science

Prior Labs

BerlinOn-siteFull-Time3d ago

Description

Who We Are Foundation models transformed text and images. Structured data - the largest and most consequential data format in the world - stayed untouched. Tables run every clinical trial, every financial model, every scientific experiment, every business decision, and no one had built a foundation model that truly understood them.

Until now. What LLMs did for language, we're doing for tables. The next modality shift in AI is happening, and we're hiring the team that makes it.

Momentum. We pioneered tabular foundation models and are now the world-leading organization in structured-data ML. Our TabPFN v2 model was published as a Nature cover story and set a new state of the art for tabular machine learning. Since release we've scaled model capabilities 20x+, passed 3.5M+ downloads and 7,500+ GitHub stars, and are seeing accelerating adoption across research and industry - from detecting lung disease with Oxford Cancer Analytics to preventing train failures with Hitachi to improving clinical-trial decisions with BostonGene.

The hardest work is ahead. We're scaling tabular foundation models to millions of rows, thousands of features, real-time inference, and entirely new data modalities, while building the infrastructure to run them in production across some of the most demanding industries on earth. These are open problems no one else is working on at this level.

Our team. We're a small, highly selective team of 30+ engineers, researchers, and GTM specialists, with backgrounds spanning Google, Apple, Amazon, DeepMind, Meta, Microsoft Research, G-Research, Jane Street, Goldman Sachs, and CERN. We're led by Frank Hutter, Noah Hollmann, and Sauraj Gambhir, and advised by world-leading AI researchers including Bernhard Schölkopf and Turing Award winner Yann LeCun. We ship fast, do top-tier research, and hold each other to an extremely high bar.

What's next. In 2025 we raised €9m pre-seed led by Balderton Capital, backed by leaders from Hugging Face, DeepMind, and Black Forest Labs. The next phase of growth is here, which makes this an ideal time to join.

What You'll Do This role is foundational data science: building the foundations of tabular foundation models so a single model can solve data-science problems across the board. Roughly half the work is inventing new frontier tools for TFMs, and half is building the dataset and benchmark bedrock they stand on.

  • Invent and build the frontier tools that extend TabPFN, including its thinking, scaling, and agentic capabilities, and the new methods that let one model generalize across the full landscape of data-science problems. This is the most open-ended part of the work and grows over time.
  • Set the research direction by deciding which model capabilities and benchmarks are worth pursuing, choosing what is worth solving rather than optimizing a score someone else set.
  • Bring in external research and real customer needs to shape new model and tooling directions, and publish frontier results that move the field forward.
  • Build trustworthy benchmarks from the structured data behind real, high-impact problems, so the team optimizes for real-world performance rather than one leaderboard.
  • Faithfully implement the baselines and competitor models that set the gold standard of applied data science, giving the team a read on where TabPFN leads and where there is room to improve.
  • Build an automated, agentic pipeline with a human in the loop so this data and benchmark foundation scales to far larger volumes without losing rigor, itself a genuinely new tool.

What We're Looking For

  • You have solved data-science problems across many domains and datasets to a high standard, optimizing for strong performance across a whole suite of tasks rather than the single best score on one.
  • You work undogmatically across the ML toolbox, including getting strong results with gradie

More jobs in Berlin