Agents are only as good as the team behind them. We're experienced researchers and engineers building vertical AI agents that run in production inside some of the most demanding environments there are: health insurers, banks, and the public sector, where decisions are audited and mistakes carry real consequences.

We're looking for a strong generalist who's comfortable moving across the stack and isn't fenced into one specialty, the kind of engineer who can follow a problem from the frontend down to the infrastructure and into the AI layer. Agents are a big part of how you cover that ground. The twist that makes this role unusual: you use agents to improve agents. You'll wield agentic coding tools to build, evaluate, and harden the production AI agents we ship to customers, and feed what you learn straight back into making the next ones better. You go wherever the hard problem is, and you understand what's happening underneath, in the systems you build and in the models and agents themselves.

What you'll do

Use agents to improve agents. Wield agentic coding tools (Claude, Codex, ...) to build, evaluate, and harden the production agents we ship, then fold what you learn back into faster, better tooling and stronger agents for the next use case. The better you get at directing agents, the better the agents you ship.
Work across the stack. Frontend, backend, infrastructure, data, and the AI layer. You go where the problem is instead of waiting for it to land in your lane.
Build and ship production AI agents. Take agents from prototype to production inside regulated environments: orchestration, tool use, messy real-world inputs, and the reliability and auditability real operations demand.
Own things end-to-end. From a vague problem to a deployed, reliable system: design it, build it, deploy it, and keep it running.
Go deep when it counts. Understand the systems beneath the surface, distributed systems behavior, infra, and how LLMs and agents actually work, so you can debug what others can't and make good calls under uncertainty.
Stay ahead. The tools and models change every few weeks. You test new releases early and translate what works into how you and the team build.

What we're looking for

Must-haves

Generalist range: you're excellent in at least one area and genuinely comfortable working outside it. You don't need to be an expert in everything; you need to be the kind of person who picks up an unfamiliar part of the stack and gets productive fast, with agents helping you go further.
Fluency with coding agents as your default way of working. You can point to specific workflows, failure modes, and things you've actually built or shipped with agents, not just tried.
A real mental model of how LLMs and agents work: why they fail, how to evaluate them, and how to make non-deterministic systems dependable.
Strong engineering judgment: you read unfamiliar code critically, catch when an agent has gone off track, and know what good architecture looks like.
Enthusiasm about AI and its applications. In software development and beyond.
On-site collaboration 3 days/week in Berlin or Bremen. Travel to our Bremen HQ during onboarding.
Fluency in English (at least B2).
Valid EU work authorization.

Nice-to-haves

Depth in one area that complements the breadth (e.g. distributed systems, infra/DevOps, full-stack, or applied AI/LLMs).
Experience building agent systems: orchestration, tool use, evaluation, or agent frameworks.
Experience taking AI from prototype to production, not just demos.
Experience in regulated industries (insurance, banking, public sector) or other compliance-heavy domains.
German language skills.
Open-source contributions or public writing on agents, applied AI, or agentic workflows.

What matters most

We prioritize **demonstrated excellence in your pro

Member of Technical Staff - Agent Engineer

Description

What you'll do

What we're looking for

More jobs in Bremen