Senior Data / AI Infrastructure Engineer (LLM Systems)
HRvizer
Description
HRvizer is currently supporting a fast-growing AI startup in Berlin in hiring a Senior Engineer to join their core technical team.
The company operates in the AI automation space, building production-grade LLM-based systems that help reduce manual, document-heavy workflows in complex, knowledge-intensive environments.
They are well-funded, already live with paying customers, and currently scaling their engineering organization.
The Role
We are looking for a Senior Engineer with strong backend and data infrastructure experience to build the foundation for evaluation, observability, and performance optimization of LLM-powered systems.
This role sits at the intersection of:
- Backend engineering
- Data engineering
- AI/LLM infrastructure
- Observability & reliability systems
You will work directly on production systems that monitor, evaluate, and improve AI agents at scale.
This is a hands-on engineering role (not BI / analytics), with direct impact on system performance, cost, and reliability.
What you will do
- Build evaluation frameworks for LLM agents (offline + online testing, datasets, human feedback loops)
- Design automated quality gates for changes in models, prompts, and agent logic
- Analyse large-scale production traces to identify failures, regressions, latency, and cost issues
- Work with analytical databases (BigQuery, ClickHouse or similar)
- Build data replay, retention, and debugging systems for production behaviour
- Develop observability tooling (logging, tracing, dashboards, monitoring)
- Contribute to backend and agent infrastructure where needed
Requirements
- Strong experience in Python and/or backend engineering
- Advanced SQL skills and experience with large datasets
- Experience working in cloud environments (GCP preferred)
- Experience building data pipelines, ETL/ELT, or event-driven systems
- Strong understanding of system design, reliability, and observability
- Ability to work in complex systems and ambiguous environments
- Strong engineering judgment and architectural thinking
Nice to have
- Experience with LLMs, agentic systems, or AI infrastructure
- Experience working with distributed system traces
- Experience building internal platforms or developer tools
- Familiarity with workflow orchestration tools (e.g. Temporal)
- Background in audit, finance, or compliance environments
- Experience in early-stage startups or scale-ups
What’s offered
- High-impact role in a fast-scaling AI company
- Strong ownership from day one
- Competitive salary + equity
- Learning & development budget
- Flexible work culture + team events