Your opportunity

Our software platform is built to solve engineering and procurement issues in the trillion-dollar electronics industry. In other words: Luminovo is redefining the software stack used to bring any type of electronics to live.

To understand what we do, check out our website, plus two of our blog posts (here and here).

Your role

Our data quality mission is product discovery applied to our part and component data. You take a fuzzy quality problem, figure out what it actually means for customers, measure it honestly, and hand a well-scoped, evidence-backed finding to the team that delivers the larger fix.
The hard part isn't running a query (our AI tooling helps with that). It's reframing "x% of parts have no pin count" into "y% of a customer's costings can't complete because of it," then giving other teams a result they can act on without them having to re-check it.
You'll be a junior version of this discovery loop: sharp, honest, and data-fluent. You'll follow threads the team doesn't have time to chase, turn them into decision-ready findings, and grow into more autonomy across your internship. You work within a clear direction, and you can take a fix all the way into production when it's a data-level change you can script, like manufacturer merges or backfills. You won't need to be a Rust engineer or own large refactors. AI tooling does the heavy lifting on unfamiliar code and scripting. Your judgment and rigor are what matter most.
This role is an internship with a duration of three to six months.

Your performance objectives

Turn ambiguous data-quality questions into customer-relevant findings by reframing part-level observations into business/customer impact (e.g. tenant-aware "what actually blocks costing"), defining a sensible metric or proxy, and producing a measured, caveated answer to the question set by the product manager.
Independently size problems and test hypotheses against our data by writing read-only queries over the data warehouse (ClickHouse) and production Postgres, and producing numbers you can defend (knowing when a result is double-counted, misleading, or too good to be true)
Make the effect of fixes and experiments visible by extending our dashboards and building ad-hoc visualizations that show trends, baselines, and whether an intervention actually moved coverage/correctness.
Run small experiments to gather evidence by writing scripts (with AI assistance) against external sources such as SiliconExpert and DigiKey, e.g., to check whether a missing-data gap is fetchable, calibrate a finding, or do spot checks on interesting cases.
Verify assumptions in the product itself by navigating the epibator (Rust/TS) codebase with AI tooling to confirm how data is actually resolved/used, and occasionally adding light instrumentation we find we need, without owning large refactors.
Apply the fixes you've scoped, safely by writing AI-assisted scripts that correct production customer data at scale: e.g. automating the research to decide whether two manufacturers are the same record and then executing thousands of merges. Make every change safe by construction: dry-run and validate against samples first, work in reversible/checkpointed batches, and put guardrails in place so we never introduce regressions or corrupt manufacturing/costing data.
Leave behind durable, trustworthy knowledge by following the mission's loop (brief, investigate, report, distill), citing evidence, dating facts, and writing findings other teams and stakeholders can act on without re-deriving them.
Be your own harshest critic by reconciling and sanity-checking your own results, clearly separating "what's proven" from "what's still a hypothesis," and flagging loudly when a finding overturns a prior assumption (incl. your own).

Data Quality Intern (d/f/m)

Description

Your opportunity

Your role

Your performance objectives

What you bring

More jobs