Platform Reliability Engineer
BEACON RED
Description
External Job Description About EDGE Group At EDGE bold ideas are engineered into technologies that protect, improve and save lives.
Headquartered in the United Arab Emirates, EDGE is a leading advanced technology group working at the forefront of defence and emerging technologies. Spanning more than 35 specialised companies and multiple centres of excellence, we are purpose-built to move fast. Free from heavy legacy processes, we give our people the autonomy, accountability, and agility to bring breakthrough technologies from concept to reality.
Based in Abu Dhabi, a globally connected hub at the crossroads of Europe, Asia, Africa, and the Middle East, EDGE is home to a truly multicultural community where bold ideas thrive, and the future is shaped.
Together, we are shaping the future.
About The Opportunity We're building a highly resilient hybrid multicloud platform spanning Nutanix and public cloud, while delivering a major migration from VMware. We're looking for a Platform Reliability Engineer who is passionate about automation, resilience, recoverability, and operational excellence. This is not a traditional infrastructure operations role. We want someone who can demonstrate how they have improved platform reliability through engineering, automation, monitoring, disaster recovery testing, and continuous improvement.
What You'll Be Doing
- Ensuring the reliability, performance, and recoverability of critical platform services
- Managing and validating backup and recovery capabilities using enterprise tools such as Rubrik
- Planning and executing disaster recovery testing and recovery assurance activities
- Supporting platform and workload migrations from VMware to modern cloud platforms
- Developing automation to reduce operational effort and improve service reliability
- Improving observability, monitoring, alerting, and operational readiness across the platform
What We're Looking For
- Strong experience with enterprise infrastructure, cloud, or platform engineering
- Experience with Nutanix, VMware, public cloud platforms, and backup/recovery technologies
- Strong PowerShell, Python, or similar automation skills
- Experience improving platform resilience, recoverability, and operational efficiency through automation
- Proven ability to reduce operational toil and improve reliability at scale
For this role, you must be able to demonstrate:
- Reliability improvements you've delivered and how success was measured
- Disaster recovery programmes or recovery testing you've led
- Operational processes you've automated and the impact achieved
- How you validate recoverability, not just successful backups
- Major migration or transformation programmes you've supported
If you're passionate about building resilient platforms, proving recoverability, and eliminating manual operational effort through automation, we'd like to hear from you.
Why Choose EDGE? Working at EDGE comes with a package that genuinely reflects how much we value our people. Salaries are highly competitive and tax-free and, depending on your role and seniority, benefits can include family visas, annual flight tickets, medical insurance for you and your dependants, and education allowances for your children.
But the real investment goes further than compensation. Through our own learning academy and digital learning platform, there are extensive opportunities to develop your skills and advance your career. Add to that the freedom to innovate, strong career guidance, and the chance to work alongside world-class talent from across the globe, and EDGE becomes a place where you can keep growing, keep learning, and keep turning bold ideas into something real.
Candidate Privacy & Equal Opportunity Statement Any information you share as part of your EDGE candidate profile or job application will be handled in accordance with applicable data prote