Senior / Staff Site Reliability, Platform Engineering

SaviyntAtlanta, GA

Apply with Sonara

Automate your job search with Sonara.

Submit 10x as many applications with less effort than one manual application.¹

Reclaim your time by letting our AI handle the grunt work of job searching.

We continuously scan millions of openings to find your top matches.

Overview

Schedule

Full-time

Career level

Senior-level

Remote

On-site

Benefits

Career Development

Job Description

About Saviynt

Saviynt is a leader in identity security, delivering an AI-powered platform that governs and secures access to applications, data, and business processes for global enterprises and government institutions. Built for the AI era, Saviynt helps organizations move faster-securely and compliantly.

Why This Role Matters

Saviynt's SaaS platform runs on complex, distributed, cloud-native systems. As a Staff Platform Engineer, you will play a critical role in ensuring these systems remain highly available, scalable, and secure as the company grows.

This is a hands-on engineering and technical leadership role. You will own reliability for major platform domains, design scalable solutions on Kubernetes and AWS, and drive automation and reliability improvements across multiple teams.

What You'll Do

In this pivotal role, you will be instrumental in designing, building, and maintaining the shared infrastructure services and platforms that our product and application teams will depend on

You will focus on creating reusable, reliable, and scalable solutions that abstract away complexity, enabling other teams to focus on their core business logic and deliver features faster in a multi-cloud environment

Design and build core platform components and shared infrastructure services that other development teams will integrate with and leverage to deploy and operate their applications

Architect, implement, and manage highly available and scalable Kubernetes platforms as a service for internal consumers

Develop robust, internal-facing tools and automation for infrastructure provisioning and management primarily using Go (Golang)

Architect and optimize foundational solutions within Cloud environments (AWS, Azure, etc.), focusing on creating reusable patterns and modules for other teams

Design and implement shared Event-Driven Architecture components and messaging platforms using technologies like Kafka or Google Pub/Sub that product teams can easily utilize

Develop and maintain robust CI/CD pipelines (e.g., GitLab CI and ArgoCD) as a service, providing standardized and automated deployment workflows for various development teams

Design and build resilient Distributed Systems components that serve as building blocks for other applications, focusing on reliability, fault tolerance, and performance

Manage and optimize our shared infrastructure across Multi-Region Cloud Environments, ensuring that platform services are globally available and performant for all consumers

Establish and enhance centralized Observability and Monitoring platforms and tools that provide self-service insights for consuming teams

Define and implement clear, well-documented RESTful API designs for the infrastructure services you build, ensuring ease of integration for internal clients

Implement and manage Service Mesh (e.g., Envoy, Istio) capabilities, providing traffic management, security, and policy enforcement as a shared platform for services

Design, implement, and optimize highly available Relational Database services or shared data platforms for broad organizational use

Collaborate closely with product development teams to understand their infrastructure needs and pain points, providing technical guidance and support

Participate in on-call rotations to support the critical shared infrastructure you build

What We're Looking For

6+ years of experience in an Infrastructure Development, Platform Engineering, or Site Reliability Engineering role, with a strong focus on building tools and services for other engineers

Deep expertise with Kubernetes in production environments, particularly in providing it as a platform(i.e single tenant and multi-tenant deployment architectures)

Strong programming skills in Go (Golang) and Python, with experience building robust, maintainable backend services and automation

Extensive hands-on experience with at least one major Cloud Provider (AWS, GCP, or Azure); multi-cloud experience is a strong plus, especially in building abstractions over them

Proven experience designing and implementing Event-Driven Architecture and message queuing systems (e.g., Kafka, RMQ, NATS) as shared services

Solid understanding and practical experience with CI/CD pipeline tools (especially GitLab CI) and experience establishing automated delivery processes for other teams

Demonstrable experience designing and operating Distributed Systems, with an understanding of patterns for creating reliable, shared components

Familiarity with Multi-Region Cloud Environments and strategies for building globally distributed and highly available platform

Proficiency in establishing and utilizing comprehensive Observability and Monitoring platforms (e.g., Prometheus, Grafana, ELK stack, Datadog) for shared infrastructure

Strong experience with RESTful API design principles and building well-documented, consumable APIs

Knowledge of Service Mesh concepts and practical experience with solutions like Istio in a platform context

Hands-on experience with Relational Databases (e.g., MySQL, PostgresSQL), ideally in managing them as a service

Excellent communication skills and the ability to clearly articulate complex technical concepts to both technical and non-technical audiences

A strong customer-centric mindset, treating internal development teams as your primary customers

Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience or equivalent military experience required

Why Join Saviynt

Work on a large-scale, cloud-native SaaS platform
Solve complex reliability challenges at scale
Influence platform architecture and engineering practices
Competitive compensation, benefits, and career growth

Security & Compliance

This role requires adherence to Saviynt's information security and privacy policies, including annual security training.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Automate your job search with Sonara.

Submit 10x as many applications with less effort than one manual application.

Apply with Sonara Apply manually

FAQs About Senior / Staff Site Reliability, Platform Engineering Jobs at Saviynt

What is the work location for this position at Saviynt?

This job at Saviynt is located in Atlanta, GA, according to the details provided by the employer. Some roles may also include multiple work locations depending on the requirement.

What pay range can candidates expect for this role at Saviynt?

Employer has not shared pay details for this role.

What employment applies to this position at Saviynt?

Saviynt lists this role as a Full-time position.

What experience level is required for this role at Saviynt?

Saviynt is looking for a candidate with "Senior-level" experience level.

What benefits are offered by Saviynt for this role?

Saviynt offers Career Development for this position. Actual benefits may vary depending on the employer's policies and employment terms.

What is the process to apply for this position at Saviynt?

You can apply for this role at Saviynt either through Sonara's automated application system, which helps you submit applications 10X faster with minimal effort, or by applying manually using the direct link on the job page.