Lead Cloud Engineering (P946)
Automate your job search with Sonara.
Submit 10x as many applications with less effort than one manual application.1
Reclaim your time by letting our AI handle the grunt work of job searching.
We continuously scan millions of openings to find your top matches.

Job Description
Lead Cloud Engineer (P946)
We are seeking a Lead Cloud Engineer, AI Enablement to join our AI Foundation Models team. The AI Foundation Models team will enable the science, serving, and scaling behind our proprietary AI models. You will collaborate with cross-functional teams of data scientists, research scientists, machine learning engineers, and product leads to understand business requirements, identify opportunities for AI integration, and ensure our models and services enable development of scalable and robust AI systems. This team may also engage with third party vendors to enable speed, scale, and efficiency.
General areas of focus are deployment & infrastructure, ci/cd pipelines, observability, SLAs, incident management, user onboarding and other related operational components.
This role is 70% engineering leadership and 30% hands on keyboard.
LEADERSHIP RESPONSIBILITIES
- Execution Excellence: Ensure work is done the right way and on time
- Mentorship: Coach and advise team members to foster growth and development
- Strategic Planning: Lead the scoping of epics in conjunction with product, guide their decomposition into stories, and ensure all work aligns with strategic OKRs
- Technical Oversight: Champion engineering best practices, guide the team through release stages, and ensure adherence to the Definition of Done (DoD)
- Collaboration: Facilitate cross-functional communication and work with other squads to identify synergies and drive efficiency
- Technical Debt Management: Proactively track, prioritize, and escalate technical debt to maintain long-term system health
TECHNICAL RESPONSIBILITIES
- Manage and enhance deployment of infrastructure using Terraform
- Manage and enhance ci/cd pipelines using Github Actions
- Interact daily with Azure and Azure services
- Troubleshoot issues related to API deployments in AKS
- Ensure accurate observability of solutions deployed in AKS using Datadog or other related observability tools
- Define and implement SLAs and incident management procedures
- Independently define, prioritize and execute project tasks and plans to deliver cloud-related infrastructure and solutions
- Document work and solutions appropriately
- Engage with other technology and infrastructure teams as necessary to complete tasks
- Participate routinely in team on-call rotation during business hours
QUALIFICATIONS, SKILLS, AND EXPERIENCE
- 4-year degree in a technology related discipline or equivalent work experience
- 5+ years of experience with public cloud technologies (Azure or GCP preferred) including demonstrated networking and security focus
- "GCP Associate Cloud Engineer" and/or "Microsoft Azure Administrator Associate" certifications preferred
- 5+ years of experience with container technologies (Docker, Kubernetes, Helm)
- 5+ years of experience with cloud automation tools (Terraform)
- 5+ years of experience with SDLC and working with Agile development teams
- Experience with AI-related concepts a plus (RAG, fine-tuning, agent framework, LLMs, etc.)
- Experience with data engineering and pipelines a plus
- Experience with front-end development a plus
- Ability to manage small to medium size IT-related projects, solving related problems and working to tight deadlines while under pressure
- Strong interpersonal and communication skills with demonstrated experience leveraging these skills with technical teams and non-technical business units
- Desire to learn new technology and grow across different areas of technology
- Demonstrated ability to prioritize own workload with multiple responsibilities and adaptability to changes in those priorities
#LI-SSS
Automate your job search with Sonara.
Submit 10x as many applications with less effort than one manual application.
