N logo

Data Center Site Manager

Nebius Group NVNew Jersey, NJ

$90,000 - $140,000 / year

Automate your job search with Sonara.

Submit 10x as many applications with less effort than one manual application.1

Reclaim your time by letting our AI handle the grunt work of job searching.

We continuously scan millions of openings to find your top matches.

pay-wall

Overview

Schedule
Full-time
Education
Engineering (PE)
PMP
Career level
Director
Compensation
$90,000-$140,000/year
Benefits
Health Insurance
Dental Insurance
Vision Insurance

Job Description

Why work at Nebius

Nebius is leading a new era in cloud computing to serve the global AI economy. We create the tools and resources our customers need to solve real-world challenges and transform industries, without massive infrastructure costs or the need to build large in-house AI/ML teams. Our employees work at the cutting edge of AI cloud infrastructure alongside some of the most experienced and innovative leaders and engineers in the field.

Where we work

Headquartered in Amsterdam and listed on Nasdaq, Nebius has a global footprint with R&D hubs across Europe, North America, and Israel. The team of over 800 employees includes more than 400 highly skilled engineers with deep expertise across hardware and software engineering, as well as an in-house AI R&D team.

The Role

The Data Center Site Manager owns end‑to‑end reliability, safety, capacity, and performance for one of our flagship U.S. sites. You'll lead a high‑performing, multi‑disciplinary operations team and partner tightly with Design, Build, Network, Security, Capacity Planning, and the DC orgs to deliver world‑class availability and cost efficiency.

Your responsibilities will include:

  • Own the site 24/7: deliver continuous availability across power, cooling, structured cabling, network, security, and DCIM-meeting or beating global SLAs.
  • Build and lead the team: hire, mentor, and develop managers/technicians; run staffing models, shift coverage, and on‑call rotations that scale.
  • Be the incident commander: lead major events end‑to‑end-triage, communications, executive briefings, RCA, and durable corrective actions.
  • Drive reliability engineering: implement RCM, predictive maintenance, QA/QC, 5S, and Lean/continuous improvement to cut MTTR and raise MTBF.
  • Deliver capacity on time: plan and execute expansions/retrofits; commission MEP systems with Design/Construction; achieve flawless change control (MOP/SOP/EOP).
  • Scale tooling & automation: mature DCIM/BMS/EPMS, monitoring/alerting, work management (Jira/ServiceNow), knowledge base (Confluence), and light scripting/SQL for telemetry and workflow automation.
  • Run a metrics‑first operation: publish dashboards and KPIs (availability, PUE, MTBF/MTTR, work compliance, safety) and use them to drive decisions.
  • Partner across functions: work with Cloud/Compute, Network, Security, and Capacity Planning to optimize performance, cost, and resiliency across the fleet.
  • Manage vendors & colos: own contracts, SLAs, and execution for rack deliveries, PDUs, fiber/copper, and lifecycle PMs; validate colo topology and compliance.
  • Raise the safety bar: enforce a zero‑injury EHS culture; conduct drills/audits for life safety, physical security, and data protection.
  • Forecast and budget: build data‑backed plans for power, spares, headcount, and projects; track OpEx/CapEx with rigor.

We expect you to have:

  • Associate's degree or trade certification in Electrical/Mechanical/Industrial Engineering (or equivalent experience).
  • 10+ years in electrical/mechanical/HVAC/controls within industrial/commercial settings, 5+ years specifically in data center or mission‑critical facilities.
  • Team leadership experience in 24/7 sites (managing leads/techs, vendors, and on‑call operations).
  • Deep, hands‑on knowledge of UPS/generators/switchgear, chillers/CRAC/CRAH, fire detection/suppression, BMS/EPMS/DCIM, and structured cabling (copper & fiber).
  • Proven strength in incident management, RCA/Corrective Actions, change management, and vendor/contract oversight.
  • Data‑driven mindset with the ability to forecast resources and make analytics‑backed decisions (Excel; SQL/scripting a plus).
  • Excellent written/verbal communication with comfort presenting to executives and guiding field teams during live events.
  • Ability to travel up to ~25% and support after‑hours escalations when needed.

It would be an added bonus if you have:

  • Bachelor's degree in Electrical/Mechanical/Industrial Engineering, Engineering Management, or Reliability Engineering.
  • Hyperscale/colo experience with reliability‑centered maintenance, predictive analytics, and Lean/Six Sigma practices.
  • Familiarity with Linux fundamentals, network equipment installation/troubleshooting, and fiber optics testing.
  • Experience with Jira, Confluence, ServiceNow (or similar); strong SOP/MOP/EOP authorship.
  • Certifications such as CDCP, DCM, PMP, OSHA‑30, ITIL, or Uptime‑aligned credentials.

Key employee benefits:

  • Health insurance: 100% company-paid medical, dental, and vision coverage for employees and families.
  • 401(k) plan: up to 4% company match with immediate vesting.
  • Parental leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
  • Remote work reimbursement: up to $85/month for mobile and internet.
  • Disability & life insurance: company-paid short-term, long-term and life insurance coverage.

Compensation

We offer competitive salaries, ranging from  $90k- $140k base + quarterly performance bonuses.

Join Nebius Today!

What we offer 

  • Competitive salary and comprehensive benefits package.
  • Opportunities for professional growth within Nebius.
  • Flexible working arrangements.
  • A dynamic and collaborative work environment that values initiative and innovation.

We're growing and expanding our products every day. If you're up to the challenge and are excited about AI and ML as much as we are, join us!

Automate your job search with Sonara.

Submit 10x as many applications with less effort than one manual application.

pay-wall