Infrastructure Operations Engineer
Automate your job search with Sonara.
Submit 10x as many applications with less effort than one manual application.1
Reclaim your time by letting our AI handle the grunt work of job searching.
We continuously scan millions of openings to find your top matches.

Overview
Job Description
Our mission at Tensorwave Cloud is to build seamless, secure, reliable, and resilient AI infrastructure at scale, eliminating barriers and challenging the status quo to empower builders and support AI innovation.
About the role
We are seeking a Infrastructure Operations Engineer to join our growing infrastructure team.
This role is ideal for someone who thrives in hardware-centric environments, enjoys hands-on datacenter and system administration work, and can build reliable automation around large-scale infrastructure.
You will be responsible for managing enterprise hardware, monitoring systems, network operations, infrastructure automation, and supporting our compute clusters across multiple data centers.
This role touches every layer of modern infrastructure - from bare metal provisioning, to OS and Kubernetes management, to monitoring and troubleshooting hardware.
If you are detail-oriented, resourceful, and comfortable working with both low-level hardware systems and higher-level DevOps tooling, we'd love to talk.
Responsibilities
Manage and maintain enterprise-grade server hardware including diagnostics and break/fix for CPUs, memory, disks, PSUs, and NICs
Operate out-of-band management systems for remote access and recovery - iLO, iDRAC, IPMI, Redfish
Design, build, and maintain infrastructure monitoring and alerting- Prometheus, Grafana, SNMP, or similar
Administer and troubleshoot Linux systems- OS install, boot issues, services, networking, filesystems, and access controls
Own bare-metal provisioning workflows- PXE/UEFI boot and automated node bring-up using MAAS, Foreman, or equivalents
Build and maintain infrastructure automation - shell scripting and CLI tooling to improve reliability and scale operations
Manage core networking - subnets, IP address management, VLANs, routing, NAT, and firewall configuration
Configure and support secure connectivity such as VPNs- IPsec, WireGuard, OpenVPN
Support Kubernetes clusters at the infrastructure layer - node lifecycle, access, basic troubleshooting, and scaling
Partner with internal teams to ensure compute clusters remain reliable, secure, and scalable across multiple data centers
Required Experience
Bachelor of Science in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience
Proven experience managing enterprise-grade hardware at scale
Expertise with automation languages such as Python, Go, PHP, or Perl
Strong understanding of out-of-band management systems- IPMI, BMC, Redfish
Hands-on expertise with monitoring systems- Prometheus, Grafana, SNMP, Nagios, CheckMK, or similar
Solid knowledge of network administration - firewalls, routing, VPNs, NAT, and managed switches
Linux system administration experience - installation, configuration, troubleshooting
Experience with filesystems- RAID, partitioning, and general storage management.
Familiarity with certificate management - key-based auth, and cryptographic functions.
Experience with bare metal provisioning- MAAS, Foreman, or similar
Understanding of PXE/UEFI/HTTP boot systems
Ability to write functional, maintainable bash scripts for automation
Nice to Have
Experience with Kubernetes - operators, cluster scaling, CRDs
Experience with Helm chart customization
Exposure to high-availability or distributed compute environments
Knowledge of infrastructure security and hardening practices
What We Bring
Mission driven company
Competitive Salary
Stock Options
100% paid Medical, Dental, and Vision insurance
Flexible PTO
Paid Holidays
401(k)
Parental Leave
Flexible Spending Account
Short Term Disability Insurance
Life and Voluntary Supplemental Insurance
Mental Health Benefits through Spring Health
We're looking for resilient, adaptable people to join our team, people who believe in the mission and think at massive scale. The solutions that worked on a handful of devices will not work at Exascale. Be prepared to be pushed daily, to learn a lot, and literally build the future.
Tensorwave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, national origin, or veteran status.
Automate your job search with Sonara.
Submit 10x as many applications with less effort than one manual application.
