H logo

Senior Data Engineer

Hatchify, Inc.New York City, NY

Automate your job search with Sonara.

Submit 10x as many applications with less effort than one manual application.1

Reclaim your time by letting our AI handle the grunt work of job searching.

We continuously scan millions of openings to find your top matches.

pay-wall

Job Description

HATCH

https://www.usehatchapp.com/

Senior Data Engineer

MUST BE BASED IN NYC - No Relocation

Cannot Sponsor

About Hatch

At Hatch, we're building AI that doesn't just assist behind the scenes; it converses with customers out in the wild. Backed by Y Combinator and top-tier investors like Bessemer and NextView, we're scaling fast, doubling revenue year over year, and looking for A players to help us cement our place as the category leader in AI for customer engagement.

About the Role

We are looking for a skilled Data Engineer with proven software development experience to join our growing data team. You will build, optimize, and maintain data pipelines and platform services that power our analytics, reporting, and AI initiatives.

The critical requirement: You must have already built production APIs, SDKs, or backend services in previous roles. We need someone who brings software development expertise to data engineering-not someone looking to learn. If you haven't designed APIs, applied design patterns in production code, or shipped services that other engineers consume, this role is not the right fit.

This is not a business intelligence, analytics, or pure SQL/ETL role. Candidates whose experience is primarily dashboards, reports, or configuring low-code tools will not be successful here.

Key Responsibilities

  • Design and build scalable batch and real-time data pipelines using Kinesis, Pub/Sub, Flink, Spark, Airflow, and dbt.

  • Architect and implement multi-tier data lake architectures with raw/staging/curated layers, defining promotion criteria, data quality gates, and consumption patterns.

  • Develop and maintain production-quality APIs, SDKs, and backend services that integrate with data infrastructure.

  • Apply software engineering best practices-modular design, design patterns, testing, CI/CD, observability, and code reviews-to all data platform work.

  • Model and optimize datasets in BigQuery and Aurora PostgreSQL with attention to performance, cost, and governance.

  • Collaborate with backend teams to define data contracts, streaming interfaces, and service boundaries.

  • Implement infrastructure-as-code (Terraform, Docker, Kubernetes/EKS) for deployment automation.

  • Establish and monitor SLOs for data quality, latency, and availability; troubleshoot production issues across distributed systems.

What We're Looking For

Must-have software development background (non-negotiable):

  • 3+ years building production APIs, SDKs, or backend services in Python, Go, or similar languages.

  • Demonstrated expertise with software design patterns (repository, factory, dependency injection, etc.) applied in real production systems-not theoretical knowledge.

  • Proven ability to write clean, tested, maintainable code with proper abstractions and error handling.

  • Experience with code reviews, CI/CD pipelines, and production deployments.

  • Strong computer science fundamentals: data structures, algorithms, concurrency, distributed systems.

Must-have data engineering experience:

  • 5+ years total engineering experience, with 2+ years focused on data engineering.

  • Hands-on expertise with distributed data technologies: Kafka/Kinesis/PubSub, Spark/Flink, Airflow, dbt, BigQuery.

  • Experience with modern data lake table formats like Apache Iceberg, Delta Lake, or Apache Hudi for advanced schema management and data lake optimization.

  • Experience designing and implementing layered data architectures (raw/landing → refined/standardized → curated/consumption) with appropriate transformations and quality checks at each stage.

  • Strong SQL skills and experience with data modeling (dimensional, event-driven, domain patterns) and query optimization.

  • Production experience building both batch and streaming data pipelines.

Must-have platform experience:

  • Working knowledge of AWS and GCP, including monitoring/troubleshooting (CloudWatch, Prometheus/Grafana).

  • Familiarity with containerization, Kubernetes/EKS, and infrastructure-as-code (Terraform).

  • Exposure to event-driven microservices and schema governance (parquet/protobuf/Avro).

  • Excellent communication skills-can explain complex systems clearly and collaborate effectively with engineering teams.

Nice to Have

  • Experience with ML/LLM pipelines in production (vector databases, feature stores, prompt orchestration).

  • Open-source contributions or work in fast-moving startup environments.

What We Offer

  • Competitive salary and equity

  • Remote (Eastern or Central Time Zone required) OR Hybrid work environment (3 days/week in our NYC office)

  • Medical, dental, and vision benefits

  • 401(k) plan

  • Flexible PTO

  • Opportunity to build at the ground floor of a high-growth, mission-driven company

  • Not offering sponsorship

Why Hatch

  • Shape the future of AI-driven customer service

  • Build alongside founders and leaders who value speed, ownership, and ambition

  • Solve hard problems that impact real businesses and customers

  • Join a team of builders who care about great engineering, fast execution, and each other

Automate your job search with Sonara.

Submit 10x as many applications with less effort than one manual application.

pay-wall