
IT Data Engineering JG3
Automate your job search with Sonara.
Submit 10x as many applications with less effort than one manual application.1
Reclaim your time by letting our AI handle the grunt work of job searching.
We continuously scan millions of openings to find your top matches.

Job Description
This is not a standard data engineer role.
Looking for a deeply technical, hands-on individual contributor who can:
- Diagnose performance, latency, and cost issues in a large-scale cloud data platform
- Take a top-down, platform-level view across multiple projects
- Improve architecture, efficiency, and cost optimization, not just write Spark code
- Act as a technical problem-solver and mentor, guiding other data engineers
This person is expected to make the platform better, not just execute tasks.
Current Platform & Architecture (Very Important)
Data Flow:
- On-premise systems → Cloud (Azure)
- Streaming ingestion → Azure Data Lake Storage (ADLS)
- Data processed into two separate containers:
- Crude trading
- Product trading
Technologies in Use:
- Qlik Replicate (formerly Attunity)
- Streaming data from on-prem to Azure
- Azure Data Lake Storage (ADLS)
- Databricks
- Delta Live Tables (DLT)
- Spark / PySpark
- Python
- SQL (complex queries and procedures)
Key Challenges the Role Is Meant to Solve
1. Data Latency
- High-volume streaming data
- End-to-end latency issues that need root-cause analysis
2. Databricks / DLT Cost Spikes
- DLT costs are far higher than expected
- Known contributors:
- Very high data volume (expected)
- Inefficient lookup logic used to split data into separate containers
- The current solution works but is not optimal
This role exists because generic recommendations are not enough.
What we do Not Want
- Someone who has:
- Only written Spark notebooks
- Only followed architectural guidance
- Only worked at a surface level
- Someone who:
- Needs strict 9–5 boundaries
- Avoids ambiguity or deep technical investigation
- Someone whose resume was “AI-polished” but not real
What WE DO WANT
Technical Depth
- Deep understanding of:
- Databricks internals
- Spark engine behavior
- Performance tuning and optimization
- Ability to:
- Analyze pipelines end-to-end
- Identify architectural inefficiencies
- Propose and prove better approaches via POCs
- Comfortable challenging Databricks as a product
- Gather evidence
- Support escalation discussions with Databricks engineers
Programming & Data Skills
- Strong Python (mandatory)
- PySpark (advanced, not basic)
- Advanced SQL
- Complex queries
- Stored procedures
- Analytical logic
Working Style
- Hands-on individual contributor
- Collaborative with data engineers
- Willing to:
- Review others’ solutions
- Build POCs independently
- Demonstrate better outcomes (performance, cost, scalability)
Role Scope
- Will work across multiple projects
- Acts as a cross-platform technical expert
- Evaluates:
- Architecture
- Cost drivers
- Scalability
- Reusability for future programs
Automate your job search with Sonara.
Submit 10x as many applications with less effort than one manual application.
