Lead Data Engineer
Health Catalyst
Join one of the nation’s leading and most impactful health care performance improvement companies. Over the years, Health Catalyst has achieved and documented clinical, operational, and financial improvements for many of the nation’s leading healthcare organizations. We are also increasingly serving international markets. Our mission is to be the catalyst for massive, measurable, data-informed healthcare improvement through:
Data: integrate data in a flexible, open & scalable platform to power healthcare’s digital transformation
Analytics: deliver analytic applications & services that generate insight on how to measurably improve
Expertise: provide clinical, financial & operational experts who enable & accelerate improvement
Engagement: attract, develop and retain world-class team members by being a best place to work
Role: Lead Data Engineer
Location: Hyderabad, IN
The healthcare industry is the next great frontier of opportunity for software development, and Health Catalyst is one of the most dynamic and influential companies in this space. We are working on solving national-level healthcare problems, and this is your chance to improve the lives of millions of people, including your family and friends. Health Catalyst is a fast-growing company that values smart, hardworking, and humble team members. Each product team is a small, mission-critical team focused on developing innovative tools to support Catalyst’s mission to improve healthcare performance, cost, and quality.
Health Catalyst is expanding and maintains a large suite of Improvement Apps that contribute to healthcare analytics and process improvement solutions. This includes products that manage the care of health system populations, better serve patients at the point of care, reduce health system costs, and reduce clinician workload.
Job Summary:
As a Lead Data Engineer, you will be working with diverse Improvement Apps, software engineering team designing, developing, and maintaining various platforms that serve internal HCAT team members, clinicians, and patients. You will rely on Test-Driven Development to safely enhance and refactor our system, shipping production code multiple times per week. And you will go to bed each night with the comfort that your code is improving outcomes for patients.
If you love…
Help drive clarity and prototype individual features or problems
Knowledge of architecture patterns and the ability to design and complete features / tasks that are 50-60% well defined.
Can discern where gaps can be filled in without consulting a Product Manager or another programmer and can judge when a consultation is needed.
Work is reviewed with the occasional need for material direction or implementation changes
Seeks and provides guidance via PR reviews, pair-programming and other interactions with Engineers and Product Managers
It is second nature to develop high code quality standards balanced with the needs of real-world customer timelines.
Possesses a passion and drive to deliver exceptional products and follows established patterns and approaches within existing code bases with ease.
Takes ownership of learning and growth
Capitalizes on internal and external opportunities for learning.
Identifies gaps in knowledge/skills and seeks ways to close those gaps (self-guided learning, pairing, seeking guidance for yourself and developing guidance for less experienced members of the team)
Periodic On Call Rotation
Ability to communicate with Customer Success about customer issues that are escalated to Engineering and help quantify customer impact.
Can Respond quickly to operational emergencies, find short term resolutions and plan long term fixes to avoid similar issues in the future.
What you own in the role:
Design, develop, and optimize complex SQL queries, stored procedures, and data models to support large-scale analytics and reporting pipelines for patient engagement and clinical outcomes data.
Architect and implement scalable data ingestion, transformation, and processing workflows using PySpark on Databricks, ensuring high performance and reliability across batch and streaming pipelines.
Lead the design and implementation of enterprise-grade data platforms, including Delta Lake architecture on Databricks, enforcing data quality standards, partitioning strategies, and schema evolution best practices aligned with Health Catalyst's ML/AI services.
Build and maintain robust ETL/ELT pipelines to acquire data from primary and secondary sources — including relational databases, HL7/FHIR feeds, and flat files — integrating them into unified data products and analytics-ready datasets.
Develop and enforce data quality frameworks using Databricks Delta Live Tables and custom PySpark validation logic to proactively detect, flag, and resolve data integrity issues across pipelines.
Collaborate with data science, ML engineering, and product teams to translate business and analytical requirements into scalable data infrastructure, and drive prioritization of data platform improvements aligned with organizational goals.
Continuously evaluate and identify opportunities to refactor legacy SQL-based workflows into optimized PySpark pipelines, improving pipeline throughput, cost efficiency, and maintainability on the Databricks platform.
What you bring to this role:
Bachelor's degree or equivalent practical experience preferred.
Strong working knowledge of SQL
Technical expertise regarding data models, database design development, data mining and segmentation techniques
Strong knowledge of and experience with reporting software such as Power BI, BusinessObjects, Looker, Tableau, etc. (Looker experience preferred)
Strong analytical skills with the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy, in a timely manner
Adept at constructing efficient queries, writing reports and presenting findings
Ability to manage multiple and simultaneous responsibilities and to prioritize scheduling of work
Strong verbal and written communication skills
An understanding of healthcare data is a plus, but not a requirement
You may also bring:
Experience with cloud infrastructure and architecture patterns, either Azure or AWS preferred.
Software development experience within healthcare IT and understands key data models (clinical, claims, financial, etc.) and interoperability standards such as HL7v2, CDA, EMR, and FHIR
Knowledge of healthcare compliance and how it applies to Application Security
Agile/Scrum software development practices
Business Intelligence or Data warehousing experience
Preferred Experience and Education:
BS/BA or MS in Computer science, information systems, or other technology/science degree.
A minimum of 7+ years of experience in building commercial software, SaaS, or digital platforms.
Please note: We currently have multiple available roles and are open to various levels of experience to fill those roles. We will consider junior, mid, and senior level experience on a case-by-case basis. If you feel this role is a match for your skills and experience, we encourage you to apply.
Information Security and Compliance Responsibilities:
- Maintain compliance with training directives required by the organization pertaining to Information Security, Acceptable Use Policy and HIPAA Privacy and Security.
- Adhere to and comply with the organizations Acceptable Use Policy.
- Safeguard information system assets by identifying and reporting potential and actual security events to the organizations Security and Compliance Officers.
The above statements describe the general nature and level of work being performed in this job function. They are not intended to be an exhaustive list of all duties, and indeed additional responsibilities may be assigned by Health Catalyst.
Studies show that candidates from underrepresented groups are less likely to apply for roles if they don’t have 100% of the qualifications shown in the job posting. While each of our roles have core requirements, please thoughtfully consider your skills and experience and decide if you are interested in the position. If you feel you may be a good fit for the role, even if you don’t meet all of the qualifications, we hope you will apply. If you feel you are lacking the core requirements for this position, we encourage you to continue exploring our careers page for other roles for which you may be a better fit.
At Health Catalyst, we appreciate the opportunity to benefit from the diverse backgrounds and experiences of others. Because of our deep commitment to respect every individual, Health Catalyst is an equal opportunity employer.