Salary: $125-$150k plus 15-20% bonuses and benefits
We are looking for a remote Data Engineer to work on our growing, dynamic Engineering team. We are seeking someone with a technical background who enjoys solving complex problems and has professional experience owning ETL processes. The Data Engineer will be primarily responsible for building out new pipeline components and maintaining existing ones for a complex technology stack that spans a variety of languages and frameworks.
The Engineering team is responsible for data health and quality in every step of the pipeline process, from initial ingestion to deployment and visualization. As a result, debugging can require a deep dive into several interfacing pieces of software, and on any given day a Data Engineer can expect to work on multiple components that perform very different functions.
You will be tasked with the following:
- Manage, modify, and maintain our proprietary software responsible for data storage and transformation of data from a wide variety of sources and delivery methods
- Design and build new components that scale to efficiently ingest, normalize, and process data from a growing number of different sources
- Run distributed computing jobs using Databricks/Spark to prepare and transform terabytes of time-series and event data for modeling
- Integrate external APIs into current products and utilize their data to streamline and add value to current offerings
- Assist DevOps with optimization of company infrastructure
Qualifications & Skills
- 2+ years of experience using Python 3 to leverage its strong data science libraries, including Pandas and Spark/Databricks
- Strong in at least one other language other than Python; experience shell scripting, especially Bash
- Proficient with different flavors of SQL, especially PostgreSQL, including understanding of under-the-hood concepts like indexing and analysis of query plans
- Experience with automation of DevOps processes in cloud environment
- Experience extracting data from, and pushing data to, a variety of sources including relational and non-relational databases, RESTful APIs, flat files, FTP servers, and distributed file systems
- Experience with Agile / Scrum development methodologies
- Experience with “XaaS” cloud services — we are an AWS shop but will consider candidates with similar experience on other cloud platforms
- Excellent communication skills, both written and oral, especially when explaining difficult technical concepts to people in non-technical roles
- Strong analytical skills, especially when working with multiple large datasets