AI/ML Jobs

The artificial intelligence, machine learning, and deep learning jobs

Backend Data Engineer at Brightfield Group (Chicago, IL)

Our tech team is looking for an experienced data engineer to join us working on moving data through the clouds.  We primarily use GCP but also have things running across other cloud platforms.  

As the ideal candidate you will have experience working on backend data pipelines and be comfortable navigating remote VMs and using containerization technology.  We write most of our code in python, and deploy it to GCP. You will be expected to build scalable data ingress and egress pipelines across data storage products, deploy new ETL pipelines and diagnose, troubleshoot and improve existing data architecture.  

We are working toward setting up a modern data stack, and you will be a huge part of getting us there.  If you know Docker and Kubernetes, great. If you dont know them you will have a chance to learn them. We strive to have all ongoing development using current generation best-of-breed technology and we are open to trying new frameworks with high value propositions.  We need someone who can both bring new ideas to the group and execute quickly to help us build a solid data architecture. 

 On a typical day, you could expect to troubleshoot a bug in our existing ETL pipeline, make a PR with the change, and deploy the code soon after. Next, perhaps you’re spending some time working on new processing flows to munge data into shape for BI reports, or deploying new machine learning models for data processing. You’ll help build and audit the pipelines that move raw data into the staging region, and develop tests to ensure the data is clean enough for production.

Key Responsibilities:

  • Be able to create and deploy ETL pipelines on our cloud architecture.  
  • Troubleshoot and fix bugs and diagnose data issues
  • Work independently to solve issues
  • Write unit and integration tests
  • Provide training to team members on areas of expertise

Skills:

  • Fluency in one (or more) programming language such as Python, Javascript, Java
  • Experience writing ETL pipelines on a cloud infrastructure, especially in Python
  • Solid understanding of SQL and RDBMS 
  • Understanding of distributed data processing (Hadoop, map/reduce, ect)
  • Familiarity with evaluating and deploying ML models
  • Having at least heard (and hopefully read about) Docker and Kubernetes
  • Google Cloud or AWS experience is a nice to have, as well as accessing VMs, serverless technology and distributed data processing