Are you inspired by innovation, hard work and a passion for data?
If so, this may be the ideal opportunity to leverage your Software Engineering, Data Engineering or Data Analytics experience to design, develop and innovate big data solutions for a diverse set of clients.
At phData, our proven success has skyrocketed the demand for our services, resulting in quality growth and an expanded presence at our company headquarters conveniently located in Downtown Minneapolis (Fueled Collective).
As the world’s largest pure-play Big Data, Machine Learning and Data Science services firm, our team includes Apache committers, Machine Learning experts and the most knowledgeable Scala development team in the industry. phData has earned the trust of customers by demonstrating our mastery of Big Data and Machine Learning services and our commitment to excellence.
In addition to a phenomenal growth and learning opportunity, we offer competitive compensation and excellent perks including base salary, annual bonus, extensive training, paid Cloudera certifications - in addition to generous PTO and employee equity.
As a Machine Learning Engineer, your responsibilities include:
- Convert proof of concepts to production-grade solutions that can scale for hundreds of thousands of users
- Create and manage machine learning pipelines on a Hadoop cluster to support any kind of model deployment on streaming or batch data.
- Tackle challenging problems, such as developing web services and ETL pipeline components, to productize and evaluate machine learning models
- Write production code and collaborate with Solutions Architects and Data Scientists to implement algorithms in production
- Design, conduct, and analyze experiments to validate proposed ML modeling approaches as well as improvements to existing ML pipelines
- Previous experience as a Software Engineer, Data Engineer or Data Scientist (with hands-on engineering experience)
- Solid programming experience in Python, Java, Scala, or other statically typed programming language
- Hands-on experience in one or more big data ecosystem products/languages such as Spark, Impala, Solr, Kudu, etc
- Experience working with Data Science/Machine Learning software and libraries such as h2o, TensorFlow, Keras, scikit-learn, etc.
- Strong working knowledge of SQL and the ability to write, debug, and optimize distributed SQL queries
- Excellent communication skills; previous experience working with internal or external customers
- Strong analytical abilities; ability to translate business requirements and use cases into a Hadoop solution, including ingestion of many data sources, ETL processing, data access, and consumption, as well as custom analytics
- 4 year Bachelor's Degree in Computer Science or a related field, or equivalent years of professional working experience.
Keywords: Hive, Apache Spark, Java, Apache Kafka, Big Data, Spark, Solution Architecture, Cloudera, Apache Pig, Hadoop, NoSQL, Cloudera Impala, Scala, Python, Data Engineering, Big Data Analytics, Large Scale Data Analysis, ETL, Linux, Kudu, Pandas, TensorFlow, h2o, R, Keras, PyTorch, scikit-learn, Machine Learning, Machine Learning Engineering, Data Science, PySpark, NLP