As a Machine Learning Engineer, your responsibilities include:
- Convert proof of concepts to production-grade solutions that can scale for hundreds of thousands of users
- Create and manage machine learning pipelines on a Hadoop cluster to support any kind of model deployment on streaming or batch data.
- Tackle challenging problems, such as developing web services and ETL pipeline components, to productize and evaluate machine learning models
- Write production code and collaborate with Solutions Architects and Data Scientists to implement algorithms in production
- Design, conduct, and analyze experiments to validate proposed ML modeling approaches as well as improvements to existing ML pipelines
- Previous experience as a Software Engineer, Data Engineer or Data Scientist (with hands-on engineering experience)
- Solid programming experience in Python, Java, Scala, or other statically typed programming language
- Hands-on experience in one or more big data ecosystem products/languages such as Spark, Impala, Solr, Kudu, etc
- Experience working with Data Science/Machine Learning software and libraries such as h2o, TensorFlow, Keras, scikit-learn, etc.
- Strong working knowledge of SQL and the ability to write, debug, and optimize distributed SQL queries
- Excellent communication skills; previous experience working with internal or external customers
- Strong analytical abilities; ability to translate business requirements and use cases into a Hadoop solution, including ingestion of many data sources, ETL processing, data access, and consumption, as well as custom analytics
- 4 year Bachelor's Degree in Computer Science or a related field, or equivalent years of professional working experience.
Keywords: Hive, Apache Spark, Java, Apache Kafka, Big Data, Spark, Solution Architecture, Cloudera, Apache Pig, Hadoop, NoSQL, Cloudera Impala, Scala, Python, Data Engineering, Big Data Analytics, Large Scale Data Analysis, ETL, Linux, Kudu, Pandas, TensorFlow, h2o, R, Keras, PyTorch, scikit-learn, Machine Learning, Machine Learning Engineering, Data Science, PySpark, NLP