● Design and develop highly scalable, end-to-end pipeline for processing and analyzing large volumes of complex data.
● Assist the Data Scientist in deploying Machine Learning models.
● Ensure high data quality and integrity from data sources.
● Support the team with data or analytics request.
● Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
● Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
● An enormous sense of ownership
● Hands on experience programming skills: Java, Python or Scala
● Solid understanding of SQL & NoSQL databases and other data manipulation tools
● Experience in scheduler and monitor workflow tools (Airflow , Oozie or Azakaban)
● Experience in Hadoop is huge plus
● Experience with stream-processing systems: Storm, Spark-Streaming, etc.
● Familiar with cloud-based infrastructure (AWS, GCP, etc) and UNIX environment
© Copyright 2018 | GreyFinders Company.