Supporting our software developers and Data Scientists on data initiatives you will ensure the consistency of the data delivery architecture throughout our ongoing projects.
- Owning our label acquisition processes, making sure their volume, quality and integrity is adequate to train and test models with.
- Creating and maintaining large datasets derived from our labels, such that these are easy to use, scalable and have a consistent format.
- Supporting running our Machine Learning models in production by implementing various I/O components within our big data pipelines, and automating processes around continuous testing and debugging of models.Building and maintaining analytic tools to provide actionable insights that form part of our product features and directly bring value to our customers.
- Allow us to better understand key business performance metrics.
- Work with stakeholders across the Engineering, Data Science, and Customer Success teams to assist with data-related technical issues and support their data infrastructure needs.
If you are a fast learner who thrives in challenging environments and has a creative yet pragmatic approach to problem-solving, read on!
- A Master's degree in Computer Science or Software Engineering, or proven working experience with large volumes of data.
- Willingness to work across a diverse set of technologies, and ability to ramp up on new technologies quickly.
- Proficiency in C# and the Python programming languages, as well as GIT.
- Deep knowledge of big data streaming patterns and technologies such as Kafka / EventHub, NiFi, Kubernetes, etc.
- Good understanding of relational databases (i.e. SQL), and unstructured / NoSQL (e.g. MongoDB, InfluxDB, etc.).
- Has an eye for detail and can obtain the domain knowledge necessary to spot incorrect data early and deliver with quality.
- Knowledge around monitoring technologies such as Grafana or Prometheus.
- Excellent written and verbal communication skills.
- Experience with testing approaches for complex, multi-stage data pipelines.
- Functional knowledge of a cloud computing platform (preferably Microsoft Azure) and serverless computing