We are looking for a data engineer, preferably with some audio processing experience, to join our ML team. You will transform raw data into consumable formats for machine learning. The role involves building infrastructure to harness the data streams that flow into our servers and collating them accordingly. As an extension of the job role, you'll also get to enjoy working with our data scientists to explore statistical methods. In all, you’ll be owning the data collation platform.
To succeed in this data engineering position, you should have a strong ability to build a platform and automation of services that collate and organize data from different sources. You are a strong programmer with attention to detail and possess good analytical skills. Also, knowledge of audio processing would be a huge bonus.
Viva Translate is led by a team of engineers from top companies and institutions, such as Google and Stanford, with a shared dream. We are creating a world where language and culture are no longer barriers to work and opportunity, and we are starting across Latin America.
Technology is at the core of our product. Viva is building a tool that helps people read, write, and speak better across English, Spanish, and Portuguese. We have an incredible dream for the future of borderless work. But we know that great dreams start with great people.
What's your story? If you too are an explorer, a dreamer, and a builder, we'd love to meet you.
What you'll be doing
- Analyze and organize raw audio
- Build data systems and pipelines
- Prepare data for predictive modeling
- Explore ways to enhance data quality, reliability, and security
- Develop analytical tools
- Collaborate with data scientists & architects, and human transcribers & translators
- Our ML tech stack includes Python/Django, AWS, CI/CD, Terraform, BERT, Spacey
Must-have skills 💪
- 1+ years of experience as a data engineer or in a similar role
- Knowledge of programming languages (e.g., Python)
- Hands-on experience with SQL database design (e.g. Postgres)
- Effective communication with team members of diverse technical backgrounds
Nice to haves 🍒
- Degree in Computer Science, IT, or similar fields
- Using project management tools (e.g., GitHub, Asana)
- Experience in handling audio streaming data
- Prior experience of building data platforms
- Experience with Cloud providers (e.g. AWS, GCP, Azure)
- Experience with distributed/streaming data-processing technologies and frameworks (e.g. Scala, Apache Spark, Databricks, Apache Kafka, Redpanda, CockroachDB)
- Fluent in Spanish and/or Portuguese
Our values 💛
- Always leveling up - we take pride in setting new standards and taking ownership of our work
- Science-based - we make decisions together based on data and logic
- Open integrity - we promote a low ego environment that treasures transparency, empathy, and feedback
- Playfulness - we champion diversity and creativity, and want everyone to be unafraid to fail & fail quickly
What we offer ✨
- Fully remote team 🌎
- 3+ in-person retreats (past locations include Mexico, Colombia & Ecuador) annually
- Join an early-stage startup (12 people & growing 🚀)
- Home office stipend
- Health & fitness benefits
- Learning stipend - we are here to support your personal & professional development journey