Member of Technical Staff, Data Acquisition
Cohere
This job is no longer accepting applications
See open jobs at Cohere.See open jobs similar to "Member of Technical Staff, Data Acquisition" Work In Tech.IT
Toronto, ON, Canada
Posted on Thursday, July 20, 2023
Who are we?
We’re a team of engineers, thinkers, and champions whose aim is to give technology language. Every day our team is breaking new ground, as we build transformational AI technology and products for enterprise and developers that wish to harness the power of Large Language Models.
We're driven by ambition, as we firmly believe that our technology has the potential to revolutionize the way industries engage with natural language. Our strong technical foundation speaks for itself, with our team composed of world-class experts who have collectively accumulated hundreds of thousands of citations in academia.
The Cohere team is a collective of college dropouts, PhDs, alumni of big tech and scrappy start-ups, new grads and career pivots, who believe a diverse team is the key to a safer, more responsible technology. At Cohere, work isn’t the opposite of play, as we build the future of language AI with team members on almost every continent in the world, working from high rises, cabins, tour buses, and dog-friendly offices.
There’s no better time to herald the next step with us as we shape the future of Generative AI.
Why this role?
At Cohere, we strive to continually improve our large language models. Academic research and real-world experience has demonstrated that high quality, diverse datasets can contribute as much to the performance and capabilities of LLMs as the underlying model architecture and training regimen. We at Cohere believe data will play a central role in accelerating the advancement of our already world-class language models.
Data is therefore critical to our success. Our ability to acquire data that is accurate, relevant, and timely is key to our ability to improve the quality of our models. We strive to continuously improve our data acquisition processes and systems to ensure that we have the data we need to stay competitive and meet the needs of our customers. We run frequent experiments to learn more about the role of data for model quality, from data mixtures, to cleaning techniques, to quality control.
This role will be part of the Data Acquisition team, which broadly provides data for training models and is responsible for building and maintaining the infrastructure that acquires, cleans, and formats data for model training. We are looking for a technically skilled, resourceful problem-solver who is able to work in areas of ambiguity and find efficient and sometimes creative solutions. The main responsibility of this role is to improve our internal data acquisition infrastructure, which includes data crawlers, formatters, and integrations with data providers. This role would also work closely with different teams at Cohere to support their data acquisition needs, as well as engage in more experimental work to develop highly informative data signals.
Please Note: We have offices in Toronto, Palo Alto, and London but embrace being remote-first! There are no restrictions on where you can be located for this role.
As a Member of Technical Staff, Data Acquisition, you will:
- Develop data pipelines to acquire, prepare, and integrate high-quality datasets into model training and evaluation
- Collaborate with research and product teams to identify, prioritize, and secure new data sources
- Enhance and develop infrastructure for data management, pipeline orchestration, and MLOps, while avoiding premature optimization
- Run experiments using new data inputs, preprocessing techniques, and data mixtures
You may be a good fit if:
- You have more than 2 years of experience working on a software or machine learning engineering team
- You have proficiency in Python and have used distributed processing technologies like Spark, Dask, etc.
- You have experience building data pipelines and ETL processes for large-scale datasets
- You have experience working with unstructured and/or human-annotated data (e.g., collecting or assessing sample quality).
- You have experience with ML frameworks such as Tensorflow, TF-Serving, JAX, and XLA/MLIR
- You have strong communication and problem-solving skills, preferring the right tool for the job even if it’s outside your wheelhouse
- You feel comfortable actively reading academic literature and researching state-of-the-art NLP best practices.
- You have a demonstrated passion for applied NLP models and products
If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! If you consider yourself a thoughtful worker, a lifelong learner, and a kind and playful team member, Cohere is the place for you.
We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants of all kinds and are committed to providing an equal opportunity process. Cohere provides accessibility accommodations during the recruitment process. Should you require any accommodation, please let us know and we will work with you to meet your needs.
Our Perks:
🤝 An open and inclusive culture and work environment
🧑💻 Work closely with a team on the cutting edge of AI research
🍽 Free daily lunch
🦷 Full health and dental benefits, including a separate budget to take care of your mental health
🐣 100% Parental Leave top-up for 6 months for employees based in Canada, the US, and the UK
🎨 Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
🏙 Remote-flexible, offices in Toronto, Palo Alto, San-Francisco and London and co-working stipend
✈️ 6 weeks of vacation
Note: This post is co-authored by both Cohere humans and Cohere technology.
This job is no longer accepting applications
See open jobs at Cohere.See open jobs similar to "Member of Technical Staff, Data Acquisition" Work In Tech.