Data Annotator/AI Data Trainer - Data Scientist (Contractor)
Software Engineering, IT, Data Science
Posted on Thursday, November 9, 2023
Who are we?
Cohere is focused on building and deploying large language model (LLM) AI into enterprises in a safe and responsible way that drives human productivity, and creates magical new ways to interact with technology and real business value. We’re a team of highly motivated and experienced engineers, innovators, and disruptors looking to change the face of technology.
Our goals are ambitious, but also concrete and practical. Cohere wants to fundamentally change how businesses operate, making everyone more productive and able to focus on doing better what they do best. Every day, our team breaks new ground, as we build transformational AI technology and products for enterprise and developers to harness the power of LLMs.
Cohere was founded by three global leaders in AI development, including our CEO, Aidan Gomez, who co-created the Transformer, which makes LLMs possible. Collectively, we're driven by the belief that our technology has the potential to revolutionize the way enterprises, their employees, and customers engage with technology through language.
Cohere’s broader research team is world-renowned, having contributed to the development of sentence transformers for semantic search, dynamic adversarial data collection and red teaming, and retrieval augmented generation, often referred to as “RAG,” among other technological breakthroughs.
We have been deliberate in assembling a team of operational leaders with industry-leading experience, with backgrounds working at the most sophisticated, demanding, and respected enterprises in the world. Cohere’s operational leaders have built, scaled, and led multi-billion product lines and businesses at Google, Apple, Rakuten, YouTube, AWS, and Cisco.
The Cohere team is a collective from all walks of life, from people who left college to start businesses, to some of the most experienced people from globally renowned companies. We believe a diverse team is the key to a safer, more responsible technology, and that different experiences and backgrounds enable us to tackle problems from all angles and avoid blindspots.
There’s no better time to play a role in defining the future of AI, and its impact on the world.
Why this role?
At Cohere, we’re obsessed with language and technology— we believe we need great writers and developers and always will. We also believe that remarkable talent, enthusiasm, and creative thinking add up to great work. We’re looking for someone with superb python and data science skills to join our team and help shape the future of language technology. The most successful candidate will be a quick learner who is excited to train our model by working on a wide variety of writing and code based prompts.
We are on a mission to build machines that understand the world and make them safely accessible to all. Data quality is foundational to this process. Machines (or Large Language Models to be exact) learn in similar ways to humans - by way of feedback.
Our AI Data Trainers ensure that all samples fed to our AI model are well-written, technically sound and useful to the end user. By creating content that data scientists would find useful, working python, or on use cases you will be an essential component of improving our Large Language Model’s performance for iterations to come, thus having a lasting impact on Cohere’s tech.
Please Note: This is role may require occasional work on-site at our Soho office in London, UK. We are looking for candidates who are able to commit 12-24 hours a week minimum to this project.
As a AI Data Trainer, you will:
- Spend the majority of your time writing or reading/proofreading code and natural language to create perfect samples to train our models
- Label, proofread, and improve machine-written and human-written code
- Raise the bar continually by writing new code that is of exceptional quality to solve a variety of tasks, with a particular focus on data analytics.
- Adeptly vary the style, functionality of code examples
- Follow our style guide, and make recommendations on unique situations that fall outside of its scope
- Work with intense attention to detail while citing sources of information
You might be a good fit if you are:
- 2+ years of industry experience working on real-world data science problems and pipelines. You excel in data analysis and visualization.
- a meticulous coder with an eye for readability, with experience in python and industry standard data science packages (numpy, pandas, matplotlib, sqlite, or others)
- Able to use sql syntax writing and workflows
- You have good familiarity with file/data formats, such as markdown, json, xml, yaml, html
- A thoughtful and thorough code reviewer. You've spent time re-writing, proofreading, and giving feedback on others' code in a previous role. You've worked with a code style guide before and enjoyed it
Other things you’ll need:
- Located in the UK
- A fast, thorough reader with great comprehension skills
- A curiosity about ML or AI or LLMs, (bonus points if you have any experience in these)
- Expert in web-based research skills that you've used for your code before
- Ability to follow complex instructions, navigate ambiguity and work independently in a remote or hybrid environment
- 20-30 minute video interview
- Technical assessment
- 30 minute final interview
We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants of all kinds and are committed to providing an equal opportunity process. Cohere provides accessibility accommodations during the recruitment process. Should you require any accommodation, please let us know and we will work with you to meet your needs.
🤝 An open and inclusive culture and work environment
🧑💻 Work with cutting-edge AI technology
🪴 A vibrant & central location
🥨 A great selection of office snacks
🏆 Performance-based incentives