hero

Canada's Talent Marketplace

Find your next role at Canada's fastest-growing tech companies
companies
Jobs

LLMOps Engineer

Thrive Career Wellness Platform

Thrive Career Wellness Platform

Remote
CAD 140k-160k / year
Posted on Mar 24, 2026

LLMOps Engineer

Description

LLMOps Engineer

Location: Hybrid – Toronto, ON (317 Adelaide Street West)

Employment Type: Full-time
Salary Range: $140,000 – $160,000 CAD
Reports To: Head of AI (with close collaboration across Engineering & DevOps)


About the Role


Thrive is hiring its first LLMOps Engineer to own and scale the operational backbone of our AI-powered products. This is a newly created, high-impact role responsible for deploying, optimizing, and operating large language models in production.
You’ll work closely with our AI, Engineering, and DevOps teams to ensure our LLM-driven features are reliable, performant, secure, and cost-effective. If you enjoy building scalable AI infrastructure in fast-paced environments and want to shape how LLMs are used across a growing SaaS platform, this role is for you.

What You’ll Do


  • Lead LLM infrastructure efforts across multiple engineering teams, enabling scalable and secure AI-powered features
  • Design, build, and operate production-grade LLM systems, including:
  • Model and prompt versioning
  • A/B testing and evaluation workflows
  • Rollback and deployment strategies
  • Partner with the AI team to implement prompt management, prompt versioning, and token optimization strategies
  • Monitor and optimize:
  • Inference latency and throughput
  • Caching strategies
  • Multi-provider cost management (OpenAI, Anthropic, AWS Bedrock, etc.)
  • Build observability pipelines, including quality metrics, error monitoring, evaluation frameworks, and user feedback loops
  • Implement and maintain Retrieval-Augmented Generation (RAG) systems, embedding pipelines, and vector databases
  • Support fine-tuning workflows and manage model registries for proprietary and open-source models
  • Implement AI safety guardrails, content filtering, and compliance measures
  • Contribute ~10% of your time to general DevOps initiatives (CI/CD, cloud infrastructure improvements)
  • Maintain clear documentation of LLM infrastructure, workflows, and best practices

The Problem You’ll Solve


This role establishes the foundation of Thrive’s AI infrastructure. By building robust LLMOps systems and evaluation pipelines, you’ll directly enable faster product iteration, higher-quality AI outputs, and scalable delivery of AI-driven career solutions to our customers.

What We’re Looking For

Required Experience & Skills
  • 3+ years of experience in LLMOps, MLOps, or production-focused AI/ML roles
  • Strong Python programming skills and experience with LLM frameworks and tooling
  • Hands-on experience with LLM providers such as OpenAI, Anthropic, AWS Bedrock, Azure, Vertex, or Databricks
  • Experience working with vector databases (Pinecone, Weaviate, Qdrant, Chroma, or similar)
  • Familiarity with model serving tools (vLLM, TGI, Ray Serve)
  • Strong knowledge of Docker, Kubernetes, and cloud infrastructure (AWS preferred)
  • Experience with prompt engineering, token optimization, and LLM evaluation metrics
  • Familiarity with LLM observability and experimentation tools (LangSmith, Weights & Biases, Phoenix, MLflow)
  • Ability to troubleshoot LLM-specific challenges such as latency, hallucinations, and context window constraints
  • Strong communication skills and the ability to collaborate with both technical and non-technical stakeholders

Nice to Have


  • Experience with open-source LLMs (Llama, Mistral, etc.)
  • Knowledge of advanced RAG techniques (hybrid search, re-ranking)
  • Exposure to agent frameworks and real-time LLM applications
  • Background in traditional MLOps, data engineering, or multimodal models
  • Experience working in Ruby on Rails environments
  • Understanding of AI safety, governance, and alignment principles

Our Hiring Process


  1. Talent Acquisition Screen – 30 minutes
  2. Take-Home Technical Assignment – 3 days to complete
  3. Hiring Manager Interview (Ali) – 30–45 minutes
  4. Live PR / Pairing Session with Staff Engineer – 60 minutes
  5. Meet the Leadership Team

Life at Thrive


  • Fast-paced, high-trust environment with meaningful ownership
  • Opportunity to shape Thrive’s AI infrastructure from the ground up
  • Strong mentorship and long-term career growth

Total Rewards


  • 3 weeks paid vacation + 1-week holiday shutdown
  • Health insurance & wellness coverage
  • Annual Learning & Development allowance
  • Annual workspace allowance
Thrive is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. Accommodation is available throughout the recruitment process upon request.
Candidates must be legally eligible to work in Canada.