Senior SRE

Viafoura

Viafoura

Toronto, ON, Canada · Remote

Posted on May 7, 2026

Senior SRE

Description

Senior Site Reliability Engineer

About Viafoura

Viafoura is a leading audience engagement platform that powers real-time conversations and community experiences for digital publishers and brands worldwide.


The Role

We're seeking a Senior Site Reliability Engineer to join our infrastructure team. You'll be responsible for designing, implementing, and maintaining highly available, scalable systems that power our platform. This role requires deep expertise in Kubernetes orchestration, AWS cloud infrastructure, and infrastructure-as-code practices.


Required Qualifications

Kubernetes & Container Orchestration

  • Amazon EKS (Elastic Kubernetes Service): Demonstrated experience designing, deploying, and managing production EKS clusters from the ground up
  • Cluster Architecture: Hands-on experience with end-to-end cluster setup including:
  • Designing and implementing cluster networking architecture
  • Configuring and managing Container Network Interfaces (CNI)
  • Deploying and managing AWS Load Balancer Controller for ingress management
  • Implementing External DNS for automated DNS management
  • Setting up and maintaining service mesh solutions, particularly Istio, for advanced traffic management, observability, and security
  • Strong understanding of Kubernetes security best practices, RBAC, pod security policies, and network policies
  • Experience with cluster upgrades, scaling strategies, and disaster recovery procedures


AWS Cloud Infrastructure

  • Networking: Deep expertise in AWS networking services including VPC design, subnets, security groups, NACLs, Transit Gateway, VPC peering, and PrivateLink
  • EC2: Extensive experience managing EC2 instances, AMI management, Auto Scaling Groups, and instance optimization
  • RDS: Production experience with Amazon RDS including database engine selection, Multi-AZ deployments, read replicas, backup strategies, and performance tuning
  • Strong understanding of AWS security best practices, IAM policies, and compliance frameworks
  • Experience with additional AWS services such as CloudWatch, CloudTrail, S3, and Route 53


Infrastructure as Code (IaC)

  • Terraform: Advanced proficiency in writing, testing, and maintaining Terraform modules and configurations
  • Terragrunt: Hands-on experience using Terragrunt for managing multiple environments, DRY configurations, and remote state management
  • Strong understanding of IaC best practices including state management, module design, version control, and testing strategies
  • Experience with infrastructure testing frameworks and validation tools


CI/CD & Automation

  • GitHub Actions: Proven experience designing and implementing CI/CD pipelines using GitHub Actions
  • Experience with workflow automation, deployment strategies (blue-green, canary), and rollback procedures
  • Knowledge of GitOps principles and tooling
  • Strong scripting skills (Bash, Python, or similar) for automation


Key Responsibilities

  • Design, deploy, and manage production Kubernetes clusters on AWS EKS with focus on reliability, security, and performance
  • Architect and maintain AWS infrastructure including networking, compute, and database layers
  • Develop and maintain infrastructure-as-code using Terraform and Terragrunt following best practices
  • Build and optimize CI/CD pipelines using GitHub Actions for automated testing and deployment
  • Implement comprehensive monitoring, logging, and alerting solutions
  • Participate in on-call rotation and incident response
  • Collaborate with development teams to improve application reliability and performance
  • Lead capacity planning and cost optimization initiatives
  • Mentor junior engineers and contribute to team knowledge sharing


Preferred Qualifications

  • Experience with observability tools (Prometheus, Grafana, Datadog)
  • Knowledge of HashiCorp Vault or similar secrets management solutions
  • Experience with disaster recovery planning and execution
  • Contributions to open-source projects


What We Offer

  • Competitive salary
  • Comprehensive health benefits
  • Professional development opportunities
  • Remote-friendly work environment
  • Collaborative and innovative team culture