We are a dynamic, well-funded and innovative startup in the security industry, that was recently acquired by the market leaader in Data Security. Our cutting-edge product, backed by substantial funding, is set to make a significant impact in the market. We are aggressively pursuing our goals and are looking for a highly skilled and motivated individual to join our team as a Site Reliability Engineer.
Key Responsibilities:
- Design, build, and maintain scalable AWS infrastructure with a focus on high availability and fault tolerance.
- Design and configure ECS scaling strategies.
- Optimize, monitor, and automate Amazon RDS (PostgreSQL) performance, backups, and failover strategies.
- Implement disaster recovery plans, backup solutions, and system restoration procedures.
- Develop and maintain infrastructure-as-code (IaC) using Terraform or CloudFormation.
- Create monitoring and alerting systems using CloudWatch, Prometheus, Grafana, or Datadog.
- Enhance CI/CD pipelines to improve deployment automation and system resilience.
- Perform incident management, troubleshoot production issues, and conduct post-mortems.
- Collaborate with engineering teams to ensure best practices in application reliability and performance.
- Stay up-to-date with AWS services and industry best practices to drive continuous improvement.
Qalifications:
- 3+ years of experience in SRE, DevOps, or Cloud Engineering roles.
- Previous experience in a high-scale, production environment.
- Strong expertise in AWS services, particularly EC2, ECS, RDS, S3, IAM, and VPC.
- Knowledge of event-driven architectures using AWS Lambda and SNS/SQS.
- Hands-on experience managing databases in production environments.
- Proficiency in Terraform, CloudFormation, or CDK for infrastructure automation.
- Experience with containerization (Docker, ECS, Kubernetes).
- Solid understanding of Linux systems, networking, and security best practices.
- Proficiency in scripting (Python or Bash) for automation.
- Strong troubleshooting and incident response skills.
- Experience with monitoring and logging tools like CloudWatch, Prometheus, Grafana, or Datadog.
- Experience working for a startup.
What We Offer:
- An exciting and challenging work environment where you can make a real impact.
- Highy competitive compensation and benefits package.
- Opportunity to make a huge impact on the industry and have proportionately great upside.
- The chance to work with a passionate and talented team on a groundbreaking product.
If you are a highly technical and hands-on professional with a passion for building secure and scalable SaaS solutions, we want to hear from you. Join us and be a part of our journey to transform the AI journey.
Job Type: Full-time
Pay: $90,000.00-$115,000.00 per year
Benefits:
- Dental care
- Extended health care
- Paid time off
Ability to commute/relocate:
- Vancouver, BC V6B 1B3: reliably commute or plan to relocate before starting work (preferred)
Location:
- Vancouver, BC V6B 1B3 (preferred)
Work Location: In person