Site Reliability Engineer | Staff | NordVPN
Full Time
hybrid
Infrastructure
Vilnius / Kaunas
The world’s most advanced VPN, and a whole lot more.
If you’re a curious problem-solver who carves their own path, join the team behind Threat Protection Pro, the NordLynx protocol, and the fastest VPN on the planet—tools that put privacy, security, and control back in people’s hands.
Your impact? Helping millions take back control of their online security, privacy, and data.
Are you excited by the challenge of managing large-scale systems, automating infrastructure, and ensuring seamless service reliability? We’re seeking a Staff Site Reliability Engineer (SRE) to play a key role in shaping the future of our global infrastructure.
Overseeing a global infrastructure of ~10,000 on-prem servers, you’ll tackle unique technical challenges, engineer scalable systems, and have a direct impact on the reliability and performance of our products.
Main Responsibilities
- Deliver projects on time: Plan, delegate, execute, and oversee key projects;
- Collaborate: Work closely with stakeholders and other teams. Mentor colleagues and lead knowledge transfer;
- Ensure quality and reduce technical debt: Deliver solutions with solid design and address blockers, toil, and debt to keep systems healthy;
- Drive engineering excellence: Aim for quality and choose the right solution for the problems we face;
- Protect solution quality: Ensure designs are implemented with proper quality and minimal tech debt;
- Data‑backed decisions: Help teams and stakeholders navigate data and act on insights;
- Design and maintain highly available, scalable infrastructure with monitoring, alerting, and anomaly detection;
- Automate everything: Create and optimize automation to streamline deployments, improve speed, and cut manual work;
- Solve complex issues: Troubleshoot, debug, and resolve critical issues in complex systems;
- Use AI: Integrate AI into workflows and processes to speed up delivery and reduce toil.
Core Requirements
- Observability: Experience with monitoring tools and frameworks to ensure system observability (OpenSearch, VictoriaMetrics, Prometheus, Thanos, Mimir, OpenTelemetry, Nagios);
- Databases and storage systems: Experience operating highly available SQL, NoSQL databases, and object stores at scale (MySQL, Percona, PostgreSQL, Cassandra, ClickHouse, Timescale, Druid, MinIO);
- Data visualization: Ability to build meaningful dashboards that show the right insights (Grafana, OpenSearch Dashboards);
- Alerting and anomaly detection: Ability to build anomaly detection and alerting pipelines;
- Programming: Proficiency in one or more programming languages for automation scripts and integrations (Python, Go, Rust, C);
- Linux: Strong knowledge of Linux systems, especially Debian‑based distributions;
- Workflow: Ability to use workflow automation frameworks (Airflow, Prefect, n8n);
- Configuration management: Ability to design and develop configuration management codebases and deployment pipelines (SaltStack, Ansible, Rundeck);
- Networking: Strong understanding of networking protocols and concepts (Overlay, VPN, Proxy, DNS, HTTP, SSL, TCP, UDP);
- Security: Ability to design secure systems and working knowledge of security concepts and tools (Vault, PKI, mTLS).
Salary Range
- Gross Salary 5800-7600 EUR/Month
Apply for this job
Infrastructure
Vilnius / Kaunas
Our values
Our values are rooted in the actions of our people. They describe how we solve problems, make decisions, and ultimately - reach our goals as a team.