P
Senior Systems Engineer (Remote)
Job Description
Our client is seeking a highly experienced Senior Systems Engineer to join their globally distributed engineering team. This is a fully remote position, offering the opportunity to work from anywhere within the United States. You will play a crucial role in designing, developing, implementing, and maintaining complex, large-scale systems infrastructure that supports our innovative software products. The ideal candidate has a deep understanding of distributed systems, cloud computing environments (AWS, Azure, or GCP), and robust automation practices. You will be instrumental in ensuring the scalability, reliability, and performance of our systems, as well as mentoring junior engineers and contributing to architectural decisions.
Key Responsibilities:
Key Responsibilities:
- Design, deploy, and manage highly available and scalable cloud infrastructure (AWS/Azure/GCP).
- Develop and maintain robust automation frameworks for infrastructure provisioning, configuration management, and application deployment (e.g., Terraform, Ansible, Kubernetes).
- Implement and monitor system performance, identifying bottlenecks and areas for optimization.
- Troubleshoot and resolve complex system issues across production, staging, and development environments.
- Define and enforce system reliability and security best practices.
- Collaborate with software development teams to ensure efficient deployment pipelines and system integration.
- Develop and maintain comprehensive system documentation and runbooks.
- Lead technical discussions and provide architectural guidance on system design and infrastructure.
- Mentor junior systems engineers and share knowledge across the team.
- Participate in an on-call rotation to support critical systems.
- Evaluate and integrate new technologies to improve system efficiency and capability.
- Bachelor's degree in Computer Science, Engineering, or a related technical field. A Master's degree is a plus.
- 7+ years of experience in systems engineering, DevOps, or Site Reliability Engineering (SRE).
- Extensive experience with at least one major cloud provider (AWS, Azure, GCP).
- Proficiency in scripting and automation languages (e.g., Python, Bash, Go).
- Strong understanding of containerization technologies (Docker, Kubernetes).
- Experience with CI/CD tools and methodologies (e.g., Jenkins, GitLab CI).
- Deep knowledge of networking concepts (TCP/IP, DNS, HTTP/S, load balancing).
- Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Proven experience with infrastructure as code (IaC) principles and tools.
- Excellent problem-solving, analytical, and communication skills.
- Experience working in a remote-first or distributed team environment is highly desirable.
Original posting:
www.whatjobs.com