Site Reliability Engineer (Immediate Opening) Site Reliability Engineer (Immediate Opening)
Responsibility :
Ensure 24x7 availability of client's production application systems
Drive focused initiatives that improve operational efficiency and scalability of the platform and applications
Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization
Identify and drive opportunities to improve automation for the company; scope and create automation for deployment, management and visibility of our services
Understand modern software security and secure software systems with cloud-based infrastructure
Provide full-stack diagnostics and determine root cause of internal problems
Analyse operational performance which supports delivering improvements to critical related system metrics & KPIs
Examine all areas of infrastructure and applications for improvement and suggest changes, rather than wait for direction
Safeguard application information against accidental or unauthorized damage, modification, or disclosure
Build and maintain redundant systems and procedures for high availability and disaster recovery
Develop integrated workflows for our support teams
Own the customer experience – think and act in ways that put our customers first, provide them a great digital experience, and make them promoters of our products and services
Respond to and help troubleshoot incidents
Participate in a 24x7 on-call rotation (within the job shift)
Key Skills and Competencies Needed
Experience with Service Mesh ("Kong" service mesh if possible)
3+ years of extensive experience with Infrastructure as a Code (IaaC) and Desired State Configuration (DSC) tools such as Terraform and Chef
3+ years of experience packaging, deploying, and managing containerized workloads running in common PaaS solutions (i.e. Docker, Kubernetes)
3+ years expertise in managing AWS infrastructure at scale including expertise in the following services: EC2, S3, Elastic Load Balancing, Lambda, Route 53, ECS, SQS, CloudWatch
Prior experience working in a DevOps or SRE environment
Highly experienced with automation and scripting using languages such as: PowerShell, Ruby, Go, Python, Bash
Large-scale monitoring and reporting experience using ELK stack, Dynatrace and/or New Relic (or other APM)
Experience with IIS management, troubleshooting, and performance monitoring
Experience managing web farms in a high-traffic SaaS environment
Strong analytical and problem-solving skills including robust troubleshooting skills with a focus on preventative and proactive actions
Extensive experience with .NET applications architecture components (caching, content delivery, high availability, load balancing, etc.)
Understanding of the Software/Application Development Life Cycle process and experience with implementing and maintaining CI/CD technologies including - TeamCity, Octopus Deploy, GitHub, Jenkins, Code fresh, etc.
Knowledge of or experience with most of the following technologies:
Active Directory, SSL, FTP, Big-IP F5, T-SQL, MongoDB, MySQL, SQL Server, Nagios, Git, TeamCity, Octopus Deploy, Codefresh, Chef, Salt, Docker, Kubernetes, Kafka, Azure, Linux Server Administration, Bash, Apache
Personal Attributes:
Demonstrated experience as a leader and mentor to teammates and co-workers across the organization and take a team-first approach to project and support initiatives
Be an enthusiastic learner, user, and advocate of our technologies
Has desire to win as a team – make big things happen by working together and being open and willing to try new ideas
Strong interpersonal and communications skills (written, verbal, & virtual) with ability to work in a team-oriented, collaborative environment
Must have high degree of personal integrity and ability to maintain strict confidentiality
Must uphold, safeguard, and promote the organization’s values and philosophy relating particularly to corporate ethics, integrity, and priorities
Ability to work without supervision on short-term projects
Strong drive, self-motivated, logical, with keen attention to detail