Site Reliability Engineer - Lead

Job Description:

Synopsis of the role: 

Seeking creative, high-energy, diverse and driven software engineers with hands-on development skills to work on a variety of meaningful projects. Our software engineering positions provide you the opportunity to join a team of talented engineers working with leading-edge technology. You are ideal for this position if you are a forward-thinking, committed, and enthusiastic software engineer who is passionate about technology.

What you’ll do: 

·  Work with teams across an organization and ensures core services reliability and keep an eye on capacity and performance.

·  Responsible for blameless postmortems and proactive identification of potential outages factor into iterative improvement.

·  Work closely with development and operations teams to build highly available, cost effective systems with extremely high uptime metrics.

·  Hands on experience Configuring and Administering SCM(GIT, SVN), Build (CMake, Make files, Maven), Nexus, CI(Jenkins), CD Automation Tools

·  Responsible for establishing end-to-end monitoring and alerting on all critical aspects to ensure SLAs and get proactive notifications of possible issues for all systems.

·  Work with cloud operations team to resolve trouble tickets, developing and running scripts, and troubleshooting.

·  Participate in 24x7X365 an on-call support for multiple core platforms globally. Using a “Follow the Sun” model, we expect working patterns will include on call duty, weekend and holiday season cover.

·  Participate in release cycles of our offerings, deploying code to integration, staging and production environments, integrating with continuous integration (CI) and continuous delivery (CD) tools, monitoring, and change management

·  Build Automation Work with Agile development teams to ensure smooth promotion of code, configuration and Docker images to production

·  Oversee and adapt monitoring and alerting systems. Interact with automated monitoring and healing infrastructure to ensure healthy environments

·  Develop automation to auto-correct or completely prevent issues in our solutions

·  Perform software updates, peer code reviews, testing, and Common Vulnerabilities and Exposures (CVE) analysis; respond to security threats

·  Identify single points of failure and other high-risk architecture issues; propose and implement more resilient resolutions

·  Create and maintain standard operating procedures (SOPs) for performing maintenance tasks, applying configuration changes, and remediating problems in our environment

·  Identify potential process improvements across the entire engineering organization

·  Define and drive architectural enhancements into system to mitigate potential failure points

·  Provide impact assessment and mitigation plan for changes going into the production environment

·  Investigate root cause of severe and systemic outages, identify corrective actions

·  As we transition to the Public cloud (Google or AWS), build new build and deployment patterns.

What experience you need:  

·  A minimum 10 years of experience as a Developer/Lead/Architect.

·  Bachelor's Degree in Computer Science, Information Management or in “STEM” Majors

·  Experience with configuring, customizing, and extending monitoring tools (Appdynamics, Apica, Sensu, Grafana, Prometheus, Graphite, Splunk, Zabbix, Nagios etc.)

·  10+ years’ experience with all stages of an agile software development lifecycle (CI/CD) supporting Java/Javascript UI applications (ex: Angular JS) and SAAS applications.

·  5 years of experience building JavaEE applications using, build tools like Maven/ANT, Subversion, JIRA Jenkins, Bitbucket and Chef

·  8+ years’ experience in continuous integration tools (Jenkins, SonarQube, JIRA, Nexus, Confluence, GIT-BitBucket, Maven, Gradle, RunDeck, is a plus)

·  3+ years’ experience with configuration management and automation (Ansible, Puppet, Chef, Salt)

·  3+ years’ experience deploying and managing infrastructure on public clouds (AWS, GCP, or Azure or Pivotal)

·  3+ years experience working on Kubernetes and other related applications.

·  Experience working with Nginx, Tomcat, HAProxy, Redis, Elastic Search, MongoDB, and RabbitMQ, Kafka, Zookeeper.

·  3+ years’ experience in Linux environments (CentOS).

·  Knowledge of TCP/IP networking, load balancers, high availability architecture, zero downtime production deployments. Comfortable with network troubleshooting (tcpdump, routing, proxies, firewalls, load balancers, etc.)

·  Demonst***d ability to script around repeatable tasks (Go, Ruby, Python, Bash)

·  Experience with  large scale cluster management systems (Mesos, Kubernetes)

·  Experience with Docker-based containers is a plus

·  Able to dive into any level of a modern internet service (schedulers, containers, Linux kernel, caching, object storage, distributed filesystems, RDBMS, NoSQL, etc.)

 
 
Needs to be local to St. Louis or Atlanta. Will go into the office 2 days a week to start and then transition to mostly remote.
 
 
Required Skills : GCP, Kubernetes,Jenkins,cloud.
Basic Qualification : Looking for a Lead SRE.
Additional Skills : Looking for a Lead SRE.
Background Check :Yes
Drug Screen :Yes
Notes :
Selling points for candidate :
Project Verification Info :
Candidate must be your W2 Employee :Yes
Exclusive to Client :No
Face to face interview required :No
Candidate must be local :Yes
Candidate must be authorized to work without sponsorship ::No
Interview times set : :No
Type of project :Development/Engineering
Master Job Title :Eng: Other
Branch Code :St. Louis

Indotronix is an Equal Opportunity Employer

Let Us Do the Heavy Lifting!
Upload your resume and we'll reach out when a job fits your skills.
Job Code
JPC - 147545
Posted Date
2022-12-01 12:45:21
Experience
6-10 years
Primary Skills
Jenkins, Kubernetes, GCP, cloud.
Salary
$63.43-$63.43
Contact Person

Arun Kumar MS

Hear from our employees:

Hear from our candidates:

Beyond Amazing

My recent interactions with one of your recruiters have been beyond amazing. When I called today, he remembered who I was and sounded like he was in a good mood and ready to assist me. I referred him to one of my family members who has a speech problem and he allowed me to do a 3-way call so I could help out. I am very appreciative: no matter how many times I called he did not sound annoyed or bothered.

Elyjah B.
Candidate (September 2022)