Site Reliability Engineer

Apple3 days ago
Cork, IE
Apple

About this job

Job type: Full-time
Role: System Administrator
Industry: Consumer Electronics
Company size: 10k+ people
Company type: Public



Technologies

sysadmin, python, linux



Job description

The Engineering Productivity and Quality team is looking for Site Reliability Engineers to build and run the services that enable thousands of Apple engineers to develop the software products that delight millions of Apple customers. In the position you will have the opportunity to be founding member of a worldwide DevOps team that fosters a culture of innovation and continuous improvement. Responsibilities will include: Manage the migration and setup of a major application in a new environment Build automation and tooling to make our systems more reliable, resilient and scalable. Identify sources of instability in distributed systems and drive operational excellence. Monitor and stress test systems to collect metrics for tuning and capacity planning. Collaborate with engineering teams to release new features and become an authority on our services.

This job will provide you with: A team of highly skilled coworkers ready to both mentor and learn from you. Unique distributed computing problems with an open mind on how they can be solved. The opportunity to collaborate with hardworking engineering teams across a wide range of technology disciplines. The freedom to take ownership and drive meaningful improvements in the operational reliability of mission critical services.

Skills & requirements

  • Passion for continually learning and exploring new technologies.
  • Proficient in Linux and macOS systems management.
  • Deep experience with Docker and container management solutions
  • Deep experience with Kubernetes and cluster management
  • Familiar with Cassandra and PostGres databases
  • Familiar with Kafka CQRS systems
  • Familiar with application and service monitoring tools and techniques
  • Scala or Java-based programming
  • Python scripting
  • Involvement with incident management and response.
  • Excellent collaborative skills, with strong written and verbal communication.


  • These are not hard requirements but this position might be of interest if you have experience with or a desire to learn about:
  • Software Defined Infrastructure management tools like Terraform or CloudFormation
  • BLOB storage technologies.
  • Splunk, Grafana, Graphite or other monitoring tools.
  • Puppet or other configuration management tools.



Location

Cork, IE

By clicking apply you will leave devsnap. Please be careful. You should never have to pay to apply.
A new version is available REFRESH