Site Reliability Engineer
Your day to day:
The Site Reliability Engineer, as a part of the Product Development team, is responsible to ensure the availability and performance of our application portfolio, toolset, and platform, as well as helping to drive improvements and ehancements at scale while automating processes linked with building and deploying to Azure Cloud.
Also the SR Engineer will take care of are operations linked to monitoring of SLA-critical production platforms, addressing issues and manual intervention, all off these in close cooperation with the software development teams.
With these activities you will have a great impact on our business:
• Ensure environment stability, security and performance through SLO’s and CI/CD enforcement
• Create and improve delivery- and stability- focused tooling across a range of languages and environments
• Ongoing monitoring and control of the availability of the different services of the production 24/7
• Monitor and detect problems in the production environment, as well as 1st and 2nd tier infrastructure and application troubleshooting
• Participate in system design consulting, platform management, and capacity planning
With these skills you are a great candidate:
• 3+ years’ experience in a similar role
• Hands-on experience with public cloud (Azure) and serverless architecture
• Experience in using Infrastructure As Code (Terraform and/or CloudFormation)
• Experience with monitoring tools and vendors such as Prometheus, Grafana, ELK, NewRelic, Signalfx, CloudWatch, DataDog, PagerDuty, etc;
#_VOIS
 
                    