Site Reliability Engineer
We are looking for a Site Reliability Engineer who is excited about applying SRE principles to system infrastructure.
- Build and maintain a observability framework to provide metrics and availability
- Implement and manage resource utilization, billing and make suggestions for optimization
- Analyze performance bottleneck based on monitoring data Implement and manage security and compliance policies
- Build scalable and reliable distributed systems
- 3+ years of relevant professional experience
- Familiar with Postgresql, Mysql, Kafka
- Fluent in at least one of popular scripting languages (Python or Go)
- Intermediate level experience with infrastructure as code. Terraform, packer and ansible are our tools of choice
- Intermediate Linux operating systems and containerization.
- Hands on experience with Kubernetes, Helm, Docker
- Experience supporting production data systems
- Have worked with monitoring systems such as ELK and Prometheus
- Automation mindset
- We believe our values make a difference:
- We value, support, and help each other grow
- We are committed to active inclusion and diversity
- We are transparent and believe the best idea wins
- We succeed when our customers succeed
- And we keep it fun!
Please contact me for more detail via the below: