SRE (Site Reliability Engineer)
eToro is the trading and investing platform that empowers users to invest, share and learn. We were founded in 2007 with the vision of a world where everyone can trade and invest in a simple and transparent way. We have created an investment platform that is built around collaboration and investor education. On our platform, users can view other investors’ portfolios and statistics, and interact with them to exchange ideas, discuss strategies and benefit from shared knowledge. We have over 38 million registered users from 75 countries and our platform is available in 20 languages. We are a fast growing business with over 1,500 employees across 13 offices around the globe, strategically positioned to serve the needs of users. You can find out more about eToro here.
eToro is seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. Your role as a SRE will be to ensure our infrastructure and applications are reliable, scalable, and perform well. You will collaborate closely with cross-functional teams to design, build, and maintain resilient systems that meet the needs of our customers and business stakeholders.
Responsibilities:
- Collaborate with R&D engineers on coordination, communication, and execution of production-related operations
- Design, implement, and maintain scalable and reliable infrastructure solutions to support our applications and services.
- Develop and deploy monitoring, alerting, and logging systems to proactively identify and mitigate operational issues.
- Build a SRE dashboard with KPI to measure eToro’s application reliability.
- Conduct capacity planning and performance tuning to optimize system performance and resource utilization for improved user experience.
- Automate repetitive tasks and processes to streamline operations and improve efficiency.
- Participate in incident response and resolution, including root cause analysis and post-mortem reviews.
- Continuously evaluate and adopt new technologies and methodologies to enhance our infrastructure and operations.
- Documentation and Knowledge Sharing: Create and maintain documentation, runbooks, and knowledge base articles to document system configurations, procedures, and best practices.
- 4+ years’ as a DevOps/SRE/Integration engineer with a passion for technology and strong motivation to build highly reliable solutions.
- In-depth knowledge of Observability tools (Prometheus, Splunk, Data Dog, Grafana).
- Git, Jenkins, Gitaction(preferred), Virtualization, Containers, Kubernetes.
- Cloud providers: AWS / Azure (preferred) / GCP.
- Excellent understanding of Linux operating systems and scripting languages (Python, Bash).
- Strong communication skills, both verbal and written, with the ability to adapt the messaging to different perspectives (technical, business) and levels of detail.
- Ability to grasp new technologies quickly and prioritize and multitask on multiple responsibilities
- Excellent problem-solving skills and the ability to work effectively in a fast-paced, dynamic environment.
- Experience with , Ansible, Terraform - an advantage