[J805] HIGHLY SKILLED TECHNICAL OPERATIONS SPECIALIST

Bebeetechnical


\ Job Title Senior Service Reliability Engineer \ About The Business Unit \ We are constantly trying to evolve the way the world thinks about travel by providing flexible systems on which our clients can deliver unparalleled booking experiences. With more than 50 customers worldwide, including many of the world's most successful hybrid and low-cost airlines, we offer a unique opportunity to make a significant impact in the world of travel. \ Key Responsibilities Summary \ \ - Provide technical support on related to production system availability, latency, performance, and efficiency issues. \ - Support monitoring tools currently in production. \ - Provide emergency response to production system incidents. \ - Maintain production ticketing system. \ - Maintain the knowledgebase solution platform. \ - Create, delete, and maintain production automation solutions using tools. \ - Automation of day-to-day tasks. \ - Resolve/remove false-positive alerts. \ - Configure and update alert dashboards. \ - Maintain tasks using task scheduler. \ - Participate during application releases implementation. \ - Analyze and interpret application logs to determine problem areas. \ - Enhance current application and device monitoring systems. \ - Help evaluate application performance statistics, including application and system response times. \ - Excellent technical knowledge on DevOps area where applications are hosted on Cloud. \ - Design and deliver highly efficient and scalable test systems. \ - Bachelor's or graduate degree in engineering or equivalent work experience. \ - Working knowledge of Linux and Windows operating systems. \ - Ideal experience with SOP, SU, SLO, Automation, Capacity Management, Operational Improvement, and Operational Readiness Testing. \ - Ability to technically troubleshoot cloud solutions, analyzing technical problems within the application, server, and operating systems logs to identify root causes and resolving the issue creating an impact to system's availability in production. \ - Experience supporting monitoring, alerting, or pipeline analysis tool while optimizing the current configuration of those monitoring tools and technically maintaining their availability. \ - General networking knowledge. \ - Knowledge and practical exposure to IT and Cloud operations, ideally in mission-critical environments. \ - Knowledge of standard automation tools and scripting: Terraform, FLUX. \ - Experience implementing measurements and alerting in complex environments using standard tools like Splunk Grafana, Prometheus, Argos, ServiceNow. \ - Knowledge of Kubernetes, OpenShift, and Azure AKS is a plus. \ - Good written and verbal communication ability. \ - Knowledge in C#, .NET, or other common programming languages. \ What We Can Offer You \\ - Competitive remuneration package. \ - Hybrid work arrangement at our Bogota office. \ - Professional development opportunities. \ - Diverse and inclusive workplace culture. \ - Chance to impact millions of travelers around the globe. \\

trabajosonline.net © 2017–2021
Más información