Expires soon Oracle

Service Reliability Operator (SRO), Cloud Operations

  • Reading (Berkshire)
  • IT development

Job description

Work with Oracle's world class technology to develop, implement, and support Oracle's global infrastructure.

As a member of the IT organization, assist with the analyze of existing complex programs and formulate logic for new complex internal systems. Prepare flowcharting, perform coding, and test/debug programs. Develop conversion and system implementation plans. Recommend changes to development, maintenance, and system standards.

Leading contributor individually and as a team member, providing direction and mentoring to others. Work is non-routine and very complex, involving the application of advanced technical/business skills in area of specialization. BS or equivalent experience in programming on enterprise or department servers or systems.

As part of Oracle's employment process candidates will be required to complete a pre-employment screening process, prior to an offer being made. This will involve identity and employment verification, salary verification, professional references, education verification and professional qualifications and memberships (if applicable).

Desired profile

Qualifications :

Location: Reading
A competitive salary is offered
Posting date: 16-Aug-2018
Please send your resumes until 13-Sep-2018

Job Title : Service Reliability Operator (SRO), Cloud Operations
Grade : IC4
Description : We are looking for a strong Service Reliability Operator (SRO) who will help ensure the availability of our Cloud services 24x7x365. The SRO will be directly accountable for the troubleshooting and resolution of service issues while continuously working to improve telemetry and automation. Your goal is to reduce time to mitigate, ensure we are measuring the right things, and automating tasks that impact development velocity, availability or productivity.
Responsibilities:
·  Perform proactive service checks and monitor/triage incoming system/application alerts, E-mails and calls to ensure appropriate priority and response
·  Triage and troubleshoot service impacting events from multiple signals including service requests, service telemetry and alerting
·  Communicate with professionalism and precision to internal and external customers during high priority situations
·  Identify and implement opportunities for automation, signal noise reduction, prevention of recurring issues and other actions to reduce time to mitigate service impacting events and increase the productivity of cloud operations resources
·  Manage the coordination, documentation and tracking of critical incidents ensuring rapid and complete issue resolution and appropriate closed loop to customers and other stakeholders
·  Work upstream with Service Operations and Development to develop and maintain standard operator procedures and troubleshooting guides, recommend modifications to OFSC product
·  Participate in project delivery aimed at increasing capabilities around monitoring, notification, configuration and deployment of servers and applications within the Oracle Cloud Platform
·  Assist in the training and development of more junior team members
·  Main Focus: Providing dedicated support to the mission critical OFSC customers
Knowledge, Skills, Abilities, and Background:
·  Able to work as part of a shift in a 24x7x365 operations team. Must be willing to work non-standard work shift.
·  Committed to DevOps culture
·  College diploma or university degree in the field of computer science or equivalent and/or equivalent work experience in infrastructure, systems, engineering or development environment
·  Strong technical background with an ability to troubleshoot issues impacting large scale service architectures and application stacks
·  Thorough knowledge of monitoring principles and frameworks (OEM, Thruk, Nagios, ThousandEyes)
·  Solid experience in data analysis and visualization (Python, Dashing, ELK, Grafana)
·  Demonstrable experience in scripting/programming languages: bash/shell, Python. Web-development: django, js, ajax
·  Proven achievements in internal tooling development and management of dev team; full stack deployment and management experience, successfully completed projects on automating workflows
·  Good knowledge of customer-facing OFSC documentation to be able to quickly find relevant information, verify implementation, troubleshoot the service, identify possible code or documentation issue. Applicable areas are Configuring, Integrating with, Administering OFSC, etc.
·  Hands-on experience with APIs and integrations (REST, SOAP, ICS)
·  Familiarity with TCP/IP protocols, firewall management, database administration, LNMP stack and particular technologies: Nginx, HAProxy, MySQL, Redis, OracleDB, PHP, PHP-FPM, Oracle Key Vault, Swift, OVM, F5 Big-IP
·  Experience installing, configuring and maintaining Oracle Linux, services and networking
·  Proven analytical and problem-solving abilities
·  Fluent Ukrainian/Russian is mandatory due to necessity to participate in numerous internal meetings with Development and ServiceOps Teams based in Ukraine

Make every future a success.
  • Job directory
  • Business directory