Offers “General Electric”

Expires soon General Electric

Sr. Staff Site Reliability Engineer

  • Alpharetta (Fulton County)
  • Design / Civil engineering / Industrial engineering

Job description

Job Number

2779683

Business

GE Digital

Business Segment

Digital Technology

About Us

GE is the world's Digital Industrial Company, transforming industry with software-defined machines and solutions that are connected, responsive and predictive. Through our people, leadership development, services, technology and scale, GE delivers better outcomes for global customers by speaking the language of industry.
GE offers a great work environment, professional development, challenging careers, and competitive compensation. GE is an Equal Opportunity Employer . Employment decisions are made without regard to race, color, religion, national or ethnic origin, sex, sexual orientation, gender identity or expression, age, disability, protected veteran status or other characteristics protected by law.

Posted Position Title

Sr. Staff Site Reliability Engineer

Career Level

Experienced

Function

Digital Technology

Function Segment

Digital Engineering

Location(s) Where Opening Is Available

United States

U.S. State, China or Canada Provinces

Georgia

City

Alpharetta, Atlanta

Postal Code

30005-4154

Relocation Assistance

No

Role Summary/Purpose

The Sr. Site Reliability Engineer will be responsible for performance and availability of Compute and Network infrastructure consumed by all business segments. The Site Reliability teams are composed of highly talented individuals obsessively focused with availability through operational excellence. The ideal individual is relentlessly technical, passionate for automating everything and totally committed to delivering amazing customer experiences.

Essential Responsibilities

As a Sr. Site Reliability Engineer, for GE Digital's Global Operations, you must have an excellent understanding of standard IT infrastructure equipment and systems – reliability and failure causes, the ability to quickly understand the key operational characteristics of new equipment and systems, interview domain experts for failure mode knowledge, and assess how possible failure models will affect measured parameters and key performance indicators (KPIs).

Available 24x7 to quickly respond and resolve critical service outages severely impacting consumers.

·  Establish performance baseline, capacity thresholds, correlate events, and define monitoring/alerting criteria

·  Develop automated solutions to address potential problems before they result in a service interruption

·  Provide impact assessment and mitigation plan for changes going into the production environment

·  Investigate root cause of severe and systemic outages, identify corrective actions and apply across the enterprise

·  Develop availability measures that align with consumer experience to accurately assess the usability of crucial services

·  Build capacity models to baseline transactional load compared to resource performance and leverage data to predict overall system capacity while automating load placement to avoid outages

·  Identify thresholds for all critical links in the data path to quickly isolate where imbalances may result in potential outages

·  Analyze failure points in services to model risk level and resolution steps if failure occurs.

·  Assist in driving architecture enhancements into system to mitigate potential failure points.

·  Programmatically monitor for and remediate configuration drift of critical devices

·  Develop response plans to potential failure points and evaluate effectiveness during planned tests

·  Perform comprehensive operational health checks of the entire services to identify areas of concern and track activities to drive improvements at all levels of the architecture

·  Provide technical coaching and direction to more junior teammates

Additional Eligibility Qualifications

GE will only employ those who are legally authorized to work in the United States for this opening. Any offer of employment is conditioned upon the successful completion of a background investigation and drug screen.

Desired Characteristics

Technical Expertise:

·  Excellent knowledge of common operating systems (Unix/Linux, Windows)Strong oral and written communication skills.

·  Demonstrated experience scripting or developing software and services for the cloud Ruby, Python, Go, Java, Node.js, .NET, etc.

·  Extensive knowledge of network protocols (TCP/IP, SNMP, FTP, syslog, TFTP, etc.

·  Experience managing version control systems such as Git

·  Experience deploying and managing infrastructure on public clouds such as AWS or Azure

·  Experience using an automated configuration management system (Terraform, Chef, Puppet, Ansible, Salt, etc.)

·  Strong organizational and project management skills

·  Strong analytical and problem resolution skills

·  Excellent knowledge of Network Management (SNMP, MIB)

·  Experience with configuring, customizing, and extending monitoring tools (Datadog, Sensu, Grafana, Splunk, etc.)

·  Excellent knowledge of TCP/IP networking, and inter-networking technologies (routing/switching, proxy, firewall, load balancing etc.)

·  Knowledge and experience using Analytics Software Packages like Matlab, SAS, JMPro etc. Programming experience with open source scripting and data analysis packages like Python, R is a plus.

Leadership:

·  Proactively engages with cross-functional teams to resolve issues and design solutions using critical thinking and analytics skills and best practices by actively incorporating input from various sources

·  Strong analytical and strong problem solving skills - effectively evaluates information/data to make decisions; anticipates obstacles and develops plans to resolve

·  Continuous improvement oriented – actively generates process improvements; champions and drives change initiatives

·  Ability to deliver results in a rapidly changing dynamic environment

Personal Attributes:

·  Emotional Intelligence, ability to influence up and out and the ability to work independently

·  Must be a team player with a strong desire to win

·  Passionate about continuously learning and able to quickly adapt and pivot to win in dynamic environment

·  Highly organized and efficient; able to balance competing priorities and execute accordingly

·  Strong oral and written communication skills.

#DTR

Desired profile

Qualifications/Requirements

Basic Qualifications

·  Bachelor's degree in Computer Science, Information Management, similar STEM degree, or in lieu a high school diploma with equivalent years of experience

·  Minimum 3 years IT experience in enterprise-wide deployments.

Eligibility Requirements

·  Legal authorization to work in the U.S. is required. We will not sponsor individuals for employment visas, now or in the future, for this job.

Make every future a success.
  • Job directory
  • Business directory