Specialist Engineer
Leeds (West Yorkshire) Bachelor's Degree IT development
Job description
Role Title: Specialist Engineer
Business: HSBC Operations and Technology (HOST)
New or Existing Role? Existing
Grade: GCB4
Role Purpose
· As a Big Data Engineer/ Specialist, you'll be part of an Agile/ DevOPS based team dedicated to breaking the traditional BI norms and pushing the limits of continuous improvement and innovation. You will participate in detailed technical design, development and implementation of applications using Big Data / Hadoop and associated Cloud and on Prem technology platforms. Working within an Agile environment, you will provide input into architectural design decisions, develop code to meet story acceptance criteria, and ensure that the data services that we build are always available to our customers. You'll have the opportunity to mentor other engineers and develop your technical knowledge and skills to keep your mind and our business on the cutting edge of technology.
· At HSBC, we have seas of big data and streams of fast data that we would like to put into the Data Lake form and hosting on Cloud platforms like AWS/ Google Cloud. To tame it we are using tools like Spark, Scala, HBase,
· Hadoop, HDFS, AVRO, AWS – EC2, AWS-S3, AWS-Redshift, Google Cloud/ Analytics, Ab Initio, SAS VA,
· Pentaho and Tableau – and we're on the look-out for anything else that will help us realize our vision.
· We are looking for a Big Data Engineer and Specialists that will work on the collecting, storing, processing, and analyzing of huge sets of data and offering that as a service to the various business stakeholders. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them. You will also be responsible for integrating them with the architecture used across the bank.
· Key Accountabilities
· Selecting and integrating Big Data tools and frameworks required to provide requested capabilities
· Implementing Meta data driven data ingest process in a performance oriented manner while undertaking the
· Data Lineage, Tagging, security, scalability and generic pattern/ framework into consideration into HDFS
· Design and build of configurable business transformation engine for derivation, calculation and enrichment of
· data by business line through the appropriate tiers of Hadoop
· Building the semantic layer and associated metadata for ease of data consumption by business lines
· Building the reporting interfaces, dashboards, data extracts, data output feeds and management reporting using the chosen analytical tools
· Monitoring performance and advising any necessary infrastructure change
Customers / Stakeholders
· Business SMEs; Lead business SME sessions to gather business data requirements and data semantics and, documents
· Support the identification of stakeholder goals related to business information management and effectively manage their expectations, addressing any misalignment
Leadership & Teamwork
· Lead Business SMEs through the gathering of data requirements
· Lead BDA, DM, DG and DQ colleagues through the understanding and integration of specific domain models
· Support integration of the HBIM with other workstreams of Group Data Service and GB/GF CDOs
· Understand and manage the Business SME requirements and expectations to ensure their satisfaction
· Support production of programme deliverables as required
Operational Effectiveness & Control
· Support implementation of the business information models in the data dictionary and other related tools
· Support direct adoption of the HBIM through engagement of global and group wide transformation programs
· Support the group wide usage of HBIM providing support and training
Management of Risk
· Maintain awareness of operational risk and minimize the likelihood of it occurring including its identification, assessment, mitigation and control, loss identification and reporting in accordance with section B.1.2 of the Group Operations FIM
Observation of Internal Controls
· Maintain awareness of operational risk and minimize the likelihood of it occurring including its identification, assessment, mitigation and control, loss identification and reporting in accordance with section B.1.2 of the Group Operations FIM
Desired profile
Qualifications :
· Proficiency with Hadoop and Eco systems of tools
· Experience with Hortonworks/ Cloudera/ MapR Hadoop distributions
· Experience with building stream-processing systems, using solutions such as Kafka, Storm or
· Spark-Streaming
· Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
· Experience with Spark
· Experience with integration of data from multiple data sources in a speedy manner (AbInitio/Pentaho/Podium Data/ Talend)
· Experience with NoSQL databases, such as HBase/ Cassandra/ MongoDB
· Knowledge of various ETL techniques and frameworks, such as Flume
· Experience with various messaging systems, such as Kafka
· Experience with Big Data ML toolkits, such as Mahout, SparkML
· Experience with common data science toolkits, such as R, Python, Weka, NumPy, MatLab,
· Experience with data visualization tools, such as SAS VA, Google Analytics, Tableau, Qlikview.
· Proficiency in using query languages such as SQL, Hive & Pig
· Exposure on data management in Cloud Environment – AWS / Google Cloud preferred
· Exposure and usage of AWS platform and data management tools/ services like EC2, S3, Redshift, EBS
· Leadership capabilities
· Navigating – understanding and translating programme objectives and aligning directions accordingly
· Aspiring – being ambitious about providing the highest standards of delivery
· Driving – setting stretching goals for self and teams and delivering them with courage and tenacity
· Mobilising – authentically engaging with team, colleagues and business partners to deliver at pace
· Sustaining – making considered decisions that protect and enhance HSBC values, reputation and business
· Bachelor's Degree from top notch College
· 3 years of experience with Unix/Linux systems with scripting experience in Shell, Perl or Python
· At least 5 years of experience building data pipelines
· At least 6 years work experience working with industries having enormous data volume (Telco/
· Financial/ Retail/ Pharma etc)
· Experience with recognized industry patterns, methodologies, and techniques
· Familiarity with DevOPS/ Agile engineering practices
· At least 2 years of experience with Spark, Scala and/or Akka.
· At least 2 years of experience with Spark Streaming, Storm, Flink, or other Stream Processing
· Technologies.
· At least 2 year experience with NoSQL implementation (Mongo, Cassandra, etc. a plus)
· At least 2 year experience developing Java based software solutions
· Candidates with exposure and project engineering experience working with AWS/ Google Cloud will be preferred
We are an equal opportunity employer and are committed to creating a diverse environment.