Watson Health - Data Scientist - Map, Extract, Transform, Load (METL)
Cleveland (Cuyahoga County) Personal services
Job description
The Data Scientist on Map, Extract, Transform, Load (METL) is responsible for the data and quality of the data that is brought into the platform. The candidate determines the critical data elements to extract from a variety of systems Electronic Health Records (EHR), claims and billing, flat files, and other databases. The candidate is also responsible for building the code to extract and load those data elements. The final step is to conduct Quality Assurance (QA) on the Extract Transform And Load (ETL) process and resolve any issues determined internally or externally with the data.
Essential Functions:
· Identifying clinical, financial, and operational data elements within different systems (Electronic Medical Records (EMRs), billing, etc.)
· Developing new methods of data extraction and data mining (such as utilizing Natural Language Processing (NLP) tools to obtain clinical information from unstructured text)
· Working on a day-to-day basis with Explorys' software engineering teams to ensure that our clinical, financial, and operational data are being accurately represented in our applications
· Formulating validation strategies and methods to ensure accurate and reliable data
· Supporting the rest of the Informatics Team in understanding and processing the data in our system
· Extracting data from traditional database architecture/flat files, performing transformation on the extracted data using technologies like Cloudera Impala, Apache Pig, Apache Hive, Ruby and loading data into the Hadoop grid
Candidate should possess:
· Ability to work in a cross-functional environment
· Knowledge of classification systems and clinical vocabularies and nomenclature, such as International Classification Of Diseases (ICD)-9, Current Procedural Terminology (CPT), Healthcare Common Procedure Coding System (HCPC), Logical Observation Identifiers Names And Codes (LOINC), Systematized Nomenclature of Medicine - Clinical Terms (SNOMED – CT), National Drug Code (NDC), and RxNorm
· Self-motivated ability to multitask
· 1-3 years of SQL experience
· Programming experience in at least one development environment (Python, Ruby, etc.)
· Exposure to big data technologies (Pig, Hive, Impala, Hadoop Distributed File System (HDFS), HBase)
· Experience working with EHR systems
· Communications skills – Effective interpersonal and customer service skills
· Analytical skills – Ability to conduct descriptive statistics of data populated in a database CSUSWDevSupport
Auto req ID
2524BR
Required Education
None
Role ( Job Role )
Software Developer
State / Province
OHIO
Primary job category
Software Development & Support
Contract type
Regular
Employment Type
Full-Time
ERBP
Yes
Is this role a commissionable/sales incentive based position?
No
Travel Required
No Travel
IBM Business Group
Watson Health
Preferred Education
Bachelor's Degree
City / Township / Village
CLEVELAND
EO Statement
IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
Required Technical and Professional Expertise
· At least 1 year experience in a clinical and technical environment or a degree in Computational Biology, Bioinformatics, Computer Science, RN, or related field
· Basic knowledge in principles of clinical, financial, and operational data
Skill-keywords
SQL, ETL, Big Data, Health Care,
Country
United States
Preferred Technical and Professional Experience
· At least 3 years experience in a clinical and technical environment or a degree in Computational Biology, Bioinformatics, Computer Science, RN, or related field
· Basic knowledge in classification systems, clinical vocabularies, and nomenclature, such as ICD-9, CPT, HCPC, LOINC, SNOMED-CT, NDC, and RxNorm preferred
· Basic knowledge in healthcare financial data
Eligibility Requirements
none