Data Scientist / Research Developer – DME
CDD Paris (Paris)
Job description
About
At WhiteLab Genomics, we believe that every patient deserves access to life-saving treatments. We won’t rest until that belief becomes a reality.
Driven by our commitment to transforming the future of healthcare, we’ve earned our place among the top innovators, including Y Combinator, French Tech 2030, and Future 40 by Station F . We’ve also been recognized by The Galien Foundation as a “Best Startup” in healthcare innovation and as Manufacturing Tech Disruptor of the Year - 2024 at Advanced Therapy Awards: Phacilitate in the USA, among other accolades.
We're committed to revolutionizing genomic medicine, delivering single-dose cures to millions with cancer and neurodegenerative and rare diseases worldwide.
By integrating cutting-edge data science, computational biology, and structural biology to design genomic medicine vectors with unprecedented precision, we're advancing the frontiers of healthcare and addressing critical challenges with the highest stakes.
If you’re passionate about improving patient outcomes and the intersection of biology and technology, we’d love to meet you!
Join us – together, we’ll bring life-saving treatments to the people who need them most.
Job Description
The Data Scientist / Research Developer will join the DME team , a transversal group working closely with Protein Vector Engineering and Computational Biology.
The role is primarily focused on data science and coding (Python), with a strong emphasis on:
·
implementing and evaluating ML models,
·
turning prototypes into clean, reusable code ,
·
helping to make data and models FAIR and systematic .
Biology knowledge is a strong plus but not strictly required ; you will learn domain context on the job.
Core mission within the AI FAIR Lab:
·
Work on data science tasks for internal projects related to gene and cell therapies (in collaboration with other data scientists and domain teams).
·
Implement and maintain ML experiments in a reproducible way:data preprocessing, model training scripts, evaluation workflows.
·
Help transform research prototypes (from notebooks) into: clean Python modules , small internal libraries or scripts that can be reused across projects.
·
Contribute to the FAIR and systematisation work of the team by:
· adding tests, logging and configuration to existing code,
· refactoring models for reusability (e.g. moving from ad hoc code to parameterised functions/classes),
· helping to standardise input/output formats.
·
Participate in benchmarking and comparisons of different models/approaches:
· setting up evaluation pipelines,
· tracking metrics,
· generating simple reports and visualisations.
·
Collaborate with more senior team members (Senior Scientist, Head of DME team) to implement ideas from papers or internal proposals, iterate quickly on model and data improvements.
Preferred Experience
MSc with 1–2+ years of experience (internships can count if substantial).
PhD is a plus but not required.
Fields (or equivalent experience): Data Science, Computer Science, Applied Mathematics, Statistics, Machine Learning, or related quantitative disciplines.
Preferred experience with some of:
·
End-to-end data science projects:
· data cleaning, feature engineering, model training, evaluation and reporting.
·
ML (classical and basic Deep Learning):
· regression, classification, tree-based models;
· basic neural networks.
· Graph NN are a plus.
·
Software development and scientific pipeline development: writing modular Python code, packaging utilities, working with Git, Docker.
·
Working with real-world, messy data (any domain).
·
*Biology / omics / healthcare data (*a plus, but not mandatory)
Recruitment Process
·
Entretien téléphonique avec Oscar (Head of DME)
·
Entretien visio avec l'équipe People & Culture
·
Cas technique a réaliser chez soi
·
Onsite meeting (présentation du cas technique et rencontre avec l'équipe
Additional Information
· Contract Type: Full-Time
· Location: Paris
· Education Level: Fourth-Year University Level
· Experience: > 1 year
· Possible partial remote