Stage - Trustworthy Deep Learning: VLM-based (CLIP) OoD Detection H/F
Stage Palaiseau (Essonne) IT development
Job description
Vacancy details
General information
Organisation
The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development and innovation in four main areas :• defence and security,
• nuclear energy (fission and fusion),
• technological research for industry,
• fundamental research in the physical sciences and life sciences.
Drawing on its widely acknowledged expertise, and thanks to its 16000 technicians, engineers, researchers and staff, the CEA actively participates in collaborative projects with a large number of academic and industrial partners.
The CEA is established in ten centers spread throughout France
Reference
2024-34494Position description
Category
Information system
Contract
Internship
Job title
Stage - Trustworthy Deep Learning: VLM-based (CLIP) OoD Detection H/F
Subject
CLIP-based OoD Detection with Post-hoc Methods
Contract duration (months)
6
Job description
Context
The List Institute at CEA Tech (CEA’s technological research division), dedicate its activities to driving innovation in intelligent digital systems. The specialized R&D programs aim to carry out technological developments of excellence in critical industry sectors and by partnering with key industry and academic actors.
Within the LIST Institute, at the heart of the Paris-Saclay Campus (Essonne), the Embedded and Autonomous Systems Design Laboratory (LSEA) works on methods and tools for the design & development of trustworthy autonomous systems that incorporate AI-based components. In particular, the LSEA’s Trustworthy Deep Learning (TDL) team conducts research on confidence (uncertainty) representation and monitoring in deep neural networks (DNNs) for computer vision tasks and automated robots.
Mission
The detection of out-of-distribution (OoD) samples is crucial for deploying machine learning (ML) models in real-world scenarios. OoD samples pose a challenge to ML models as they are not represented in the training data and can naturally arrive during deployment (i.e., a distribution shift), increasing the risk of obtaining wrong predictions. Consequently, the detection of OoD samples is
crucial in safety-critical domains, such as healthcare or automated vehicles, where trustworthy models are required.
To address the OoD detection task, previous works have been focused on proposing post-hoc confidence scores for fully supervised settings using a single data modality (e.g., images and image classification tasks). However, the advent of vision-language models (VLMs), represented by CLIP, has accelerated the field of computer vision and allowed zero-shot and few-shot learning schemes for
different tasks. In this regard, a new paradigm has emerged where CLIP is used for OoD detection with confidence scores that leverage visual features and textual concepts, leaving the applicability of existing post-hoc confidence scores for situations where CLIP is fine-tuned with more data.
Interestingly, recent works showed that CLIP fine-tuning tends to improve classification accuracy but does not necessarily enhance OoD detection accuracy when using post-hoc methods. A plausible hypothesis for that effect is that fine-tuning procedures may destroy CLIP’s rich visual-language representations. Therefore, with this internship, we seek to explore strategies to augment CLIP’s robustness when fine-tuning procedures are applied so that existing and new post-hoc confidence measures can be used to detect OoD samples without a decrease in detection performance.
Internship Objectives
- Study the State-of-the-Art methods for fine-tuning and augmenting CLIP’s robustness.
- Evaluate the performance of post-hoc confidence scores for OoD detection on fine-tuned CLIP that employ robustness augmentation methods.
- Extract/inspect internal visual-language features of CLIP, and design a CLIP-based post-hoc confidence score for OoD detection.
Methods / Means
Python, PyTorch, VLMs-CLIP
Applicant Profile
What do we expect from you?
- You are a 2nd year Master student (M2 – France).
- Proficiency in Python and PyTorch.
- Computer vision and deep learning skills: VLMs (CLIP).
In line with CEA's commitment to integrating people with disabilities, this job is open to all.
Position location
Site
Other
Job location
France, Ile-de-France, Essonne (91)
Location
Candidate criteria
Languages
English (Fluent)
Prepared diploma
Bac+5 - Master 2
Recommended training
Computer Science
PhD opportunity
Oui
Requester
Position start date
29/11/2024