Offers “Station F”

New Station F

Data Science Intern

  • Stage
  • Paris (Paris)
  • IT development

Job description

About

About the Internship

Are you a passionate and curious engineering student looking for an opportunity to kickstart your career in AI? At Neuralk-AI, we're seeking a motivated intern to help us aggregate and structure training datasets by developing innovative web scraping pipelines. This is your chance to dive into the world of cutting-edge AI research while gaining hands-on experience in a dynamic startup environment!

About Neuralk-AI

We are a fast-growing deeptech startup, leading the way in AI innovation. Our mission is to build the technical tools that enable companies to create AI applications capable of interacting seamlessly with their structured data (tabular or graph databases). At the heart of our work is a modern AI embedding platform that transforms structured data into vector representations for applications in classification, regression, clustering, and more.

Backed by significant funding (several millions), we combine state-of-the-art academic research with practical business applications to drive real impact. Our culture values simplicity, clear communication, and a constant drive for optimization.

At Neuralk, you’ll join a team of passionate individuals eager to learn, grow, and transform the AI industry. We believe in fostering a diverse, respectful, and inclusive environment and welcome candidates from all backgrounds to apply.

Co-founders: Alexandre Pasquiou (CSO) & Antoine Moissenot (CEO).

Job Description

Mission Highlights

As a Data Science Intern, you’ll play a key role in building high-quality training datasets that fuel our AI models. By developing web scraping pipelines and consolidating diverse data sources, you’ll help lay the foundation for groundbreaking advancements in AI for structured data.

You will report to the CSO of Neuralk and will be located in our Paris offices.

Role & Responsibilities

This position is the keystone to our company’s core initiative to build a platform that automates expert AI workflows on structured datasets. You will lead the architectural choices and software developments in close collaboration with other ML engineers in the team. You will be responsible for:

· 
Web scraping: Design, implement, and maintain efficient web scraping pipelines to collect high-quality data from diverse online sources.

· 
Data cleaning and preprocessing: Ensure the scraped data is accurate, structured, and ready for use in training AI models.

· 
Dataset consolidation: Aggregate data from multiple sources, standardizing formats and ensuring compatibility with our AI platform.

· 
Collaborative work: Partner with our research and engineering teams (~5 people) to identify the most valuable data sources and contribute to our dataset strategy.

· 
Exploration: Experiment with innovative approaches to improve data quality and diversity, fueling better model performance.

Why you should join us ?

· 
Hands-on learning: Get practical experience in an exciting and rapidly evolving field.

· 
Mentorship: Work closely with experienced researchers and engineers who are eager to share their knowledge.

· 
Impactful work: Your contributions will directly support the development of cutting-edge AI models and platforms.

· 
Dynamic environment: Be part of a fast-growing startup where your ideas and efforts will make a tangible difference.

· 
Growth opportunities: Gain exposure to advanced AI concepts and methodologies, positioning yourself for a future career in machine learning.

Preferred Experience

· 
Currently pursuing a degree in Computer Science, Engineering, Data Science, or a related field (Bac+3/Bac+5 or equivalent).

· 
Programming skills: Proficiency in Python; experience with web scraping libraries like BeautifulSoup, Scrapy, or Selenium is a big plus.

· 
Data processing: Familiarity with data cleaning and preprocessing tools (e.g., Pandas, NumPy, Skrub).

· 
Strong interest in AI and machine learning; curiosity about how structured data can be transformed into actionable insights (Sklearn).

· 
Self-starter with the ability to work autonomously and solve problems creatively.

· 
Good communication skills in English.

Bonuses

· 
Experience with large-scale data collection or analysis projects.

· 
Interest or experience in deep learning frameworks (e.g., PyTorch, TensorFlow).

· 
Familiarity with version control systems like Git.

Recruitment Process

Please submit your application using the following link to ensure it is reviewed by our team: https://tally.so/r/wodVQX .

Additional Information

·  Contract Type: Internship (Between 4 and 6 months)
·  Location: Paris
·  Occasional remote authorized

Make every future a success.
  • Job directory
  • Business directory