Position Details
About this role
This role involves developing and managing complex multimodal datasets for training and evaluating AI models, with a focus on synthetic data generation and data annotation.
Key Responsibilities
- Design data collection efforts
- Analyze large datasets
- Build data analysis tools
- Collaborate with scientists
- Ensure data quality
Technical Overview
The environment includes data collection, synthetic data generation, data analysis, and annotation systems, primarily using Python and related tools for multimodal and speech/text data.
Ideal Candidate
The ideal candidate is a mid-level AI or NLP professional with 2+ years experience in language data processing, data collection, and annotation, proficient in Python, and with a strong background in computational linguistics or AI data creation. They should be collaborative, detail-oriented, and capable of handling complex multimodal datasets.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Lack of experience with language data annotation systems, No proficiency in scripting languages like Python, No relevant advanced degree, Less than 2 years experience in relevant field
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile