Position Details
About this role
As an AI Tutor for Rufus, you will evaluate and label data to improve the fluency and overall shopping experience of Amazon’s shopping AI model. You’ll apply high judgment to ambiguous tasks, follow labeling guidelines and KPIs, and help refine prompting strategies in partnership with product and engineering teams.
Key Responsibilities
- Deliver high-quality labelled data meeting KPIs
- Conduct high-judgment evaluations for LLM training
- Generate human insight data across text, image, video, and audio
- Develop prompting strategies to train and improve the Shopping LLM
- Analyze root causes and propose solutions to improve labeling quality and SOP/tooling
Technical Overview
This role supports training of Large Language Models using high-judgment evaluations and labeled datasets across multiple modalities (text, image, video, audio). You will develop prompting strategies and use root-cause analysis to identify error patterns and improve labeling quality and operational processes.
Ideal Candidate
The ideal candidate is an AI-focused evaluator and labeler with strong language skills and proven high-judgment performance in ambiguous, detail-heavy tasks. They have experience supporting the training of Large Language Models through data labeling, quality evaluation, and iterative prompting strategies in collaboration with Product, Science, and Engineering teams.
Must-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Must demonstrate strong language skills and exceptional attention to detail, Must be able to perform high-judgment evaluations and deliver high-quality labelled data meeting KPIs, Must be able to work with ambiguous or incomplete information and maintain a high quality bar
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile