✦ Luna Orbit — AI & Machine Learning

Data Scientist, AWS Quick Data

at Amazon.com

📍 US, CA, Santa Clara Hybrid Posted April 01, 2026
Type Full-Time
Experience mid
Exp. Years 2+ years
Education Master's degree in Science, Technology, Engineering, or Mathematics (STEM), or equivalent
Category AI & Machine Learning

Data Scientist II will design and develop evaluation and benchmarking datasets for Quick Suite AI features, leveraging LLMs for synthetic data and conducting human annotation audits. The role focuses on measuring model performance, creating ground-truth QA datasets, and contributing to Responsible AI initiatives.

  • Design and develop comprehensive evaluation and benchmarking datasets for Quick Suite AI-powered features
  • Leverage LLMs for synthetic data corpora generation; data evaluation and quality assessment using LLM-as-a-judge settings
  • Create ground truth datasets with high-quality question-answer pairs across diverse domains and use cases
  • Lead human annotation initiatives and model evaluation audits to ensure data quality and relevance
  • Develop and refine annotation guidelines and quality frameworks for evaluation tasks

The role involves data-centric AI work with SQL/Python/R/SAS/MATLAB, building evaluation pipelines, and applying ML/statistical methods to benchmark model performance. Key areas include LLM-based synthetic data generation, RAG approaches, and quality frameworks for evaluation tasks.

The ideal candidate is a mid-level data scientist with 2+ years of data science experience, strong SQL/Python/R skills, and a background in building and evaluating AI datasets. They should be adept at working with LLMs for synthetic data, leading annotation initiatives, and applying quantitative analysis to improve model performance while adhering to Responsible AI practices.

2+ years of data scientist experience3+ years of data querying languages (e.g. SQL)scripting languages (e.g. Python) or statistical/mathematical software (e.g. RSASMatlabetc.) experience3+ years of machine learning/statistical modeling data analysis tools and techniquesand parameters that affect their performance experience1+ years of working with or evaluating AI systems experience1+ years of creating or contributing to mathematical textbooksresearch papersor educational content experienceMaster's degree in ScienceTechnologyEngineeringor Mathematics (STEM)or experience working in ScienceTechnologyEngineeringor Mathematics (STEM)Experience applying theoretical models in an applied environment
Ph.D. in ScienceTechnologyEngineeringor Mathematics (STEM)Knowledge of machine learning concepts and their application to reasoning and problem-solvingExperience in a ML or data scientist role with a large technology companyExperience in defining and creating benchmarks for assessing GenAI model performanceExperience working on multi-teamcross-disciplinary projectsExperience applying quantitative analysis to solve business problems and making data-driven business decisionsExperience effectively communicating complex concepts through written and verbal communication
SQLStructured Query LanguagePythonRSASMATLABLLMRetrieval-Augmented Generation
Data ScientistSQLStructured Query LanguagePythonRSASMATLABMachine LearningStatistical ModelingData AnalysisGround Truth DatasetsAnnotation GuidelinesHuman AnnotationData PipelinesEvaluation DatasetsResponsible AIRetrieval-Augmented GenerationLLMSynthetic DataQuestion-Answer pairsLLM-as-a-judge
SQLStructured Query LanguagePythonRSASMATLABMachine LearningStatistical ModelingData AnalysisGround Truth DatasetsAnnotation GuidelinesHuman AnnotationData PipelinesEvaluation DatasetsResponsible AIRetrieval-Augmented GenerationLLMSynthetic DataQuestion-Answer pairsLLM-as-a-judge
CollaborationCommunicationAnalytical thinkingProblem-solvingAttention to detailTeamwork

Preferred

AI/ML degree or certifications
Industry SaaS
Job Function Develop and evaluate data and datasets for AI/ML features to enable enterprise-ready generative AI.
Role Subtype Data Scientist II
Tech Domains Python, SQL / PostgreSQL, R, MATLAB, SAS, Machine Learning, Ground Truth Datasets, NLP, LLM
Data Scientist IIData ScientistGenerative AILLMLLM-as-a-judgeRetrieval-Augmented Generationaws quick suitehybriddata pipelinesPythonSQLStructured Query LanguageRSASMATLABGround Truth DatasetsAnnotation GuidelinesHuman AnnotationResponsible AIEvaluation DatasetsQuestion-Answer pairsSynthetic Data

2+ years data scientist experience not met, 3+ years data querying/scripting/statistical software experience not met, No experience with AI systems or evaluation of AI models, No Master's degree in STEM or equivalent experience, Lack of experience applying theoretical models in an applied environment

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile