About this role
AWS is hiring multiple Research Scientist II candidates to perform and support psychometric research and operational work for certification exam development. The role covers item and test analyses, test form assembly, item bank health monitoring, standard setting, and statistical/psychometric modeling to ensure exam quality and validity.
Key Responsibilities
- Perform operational psychometric analysis (item and test analyses, item bank health, standard setting, JTA)
- Build optimal test forms and pools using optimization techniques
- Develop and apply statistical and psychometric modeling to evaluate exam validity, reliability, applicability, efficiency, accuracy
- Participate in research projects using ML, NLP, and GenAI to improve operational processes
- Develop automation code in R or Python for psychometric workflow pipelines and report results
Technical Overview
The technical scope includes operational psychometric tasks such as item response theory, classical test theory, equating and scaling, and form/pool assembly, along with complex test designs like LOFT and CAT. The scientist will also develop automation code using R or Python and apply machine learning (ML) and/or natural language processing (NLP) techniques, while following ISO/IEC:2012 17024 and NCCA accreditation standards.
Ideal Candidate
The ideal candidate is a Research Scientist II with a PhD in psychometrics, educational measurement, statistics, or a closely related quantitative field. They have 1+ year experience performing operational psychometric work for large-scale education, licensure, or certification programs, including item response theory, equating and scaling, standard setting, and job task analysis, plus programming skills in R and/or Python and exposure to ML or NLP.
Must-Have Skills
PhD or foreign equivalent degree in StatisticsPsychometricsEducational MeasurementQuantitative PsychologyData ScienceIndustrial-Organizational (I/O) Psychologyor a related fieldone year of research or work experience in the job offered or as a Research ScientistResearch AssistantSoftware Engineeror a related occupation1 year of experience in large-scale educationlicensureor certification assessment programsoperational psychometric tasks on large-scale educationlicensureor certification assessment programs including item analysisequating and scalingitem response theoryclassical test theoryform and pool assemblyitem bank health analysisstandard settingand job task analysisat least one of the complex test designs such as linear-on-the-fly testing (LOFT)computerized adaptive testing (CAT)at least one of the following areas including machine learning (ML) or natural language processing (NLP)Programming skills in at least one script-based programming language (RPython)Follow accreditation standards set by ISO/IEC:2012 17024 and the National Council for Certifying Agencies (NCCA)
Nice-to-Have Skills
Generative Artificial Intelligence (GenAI)Natural Language Processing (NLP)Machine Learning (ML)
Required Skills
psychometric analysisautomated test assemblyitem analysisoptimal item bank designjob task analysisstandard settingquality assuranceproject planningitem response theoryclassical test theoryequating and scalingform and pool assemblyitem bank health analysislinear-on-the-fly testing (LOFT)computerized adaptive testing (CAT)Machine Learning (ML)natural language processing (NLP)Generative Artificial Intelligence (GenAI)RPythonstatistical modelingpsychometric modelingISO/IEC:2012 17024National Council for Certifying Agencies (NCCA)
Hard Skills
psychometric aspects of exam development and operationsautomated test assemblyitem analysesoptimal item bank designjob task analysisstandard settingquality assuranceproject planningitem response theoryequating and scalingclassical test theoryform and pool assemblyitem bank health analysisoperational psychometric tasks on large-scale educationlicensureor certification assessment programslarge-scale educationlicensureor certification assessment programsstatistical modelingpsychometric modelingevaluate and ensure exam validityevaluate and ensure exam reliabilityevaluate and ensure exam applicabilityevaluate and ensure exam efficiencyevaluate and ensure exam accuracymachine learning (ML)natural language processing (NLP)generative artificial intelligence (GenAI)RPythonautomation code development for psychometric workflow pipelinepresentinterpretand communicate analysis results through written and oral reportsaccreditation standards follow ISO/IEC:2012 17024National Council for Certifying Agencies (NCCA) standards for valid psychometric practicesbuilding optimal test forms and pools via optimization techniquesanalyzing and monitoring item bank healthdeveloping and applying statistical and psychometric modelingrefresh test blueprintsstandard setting studiesjob task analysis (JTA)
Soft Skills
presenting resultsinterpreting analysis outcomescommunicating to stakeholders through written reportscommunicating to stakeholders through oral reportsstakeholder managementparticipating in research projectsengagement with professional communitycollaboration
Keywords for Your Resume
Research Scientist IIAMZ9698004psychometric analysisexam developmentexam operationsautomated test assemblyitem analysisoptimal item bank designjob task analysisJTAstandard settingquality assuranceproject planningitem response theoryclassical test theoryequating and scalingform and pool assemblyitem bank health analysisstandard setting studieslinear-on-the-fly testing (LOFT)LOFTcomputerized adaptive testing (CAT)CATMachine Learning (ML)natural language processing (NLP)GenAIGenerative Artificial Intelligence (GenAI)RPythonstatistical modelingpsychometric modelingvalidityreliabilityapplicabilityefficiencyaccuracyISO/IEC:2012 17024International Organization for Standardization/International Electrotechnical Commission 17024National Council for Certifying Agencies (NCCA)
Deal Breakers
PhD (or foreign equivalent) in an approved quantitative/psychometrics-related field, Must have 1 year of experience in operational psychometric tasks for large-scale education, licensure, or certification assessment programs, Must have at least one of LOFT or CAT complex test designs, Must have programming skills in at least one of R or Python
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile