Position Details

Type Full-Time

Experience senior

Exp. Years Not specified

Education Not specified

Category QA & Testing

About this role

This role owns the quality validation of AI-enabled applications, agent workflows, and non-deterministic outputs in enterprise production-style environments. The engineer builds and executes quality strategies to measure hallucination risk, bias, safety, reliability, and policy adherence, ensuring AI capabilities meet quality and operational standards before release.

Key Responsibilities

Design and execute quality strategies for AI-powered applications, agents, prompts, and workflow automations
Validate AI outputs for accuracy, completeness, consistency, policy adherence, and business usefulness
Define test approaches for AI workflows and build automated test assets
Run scenario-based and adversarial checks for prompts and outputs
Partner with engineering to remediate defects and ensure release readiness for agentic systems

Technical Overview

The technical scope includes functional testing, regression testing, API validation, UI verification, output scoring, hallucination detection, evaluation dataset design, and drift monitoring. Work includes scenario-based and adversarial checks, automated test asset creation, and deployment gate readiness for agentic systems within a governance-aware Forge model.

Ideal Candidate

The ideal candidate is a senior quality engineering professional focused on AI systems, with experience validating AI-enabled applications, agent workflows, and non-deterministic outputs in production-style enterprise environments. They can design quality strategies across functional, regression, API, and UI testing, and explicitly test hallucination risk, bias, safety, reliability, and drift using evaluation datasets and automated test assets.

Must-Have Skills

English (Required)owns the quality validation of AI-enabled applicationsagent workflowsand non-deterministic outputs in production-style enterprise environmentsDesign and execute quality strategies for AI-powered applicationsagentspromptsand workflow automationsValidate AI outputs for accuracycompletenessconsistencypolicy adherenceand business usefulness

Nice-to-Have Skills

hallucination detectionevaluation dataset designdrift monitoringdeployment gate readiness for agentic systemsscenario-based and adversarial checksoutput scoring using defined scoring rub

Tools & Platforms

Forge model

Required Skills

Hard Skills

quality validationAI-enabled applicationsagent workflowsnon-deterministic outputsfunctional testingregression testingAPI validationUI verificationoutput scoringhallucination detectionevaluation dataset designdrift monitoringdeployment gate readinessscenario-based checksadversarial checksprompt-driven solutionsenterprise automationsbiassafetyreliabilitycorrectnesscompletenesshallucination riskpolicy adherenceaccuracy validationenterprise environmentsproduction readinesssecurity standardsoperational standardstest approach designautomated test assetsvalidating prompts and outputsremediating defects with engineeringgovernance-aware and evidence-driven testing

Soft Skills

stakeholder collaborationpartnering with engineeringanalytical judgmentcommunicationrisk management mindset

Industry & Role

Industry Banking

Job Function Validate and test AI systems and agent workflows to ensure accuracy, safety, reliability, and production readiness

Role Subtype QA Lead

Tech Domains AI & Machine Learning, QA & Testing, Cybersecurity, Python

Clearance & Visa

Visa Sponsorship No

Keywords for Your Resume

Senior AI QA EngineerAI QAquality validationAI-enabled applicationsagent workflowsnon-deterministic outputsfunctional testingregression testingAPI validationUI verificationoutput scoringhallucination detectionevaluation dataset designdrift monitoringdeployment gate readinessscenario-based and adversarial checksprompt-driven solutionsenterprise automationsbiassafetyreliabilitypolicy adherenceproduction readinessForge model

Deal Breakers

English fluency required, Must be able to validate AI outputs for accuracy, completeness, consistency, policy adherence, and business usefulness, Must be able to design and execute quality strategies including functional testing, regression testing, API validation, and UI verification

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile