AI Quality Assurance Engineer (f/m/d) - Allianz - Frankfurt

Job description

As an AI Quality Assurance Engineer (f/m/d), you will be part of the AI Center of Enablement (AI CoE), which centrally coordinates the responsible adoption and scalable implementation of AI across AllianzGI by providing shared capabilities, governance frameworks, and reusable technical standards. As part of this mandate, you will operate the GenAI Hub as the central enablement platform for productivity and business use cases powered by generative and agentic AI capabilities, enabling decentralized teams to develop AI solutions within governance‑compliant environments.

In this context, you are responsible for the evaluation and validation of both centrally provided AI Hub capabilities and decentralized AI use cases. The role focuses on defining and operationalising evaluation approaches for Generative and Agentic AI systems, supporting delivery teams in translating functional AI requirements into technically testable criteria, and deriving reusable validation methods and best practices that can be adopted by other delivery teams and integrated into enterprise AI governance frameworks and standards. Acting as a central AI quality assurance and guardrails engineer within the AI CoE, the role promotes AI Quality Assurance methods that will enable the scalable and regulatorily compliant deployment of AI systems across the company.

This position will be based in Frankfurt or Munich.

What you will do

GenAI Hub (portal) and Platform quality assurance and evaluation of centrally provided AI capabilities prior to enterprise‑wide adoption
Define, implement, prepare rollout of reusable AI QA methods, AI evaluation frameworks, implementation patterns
Contribute to AI testing standards, best practices, and governance frameworks to support regulatorily compliant deployment of AI solutions
Strong focus on automated testing of AI Evals, complemented by manual testing and red teaming where appropriate
Assess AI system behaviour regarding reliability, performance, explainability, robustness, and governance alignment
Define and implement CI/CD regression gates (“quality gates”) maintaining a golden eval set, running evals in CI, blocking rollout on regressions
Apply best practises and principals in prompt, context, and harness engineering to ensure quality practices are systematically embedded within AI solution design & delivery pipelines
Incorporate progressive disclosure principles and proactively mitigate risks such as context rot to maintain performance and reliability over time
Utilize spec-driven development approaches, using frameworks such as OpenSpec, Tessl, and GitHub Spec Kit, to codify clear and effective quality instructions
Apply a strong understanding of feature flagging and A/B testing methodologies to support experimentation and continuous improvement of AI solutions

What you bring

Strong conceptual and practical understanding of prompt /context / harness engineering, intent based testing, spec driven development, behaviour-driven development and codified acceptance criteria, feature flagging, A/B testing
Min. 3 years of experience as an engineer, with strong advocacy and knowledge of Software Quality Assurance, System Testing in the Agentic AI space
Min. 2 years of experience and familiarity with evaluation frameworks, behavioural/performance testing and observability platforms (e.g., LangSmith, DeepEval, Ragas, Langfuse, Braintrust, Datadog) to support robust monitoring and analysis of AI systems
Experience in defining test strategies, test plans, acceptance criteria, or validation approaches
Deep understanding of Generative AI systems (e.g. LLM based applications) and agentic AI workflows
Familiarity with evaluation of GenAI outputs (e.g. hallucination risk, consistency, explainability, robustness)
Understanding of prompt-based system behaviour, orchestration patterns, or AI agents
Familiarity with data pipelines, APIs, and system integration patterns
Ability to translate functional AI requirements into technically testable specifications (BA type tasks as required)
Experience working in Agile delivery environments and cross functional teams
Strong analytical and problem-solving skills
Experience in regulated financial services environments is beneficial

What we offer

We empower our employees by ensuring flexible work arrangements that maintain a balance between performance, productivity, career development and personal priorities (e.g., hybrid model/ flexible working hours)
Securing your future: Access to company pension/savings plans
Family support (relocation/ childcare facilities)
Company share purchasing plan
Mental health and wellbeing programs
Mobility solutions (Jobrad bike leasing, subvention Jobticket)
Career opportunities within the entire Allianz Group
Self-guided learning & development
Volunteering time
… and so much more!

Offers “Allianz”

Job description