AI Quality Assurance Engineer (f/m/d)
Frankfurt, GERMANY
Job description
As an AI Quality Assurance Engineer (f/m/d), you will be part of the AI Center of Enablement (AI CoE), which centrally coordinates the responsible adoption and scalable implementation of AI across AllianzGI by providing shared capabilities, governance frameworks, and reusable technical standards. As part of this mandate, you will operate the GenAI Hub as the central enablement platform for productivity and business use cases powered by generative and agentic AI capabilities, enabling decentralized teams to develop AI solutions within governance‑compliant environments.
In this context, you are responsible for the evaluation and validation of both centrally provided AI Hub capabilities and decentralized AI use cases. The role focuses on defining and operationalising evaluation approaches for Generative and Agentic AI systems, supporting delivery teams in translating functional AI requirements into technically testable criteria, and deriving reusable validation methods and best practices that can be adopted by other delivery teams and integrated into enterprise AI governance frameworks and standards. Acting as a central AI quality assurance and guardrails engineer within the AI CoE, the role promotes AI Quality Assurance methods that will enable the scalable and regulatorily compliant deployment of AI systems across the company.
This position will be based in Frankfurt or Munich.
What you will do
- GenAI Hub (portal) and Platform quality assurance and evaluation of centrally provided AI capabilities prior to enterprise‑wide adoption
- Define, implement, prepare rollout of reusable AI QA methods, AI evaluation frameworks, implementation patterns
- Contribute to AI testing standards, best practices, and governance frameworks to support regulatorily compliant deployment of AI solutions
- Strong focus on automated testing of AI Evals, complemented by manual testing and red teaming where appropriate
- Assess AI system behaviour regarding reliability, performance, explainability, robustness, and governance alignment
- Define and implement CI/CD regression gates (“quality gates”) maintaining a golden eval set, running evals in CI, blocking rollout on regressions
- Apply best practises and principals in prompt, context, and harness engineering to ensure quality practices are systematically embedded within AI solution design & delivery pipelines
- Incorporate progressive disclosure principles and proactively mitigate risks such as context rot to maintain performance and reliability over time
- Utilize spec-driven development approaches, using frameworks such as OpenSpec, Tessl, and GitHub Spec Kit, to codify clear and effective quality instructions
- Apply a strong understanding of feature flagging and A/B testing methodologies to support experimentation and continuous improvement of AI solutions
What you bring
- Strong conceptual and practical understanding of prompt /context / harness engineering, intent based testing, spec driven development, behaviour-driven development and codified acceptance criteria, feature flagging, A/B testing
- Min. 3 years of experience as an engineer, with strong advocacy and knowledge of Software Quality Assurance, System Testing in the Agentic AI space
- Min. 2 years of experience and familiarity with evaluation frameworks, behavioural/performance testing and observability platforms (e.g., LangSmith, DeepEval, Ragas, Langfuse, Braintrust, Datadog) to support robust monitoring and analysis of AI systems
- Experience in defining test strategies, test plans, acceptance criteria, or validation approaches
- Deep understanding of Generative AI systems (e.g. LLM based applications) and agentic AI workflows
- Familiarity with evaluation of GenAI outputs (e.g. hallucination risk, consistency, explainability, robustness)
- Understanding of prompt-based system behaviour, orchestration patterns, or AI agents
- Familiarity with data pipelines, APIs, and system integration patterns
- Ability to translate functional AI requirements into technically testable specifications (BA type tasks as required)
- Experience working in Agile delivery environments and cross functional teams
- Strong analytical and problem-solving skills
- Experience in regulated financial services environments is beneficial
What we offer
- We empower our employees by ensuring flexible work arrangements that maintain a balance between performance, productivity, career development and personal priorities (e.g., hybrid model/ flexible working hours)
- Securing your future: Access to company pension/savings plans
- Family support (relocation/ childcare facilities)
- Company share purchasing plan
- Mental health and wellbeing programs
- Mobility solutions (Jobrad bike leasing, subvention Jobticket)
- Career opportunities within the entire Allianz Group
- Self-guided learning & development
- Volunteering time
- … and so much more!