On May 22, we presented a poster summing up the current outcomes of our collaboration with Paula Gürtler at the 29th International Student Scientific Conference (POSTER) in Prague. The work titled Dimensions of Explainability in AI Alignment stood out during the poster session and the subsequent elevator pitch as an original, interdisciplinary proposal. It sparked interest among the conference committee, and the positive feedback received is an important validation of our efforts. Read the full paper or the abstract below.

Authors of this article presenting at POSTER

Human-AI alignment is challenging due to limitations in both technical solutions and governance frameworks. Given the infeasibility of properly anticipating all potential misalignment risks, we see explainability as essential for continuous oversight, bridging the gap between AI systems, governance, and human intervention. Recognizing the multi-faceted character of the problem, we argue for a structured framework for evaluating explainability methods, moving beyond narrow technical metrics, to enhance future developments in AI accountability and alignment.

RAI at POSTER: Explainability for AI Alignment

Written by:

Martin Krutský Follow

Jiří Němeček Follow

Jakub Peleška Follow