当前位置: X-MOL 学术ACM Trans. Interact. Intell. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
generAItor: Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation
ACM Transactions on Interactive Intelligent Systems ( IF 3.4 ) Pub Date : 2024-03-14 , DOI: 10.1145/3652028
Thilo Spinner 1 , Rebecca Kehlbeck 2 , Rita Sevastjanova 1 , Tobias Stähle 2 , Daniel A. Keim 2 , Oliver Deussen 2 , Mennatallah El-Assady 1
Affiliation  

Large language models (LLMs) are widely deployed in various downstream tasks, e.g., auto-completion, aided writing, or chat-based text generation. However, the considered output candidates of the underlying search algorithm are under-explored and under-explained. We tackle this shortcoming by proposing a tree-in-the-loop approach, where a visual representation of the beam search tree is the central component for analyzing, explaining, and adapting the generated outputs. To support these tasks, we present generAItor, a visual analytics technique, augmenting the central beam search tree with various task-specific widgets, providing targeted visualizations and interaction possibilities. Our approach allows interactions on multiple levels and offers an iterative pipeline that encompasses generating, exploring, and comparing output candidates, as well as fine-tuning the model based on adapted data. Our case study shows that our tool generates new insights in gender bias analysis beyond state-of-the-art template-based methods. Additionally, we demonstrate the applicability of our approach in a qualitative user study. Finally, we quantitatively evaluate the adaptability of the model to few samples, as occurring in text-generation use cases.



中文翻译:

GenerAItor:用于语言模型可解释性和适应性的树环文本生成

大型语言模型(LLM)广泛部署在各种下游任务中,例如自动完成、辅助写作或基于聊天的文本生成。然而,底层搜索算法所考虑的输出候选尚未得到充分探索和解释。我们通过提出一种树循环方法来解决这个缺点,其中波束搜索树的视觉表示是分析、解释和调整生成的输出的核心组件。为了支持这些任务,我们提出了generAItor,一种可视化分析技术,用各种特定于任务的小部件增强了中心波束搜索树,提供有针对性的可视化和交互可能性。我们的方法允许在多个级别上进行交互,并提供迭代管道,其中包括生成、探索和比较输出候选,以及根据适应的数据微调模型。我们的案例研究表明,我们的工具在性别偏见分析方面产生了超越最先进的基于模板的方法的新见解。此外,我们还证明了我们的方法在定性用户研究中的适用性。最后,我们定量评估模型对少量样本的适应性,如文本生成用例中发生的情况。

更新日期:2024-03-14
down
wechat
bug