当前位置: X-MOL 学术arXiv.cs.HC › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Chart What I Say: Exploring Cross-Modality Prompt Alignment in AI-Assisted Chart Authoring
arXiv - CS - Human-Computer Interaction Pub Date : 2024-04-07 , DOI: arxiv-2404.05103
Nazar Ponochevnyi, Anastasia Kuzminykh

Recent chart-authoring systems, such as Amazon Q in QuickSight and Copilot for Power BI, demonstrate an emergent focus on supporting natural language input to share meaningful insights from data through chart creation. Currently, chart-authoring systems tend to integrate voice input capabilities by relying on speech-to-text transcription, processing spoken and typed input similarly. However, cross-modality input comparisons in other interaction domains suggest that the structure of spoken and typed-in interactions could notably differ, reflecting variations in user expectations based on interface affordances. Thus, in this work, we compare spoken and typed instructions for chart creation. Findings suggest that while both text and voice instructions cover chart elements and element organization, voice descriptions have a variety of command formats, element characteristics, and complex linguistic features. Based on these findings, we developed guidelines for designing voice-based authoring-oriented systems and additional features that can be incorporated into existing text-based systems to support speech modality.

中文翻译:

我所说的图表:探索人工智能辅助图表创作中的跨模态提示对齐

最近的图表创作系统(例如 QuickSight 中的 Amazon Q 和 Power BI 的 Copilot)显示出对支持自然语言输入的新兴关注,以通过图表创建来分享来自数据的有意义的见解。目前,图表创作系统倾向于通过依赖语音到文本转录来集成语音输入功能,以类似的方式处理语音和打字输入。然而,其他交互领域的跨模态输入比较表明,口头交互和键入交互的结构可能显着不同,反映了基于界面可供性的用户期望的变化。因此,在这项工作中,我们比较了图表创建的口头指令和打字指令。研究结果表明,虽然文本和语音指令都涵盖了图表元素和元素组织,但语音描述具有多种命令格式、元素特征和复杂的语言特征。基于这些发现,我们制定了设计基于语音的创作导向系统的指南,以及可以合并到现有基于文本的系统中以支持语音模态的附加功能。
更新日期:2024-04-09
down
wechat
bug