MiX Knowledge

通过零样本情感和不流畅生成进行人性化语音合成

分类： 计算和语言, 人工智能, 人机交互

作者： Rohan Chaudhury, Mihir Godbole, Aakash Garg, Jinsil Hwaryoung Seo

发布时间： 2024-03-31

链接： http://arxiv.org/abs/2404.01339v1

摘要： 当代对话系统通常存在显着的局限性：它们的反应缺乏人类互动的情感深度和不连贯特征。当用户寻求更加个性化和同理心的互动时，这种缺失变得尤其明显。因此，这使得它们看起来很机械，与人类用户不太相关。认识到这一差距，我们踏上了人性化机器通信的旅程，以确保人工智能系统不仅能够理解，而且能够产生共鸣。为了解决这个缺点，我们设计了一种创新的语音合成管道。在此框架内，尖端的语言模型在零样本环境中引入了类人情感和不流畅性。这些错综复杂的内容在文本生成过程中通过语言模型无缝集成到生成的文本中，使系统能够更好地反映人类语音模式，从而促进更直观和自然的用户交互。然后，在文本转语音阶段，使用基于规则的方法将这些生成的元素巧妙地转换为相应的语音模式和情感声音。根据我们的实验，我们的新颖系统生成的合成语音几乎与真实的人类交流没有区别，使每次交互都感觉更加个性化和真实。

游戏持续时间的情绪影响：在长时间游戏过程中理解玩家情绪的框架

分类： 人机交互, 人工智能

作者： Anoop Kumar, Suresh Dodda, Navin Kamuni, Venkata Sai Mahesh Vuppalapati

发布时间： 2024-03-31

链接： http://arxiv.org/abs/2404.00526v1

摘要： 自 20 世纪 70 年代发展以来，电子游戏在娱乐领域发挥着至关重要的作用，在人们寻找娱乐方式的封锁期间，电子游戏变得更加突出。然而，当时的玩家并没有意识到游戏时间可能对他们的感受产生重大影响。这给设计师和开发者创造新游戏带来了挑战，因为他们必须控制这些游戏对玩家的情感影响。因此，本研究的目的是了解游戏持续时间如何影响玩家的情绪。为了实现这一目标，创建了情绪检测框架。根据实验结果，志愿者表达情绪的一般能力从20分钟提高到60分钟。与较短的游戏时间相比，实验发现延长的游戏时间确实会显着影响玩家的情绪。根据研究结果，建议为了减轻未来玩电脑和视频游戏可能产生的潜在情感影响，游戏制作人应该考虑制作更短、更具娱乐性的游戏。

“我的代理更了解我”：在基于 LLM 的代理中集成动态类人记忆回忆和巩固

分类： 人机交互, I.2.4; H.3.3

作者： Yuki Hou, Haruki Tamoto, Homei Miyashita

发布时间： 2024-03-31

链接： http://arxiv.org/abs/2404.00573v1

摘要： 在这项研究中，我们提出了一种新颖的类人记忆架构，旨在增强基于大型语言模型的对话代理的认知能力。我们提出的架构使代理能够自主回忆生成响应所需的记忆，有效解决大语言模型时间认知的限制。我们采用人类记忆线索回忆作为准确有效的记忆回忆的触发因素。此外，我们开发了一个数学模型，可以动态量化记忆巩固，考虑上下文相关性、经过时间和回忆频率等因素。代理将从用户交互历史中检索到的记忆存储在数据库中，该数据库封装了每个记忆的内容和时间上下文。因此，这种策略性存储允许代理回忆特定的记忆并在时间背景下理解它们对用户的重要性，类似于人类如何识别和回忆过去的经历。

设计人类-人工智能系统：拟人化和人类-人工智能协作的框架偏差

分类： 人机交互

作者： Samuel Aleksander Sánchez Olszewski

发布时间： 2024-03-31

链接： http://arxiv.org/abs/2404.00634v1

摘要： 人工智能正在重新定义人类与技术的互动方式，从而实现两者之间的协同合作。然而，人类认知对这种合作的影响仍不清楚。本研究调查了拟人化和框架效应这两种认知偏差对招聘环境中人机协作的影响。受试者被要求在人工智能推荐工具的帮助下选择求职者。该工具被操纵为具有类似人类或类似机器人的特征，并以积极或消极的框架提出建议。结果显示，人工智能建议的框架对受试者的决策没有显着影响。相比之下，拟人化显着影响了受试者对人工智能建议的同意程度。与预期相反，如果人工智能具有类似人类的特征，受试者就不太可能同意它。这些发现表明，认知偏差会影响人类与人工智能的协作，并强调需要采用量身定制的人工智能产品设计方法，而不是单一的通用解决方案。

您的同事很重要：评估 Blocks World 中语言模型的协作能力

分类： 计算和语言, 人工智能, 人机交互

作者： Guande Wu, Chen Zhao, Claudio Silva, He He

发布时间： 2024-03-30

链接： http://arxiv.org/abs/2404.00246v1

摘要： 独立与世界交互的语言代理在自动化数字任务方面具有巨大的潜力。虽然大型语言模型 (LLM) 代理在理解和执行文本游戏和网页控制等任务方面取得了进展，但许多现实世界的任务还需要与人类或其他同等角色的 LLM 协作，这涉及意图理解、任务协调和沟通。为了测试 LLM 的协作能力，我们设计了一个块世界环境，其中两个代理（每个代理都有独特的目标和技能）一起构建目标结构。为了完成目标，他们可以在世界中行动并用自然语言进行交流。在这种环境下，我们设计了越来越具有挑战性的设置来评估不同的协作视角，从独立任务到更复杂的依赖任务。我们进一步采用思想链提示，其中包括中间推理步骤来对合作伙伴的状态进行建模并识别和纠正执行错误。人机和机器机实验都表明LLM代理具有很强的接地能力，并且我们的方法显着提高了评估指标。

增强虚拟现实中的同理心：心态调节的具体方法

分类： 人机交互

作者： Seoyeon Bae, Yoon Kyung Lee, Jungcheol Lee, Jaeheon Kim, Haeseong Jeon, Seung-Hwan Lim, Byung-Cheol Kim, Sowon Hahn

发布时间： 2024-03-30

链接： http://arxiv.org/abs/2404.00300v1

摘要： 成长心态在提高同理心能力方面显示出了有希望的结果。然而，在基于 VR 的同理心干预中刺激成长心态的探索还不够。在本研究中，我们实施了亲社会 VR 内容《我们的邻居英雄》，重点是体现虚拟角色来调节玩家的心态。虚拟身体充当垫脚石，使玩家能够在遵循任务指示时认同角色并培养成长心态。我们考虑了几个实施因素来帮助玩家在 VR 体验中进行定位，包括积极反馈、内容难度、背景照明和多模式反馈。我们进行了一项实验来调查干预措施在增强同理心方面的有效性。我们的研究结果显示，VR 内容和心态培训鼓励参与者改善他们的成长心态和移情动机。该 VR 内容是为大学生开发的，旨在增强他们的同理心和团队合作能力。它有潜力改善组织和社区环境中的协作。

关于任务和同步：检查视频讲座学习期间注视同步与自我报告注意力之间的关系

分类： 人机交互

作者： Babette Bühler, Efe Bozkir, Hannah Deininger, Peter Gerjets, Ulrich Trautwein, Enkelejda Kasneci

发布时间： 2024-03-30

链接： http://arxiv.org/abs/2404.00333v1

摘要： 成功的学习取决于学习者保持注意力的能力，由于教师互动有限，这在在线教育中尤其具有挑战性。注意力的一个潜在指标是注视同步性，它在专注于操纵注意力的受控实验中展示了对基于视频的学习中学习成绩的预测能力。本研究 (N=84) 使用经验采样，在现实的在线视频学习过程中检查了注视同步与学习者自我报告的注意力之间的关系。通过注视密度图的 Kullback-Leibler 发散和 MultiMatch 算法扫描路径比较来评估注视同步性。结果表明，在这两项测量中，注意力集中的参与者的注视同步性显着提高，并且自我报告的注意力显着预测了测试后的分数。相比之下，同步性指标与学习成果并不相关。虽然支持专心学习者表现出相似眼球运动的假设，但直接使用同步作为注意力指标提出了挑战，需要对注意力、注视同步和视频内容类型的相互作用进行进一步研究。

设计以用户为中心的大规模街景图像信息质量排名框架

分类： 人机交互

作者： Tahiya Chowdhury, Ilan Mandel, Jorge Ortiz, Wendy Ju

发布时间： 2024-03-30

链接： http://arxiv.org/abs/2404.00392v1

摘要： 街景图像 (SVI) 主要通过配备的车队或消费车辆中安装的行车记录仪捕获，是城市传感和开发中使用的地理空间数据的快速增长来源。这些数据集通常是随机收集的，规模庞大，质量参差不齐，限制了它们在城市规划中使用的范围和程度。到目前为止，还没有太多工作来确定此类数据集用户遇到的障碍和所需的工具。这严重限制了使用新兴街景图像来支持可提高城市生活质量的新颖研究问题的机会。这项工作包括对来自学术界、城市规划和相关专业的 5 名大规模街景数据集专家用户进行的形成性访谈研究，确定了提高这些数据集实用性的新用例、挑战和机会。根据用户的发现，我们提出了一个框架来评估街道图像的三个属性（空间、时间和内容）的信息质量，利益相关者可以利用该框架来估计数据集的价值，并随着时间的推移对其进行改进，以适应他们的需求。各自的用例。然后，我们使用新颖的街景图像进行案例研究，在其中评估我们的框架并向用户展示实际用例。我们讨论了设计未来系统以支持街景数据的收集和使用以协助感知和规划城市环境的影响。

人类与大语言模型互动模式的分类：初步探索

分类： 人机交互

作者： Jie Gao, Simret Araya Gebreegziabher, Kenny Tsu Wei Choo, Toby Jia-Jun Li, Simon Tangi Perrault, Thomas W. Malone

发布时间： 2024-03-30

链接： http://arxiv.org/abs/2404.00405v1

摘要： 随着 ChatGPT 的发布，对话提示已成为人类与大语言模型互动的最流行形式。然而，对于涉及推理、创造力和迭代的更复杂的任务，其有效性有限。通过对 2021 年以来发表的 HCI 论文进行系统分析，我们确定了人与 LLM 交互流程中的四个关键阶段——规划、促进、迭代和测试——以准确理解这一过程的动态。此外，我们还开发了四种主要交互模式的分类：模式 1：标准提示、模式 2：用户界面、模式 3：基于上下文和模式 4：代理协助者。使用“5W1H”指导方法进一步丰富了该分类法，其中涉及对定义、参与者角色（谁）、发生的阶段（何时）、人类目标和大语言模型能力（什么）以及每次交互的机制的详细检查模式（如何）。我们预计这种分类法将有助于未来人类与大语言模型互动的设计和评估。

使用人工智能发现的街景模式可视化路线

分类： 人机交互, 机器学习

作者： Tsung Heng Wu, Md Amiruzzaman, Ye Zhao, Deepshikha Bhati, Jing Yang

发布时间： 2024-03-30

链接： http://arxiv.org/abs/2404.00431v1

摘要： 街道视觉外观在研究社会系统中发挥着重要作用，例如了解建筑环境、驾驶路线以及相关的社会和经济因素。它尚未集成到用于规划驾驶路线的典型地理可视化界面（例如地图服务）中。在本文中，我们研究了这个新的可视化任务，并做出了一些新的贡献。首先，我们尝试了一组人工智能技术，并提出了一种使用语义潜在向量来量化视觉外观特征的解决方案。其次，我们计算大量街景图像之间的图像相似度，然后发现空间图像模式。第三，我们利用新的可视化技术将这些发现的模式整合到驾驶路线规划器中。最后，我们展示了 VivaRoutes，一个交互式可视化原型，以展示利用这些发现的模式的可视化如何帮助用户有效地、交互式地探索多条路线。此外，我们还进行了一项用户研究来评估 VivaRoutes 的实用性和实用性。

具有手势响应和音乐伴奏的交互式多机器人植绒

分类： 机器人技术, 人工智能, 人机交互

作者： Catie Cuan, Kyle Jeffrey, Kim Kleiven, Adrian Li, Emre Fisher, Matt Harrison, Benjie Holson, Allison Okamura, Matt Bennice

发布时间： 2024-03-30

链接： http://arxiv.org/abs/2404.00442v1

摘要： 几十年来，机器人研究人员一直致力于多机器人系统的各种任务，从协作操纵到搜索和救援。这些任务是经典机器人任务的多机器人扩展，并且通常在速度或效率等维度上进行优化。随着机器人从商业和研究环境过渡到日常环境，参与或娱乐等社交任务目标变得越来越重要。这项工作提出了一项引人注目的多机器人任务，其主要目的是吸引人并引起兴趣。在这项任务中，目标是吸引人类与充满活力、富有表现力的机器人群体一起移动并参与其中。为了实现这一目标，研究团队创建了机器人运动和手势和声音等互动模式的算法。贡献如下：（1）一种涉及人类和机器人代理的新型群体导航算法，（2）一种用于实时人机集群交互的手势响应算法，（3）一种用于修改集群行为的权重模式表征系统，以及（4）一种在动态、自适应、学习系统中编码编舞者偏好的方法。进行了一项实验，以了解人类在三种条件下与群体互动时的行为：人类编舞者选择的权重模式、学习模型或子集列表。实验结果表明，体验感知不受权重模式选择的影响。这项工作阐明了不同的任务目标（例如参与度）如何在多机器人系统设计和执行中体现，并拓宽了多机器人任务的领域。

情境人工智能日记：整合大语言模型和时间序列行为感知技术，使用 MindScape 应用程序促进自我反思和幸福感

分类： 人机交互, 人工智能, H.5.0; H.5.3; H.5.m; J.0

作者： Subigya Nepal, Arvind Pillai, William Campbell, Talie Massachi, Eunsol Soul Choi, Orson Xu, Joanna Kuc, Jeremy Huckins, Jason Holden, Colin Depp, Nicholas Jacobson, Mary Czerwinski, Eric Granholm, Andrew T. Campbell

发布时间： 2024-03-30

链接： http://arxiv.org/abs/2404.00487v1

摘要： MindScape 旨在研究将时间序列行为模式（例如对话参与、睡眠、位置）与大型语言模型 (LLM) 相结合的好处，以创建一种新形式的情境 AI 日记，促进自我反思和幸福感。我们认为，将行为感知融入大语言模型可能会开辟人工智能的新领域。在这篇最新工作论文中，我们讨论了 MindScape 情境日记应用程序的设计，该应用程序使用大语言模型和行为感知来生成情境和个性化的日记提示，旨在鼓励自我反思和情感发展。我们还讨论了基于初步用户研究的 MindScape 对大学生的研究，以及我们即将进行的研究，以评估情境人工智能日记在促进大学校园更好的福祉方面的有效性。 MindScape 代表了一种将行为智能嵌入人工智能的新应用程序类别。

MindArm: Mechanized Intelligent Non-Invasive Neuro-Driven Prosthetic Arm System

分类： 人工智能, 人机交互, 机器人技术, I.2.9

作者： Maha Nawaz, Abdul Basit, Muhammad Shafique

发布时间： 2024-03-29

链接： http://arxiv.org/abs/2403.19992v1

摘要： Currently, people with disability or difficulty to move their arms (referred to as "patients") have very limited technological solutions to efficiently address their physiological limitations. It is mainly due to two reasons: (1) the non-invasive solutions like mind-controlled prosthetic devices are typically very costly and require expensive maintenance; and (2) other solutions require costly invasive brain surgery, which is high risk to perform, expensive, and difficult to maintain. Therefore, current technological solutions are not accessible for all patients with different financial backgrounds. Toward this, we propose a low-cost technological solution called MindArm, a mechanized intelligent non-invasive neuro-driven prosthetic arm system. Our MindArm system employs a deep neural network (DNN) engine to translate brain signals into the intended prosthetic arm motion, thereby helping patients to perform many activities despite their physiological limitations. Here, our MindArm system utilizes widely accessible and low-cost surface electroencephalogram (EEG) electrodes coupled with an Open Brain Computer Interface and UDP networking for acquiring brain signals and transmitting them to the compute module for signal processing. In the compute module, we run a trained DNN model to interpret normalized micro-voltage of the brain signals, and then translate them into a prosthetic arm action via serial communication seamlessly. The experimental results on a fully working prototype demonstrate that, from the three defined actions, our MindArm system achieves positive success rates, i.e., 91% for idle/stationary, 85% for shake hand, and 84% for pick-up cup. This demonstrates that our MindArm provides a novel approach for an alternate low-cost mind-controlled prosthetic devices for all patients.

ITCMA: A Generative Agent Based on a Computational Consciousness Structure

分类： 人工智能, 人机交互, 神经元和认知, I.2; J.4

作者： Hanzhong Zhang, Jibin Yin, Haoyang Wang, Ziwei Xiang

发布时间： 2024-03-29

链接： http://arxiv.org/abs/2403.20097v1

摘要： Large Language Models (LLMs) still face challenges in tasks requiring understanding implicit instructions and applying common-sense knowledge. In such scenarios, LLMs may require multiple attempts to achieve human-level performance, potentially leading to inaccurate responses or inferences in practical environments, affecting their long-term consistency and behavior. This paper introduces the Internal Time-Consciousness Machine (ITCM), a computational consciousness structure. We further propose the ITCM-based Agent (ITCMA), which supports behavior generation and reasoning in open-world settings. ITCMA enhances LLMs' ability to understand implicit instructions and apply common-sense knowledge by considering agents' interaction and reasoning with the environment. Evaluations in the Alfworld environment show that trained ITCMA outperforms the state-of-the-art (SOTA) by 9% on the seen set. Even untrained ITCMA achieves a 96% task completion rate on the seen set, 5% higher than SOTA, indicating its superiority over traditional intelligent agents in utility and generalization. In real-world tasks with quadruped robots, the untrained ITCMA achieves an 85% task completion rate, which is close to its performance in the unseen set, demonstrating its comparable utility in real-world settings.

娱乐聊天机器人，为没有抽象能力的老年人提供数字包容

分类： 计算和语言, 人工智能, 人机交互, 机器学习

作者： Silvia García-Méndez, Francisco de Arriba-Pérez, Francisco J. González-Castaño, José A. Regueiro-Janeiro, Felipe Gil-Castiñeira

发布时间： 2024-03-29

链接： http://arxiv.org/abs/2404.01327v1

摘要： 当前的语言处理技术允许创建对话式聊天机器人平台。尽管人工智能还不够成熟，无法在许多大众市场领域提供令人满意的用户体验，但会话界面已经进入呼叫中心和在线购物助理等临时应用程序。然而，迄今为止，它们尚未应用于老年人的社会融入，因为老年人特别容易受到数字鸿沟的影响。他们中的许多人通过电视和广播等传统媒体来缓解孤独感，众所周知，这些媒体可以创造一种陪伴感。在本文中，我们介绍了 EBER 聊天机器人，旨在缩小老年人的数字鸿沟。 EBER 在后台读取新闻并根据用户的情绪调整其响应。它的新颖之处在于“智能广播”的概念，根据该概念，不是简化数字信息系统以方便老年人使用，而是通过语音对话增强了他们熟悉的传统频道（背景新闻）。我们通过结合人工智能建模语言、自动自然语言生成和情感分析使之成为可能。该系统允许通过将从用户对聊天机器人问题的回答中提取的单词与从新闻条目中提取的关键字相结合来访问感兴趣的数字内容。这种方法允许根据词空间的空间表示来定义用户的抽象能力的度量。为了证明所提出的解决方案的适用性，我们展示了对老年人进行的真实实验的结果，这些结果提供了有价值的见解。我们的方法在测试期间被认为是令人满意的，并且提高了参与者的信息搜索能力。

Enhancing Dimension-Reduced Scatter Plots with Class and Feature Centroids

分类： 机器学习, 人机交互, J.3

作者： Daniel B. Hier, Tayo Obafemi-Ajayi, Gayla R. Olbricht, Devin M. Burns, Sasha Petrenko, Donald C. Wunsch II

发布时间： 2024-03-29

链接： http://arxiv.org/abs/2403.20246v1

摘要： Dimension reduction is increasingly applied to high-dimensional biomedical data to improve its interpretability. When datasets are reduced to two dimensions, each observation is assigned an x and y coordinates and is represented as a point on a scatter plot. A significant challenge lies in interpreting the meaning of the x and y axes due to the complexities inherent in dimension reduction. This study addresses this challenge by using the x and y coordinates derived from dimension reduction to calculate class and feature centroids, which can be overlaid onto the scatter plots. This method connects the low-dimension space to the original high-dimensional space. We illustrate the utility of this approach with data derived from the phenotypes of three neurogenetic diseases and demonstrate how the addition of class and feature centroids increases the interpretability of scatter plots.

给文本一个机会：提倡平等考虑语言和可视化

分类： 人机交互

作者： Chase Stokes, Marti A. Hearst

发布时间： 2024-03-29

链接： http://arxiv.org/abs/2404.00131v1

摘要： 可视化研究倾向于不强调对其图像所在文本上下文的考虑。我们认为，可视化研究在评估设计时应将文本表示视为视觉选项的主要替代方案，并且在评估设计时，应同等重视语言的构建和可视化。我们还呼吁在将可视化与书面文本集成时考虑可读性。在强调这些要点的过程中，可视化研究的有效性将得到提高，并能充分考虑观众的需求和反应。

下周再来：无会议周对分布式员工的非结构化时间和注意力谈判的影响

分类： 人机交互

作者： Sharon Ferguson, Michael Massimi

发布时间： 2024-03-29

链接： http://arxiv.org/abs/2404.00161v1

摘要： 虽然分散的工作人员依靠预定的会议进行协调和协作，但这些会议也会挑战他们的专注能力。保护员工的注意力已经从技术角度得到解决，但公司现在正在尝试组织干预，例如无会议周。认识到分布式协作是一项社会技术挑战，我们首先对参加企业软件公司无会议周的分布式员工进行了访谈研究。我们确定了员工在这几周内表现出的三种方向：专注、协作和有时限，每种方向都有不同的水平和非结构化时间的使用。这些不同的方向导致了注意力协商的挑战，这可能适合技术干预。这激发了一项后续研究，调查注意力协商和员工在无会议周期间制定的补偿机制。我们的框架确定了吸引注意力策略和注意力委托策略之间的紧张关系。我们扩展了过去的工作，以展示员工如何调整虚拟协作机制以响应组织干预

无风险，无回报：通过在线交流实现心理安全的自动测量

分类： 人机交互

作者： Sharon Ferguson, Georgia Van de Zande, Alison Olechowski

发布时间： 2024-03-29

链接： http://arxiv.org/abs/2404.00171v1

摘要： 从虚拟通信平台创建的数据提供了探索监控团队绩效的自动化措施的机会。在这项工作中，我们探讨了成功团队的一个重要特征——心理安全——或者相信团队在人际冒险方面是安全的。为了实现对这种现象的自动测量，我们从文献中得出与心理安全要素相关的虚拟通信特征和消息关键词。使用混合方法，我们调查了两个设计团队的 Slack 消息中是否存在这些特征 - 一个团队的心理安全性较高，另一个团队的心理安全性较低。我们发现一些使用特征，例如回复、反应和用户提及，可能是表明更高水平的心理安全性的有希望的指标，而简单的关键词搜索可能不够细致。我们提出了自动检测这一重要但复杂的团队特征的第一步。

意义建构中的工具和任务：视觉无障碍视角

分类： 人机交互

作者： Yichun Zhao, Miguel A. Nacenta

发布时间： 2024-03-29

链接： http://arxiv.org/abs/2404.00192v1

摘要： 我们之前的访谈研究探讨了盲人和低视力 (BLV) 社区对图表信息的需求和使用，从而形成了一个称为“图表访问阶梯”的框架。该框架概述了与图表交互时信息访问的五个级别。在本文中，我们将这个框架与意义建构的全球活动联系起来，并讨论其对 BLV 人群的可及性和可及性。我们还讨论了将该框架整合到意义建构过程中，并探讨了 BLV 社区当前采用的意义建构实践和策略、他们在不同层次所面临的挑战，以及增强数据驱动型劳动力包容性的潜在解决方案。

Real-time accident detection and physiological signal monitoring to enhance motorbike safety and emergency response

分类： 系统与控制, 计算机与社会, 人机交互, 系统与控制

作者： S. M. Kayser Mehbub Siam, Khadiza Islam Sumaiya, Md Rakib Al-Amin, Tamim Hasan Turjo, Ahsanul Islam, A. H. M. A. Rahim, Md Rakibul Hasan

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19085v1

摘要： Rapid urbanization and improved living standards have led to a substantial increase in the number of vehicles on the road, consequently resulting in a rise in the frequency of accidents. Among these accidents, motorbike accidents pose a particularly high risk, often resulting in serious injuries or deaths. A significant number of these fatalities occur due to delayed or inadequate medical attention. To this end, we propose a novel automatic detection and notification system specifically designed for motorbike accidents. The proposed system comprises two key components: a detection system and a physiological signal monitoring system. The detection system is integrated into the helmet and consists of a microcontroller, accelerometer, GPS, GSM, and Wi-Fi modules. The physio-monitoring system incorporates a sensor for monitoring pulse rate and SpO${2}$ saturation. All collected data are presented on an LCD display and wirelessly transmitted to the detection system through the microcontroller of the physiological signal monitoring system. If the accelerometer readings consistently deviate from the specified threshold decided through extensive experimentation, the system identifies the event as an accident and transmits the victim's information -- including the GPS location, pulse rate, and SpO${2}$ saturation rate -- to the designated emergency contacts. Preliminary results demonstrate the efficacy of the proposed system in accurately detecting motorbike accidents and promptly alerting emergency contacts. We firmly believe that the proposed system has the potential to significantly mitigate the risks associated with motorbike accidents and save lives.

Exploring Holistic HMI Design for Automated Vehicles: Insights from a Participatory Workshop to Bridge In-Vehicle and External Communication

分类： 人机交互

作者： Haoyu Dong, Tram Thi Minh Tran, Rutger Verstegen, Silvia Cazacu, Ruolin Gao, Marius Hoggenmüller, Debargha Dey, Mervyn Franssen, Markus Sasalovici, Pavlo Bazilinskyy, Marieke Martens

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19153v1

摘要： Human-Machine Interfaces (HMIs) for automated vehicles (AVs) are typically divided into two categories: internal HMIs for interactions within the vehicle, and external HMIs for communication with other road users. In this work, we examine the prospects of bridging these two seemingly distinct domains. Through a participatory workshop with automotive user interface researchers and practitioners, we facilitated a critical exploration of holistic HMI design by having workshop participants collaboratively develop interaction scenarios involving AVs, in-vehicle users, and external road users. The discussion offers insights into the escalation of interface elements as an HMI design strategy, the direct interactions between different users, and an expanded understanding of holistic HMI design. This work reflects a collaborative effort to understand the practical aspects of this holistic design approach, offering new perspectives and encouraging further investigation into this underexplored aspect of automotive user interfaces.

Algorithmic Ways of Seeing: Using Object Detection to Facilitate Art Exploration

分类： 人机交互, 计算机视觉和模式识别

作者： Louie Søs Meyer, Johanne Engel Aaen, Anitamalina Regitse Tranberg, Peter Kun, Matthias Freiberger, Sebastian Risi, Anders Sundnes Løvlie

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19174v1

摘要： This Research through Design paper explores how object detection may be applied to a large digital art museum collection to facilitate new ways of encountering and experiencing art. We present the design and evaluation of an interactive application called SMKExplore, which allows users to explore a museum's digital collection of paintings by browsing through objects detected in the images, as a novel form of open-ended exploration. We provide three contributions. First, we show how an object detection pipeline can be integrated into a design process for visual exploration. Second, we present the design and development of an app that enables exploration of an art museum's collection. Third, we offer reflections on future possibilities for museums and HCI researchers to incorporate object detection techniques into the digitalization of museums.

CogniDot: Vasoactivity-based Cognitive Load Monitoring with a Miniature On-skin Sensor

分类： 人机交互

作者： Hongbo Lan, Yanrong Li, Shixuan Li, Xin Yi, Tengxiang Zhang

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19206v1

摘要： Vascular activities offer valuable signatures for psychological monitoring applications. We present CogniDot, an affordable, miniature skin sensor placed on the temporal area on the head that senses cognitive loads with a single-pixel color sensor. With its energy-efficient design, bio-compatible adhesive, and compact size (22mm diameter, 8.5mm thickness), it is ideal for long-term monitoring of mind status. We showed in detail the hardware design of our sensor. The user study results with 12 participants show that CogniDot can accurately differentiate between three levels of cognitive loads with a within-user accuracy of 97%. We also discuss its potential for broader applications.

An Interactive Human-Machine Learning Interface for Collecting and Learning from Complex Annotations

分类： 机器学习, 人机交互

作者： Jonathan Erskine, Matt Clifford, Alexander Hepburn, Raúl Santos-Rodríguez

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19339v1

摘要： Human-Computer Interaction has been shown to lead to improvements in machine learning systems by boosting model performance, accelerating learning and building user confidence. In this work, we aim to alleviate the expectation that human annotators adapt to the constraints imposed by traditional labels by allowing for extra flexibility in the form that supervision information is collected. For this, we propose a human-machine learning interface for binary classification tasks which enables human annotators to utilise counterfactual examples to complement standard binary labels as annotations for a dataset. Finally we discuss the challenges in future extensions of this work.

"At the end of the day, I am accountable": Gig Workers' Self-Tracking for Multi-Dimensional Accountability Management

分类： 人机交互

作者： Rie Helene, Hernandez, Qiurong Song, Yubo Kou, Xinning Gui

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19436v1

摘要： Tracking is inherent in and central to the gig economy. Platforms track gig workers' performance through metrics such as acceptance rate and punctuality, while gig workers themselves engage in self-tracking. Although prior research has extensively examined how gig platforms track workers through metrics -- with some studies briefly acknowledging the phenomenon of self-tracking among workers -- there is a dearth of studies that explore how and why gig workers track themselves. To address this, we conducted 25 semi-structured interviews, revealing how gig workers self-tracking to manage accountabilities to themselves and external entities across three identities: the holistic self, the entrepreneurial self, and the platformized self. We connect our findings to neoliberalism, through which we contextualize gig workers' self-accountability and the invisible labor of self-tracking. We further discuss how self-tracking mitigates information and power asymmetries in gig work and offer design implications to support gig workers' multi-dimensional self-tracking.

A theoretical framework for the design and analysis of computational thinking problems in education

分类： 人机交互

作者： Giorgia Adorni, Alberto Piatti, Engin Bumbacher, Lucio Negrini, Francesco Mondada, Dorit Assaf, Francesca Mangili, Luca Gambardella

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19475v1

摘要： The field of computational thinking education has grown in recent years as researchers and educators have sought to develop and assess students' computational thinking abilities. While much of the research in this area has focused on defining computational thinking, the competencies it involves and how to assess them in teaching and learning contexts, this work takes a different approach. We provide a more situated perspective on computational thinking, focusing on the types of problems that require computational thinking skills to be solved and the features that support these processes. We develop a framework for analysing existing computational thinking problems in an educational context. We conduct a comprehensive literature review to identify prototypical activities from areas where computational thinking is typically pursued in education. We identify the main components and characteristics of these activities, along with their influence on activating computational thinking competencies. The framework provides a catalogue of computational thinking skills that can be used to understand the relationship between problem features and competencies activated. This study contributes to the field of computational thinking education by offering a tool for evaluating and revising existing problems to activate specific skills and for assisting in designing new problems that target the development of particular competencies. The results of this study may be of interest to researchers and educators working in computational thinking education.

LLMs as Academic Reading Companions: Extending HCI Through Synthetic Personae

分类： 人机交互

作者： Celia Chen, Alex Leitch

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19506v1

摘要： This position paper argues that large language models (LLMs) constitute promising yet underutilized academic reading companions capable of enhancing learning. We detail an exploratory study examining Claude.ai from Anthropic, an LLM-based interactive assistant that helps students comprehend complex qualitative literature content. The study compares quantitative survey data and qualitative interviews assessing outcomes between a control group and an experimental group leveraging Claude.ai over a semester across two graduate courses. Initial findings demonstrate tangible improvements in reading comprehension and engagement among participants using the AI agent versus unsupported independent study. However, there is potential for overreliance and ethical considerations that warrant continued investigation. By documenting an early integration of an LLM reading companion into an educational context, this work contributes pragmatic insights to guide development of synthetic personae supporting learning. Broader impacts compel policy and industry actions to uphold responsible design in order to maximize benefits of AI integration while prioritizing student wellbeing.

Exploring Communication Dynamics: Eye-tracking Analysis in Pair Programming of Computer Science Education

分类： 人机交互

作者： Wunmin Jang, Hong Gao, Tilman Michaeli, Enkelejda Kasneci

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19560v1

摘要： Pair programming is widely recognized as an effective educational tool in computer science that promotes collaborative learning and mirrors real-world work dynamics. However, communication breakdowns within pairs significantly challenge this learning process. In this study, we use eye-tracking data recorded during pair programming sessions to study communication dynamics between various pair programming roles across different student, expert, and mixed group cohorts containing 19 participants. By combining eye-tracking data analysis with focus group interviews and questionnaires, we provide insights into communication's multifaceted nature in pair programming. Our findings highlight distinct eye-tracking patterns indicating changes in communication skills across group compositions, with participants prioritizing code exploration over communication, especially during challenging tasks. Further, students showed a preference for pairing with experts, emphasizing the importance of understanding group formation in pair programming scenarios. These insights emphasize the importance of understanding group dynamics and enhancing communication skills through pair programming for successful outcomes in computer science education.

Collaborative Interactive Evolution of Art in the Latent Space of Deep Generative Models

分类： 神经和进化计算, 人工智能, 计算机视觉和模式识别, 人机交互, 机器学习

作者： Ole Hall, Anil Yaman

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19620v1

摘要： Generative Adversarial Networks (GANs) have shown great success in generating high quality images and are thus used as one of the main approaches to generate art images. However, usually the image generation process involves sampling from the latent space of the learned art representations, allowing little control over the output. In this work, we first employ GANs that are trained to produce creative images using an architecture known as Creative Adversarial Networks (CANs), then, we employ an evolutionary approach to navigate within the latent space of the models to discover images. We use automatic aesthetic and collaborative interactive human evaluation metrics to assess the generated images. In the human interactive evaluation case, we propose a collaborative evaluation based on the assessments of several participants. Furthermore, we also experiment with an intelligent mutation operator that aims to improve the quality of the images through local search based on an aesthetic measure. We evaluate the effectiveness of this approach by comparing the results produced by the automatic and collaborative interactive evolution. The results show that the proposed approach can generate highly attractive art images when the evolution is guided by collaborative human feedback.

Leveraging Counterfactual Paths for Contrastive Explanations of POMDP Policies

分类： 人工智能, 人机交互

作者： Benjamin Kraske, Zakariya Laouar, Zachary Sunberg

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19760v1

摘要： As humans come to rely on autonomous systems more, ensuring the transparency of such systems is important to their continued adoption. Explainable Artificial Intelligence (XAI) aims to reduce confusion and foster trust in systems by providing explanations of agent behavior. Partially observable Markov decision processes (POMDPs) provide a flexible framework capable of reasoning over transition and state uncertainty, while also being amenable to explanation. This work investigates the use of user-provided counterfactuals to generate contrastive explanations of POMDP policies. Feature expectations are used as a means of contrasting the performance of these policies. We demonstrate our approach in a Search and Rescue (SAR) setting. We analyze and discuss the associated challenges through two case studies.

Creating Aesthetic Sonifications on the Web with SIREN

分类： 声音, 人机交互, 多媒体, 音频和语音处理

作者： Tristan Peng, Hongchan Choi, Jonathan Berger

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19763v1

摘要： SIREN is a flexible, extensible, and customizable web-based general-purpose interface for auditory data display (sonification). Designed as a digital audio workstation for sonification, synthesizers written in JavaScript using the Web Audio API facilitate intuitive mapping of data to auditory parameters for a wide range of purposes. This paper explores the breadth of sound synthesis techniques supported by SIREN, and details the structure and definition of a SIREN synthesizer module. The paper proposes further development that will increase SIREN's utility.

"I'm categorizing LLM as a productivity tool": Examining ethics of LLM use in HCI research practices

分类： 人机交互

作者： Shivani Kapania, Ruiyi Wang, Toby Jia-Jun Li, Tianshi Li, Hong Shen

发布时间： 2024-03-28

链接： http://arxiv.org/abs/2403.19876v1

摘要： Large language models are increasingly applied in real-world scenarios, including research and education. These models, however, come with well-known ethical issues, which may manifest in unexpected ways in human-computer interaction research due to the extensive engagement with human subjects. This paper reports on research practices related to LLM use, drawing on 16 semi-structured interviews and a survey conducted with 50 HCI researchers. We discuss the ways in which LLMs are already being utilized throughout the entire HCI research pipeline, from ideation to system development and paper writing. While researchers described nuanced understandings of ethical issues, they were rarely or only partially able to identify and address those ethical concerns in their own projects. This lack of action and reliance on workarounds was explained through the perceived lack of control and distributed responsibility in the LLM supply chain, the conditional nature of engaging with ethics, and competing priorities. Finally, we reflect on the implications of our findings and present opportunities to shape emerging norms of engaging with large language models in HCI research.

Aiming for Relevance

分类： 机器学习, 人工智能, 人机交互, 机器学习

作者： Bar Eini Porat, Danny Eytan, Uri Shalit

发布时间： 2024-03-27

链接： http://arxiv.org/abs/2403.18668v1

摘要： Vital signs are crucial in intensive care units (ICUs). They are used to track the patient's state and to identify clinically significant changes. Predicting vital sign trajectories is valuable for early detection of adverse events. However, conventional machine learning metrics like RMSE often fail to capture the true clinical relevance of such predictions. We introduce novel vital sign prediction performance metrics that align with clinical contexts, focusing on deviations from clinical norms, overall trends, and trend deviations. These metrics are derived from empirical utility curves obtained in a previous study through interviews with ICU clinicians. We validate the metrics' usefulness using simulated and real clinical datasets (MIMIC and eICU). Furthermore, we employ these metrics as loss functions for neural networks, resulting in models that excel in predicting clinically significant events. This research paves the way for clinically relevant machine learning model evaluation and optimization, promising to improve ICU patient care. 10 pages, 9 figures.

An Exploratory Study on Upper-Level Computing Students' Use of Large Language Models as Tools in a Semester-Long Project

分类： 软件工程, 人机交互

作者： Ben Arie Tanay, Lexy Arinze, Siddhant S. Joshi, Kirsten A. Davis, James C. Davis

发布时间： 2024-03-27

链接： http://arxiv.org/abs/2403.18679v1

摘要： Background: Large Language Models (LLMs) such as ChatGPT and CoPilot are influencing software engineering practice. Software engineering educators must teach future software engineers how to use such tools well. As of yet, there have been few studies that report on the use of LLMs in the classroom. It is, therefore, important to evaluate students' perception of LLMs and possible ways of adapting the computing curriculum to these shifting paradigms. Purpose: The purpose of this study is to explore computing students' experiences and approaches to using LLMs during a semester-long software engineering project. Design/Method: We collected data from a senior-level software engineering course at Purdue University. This course uses a project-based learning (PBL) design. The students used LLMs such as ChatGPT and Copilot in their projects. A sample of these student teams were interviewed to understand (1) how they used LLMs in their projects; and (2) whether and how their perspectives on LLMs changed over the course of the semester. We analyzed the data to identify themes related to students' usage patterns and learning outcomes. Results/Discussion: When computing students utilize LLMs within a project, their use cases cover both technical and professional applications. In addition, these students perceive LLMs to be efficient tools in obtaining information and completion of tasks. However, there were concerns about the responsible use of LLMs without being detrimental to their own learning outcomes. Based on our findings, we recommend future research to investigate the usage of LLM's in lower-level computer engineering courses to understand whether and how LLMs can be integrated as a learning aid without hurting the learning outcomes.

Teaching Introductory HRI: UChicago Course "Human-Robot Interaction: Research and Practice"

分类： 机器人技术, 人机交互

作者： Sarah Sebo

发布时间： 2024-03-27

链接： http://arxiv.org/abs/2403.18692v1

摘要： In 2020, I designed the course CMSC 20630/30630 Human-Robot Interaction: Research and Practice as a hands-on introduction to human-robot interaction (HRI) research for both undergraduate and graduate students at the University of Chicago. Since 2020, I have taught and refined this course each academic year. Human-Robot Interaction: Research and Practice focuses on the core concepts and cutting-edge research in the field of human-robot interaction (HRI), covering topics that include: nonverbal robot behavior, verbal robot behavior, social dynamics, norms & ethics, collaboration & learning, group interactions, applications, and future challenges of HRI. Course meetings involve students in the class leading discussions about cutting-edge peer-reviewed research HRI publications. Students also participate in a quarter-long collaborative research project, where they pursue an HRI research question that often involves conducing their own human-subjects research study where they recruit human subjects to interact with a robot. In this paper, I detail the structure of the course and its learning goals as well as my reflections and student feedback on the course.

SolderlessPCB: Reusing Electronic Components in PCB Prototyping through Detachable 3D Printed Housings

分类： 人机交互

作者： Zeyu Yan, Jiasheng Li, Zining Zhang, Huaishu Peng

发布时间： 2024-03-27

链接： http://arxiv.org/abs/2403.18797v1

摘要： The iterative prototyping process for printed circuit boards (PCBs) frequently employs surface-mounted device (SMD) components, which are often discarded rather than reused due to the challenges associated with desoldering, leading to unnecessary electronic waste. This paper introduces SolderlessPCB, a collection of techniques for solder-free PCB prototyping, specifically designed to promote the recycling and reuse of electronic components. Central to this approach are custom 3D-printable housings that allow SMD components to be mounted onto PCBs without soldering. We detail the design of SolderlessPCB and the experiments conducted to evaluate its design parameters, electrical performance, and durability. To illustrate the potential for reusing SMD components with SolderlessPCB, we discuss two scenarios: the reuse of components from earlier design iterations and from obsolete prototypes. We also provide examples demonstrating that SolderlessPCB can handle high-current applications and is suitable for high-speed data transmission. The paper concludes by discussing the limitations of our approach and suggesting future directions to overcome these challenges.

Thelxinoë: Recognizing Human Emotions Using Pupillometry and Machine Learning

分类： 机器学习, 人机交互

作者： Darlene Barker, Haim Levkowitz

发布时间： 2024-03-27

链接： http://arxiv.org/abs/2403.19014v1

摘要： In this study, we present a method for emotion recognition in Virtual Reality (VR) using pupillometry. We analyze pupil diameter responses to both visual and auditory stimuli via a VR headset and focus on extracting key features in the time-domain, frequency-domain, and time-frequency domain from VR generated data. Our approach utilizes feature selection to identify the most impactful features using Maximum Relevance Minimum Redundancy (mRMR). By applying a Gradient Boosting model, an ensemble learning technique using stacked decision trees, we achieve an accuracy of 98.8% with feature engineering, compared to 84.9% without it. This research contributes significantly to the Thelxino"e framework, aiming to enhance VR experiences by integrating multiple sensor data for realistic and emotionally resonant touch interactions. Our findings open new avenues for developing more immersive and interactive VR environments, paving the way for future advancements in virtual touch technology.

The Correlations of Scene Complexity, Workload, Presence, and Cybersickness in a Task-Based VR Game

分类： 人机交互

作者： Mohammadamin Sanaei, Stephen B. Gilbert, Nikoo Javadpour, Hila Sabouni, Michael C. Dorneich, Jonathan W. Kelly

发布时间： 2024-03-27

链接： http://arxiv.org/abs/2403.19019v1

摘要： This investigation examined the relationships among scene complexity, workload, presence, and cybersickness in virtual reality (VR) environments. Numerous factors can influence the overall VR experience, and existing research on this matter is not yet conclusive, warranting further investigation. In this between-subjects experimental setup, 44 participants engaged in the Pendulum Chair game, with half exposed to a simple scene with lower optic flow and lower familiarity, and the remaining half to a complex scene characterized by higher optic flow and greater familiarity. The study measured the dependent variables workload, presence, and cybersickness and analyzed their correlations. Equivalence testing was also used to compare the simple and complex environments. Results revealed that despite the visible differences between the environments, within the 10% boundaries of the maximum possible value for workload and presence, and 13.6% of the maximum SSQ value, a statistically significant equivalence was observed between the simple and complex scenes. Additionally, a moderate, negative correlation emerged between workload and SSQ scores. The findings suggest two key points: (1) the nature of the task can mitigate the impact of scene complexity factors such as optic flow and familiarity, and (2) the correlation between workload and cybersickness may vary, showing either a positive or negative relationship.

Should I Help a Delivery Robot? Cultivating Prosocial Norms through Observations

分类： 机器人技术, 人机交互

作者： Vivienne Bihe Chi, Shashank Mehrotra, Teruhisa Misu, Kumar Akash

发布时间： 2024-03-27

链接： http://arxiv.org/abs/2403.19027v1

摘要： We propose leveraging prosocial observations to cultivate new social norms to encourage prosocial behaviors toward delivery robots. With an online experiment, we quantitatively assess updates in norm beliefs regarding human-robot prosocial behaviors through observational learning. Results demonstrate the initially perceived normativity of helping robots is influenced by familiarity with delivery robots and perceptions of robots' social intelligence. Observing human-robot prosocial interactions notably shifts peoples' normative beliefs about prosocial actions; thereby changing their perceived obligations to offer help to delivery robots. Additionally, we found that observing robots offering help to humans, rather than receiving help, more significantly increased participants' feelings of obligation to help robots. Our findings provide insights into prosocial design for future mobility systems. Improved familiarity with robot capabilities and portraying them as desirable social partners can help foster wider acceptance. Furthermore, robots need to be designed to exhibit higher levels of interactivity and reciprocal capabilities for prosocial behavior.

Women are less comfortable expressing opinions online than men and report heightened fears for safety: Surveying gender differences in experiences of online harms

分类： 计算机与社会, 人机交互

作者： Francesca Stevens, Florence E. Enock, Tvesha Sippy, Jonathan Bright, Miranda Cross, Pica Johansson, Judy Wajcman, Helen Z. Margetts

发布时间： 2024-03-27

链接： http://arxiv.org/abs/2403.19037v1

摘要： Online harms, such as hate speech, trolling and self-harm promotion, continue to be widespread. While some work suggests women are disproportionately affected, other studies find mixed evidence for gender differences in experiences with content of this kind. Using a nationally representative survey of UK adults (N=1992), we examine exposure to a variety of harms, fears surrounding being targeted, the psychological impact of online experiences, the use of safety tools to protect against harm, and comfort with various forms of online participation across men and women. We find that while men and women see harmful content online to a roughly similar extent, women are more at risk than men of being targeted by harms including online misogyny, cyberstalking and cyberflashing. Women are significantly more fearful of being targeted by harms overall, and report greater negative psychological impact as a result of particular experiences. Perhaps in an attempt to mitigate risk, women report higher use of a range of safety tools and less comfort with several forms of online participation, with just 23% of women comfortable expressing political views online compared to 40% of men. We also find direct associations between fears surrounding harms and comfort with online behaviours. For example, fear of being trolled significantly decreases comfort expressing opinions, and fear of being targeted by misogyny significantly decreases comfort sharing photos. Our results are important because with much public discourse happening online, we must ensure all members of society feel safe and able to participate in online spaces.

Visualizing High-Dimensional Temporal Data Using Direction-Aware t-SNE

分类： 机器学习, 人机交互

作者： Pavlin G. Poličar, Blaž Zupan

发布时间： 2024-03-27

链接： http://arxiv.org/abs/2403.19040v1

摘要： Many real-world data sets contain a temporal component or involve transitions from state to state. For exploratory data analysis, we can represent these high-dimensional data sets in two-dimensional maps, using embeddings of the data objects under exploration and representing their temporal relationships with directed edges. Most existing dimensionality reduction techniques, such as t-SNE and UMAP, do not take into account the temporal or relational nature of the data when constructing the embeddings, resulting in temporally cluttered visualizations that obscure potentially interesting patterns. To address this problem, we propose two complementary, direction-aware loss terms in the optimization function of t-SNE that emphasize the temporal aspects of the data, guiding the optimization and the resulting embedding to reveal temporal patterns that might otherwise go unnoticed. The Directional Coherence Loss (DCL) encourages nearby arrows connecting two adjacent time series points to point in the same direction, while the Edge Length Loss (ELL) penalizes arrows - which effectively represent time gaps in the visualized embedding - based on their length. Both loss terms are differentiable and can be easily incorporated into existing dimensionality reduction techniques. By promoting local directionality of the directed edges, our procedure produces more temporally meaningful and less cluttered visualizations. We demonstrate the effectiveness of our approach on a toy dataset and two real-world datasets.

Towards Human-Centered Construction Robotics: An RL-Driven Companion Robot For Contextually Assisting Carpentry Workers

分类： 机器人技术, 人工智能, 人机交互, 机器学习

作者： Yuning Wu, Jiaying Wei, Jean Oh, Daniel Cardoso Llach

发布时间： 2024-03-27

链接： http://arxiv.org/abs/2403.19060v1

摘要： In the dynamic construction industry, traditional robotic integration has primarily focused on automating specific tasks, often overlooking the complexity and variability of human aspects in construction workflows. This paper introduces a human-centered approach with a ``work companion rover" designed to assist construction workers within their existing practices, aiming to enhance safety and workflow fluency while respecting construction labor's skilled nature. We conduct an in-depth study on deploying a robotic system in carpentry formwork, showcasing a prototype that emphasizes mobility, safety, and comfortable worker-robot collaboration in dynamic environments through a contextual Reinforcement Learning (RL)-driven modular framework. Our research advances robotic applications in construction, advocating for collaborative models where adaptive robots support rather than replace humans, underscoring the potential for an interactive and collaborative human-robot workforce.