Artwork

Contenido proporcionado por Kai Kunze. Todo el contenido del podcast, incluidos episodios, gráficos y descripciones de podcast, lo carga y proporciona directamente Kai Kunze o su socio de plataforma de podcast. Si cree que alguien está utilizando su trabajo protegido por derechos de autor sin su permiso, puede seguir el proceso descrito aquí https://es.player.fm/legal.
Player FM : aplicación de podcast
¡Desconecta con la aplicación Player FM !

ISWC 2024 Honorable Mention: EchoGuide: Active Acoustic Guidance for LLM-Based Eating Event Analysis from Egocentric Videos

12:48
 
Compartir
 

Manage episode 444948931 series 3605621
Contenido proporcionado por Kai Kunze. Todo el contenido del podcast, incluidos episodios, gráficos y descripciones de podcast, lo carga y proporciona directamente Kai Kunze o su socio de plataforma de podcast. Si cree que alguien está utilizando su trabajo protegido por derechos de autor sin su permiso, puede seguir el proceso descrito aquí https://es.player.fm/legal.

We deep dive today into an ISWC 2024 Honorable Mention.

Self-recording eating behaviors is a step towards a healthy lifestyle recommended by many health professionals. However, the current practice of manually recording eating activities using paper records or smartphone apps is often unsustainable and inaccurate. Smart glasses have emerged as a promising wearable form factor for tracking eating behaviors, but existing systems primarily identify when eating occurs without capturing details of the eating activities (E.g., what is being eaten). In this paper, we present EchoGuide, an application and system pipeline that leverages low-power active acoustic sensing to guide head-mounted cameras to capture egocentric videos, enabling efficient and detailed analysis of eating activities. By combining active acoustic sensing for eating detection with video captioning models and large-scale language models for retrieval augmentation, EchoGuide intelligently clips and analyzes videos to create concise, relevant activity records on eating. We evaluated EchoGuide with 9 participants in naturalistic settings involving eating activities, demonstrating high-quality summarization and significant reductions in video data needed, paving the way for practical, scalable eating activity tracking.

https://dl.acm.org/doi/10.1145/3675095.3676611

  continue reading

35 episodios

Artwork
iconCompartir
 
Manage episode 444948931 series 3605621
Contenido proporcionado por Kai Kunze. Todo el contenido del podcast, incluidos episodios, gráficos y descripciones de podcast, lo carga y proporciona directamente Kai Kunze o su socio de plataforma de podcast. Si cree que alguien está utilizando su trabajo protegido por derechos de autor sin su permiso, puede seguir el proceso descrito aquí https://es.player.fm/legal.

We deep dive today into an ISWC 2024 Honorable Mention.

Self-recording eating behaviors is a step towards a healthy lifestyle recommended by many health professionals. However, the current practice of manually recording eating activities using paper records or smartphone apps is often unsustainable and inaccurate. Smart glasses have emerged as a promising wearable form factor for tracking eating behaviors, but existing systems primarily identify when eating occurs without capturing details of the eating activities (E.g., what is being eaten). In this paper, we present EchoGuide, an application and system pipeline that leverages low-power active acoustic sensing to guide head-mounted cameras to capture egocentric videos, enabling efficient and detailed analysis of eating activities. By combining active acoustic sensing for eating detection with video captioning models and large-scale language models for retrieval augmentation, EchoGuide intelligently clips and analyzes videos to create concise, relevant activity records on eating. We evaluated EchoGuide with 9 participants in naturalistic settings involving eating activities, demonstrating high-quality summarization and significant reductions in video data needed, paving the way for practical, scalable eating activity tracking.

https://dl.acm.org/doi/10.1145/3675095.3676611

  continue reading

35 episodios

Todos los episodios

×
 
Uğur Genç and Himanshu Verma. 2024. Situating Empathy in HCI/CSCW: A Scoping Review. Proc. ACM Hum.-Comput. Interact. 8, CSCW2, Article 513 (November 2024), 37 pages. https://doi.org/10.1145/3687052 Empathy is considered a crucial construct within HCI and CSCW, yet our understanding of this complex concept remains fragmented and lacks consensus in existing research. In this scoping review of 121 articles from the ACM Digital Library, we synthesize the diverse perspectives on empathy and scrutinize its current conceptualization and operationalization. In particular, we examine the various interpretations and definitions of empathy, its applications, and the methodologies, findings, and trends in the field. Our analysis reveals a lack of consensus on the definitions and theoretical underpinnings of empathy, with interpretations ranging from understanding the experiences of others to an affective response to the other's situation. We observed that despite the variety of methods used to gauge empathy, the predominant approach remains self-assessed instruments, highlighting the lack of novel and rigorously established and validated measures and methods to capture the multifaceted manifestations of empathy. Furthermore, our analysis shows that previous studies have used a variety of approaches to elicit empathy, such as experiential methods and situational awareness. These approaches have demonstrated that shared stressful experiences promote community support and relief, while situational awareness promotes empathy through increased helping behavior. Finally, we discuss a) the potential and drawbacks of leveraging empathy to shape interactions and guide design practices, b) the need to find a balance between the collective focus of empathy and the (existing and dominant) focus on the individual, and c) the careful testing of empathic designs and technologies with real-world applications. https://dl.acm.org/doi/10.1145/3687052…
 
Isna Alfi Bustoni, Mark McGill, and Stephen Anthony Brewster. 2024. Exploring the Alteration and Masking of Everyday Noise Sounds using Auditory Augmented Reality. In Proceedings of the 26th International Conference on Multimodal Interaction (ICMI '24). Association for Computing Machinery, New York, NY, USA, 154–163. https://doi.org/10.1145/3678957.3685750 While noise-cancelling headphones can block out or mask environmental noise with digital sound, this costs the user situational awareness and information. With the advancement of acoustically transparent personal audio devices (e.g. headphones, open-ear audio frames), Auditory Augmented Reality (AAR), and real-time audio processing, it is feasible to preserve user situational awareness and relevant information whilst diminishing the perception of the noise. Through an online survey (n=124), this research explored users’ attitudes and preferred AAR strategy (keep the noise, make the noise more pleasant, obscure the noise, reduce the noise, remove the noise, and replace the noise) toward different types of noises from a range of categories (living beings, mechanical, and environmental) and varying degrees of relevance. It was discovered that respondents’ degrees of annoyance varied according to the kind of noise and its relevance to them. Additionally, respondents had a strong tendency to reduce irrelevant noise and retain more relevant noise. Based on our findings, we discuss how AAR can assist users in coping with noise whilst retaining relevant information through selectively suppressing or altering the noise, as appropriate. https://dl.acm.org/doi/10.1145/3678957.3685750…
 
Pratheep Kumar Chelladurai, Ziming Li, Maximilian Weber, Tae Oh, and Roshan L Peiris. 2024. SoundHapticVR: Head-Based Spatial Haptic Feedback for Accessible Sounds in Virtual Reality for Deaf and Hard of Hearing Users. In Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '24). Association for Computing Machinery, New York, NY, USA, Article 31, 1–17. https://doi.org/10.1145/3663548.3675639 Virtual Reality (VR) systems use immersive spatial audio to convey critical information, but these audio cues are often inaccessible to Deaf or Hard-of-Hearing (DHH) individuals. To address this, we developed SoundHapticVR, a head-based haptic system that converts audio signals into haptic feedback using multi-channel acoustic haptic actuators. We evaluated SoundHapticVR through three studies: determining the maximum tactile frequency threshold on different head regions for DHH users, identifying the ideal number and arrangement of transducers for sound localization, and assessing participants’ ability to differentiate sound sources with haptic patterns. Findings indicate that tactile perception thresholds vary across head regions, necessitating consistent frequency equalization. Adding a front transducer significantly improved sound localization, and participants could correlate distinct haptic patterns with specific objects. Overall, this system has the potential to make VR applications more accessible to DHH users. https://dl.acm.org/doi/10.1145/3663548.3675639…
 
Giulia Barbareschi, Ando Ryoichi, Midori Kawaguchi, Minato Takeda, and Kouta Minamizawa. 2024. SeaHare: An omidirectional electric wheelchair integrating independent, remote and shared control modalities. In Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '24). Association for Computing Machinery, New York, NY, USA, Article 9, 1–16. https://doi.org/10.1145/3663548.3675657 Depending on one’s needs electric wheelchairs can feature different interfaces and driving paradigms with control handed to the user, a remote pilot, or shared. However, these systems have generally been implemented on separate wheelchairs, making comparison difficult. We present the design of an omnidirectional electric wheelchair that can be controlled using two sensing seats detecting changes in the centre of gravity. One of the sensing seats is used by the person on the wheelchair, whereas the other is used as a remote control by a second person. We explore the use of the wheelchair using different control paradigms (independent, remote, and shared) from both the wheelchair and the remote control seat with 5 dyads and 1 triad of participants, including wheelchair users and non. Results highlight key advantages and disadvantages of the SeaHare in different paradigms, with participants’ perceptions affected by their skills and lived experiences, and reflections on how different control modes might suit different scenarios. https://dl.acm.org/doi/10.1145/3663548.3675657…
 
Giulia Barbareschi, Songchen Zhou, Ando Ryoichi, Midori Kawaguchi, Mark Armstrong, Mikito Ogino, Shunsuke Aoiki, Eisaku Ohta, Harunobu Taguchi, Youichi Kamiyama, Masatane Muto, Kentaro Yoshifuji, and Kouta Minamizawa. 2024. Brain Body Jockey project: Transcending Bodily Limitations in Live Performance via Human Augmentation. In Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '24). Association for Computing Machinery, New York, NY, USA, Article 18, 1–14. https://doi.org/10.1145/3663548.3675621 Musicians with significant mobility limitations, face unique challenges in being able to use their bodies to interact with fans during live performances. In this paper we present the results of a collaboration between a professional DJ with advanced Amyotrophic Lateral Sclerosis and a group of technologists and researchers culminating in two public live performances leveraging human augmentation technologies to enhance the artist’s stage presence. Our system combines Brain Machine Interface, and accelerometer based trigger, to select pre-programmed moves performed by robotic arms during a live event, as well as for facilitating direct physical interaction during a “Meet the DJ” event. Our evaluation includes ethnographic observations and interviews with the artist and members of the audience. Results show that the system allowed artist and audience to feel a sense of unity, expanded the imagination of creative possibilities, and challenged conventional perceptions of disability in the arts and beyond. https://dl.acm.org/doi/10.1145/3663548.3675621…
 
F. Chiossi, I. Trautmannsheimer, C. Ou, U. Gruenefeld and S. Mayer, "Searching Across Realities: Investigating ERPs and Eye-Tracking Correlates of Visual Search in Mixed Reality," in IEEE Transactions on Visualization and Computer Graphics, vol. 30, no. 11, pp. 6997-7007, Nov. 2024, doi: 10.1109/TVCG.2024.3456172. Mixed Reality allows us to integrate virtual and physical content into users' environments seamlessly. Yet, how this fusion affects perceptual and cognitive resources and our ability to find virtual or physical objects remains uncertain. Displaying virtual and physical information simultaneously might lead to divided attention and increased visual complexity, impacting users' visual processing, performance, and workload. In a visual search task, we asked participants to locate virtual and physical objects in Augmented Reality and Augmented Virtuality to understand the effects on performance. We evaluated search efficiency and attention allocation for virtual and physical objects using event-related potentials, fixation and saccade metrics, and behavioral measures. We found that users were more efficient in identifying objects in Augmented Virtuality, while virtual objects gained saliency in Augmented Virtuality. This suggests that visual fidelity might increase the perceptual load of the scene. Reduced amplitude in distractor positivity ERP, and fixation patterns supported improved distractor suppression and search efficiency in Augmented Virtuality. We discuss design implications for mixed reality adaptive systems based on physiological inputs for interaction. https://ieeexplore.ieee.org/document/10679197…
 
S. Cheng, Y. Liu, Y. Gao and Z. Dong, "“As if it were my own hand”: inducing the rubber hand illusion through virtual reality for motor imagery enhancement," in IEEE Transactions on Visualization and Computer Graphics , vol. 30, no. 11, pp. 7086-7096, Nov. 2024, doi: 10.1109/TVCG.2024.3456147 Brain-computer interfaces (BCI) are widely used in the field of disability assistance and rehabilitation, and virtual reality (VR) is increasingly used for visual guidance of BCI-MI (motor imagery). Therefore, how to improve the quality of electroencephalogram (EEG) signals for MI in VR has emerged as a critical issue. People can perform MI more easily when they visualize the hand used for visual guidance as their own, and the Rubber Hand Illusion (RHI) can increase people's ownership of the prosthetic hand. We proposed to induce RHI in VR to enhance participants' MI ability and designed five methods of inducing RHI, namely active movement, haptic stimulation, passive movement, active movement mixed with haptic stimulation, and passive movement mixed with haptic stimulation, respectively. We constructed a first-person training scenario to train participants' MI ability through the five induction methods. The experimental results showed that through the training, the participants' feeling of ownership of the virtual hand in VR was enhanced, and the MI ability was improved. Among them, the method of mixing active movement and tactile stimulation proved to have a good effect on enhancing MI. Finally, we developed a BCI system in VR utilizing the above training method, and the performance of the participants improved after the training. This also suggests that our proposed method is promising for future application in BCI rehabilitation systems. https://ieeexplore.ieee.org/document/10669780…
 
Pavel Manakhov, Ludwig Sidenmark, Ken Pfeuffer, and Hans Gellersen. 2024. Filtering on the Go: Effect of Filters on Gaze Pointing Accuracy During Physical Locomotion in Extended Reality. IEEE Transactions on Visualization and Computer Graphics 30, 11 (Nov. 2024), 7234–7244. https://doi.org/10.1109/TVCG.2024.3456153 Eye tracking filters have been shown to improve accuracy of gaze estimation and input for stationary settings. However, their effectiveness during physical movement remains underexplored. In this work, we compare common online filters in the context of physical locomotion in extended reality and propose alterations to improve them for on-the-go settings. We conducted a computational experiment where we simulate performance of the online filters using data on participants attending visual targets located in world-, path-, and two head-based reference frames while standing, walking, and jogging. Our results provide insights into the filters' effectiveness and factors that affect it, such as the amount of noise caused by locomotion and differences in compensatory eye movements, and demonstrate that filters with saccade detection prove most useful for on-the-go settings. We discuss the implications of our findings and conclude with guidance on gaze data filtering for interaction in extended reality. https://ieeexplore.ieee.org/document/10672561…
 
Nicholas Jennings, Han Wang, Isabel Li, James Smith, and Bjoern Hartmann. 2024. What's the Game, then? Opportunities and Challenges for Runtime Behavior Generation. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST '24). Association for Computing Machinery, New York, NY, USA, Article 106, 1–13. https://doi.org/10.1145/3654777.3676358 Procedural content generation (PCG), the process of algorithmically creating game components instead of manually, has been a common tool of game development for decades. Recent advances in large language models (LLMs) enable the generation of game behaviors based on player input at runtime. Such code generation brings with it the possibility of entirely new gameplay interactions that may be difficult to integrate with typical game development workflows. We explore these implications through GROMIT, a novel LLM-based runtime behavior generation system for Unity. When triggered by a player action, GROMIT generates a relevant behavior which is compiled without developer intervention and incorporated into the game. We create three demonstration scenarios with GROMIT to investigate how such a technology might be used in game development. In a system evaluation we find that our implementation is able to produce behaviors that result in significant downstream impacts to gameplay. We then conduct an interview study with n=13 game developers using GROMIT as a probe to elicit their current opinion on runtime behavior generation tools, and enumerate the specific themes curtailing the wider use of such tools. We find that the main themes of concern are quality considerations, community expectations, and fit with developer workflows, and that several of the subthemes are unique to runtime behavior generation specifically. We outline a future work agenda to address these concerns, including the need for additional guardrail systems for behavior generation. https://dl.acm.org/doi/10.1145/3654777.3676358…
 
Akifumi Takahashi, Yudai Tanaka, Archit Tamhane, Alan Shen, Shan-Yuan Teng, and Pedro Lopes. 2024. Can a Smartwatch Move Your Fingers? Compact and Practical Electrical Muscle Stimulation in a Smartwatch. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST '24). Association for Computing Machinery, New York, NY, USA, Article 2, 1–15. https://doi.org/10.1145/3654777.3676373 Smartwatches gained popularity in the mainstream, making them into today’s de-facto wearables. Despite advancements in sensing, haptics on smartwatches is still restricted to tactile feedback (e.g., vibration). Most smartwatch-sized actuators cannot render strong force-feedback. Simultaneously, electrical muscle stimulation (EMS) promises compact force-feedback but, to actuate fingers requires users to wear many electrodes on their forearms. While forearm electrodes provide good accuracy, they detract EMS from being a practical force-feedback interface. To address this, we propose moving the electrodes to the wrist—conveniently packing them in the backside of a smartwatch. In our first study, we found that by cross-sectionally stimulating the wrist in 1,728 trials, we can actuate thumb extension, index extension & flexion, middle flexion, pinky flexion, and wrist flexion. Following, we engineered a compact EMS that integrates directly into a smartwatch’s wristband (with a custom stimulator, electrodes, demultiplexers, and communication). In our second study, we found that participants could calibrate our device by themselves Math 1 faster than with conventional EMS. Furthermore, all participants preferred the experience of this device, especially for its social acceptability & practicality. We believe that our approach opens new applications for smartwatch-based interactions, such as haptic assistance during everyday tasks. https://dl.acm.org/doi/10.1145/3654777.3676373…
 
Md Touhidul Islam, Noushad Sojib, Imran Kabir, Ashiqur Rahman Amit, Mohammad Ruhul Amin, and Syed Masum Billah. 2024. Wheeler: A Three-Wheeled Input Device for Usable, Efficient, and Versatile Non-Visual Interaction. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST '24). Association for Computing Machinery, New York, NY, USA, Article 31, 1–20. https://doi.org/10.1145/3654777.3676396 Blind users rely on keyboards and assistive technologies like screen readers to interact with user interface (UI) elements. In modern applications with complex UI hierarchies, navigating to different UI elements poses a significant accessibility challenge. Users must listen to screen reader audio descriptions and press relevant keyboard keys one at a time. This paper introduces Wheeler, a novel three-wheeled, mouse-shaped stationary input device, to address this issue. Informed by participatory sessions, Wheeler enables blind users to navigate up to three hierarchical levels in an app independently using three wheels instead of navigating just one level at a time using a keyboard. The three wheels also offer versatility, allowing users to repurpose them for other tasks, such as 2D cursor manipulation. A study with 12 blind users indicates a significant reduction (40%) in navigation time compared to using a keyboard. Further, a diary study with our blind co-author highlights Wheeler’s additional benefits, such as accessing UI elements with partial metadata and facilitating mixed-ability collaboration. https://dl.acm.org/doi/10.1145/3654777.3676396…
 
Shwetha Rajaram, Nels Numan, Balasaravanan Thoravi Kumaravel, Nicolai Marquardt, and Andrew D Wilson. 2024. BlendScape: Enabling End-User Customization of Video-Conferencing Environments through Generative AI. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST '24). Association for Computing Machinery, New York, NY, USA, Article 40, 1–19. https://doi.org/10.1145/3654777.3676326 Today’s video-conferencing tools support a rich range of professional and social activities, but their generic meeting environments cannot be dynamically adapted to align with distributed collaborators’ needs. To enable end-user customization, we developed BlendScape, a rendering and composition system for video-conferencing participants to tailor environments to their meeting context by leveraging AI image generation techniques. BlendScape supports flexible representations of task spaces by blending users’ physical or digital backgrounds into unified environments and implements multimodal interaction techniques to steer the generation. Through an exploratory study with 15 end-users, we investigated whether and how they would find value in using generative AI to customize video-conferencing environments. Participants envisioned using a system like BlendScape to facilitate collaborative activities in the future, but required further controls to mitigate distracting or unrealistic visual elements. We implemented scenarios to demonstrate BlendScape’s expressiveness for supporting environment design strategies from prior work and propose composition techniques to improve the quality of environments. https://dl.acm.org/doi/10.1145/3654777.3676326…
 
Ximing Shen, Youichi Kamiyama, Kouta Minamizawa, and Jun Nishida. 2024. DexteriSync: A Hand Thermal I/O Exoskeleton for Morphing Finger Dexterity Experience. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST '24). Association for Computing Machinery, New York, NY, USA, Article 102, 1–12. https://doi.org/10.1145/3654777.3676422 Skin temperature is an important physiological factor for human hand dexterity. Leveraging this feature, we engineered an exoskeleton, called DexteriSync, that can dynamically adjust the user’s finger dexterity and induce different thermal perceptions by modulating finger skin temperature. This exoskeleton comprises flexible silicone-copper tube segments, 3D-printed finger sockets, a 3D-printed palm base, a pump system, and a water temperature control with a storage unit. By realising an embodied experience of compromised dexterity, DexteriSync can help product designers understand the lived experience of compromised hand dexterity, such as that of the elderly and/or neurodivergent users, when designing daily necessities for them. We validated DexteriSync via a technical evaluation and two user studies, demonstrating that it can change skin temperature, dexterity, and thermal perception. An exploratory session with design students and an autistic compromised dexterity individual, demonstrated the exoskeleton provided a more realistic experience compared to video education, and allowed them to gain higher confidence in their designs. The results advocated for the efficacy of experiencing embodied compromised finger dexterity, which can promote an understanding of the related physical challenges and lead to a more persuasive design for assistive tools. https://dl.acm.org/doi/10.1145/3654777.3676422…
 
Andreia Valente, Dajin Lee, Seungmoon Choi, Mark Billinghurst, and Augusto Esteves. 2024. Modulating Heart Activity and Task Performance using Haptic Heartbeat Feedback: A Study Across Four Body Placements. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST '24). Association for Computing Machinery, New York, NY, USA, Article 25, 1–13. https://doi.org/10.1145/3654777.3676435 This paper explores the impact of vibrotactile haptic feedback on heart activity when the feedback is provided at four different body locations (chest, wrist, neck, and ankle) and with two feedback rates (50 bpm and 110 bpm). A user study found that the neck placement resulted in higher heart rates and lower heart rate variability, and higher frequencies correlated with increased heart rates and decreased heart rate variability. The chest was preferred in self-reported metrics, and neck placement was perceived as less satisfying, harmonious, and immersive. This research contributes to understanding the interplay between psychological experiences and physiological responses when using haptic biofeedback resembling real body signals. https://dl.acm.org/doi/10.1145/3654777.3676435…
 
Junlei Hong, Tobias Langlotz, Jonathan Sutton, and Holger Regenbrecht. 2024. Visual Noise Cancellation: Exploring Visual Discomfort and Opportunities for Vision Augmentations. ACM Trans. Comput.-Hum. Interact. 31, 2, Article 22 (April 2024), 26 pages. https://doi.org/10.1145/3634699 Acoustic noise control or cancellation (ANC) is a commonplace component of modern audio headphones. ANC aims to actively mitigate disturbing environmental noise for a quieter and improved listening experience. ANC is digitally controlling frequency and amplitude characteristics of sound. Much less explored is visual noise and active visual noise control, which we address here. We first explore visual noise and scenarios in which visual noise arises based on findings from four workshops we conducted. We then introduce the concept of visual noise cancellation (VNC) and how it can be used to reduce identified effects of visual noise. In addition, we developed head-worn demonstration prototypes to practically explore the concept of active VNC with selected scenarios in a user study. Finally, we discuss the application of VNC, including vision augmentations that moderate the user’s view of the environment to address perceptual needs and to provide augmented reality content. https://dl.acm.org/doi/10.1145/3634699…
 
Loading …

Bienvenido a Player FM!

Player FM está escaneando la web en busca de podcasts de alta calidad para que los disfrutes en este momento. Es la mejor aplicación de podcast y funciona en Android, iPhone y la web. Regístrate para sincronizar suscripciones a través de dispositivos.

 

Guia de referencia rapida

Escucha este programa mientras exploras
Reproducir