Conversational Health Agents (CHAs) are interactive systems designed to enhance personal healthcare services by engaging in empathetic conversations and processing multimodal data. While current CHAs, especially those utilizing Large Language Models (LLMs), primarily focus on conversation, they often need more comprehensive agent capabilities. This limitation includes accessing personal user health data from wearables, ubiquitous data collection sources, and electronic health records, integrating the latest published health insights, and connecting with established multimodal data analysis tools. In this paper, we propose an LLM-powered framework to empower CHAs to generate a personalized response for users’ healthcare queries.
We design an LLM-powered framework leveraging a service-based architecture with a central agent that perceives and analyze user queries, provides appropriate response, and manages access to external resources through Application Program Interfaces (APIs). The user-framework interaction is bidirectional, ensuring a conversational tone for ongoing and follow-up conversations.
Interface acts as a bridge between the users and agents, including interactive tools accessible through mobile, desktop,or web applications. It integrates multimodal communication channels, such as text and audio. The Interface receives users’ queries and subsequently transmits them to the Orchestrator. It is noteworthy that questions can be presented in various modes of human communication, such as text, speech, or gestures.
Users can provide metadata (alongside their queries) within this framework, including images, audio, gestures, and more. For instance, a user could capture an image of their meal and inquire about its nutritional values or calorie content, with the image serving as metadata. In this open-source version, we are using Gradio as our Interface to make it easier for other contributers to start contributing.
The Orchestrator is responsible for problem solving and decision making to provide an appropriate response based on the user query. It incorporates the concept of the Perceptual Cycle Model in CHAs, allowing it to perceive, transform, and analyze the world (i.e., input query and metadata) to generate appropriate responses. To this end, the input data are aggregated, transformed into structured data, and then analyzed to plan and execute actions. Through this process, the Orchestrator interacts with external sources to acquire the required information, perform data integration and analysis, and extract insights, among other functions.
External Sources play a pivotal role in obtaining essential information from the broader world. Typically, these External Sources furnish application program interfaces (APIs) that the Orchestrator can use to retrieve required data, process them, and extract meaningful health information. As previously mentioned, the Task Executor calls the APIs, i.e., the Orchestrator’s actuation component. Our framework integrates with four primary external sources, which we found critical for conversational health agents.
Latest Videos
Latest Posts & News
Conversational Health Agents (CHAs) are interactive systems designed to enhance personal healthcare services by engaging in empathetic conversations and processing multimodal data …
In this paper, we introduce ChatDiet, a novel LLM-powered framework designed specifically for personalized nutrition-oriented food recommendation chatbots …
Generative Artificial Intelligence is set to revolutionize healthcare delivery by transforming traditional patient care into a more personalized, efficient, and proactive process. Chatbots, serving as interactive conversational …
Effective diabetes management is crucial for maintaining health in diabetic patients. Large Language Models (LLMs) have opened new avenues for diabetes management, facilitating their efficacy …