"Understanding Emotions And Transforming Learning" Innovative Educational Experience "illusionX" Integrated Emotional Computing And Mixed Reality System

Large Language Models 08/04/2024

3 main points
✔️ Educational Applications of Emotional Computing: a new system integrating emotional computing, large-scale language modeling, and mixed reality technology is introduced to the field of education.
✔️ System Capabilities and Requirements: requirements for systems to provide capabilities such as information retrieval, instructional capabilities, task support, and conversational interfaces, with a focus on improving user experience and performance.
✔️ System Design and Components: presents the design and configuration of a system that integrates elements such as large-scale language models, APIs, mobile apps, smart glasses and smart watches.

IllusionX: An LLM-powered mixed reality personal companion
written by Ramez Yousri, Zeyad Essam, Yehia Kareem, Youstina Sherief, Sherry Gamil, Soha Safwat
(Submitted on 4 Feb 2024)
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)

code：

The images used in this article are from the paper, the introductory slides, or were created based on them.

Summary

Emotional computing has been gaining attention in recent years to enrich human interaction with computers and machines. This innovative field seeks to transform human-machine interactions into more natural and intuitive ones by understanding and responding to users' emotions and psychological states. Whether it is emotion recognition, facial expression analysis, or detection of user engagement, this technology plays an important role in the field of human-computer interaction (HCI) and human-machine interaction (HMI).

Emotional computing systems accomplish these goals by capturing signals of verbal and nonverbal communication, such as conversations and gestures. By using machine learning algorithms to analyze the user's emotions and derive appropriate responses, more human-like interactions are possible.

Education is also a crucial area of human life, and cutting-edge technologies such as artificial intelligence (AI) and affective computing are enabling the delivery of personalized learning experiences. Artificial intelligence has been particularly influential in improving the quality of education and promoting personalized learning, and research on strategies to incorporate tools such as ChatGPT into education has been active. In addition, mixed reality (MR), including virtual reality (VR) and augmented reality (AR), has also emerged as a promising technology to enhance the learning experience, especially in the areas of online learning and e-learning since the COVID-19 pandemic.

The paper proposes a new system, illusionX, that blends mixed reality, artificial intelligence, and affective computing to provide personalized support for learners and help educators prepare more interactive lessons and curriculum.

This paper begins with the latest concepts in large-scale language models and mixed reality systems and delves into how they can help education and the challenges that may be faced. Through the functional and non-functional requirements of the system, the design components, and the results of the surveys and tests conducted, we show how the system we have developed can be useful in the field of education. Finally, we provide an overview of the system as a summary, along with future prospects in this field, and emphasize that this is the first system to combine large-scale language modeling, mixed reality, and affective computing specifically for the education sector.

System requirement

illusionX is a system aimed at providing a more personalized experience and superior user experience for educational purposes. The system is divided into two main components: one is a software application (AI, backend, and mobile app) and the other is a hardware device (smart glasses and smart watch). Below is a description of the capabilities and requirements of this innovative system.

The first is the main function of the system.

Information Search: Information on a wide range of topics is available upon user request.
Teaching ability: able to teach and explain complex topics to various levels of understanding.
Task support: supports tasks related to learning, e.g., organizing notes, summarizing texts, etc.
Conversational and immersive interface: users can interact with the system through casual conversation, as if they were friends.

Functional requirements are as follows

RE1: Ability to provide accurate information on a variety of topics based on user requests.
RE2: Provide conversational and immersive experiences for users.
RE3: Ability for users to create custom chatbots to suit their needs.
RE4: Provides multiple methods for user authentication.
RE5: Remains accessible to users more than 95% of the time.

Non-functional requirements are as follows

N-RE1: Have an intuitive and user-friendly interface.
N-RE2: To support a large number of simultaneous users while maintaining response times and ensuring scalability.
N-RE3: Apply robust security measures to protect data and safeguard privacy.

In addition, the use of large-scale language models may raise certain ethical issues related to the accuracy of information and educational content. In particular, the issue of "hallucinatory phenomena" is related to the fact that large-scale language models generate information that is not based on facts. This is a particularly serious problem in educational contexts. We seek to address this problem and minimize hallucinatory phenomena through multiple approaches, including parameter adaptation, use of external knowledge, and evaluative feedback.

The system is an innovative step in shaping the future of educational technology, with the goal of providing a valuable learning experience for users.

System Design and Components

The main components are presented here.

The first is a large-scale language model. The core of this system is the use of pre-trained large-scale language models via an API. The options were ChatGPT, PaLM2, and Google Gemini, but we chose PaLM2 due to ease of use, cost issues, and availability PaLM2 is accessible via Python, which we use to develop our back end and API compatible with the Python language.

The second is the API. Another key element of the system is the IllusionX API, developed using FastAPI and PostgreSQL, which, because of its speed and simplicity, was chosen as the foundation for the API to meet the system's scalability requirements for rapid response and FastAPI was chosen as the foundation for the API because of its speed and simplicity, providing rapid response and scalability to meet system scalability requirements. PostgreSQL was chosen as the database because of its superior performance and applicability in business scenarios. We also use Alembic as our database migration tool and Pydantic for schema validation.

Third, the mobile app: the system is even more accessible through a cross-platform mobile app developed in Flutter. It allows for login, sign-up, chat functionality, and management of chatbots (agents) in a variety of specialties. The user-friendly interface facilitates adoption and use by the targeted user base.

The fourth is smart glasses and smart watches. The hardware portion of the system includes smartwatches and smartglasses. Smart glasses utilize an AR display to show digital information on the lens, while smart watches contain a custom-designed System-on-Chip (SoC) that generates audio and visual responses based on user requests. Combined, users are immersed in a virtual environment that blends with their environment for an interactive MR experience.

Tests and Results

A survey was conducted to assess the adoption of the system by the target audience, and representatives of the target users were asked to complete the survey. Approximately 87.5% of users responding to the survey indicated that they would be interested in a personal companion to assist them with learning and daily tasks; 67% would be interested in a system that includes both text and voice commands; and 62.5% would be interested in a system that includes both text and voice commands. 62.5% of those who responded said they would use the system to search for information, revealing this to be the most requested feature of the system. Other features selected by users and their respective percentages are listed in the table below. Note that users can select more than one feature per response.

We are testing the system on key tasks for learning (course description and outline generation, lesson generation, Q&A). We also compare this system to PaLM. The paper used the PaLM model as the underlying model, but we are testing the knowledge embedding module to see if it improves the extraction results over the vanilla model; we are embedding knowledge by providing PDF documents to the model; we are also comparing the PaLM model with the vanilla model to see if it improves the extraction results over the vanilla model.

First is the course description. In generating the course description and outline, we are testing three different prompts in vanilla PaLM. We are testing in two areas: artificial intelligence (AI) and nanoelectronics. We found that while vanilla PaLM is able to generate a consistent curriculum, it tends to be too broad or too non-technical for students. On the other hand, when tested with illusionX, it is able to generate more detailed course descriptions, albeit limited by the knowledge embedded within the given documents and prompts.

With regard to lesson generation, we have been able to slightly reduce hallucinations in illusionX. Also, Q&A has been able to answer more technical and detailed questions in illusionX, while vanilla PaLMs sometimes hallucinate or explain concepts differently than what the user asked for.

The advantages and disadvantages of this system are summarized as follows

We have also tested the guidelines for effective prompts. Test results indicate that the following prompting guidelines can help generate more effective responses

Give the model a role. For example, "act as a college professor" or "you are a college professor".
Provide as much detail as possible about the lesson or course without the need to provide the technical aspects of it.
Make sure the documentation you provide is relevant to the topic and clearly organized.

(Exemplary Prompt) Act as a college professor and generate a detailed course description and outline for an introductory course in VLSI design. This course should be targeted at junior level engineering students. The course should cover the fundamentals of VLSI and the design process and manufacturing process of VLSI systems. The course will span 12 weeks.

Summary

This paper presents illusionX, a new large-scale language model-driven mixed reality system that is revolutionizing the field of education. The system has demonstrated modest but tangible improvements in achieving learning goals and supporting instructional tasks. Room for further evolution includes the addition of more precise custom-designed components and a fundamental restructuring of the system to provide more practical and accurate information.

Of particular note is the incorporation of functions that take into account users with special needs, as well as multilingual and multimodal support to enhance the user experience. Significant advances are also expected in hardware design and technology.

The system identifies potential and real-world challenges based on the results of tests conducted to evaluate the adoption and performance of technology in education. It also explores directions for future improvement and includes a discussion of the ethical considerations involved in the system.

This study offers a new perspective as a step toward the future of educational technology. It is hoped that this system will enhance the quality of education and provide a richer experience for learners in the future.

Categories related to this article

Large Language Models

Takumu: I have worked as a Project Manager/Product Manager and Researcher at internet advertising companies (DSP, DMP, etc.) and machine learning startups. Currently, I am a Product Manager for new business at an IT company. I also plan services utilizing data and machine learning, and conduct seminars related to machine learning and mathematics.

"Understanding Emotions And Transforming Learning" Innovative Educational Experience "illusionX" Integrated Emotional Computing And Mixed Reality System

Summary

System requirement

System Design and Components

Tests and Results

Summary

Libra] A New Multimodal Design Of Large Language Models Using Separate Vision Systems

Libra] A New Multimodal Design Of Large Language Models Using Separate Vision Systems

Construction And Analysis Of The "TruthEval" Dataset To Expose LLM Weaknesses

Construction And Analysis Of The "TruthEval" Dataset To Expose LLM Weaknesses

SportQA, A New Dataset That Measures The Comprehension Of Sports In Large Language Models

SportQA, A New Dataset That Measures The Comprehension Of Sports In Large Language Models

Proposal For A New Evaluation Method For AI Assistants Based On Human Preferences

Proposal For A New Evaluation Method For AI Assistants Based On Human Preferences

The Future Of Music Education, Flute X GPT And LAUI's Potential To Change Large-Scale Language Models

The Future Of Music Education, Flute X GPT And LAUI's Potential To Change Large-Scale Language Model ...

Prediction Of Handball Results For The 2024 Paris Olympics And Explanation Of The Basis For The Prediction Using LLM

Prediction Of Handball Results For The 2024 Paris Olympics And Explanation Of The Basis For The Pred ...