INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue VI, June 2025
www.ijltemas.in Page 1105
An Integrated Model for A Virtual Voice Assistant Using Modern
Artificial Intelligence Technologies
Ahmed Salam AL Amour, Dr. G. Sandhya Devi
Computer Science & Systems Engineering Andhra University.
DOI: https://doi.org/10.51583/IJLTEMAS.2025.1406000123
Abstract: This paper presents the comprehensive development of a desktop-based virtual assistant application that leverages
advanced speech recognition and natural language processing (NLP) technologies. The system is developed using Python
programming language and is seamlessly integrated with Google Gemini to enhance performance and understanding of user intent.
The assistant provides a user-friendly conversational interface for executing a variety of system-level tasks, such as launching
desktop applications, recording the screen, retrieving general information, and automating repetitive commands. The primary
objective of this project is to significantly enhance user productivity and improve overall system accessibility, especially through
intuitive voice-based interaction. In today's world, many visually impaired, physically disabled, and elderly individuals suffer from
social isolation and a diminished sense of independence in their everyday lives. Voice assistant systems provide a highly promising
solution to this challenge by enabling hands-free, natural, and intuitive interaction with digital technology. This approach allows
users to carry out essential tasks, access necessary information, and communicate with others without the need for visual cues or
manual input. The proposed project introduces a Python-based voice assistant system specifically designed with the needs of blind,
elderly, and physically challenged individuals in mind. The system is tailored to improve their quality of life by promoting
continuous engagement, enhancing digital accessibility, and encouraging greater independence. By integrating both speech
recognition and text-to-speech capabilities, the assistant can understand verbal commands, respond with synthesized speech, and
perform vital functions such as setting alarms or reminders, reading incoming messages, providing weather or news updates, and
accessing online content. This work demonstrates the transformative potential of AI-powered voice technologies in fostering
inclusivity, supporting vulnerable populations, and empowering individuals with special needs through smart, accessible digital
interaction.
Keywords :Virtual Assistant, Speech Recognition, Natural Language Processing (NLP), Text-to-Speech
I. Introduction
In today's era, nearly all tasks have become digitalized. With smartphones in hand, it feels as though we have the world at our
fingertips. Increasingly, we no longer even need to use our fingers; we simply speak a command, and it is executed. There are
systems available where one can say, "Text Dad, 'I'll be late today,'" and the message is sent automatically. This is the role of a
Virtual Assistant. These systems also support specialized tasks such as booking a flight or finding the cheapest book online across
various e-commerce platforms and then providing an interface to place the order. They help automate search, discovery, and online
ordering processes. European population is currently 65 years or older and predicts that this percentage will increase to 23.8% by
2030. In addition, most older adults wish to stay in their homes and to age in place. These facts have led to more and more
researchers aiming to improve the ageing process in the older adult home and ease the access to Information and Communications
Technologies (ICT) by this population Error! Reference source not found. (Valera Román A, 2021 Apr 20) . In this new age of
technology and innovation, the use of artificial intelligence and machine learning has made our life much easier. Virtual Assistants
process audio signals, convert them to text, and perform tasks using Speech-To-Text modules, parsers, dialog managers, answer
generators, and speech synthesizers (Patil J, 2021 May). Technological advancements have simplified daily tasks through oral
communication with computers via conversational interfaces (CIs) (Abougarair AJ, 2022). AI technology interacts with humans
both directly and indirectly. A clear example of human-AI interaction is the use of chatbots, which enhance service quality and
customer engagement by enabling personalized communication without the need for human agents. Similarly, voice assistants have
brought significant benefits and transformative impacts to daily life, allowing users to issue voice commands and contributing to
improved customer satisfaction. AI-powered recommendation systems also play a crucial role in guiding user choices by analyzing
preferences and behaviors, thereby offering tailored suggestions that enhance user experience and drive engagement. Virtual
assistants' usability and efficacy are greatly influenced by natural language processing (NLP). The manner in which people engage
with technology is being revolutionized by these AI-powered assistants' ability to understand, interpret, and reply to human
language thanks to NLP (Entertainment:, 2023 Dec 7).
Google Gemini is integrated into the virtual assistant system through its API to provide intelligent and advanced natural language
understanding. After the user's speech is converted into text using speech recognition tools, this text is sent to the Gemini model
via the google-generativeai Python library, using an API key obtained from Google Cloud. The model analyzes the text, understands
the user's intent, and then sends a response that clarifies the intended command. The assistant then uses this response to perform
the desired action, such as opening an app or providing an answer. This integration enables the system to understand complex or
ambiguous commands accurately and smoothly.
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue VI, June 2025
www.ijltemas.in Page 1106
However, this paper aims to provide an overview of the methodologies and steps involved in making a Virtual Personal Assistant,
considering different research results and limitations. Technological advancements have simplified daily tasks through oral
communication with computers via conversational interfaces (CIs).
Motivation
With the accelerating pace of digital transformation efficiency and ease of use have become critical factors in the design of everyday
computing tools, especially with the complexity of tasks and the multitude of applications required by the average user. Therefore,
this virtual assistant was developed using a modular approach based on the Python programming language. Several specialized
speech recognition libraries, such as Whisper and Speech Recognition were integrated to convert the user's voice commands into
text, enabling natural interaction with the system without the need for manual input. Understanding the user's intent and parsing
their commands was accomplished using the Google Gemini language model which is capable of extracting intents "Intent
Detection" and accurately identifying key elements in commands "Slot Filling".
To facilitate the execution of commands, the assistant was linked to application interfaces at the operating system level using
libraries such as os pyautogui and pywinauto. This enabled it to open common applications such as Notepad Calculator Internet
Explorer and Microsoft Office as well as perform actions such as taking screenshots and videos. An interactive graphical interface
(GUI) was also designed using PyQt or Tkinter, allowing users to visually monitor commands and alerts in realtime, improving the
user experience and reducing reliance on traditional manual interaction. These technical solutions respond to the challenges faced
by users in the modern computing environment, where many daily tasks require cumbersome repetitions of clicks and commands,
leading to decreased productivity and distraction. This difficulty is exacerbated for nonexpert users or those lacking sufficient
technical expertise making the need for natural and optimized interactive interfaces even more crucial.
From a research perspective, additional challenges arise in measuring the effectiveness of these systems in understanding natural
language. Current language models struggle to capture the nuances of speech context or handle ambiguity in commands.
Furthermore, traditional evaluation tools do not always reflect the ability of intelligent assistants to handle complex real-world
scenarios. Therefore, the need to develop compact (distilled) language models that maintain the interpretability of large models, but
with fewer resources, has emerged, making them suitable for mobile devices and limited systems. This requires advanced
knowledge distillation techniques, along with new assessment methodologies that take into account contextual and functional
accuracy in language comprehension. (Abougarair AJ, 2022)
II. Objectives of The Proposed System
The proposed voice-activated desktop assistant is designed to revolutionize the way usersparticularly those who are visually
impaired, elderly, or physically challengedinteract with their computers. (Shashikala KS, 2025; Masina F, 2020 Sep 25) By
enabling voice-controlled execution of routine tasks such as opening commonly used applications (e.g., Notepad, Calculator, Word,
Excel, PowerPoint), taking screenshots, and recording screens, the system significantly enhances productivity and reduces
dependence on traditional input devices like the keyboard and mouse. Through a natural, conversational interface powered by
speech recognition and natural language processing, the system creates a user-friendly environment that lowers the technological
barrier for individuals who may lack technical proficiency. One of the core features is a real-time transcription interface that displays
voice interactions, allowing users to monitor and verify commands as they are processedensuring transparency and reinforcing
trust in the system. Moreover, the assistant is designed to learn from user behavior over time, gradually adapting to individual
preferences and offering personalized responses and suggestions that align with the user’s habits and workflow. Seamless
integration with the desktop operating system ensures fast, efficient task execution, making the assistant a reliable companion for
daily computing needs. Beyond functional support, the system also facilitates casual, human-like conversations, enhancing
emotional engagement and reducing the sense of isolation often felt by users with physical or visual limitations. By automating
repetitive actions and minimizing the cognitive effort required to perform standard tasks, the assistant enables users to focus on
more meaningful activities. Additionally, the system promotes a fully hands-free computing experience, especially useful in
contexts where manual interaction is difficult or impractical. Importantly, strong emphasis is placed on data privacy and security,
ensuring that all interactionsespecially those involving sensitive informationare handled responsibly and securely. In sum, this
system represents a comprehensive, intelligent solution aimed at fostering independence, inclusivity, and digital empowerment
through the power of voice technology.
Related Work
Virtual assistants such as Siri (Apple), Alexa (Amazon), and Cortana (Microsoft) have become prominent tools for voice-based
interaction, offering users the ability to perform simple tasks such as setting reminders, playing music, checking the weather, or
controlling smart home devices. (Hoy MB. Alexa, 2018 Jan 2) These systems rely heavily on cloud-based natural language
processing and are designed primarily for mobile and smart home environments. While convenient for casual tasks, their usefulness
in complex desktop workflows or professional environments remains limited. One major shortcoming of these systems is their lack
of deep integration with desktop operating systems. Most assistants are not capable of navigating the file system, controlling
multiple applications simultaneously, or managing context-rich tasks like editing documents, scheduling across different platforms,
or providing real-time feedback based on user behavior within software environments. When desktop integration does exist, it is
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue VI, June 2025
www.ijltemas.in Page 1107
often restricted to a few predefined commands, making the interaction rigid and task-specific rather than adaptive or conversational.
Furthermore, real-time GUI-based transcription is either non-existent or highly limited in current systems. Users have minimal
visual feedback during interactions, which can lead to misunderstandings or errors, especially when issuing complex or ambiguous
commands. Without an intuitive graphical interface, these assistants fail to offer the transparency and control users expect from
desktop tools. Additionally, conversational depth remains a challenge. Existing assistants struggle with maintaining context over
extended dialogues and often require users to repeat or rephrase commands (Wired, 2023 , October 5). This lack of continuity
interrupts the flow of interaction and prevents the formation of a more natural, human-like communication experience. Academic
research has attempted to address some of these challenges by proposing multimodal systems that combine voice, gesture, and
visual feedback. However, such approaches often remain at the prototype stage and are not widely adopted in consumer or enterprise
environments. This project builds upon these limitations by developing an intelligent assistant that emphasizes desktop integration,
real-time GUI-based transcription, and extended conversational capabilities. The proposed system is designed to support a wider
range of tasks, provide visual context, and sustain dynamic interactions with users in a more natural and productive manner.
Proposed System
The desktop virtual assistant represents a significant step forward in intelligent personal productivity tools, combining the latest
advancements in artificial intelligence, natural language processing (NLP), and speech technologies. Developed primarily in
Python, the assistant is engineered to integrate deeply with the Windows operating system, ensuring reliable interaction with key
applications such as Notepad, Calculator, Microsoft Word, Excel, PowerPoint, and multiple web browsers. At its foundation, the
system uses advanced natural language understanding (NLU) models to parse user inputs, accurately detect intents, and extract slot
values, thereby allowing users to communicate with the assistant using natural, conversational language. Voice commands are
handled through a sophisticated speech recognition engine that converts spoken input into text with high accuracy, enabling hands-
free operation and expanding accessibility for users with physical limitations. In parallel, a responsive text-to-speech system
generates clear verbal responses, creating a smooth two-way conversational experience. To further enhance transparency and
control, the assistant features a modern, interactive graphical user interface (GUI) developed using frameworks like PyQt or Tkinter.
This interface includes a real-time transcription panel that displays both user inputs and assistant responses, helping users monitor
system behavior, catch misinterpretations, and interact more confidently. The assistant is not only reactive but context-aware,
designed to handle multi-step interactions by maintaining short-term conversational memory and offering follow-up suggestions.
Built with modularity in mind, the architecture supports easy extension, allowing developers to incorporate additional services such
as email automation, calendar scheduling, summarization of documents, file system navigation, and integration with cloud-based
productivity platforms. Looking ahead, the platform can be augmented with personalization features driven by machine learning,
enabling the assistant to adapt to individual usage patterns, preferences, and task histories over time. By uniting intelligent language
processing with a real-time, user-friendly interface, this virtual assistant not only streamlines daily computing tasks but also sets a
foundation for more natural and effective human-computer interaction on the desktop.
Methodology
The virtual assistant was developed using a modular approach with Python as the primary programming language, leveraging
various libraries for speech recognition, natural language understanding (NLU), and graphical user interface (GUI) development.
For speech recognition, Python libraries such as Speech Recognition or whisper were employed to convert voice inputs into text,
enabling voice-based interaction. Natural language understanding was powered by Google Gemini, a large language model, which
processes and interprets user commands, extracting intent and identifying key slots to understand complex requests. To facilitate
seamless interaction with the system, the assistant uses OS-level APIs to interface with native desktop applications such as Notepad,
Calculator, and Microsoft Office tools. These APIs, accessed through Python’s os module, pyautogui, and pywinauto, allow the
assistant to execute commands, open files, or control other applications based on the interpreted user input. Additionally, the
assistant features a modern GUI, built with frameworks like PyQt or Tkinter, which provides real-time feedback and transcription,
enabling users to track their interactions and commands visually. The system's architecture and interaction flow were modeled using
use case diagrams, data flow diagrams (DFDs), and system models, which provided clear visual representations of user-system
interactions, data movement, and module relationships. This structured approach guided the development process, ensuring the
assistant was both efficient and scalable, while allowing for future enhancements and integrations.
Finalized Architecture of The Model
The user provides a voice input, which is captured and processed by the Speech Recognition Module. This module converts the
spoken language into text form. The recognized text is then sent to the Python Backend, which acts as the system's brain.The
backend analyzes the input and determines what action needs to be taken. Depending on the user's request, the system can either
make an API call to fetch external information, perform Content Extraction to analyze or retrieve specific data from the text, or
issue a System Call to perform an action directly on the computer (like opening an app). After processing, the result is forwarded
to the Text to Speech Module, which converts the response into audio form. Finally, the system produces an output voice response
that the user can hear, completing the interaction in a natural and human-like way.
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue VI, June 2025
www.ijltemas.in Page 1108
Figure 1. Architecture of the model
Source: Created by authors
System Design and Block Diagram
Figure 2. ER Diagram
Source: Created by authors
The above diagram shows entities and their relationship for a virtual assistant system. We have a user of a system who can have
their keys and values. It can be used to store any information about the user. Say, for key “name” value can be “Jim”. For some
keys user might like to keep secure. There he can enable lock and set a password (voice clip). Single user can ask multiple questions.
Each question will be given ID to get recognized along with the query and its corresponding answer. User can also be having n
number of tasks. These should have their own unique id and status i.e. their current state. A task should also have a priority value
and its category whether it is a parent task or child task of an older task. The Entity-Relationship Diagram (ERD) is a widely adopted
tool in structured analysis and conceptual data modeling. It provides an intuitive and effective means of representing real-world
systems by modeling business entities, the relationships among them, and the attributes that describe their properties. The ER
approach is valued for its clarity, expressiveness, and its ability to be easily transformed into a relational database schema. Common
semantic components of ER modeling include cardinality constraints, participation (optional or mandatory involvement in
relationships), and generalization/specialization hierarchies, which further enhance its descriptive power (10.Song IY, 1995)
.Additionally, the main components of ER model are entity set, relationship set and integrity constraints. Entity set shows the objects
in the real world which are different from other objects (Mohammed MA, 2015 Oct).
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue VI, June 2025
www.ijltemas.in Page 1109
Activity Diagram
Figure 3. Sequence diagram for Query-Response
Source: Created by authors
Initially, the system is in idle mode. As it receives any wake up cal it begins execution. The received command is identified whether
it is a questionnaire or a task to be performed. Specific action is taken accordingly. After the Question is being answered or the task
is being performed, the system waits for another command. This loop continues unless it receives quit command. At that moment,
it goes back to sleep.
Sequence Diagram
Figure 4. Sequence diagram for Query-Response
Source: Created by authors
The above sequence diagram shows how an answer asked by the user is being fetched from internet. The audio query is interpreted
and sent to Web scraper. The web scraper searches and finds the answer. It is then sent back to speaker, where it speaks the answer
to user. This is followed by analyzing classes with the intent of encapsulation (bundling data and methods) while still keeping data
and operations separate. Then, the analysis moves to the task of specifying operations, which define the behavior of objects,
including the communication that occurs between objects by passing messages to one another (S, 2021 May 31.). An interaction is
a sequence of messages passed between objects to accomplish a particular task (Swain SK, 2010 Jul).
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue VI, June 2025
www.ijltemas.in Page 1110
Figure 5. Sequence diagram for Task Execution
Source: Created by authors
The user sends command to virtual assistant in audio form. The command is passed to the interpreter. It identifies what the user has
asked and directs it to task executer. If the task is missing some info, the virtual assistant asks user back about it. The received
information is sent back to task and it is accomplished. After execution feedback is sent back to user.
Figure 6. test the system
The system's performance was evaluated under various acoustic conditions, as shown in Figure 6. The evaluation included
measuring response time, speech recognition accuracy, and word error rate in both quiet environments (such as closed rooms) and
noisy environments (such as streets or public spaces). The results showed that the system performed significantly better in low-
noise environments, with higher accuracy and faster responses. In contrast, high noise led to increased speech recognition errors
and increased response time. These results underscore the importance of incorporating advanced noise filtering techniques to ensure
reliable performance in various real-world scenarios.
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue VI, June 2025
www.ijltemas.in Page 1111
User Study and Feedback:
To evaluate the ease of use and effectiveness of the voice assistant in real world scenarios, a mini-usage study was conducted with
five participants from the target audience, including the blind and the elderly. Each participant was asked to interact with the
assistant by performing a set of predefined tasks, such as opening apps, obtaining information, or giving general voice commands.
After completing the experiment, participants were asked a series of questions to evaluate their experience, focusing on three main
areas: (1) ease of use, (2) speech recognition accuracy, and (3) their preference for using voice commands over traditional typing.
The results showed that most users found the system easy to use with four out of five participants indicating that the assistant
understood their commands well. Blind users expressed high satisfaction with the hands free voice interaction feature while the
elderly expressed a significant preference for using voice over typing. However, some observations were noted that required
improvement, such as response speed and difficulty understanding unclear speech. This study provided important insights into the
real world usability of voice assistants and emphasized the importance of inclusive design and testing with diverse audiences to
ensure accessibility and effectiveness for all users.
IV. Evaluation and Results
The virtual assistant system was evaluated in a controlled desktop environment running a standard operating system configuration.
The goal of the testing phase was to determine the system's responsiveness, reliability, and overall functionality in handling user
commands and managing tasks. The assistant was able to successfully recognize and execute voice commands related to common
operations such as launching desktop applications, retrieving stored user information, recording notes or activities, and managing
scheduled tasks. Throughout the testing phase, the system maintained stable performance and exhibited accurate voice recognition,
even in the presence of minor background noise. This was made possible by integrating a pre-trained speech-to-text engine, which
effectively translated voice input into executable instructions. Additionally, the voice-based password system used for securing
user-defined attributes functioned as intended, requiring a matching voice clip for access to sensitive data. This feature added a
layer of personalized security without compromising usability. The assistant also demonstrated the ability to handle multiple
concurrent tasks without freezing or crashing, indicating efficient resource management. For example, while a voice command was
being processed, the GUI remained responsive, allowing users to interact with other elements such as task lists or query logs. The
system's task management feature properly categorized tasks as parent or child, tracked their status changes, and prioritized
execution based on user-defined importance levels. To further document these outcomes, screenshots were taken at various stages,
capturing the assistant's interface while processing input and delivering feedback. These visuals confirm that the system adheres to
expected design standards and maintains a user-friendly interface throughout usage.
In summary, the virtual assistant performed reliably under standard conditions, meeting its functional requirements and
demonstrating a high degree of usability and efficiency. These initial results suggest that the system is well-suited for deployment
in a personal or professional desktop setting, with potential for further expansion to other platforms.
A practical evaluation of the voice assistant's performance was conducted in various environments to measure its efficiency and
responsiveness under various acoustic conditions. The system was tested in two main environments: the first in a closed quiet room,
and the second in a noisy crowded environment such as a street or public space. During each experiment, several performance
indicators were recorded, including response time, speech recognition accuracy and word error rate. In a quiet environment, the
system demonstrated high performance, with a response time of approximately 2 seconds and a speech recognition accuracy rate
exceeding 95%, while the word error rate was very low (approximately 1 to 2 words out of 100). In a noisy environment performance
deteriorated somewhat due to acoustic interference. The response time was approximately 5 seconds the accuracy rate dropped to
approximately 80% and the word error rate rose to approximately 10 words out of 100.These results demonstrate the importance of
testing the voice assistant under various conditions to ensure its reliability in real-world use. There is also a need to improve noise
filtering algorithms and develop models that are more capable of handling audio inputs in noisy environments.
V. Conclusion and Future Work
As part of our project's development, we aim to expand the capabilities of the virtual voice assistant in the future by incorporating
advanced and intelligent features that enhance its practical value to users. Among these features, we are working on adding the
ability to automatically generate code in various programming languages, summarize videos for quick understanding of their
content, and improve the text-to-speech (TTS) technology to be more natural and interactive. We also plan to expand the system's
capabilities by developing a large language model (LLM) specifically for the project, supporting multiple languages, including
Arabic, French, and Kannada. This will enable broader interaction with users from diverse cultures and linguistic backgrounds, and
improve the system's ability to understand and respond to commands in multiple contexts. One of the applications we are
particularly interested in is customizing this assistant to serve students, specifically Andhra Pradesh University students, by
designing an intelligent assistant system that guides them through the steps of preparing and writing academic papers in a
professional academic manner. The assistant will focus on guiding students, especially those who have never published a research
paper, step by step in writing a successful research paper.
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue VI, June 2025
www.ijltemas.in Page 1112
This will include:
Collecting and analyzing similar research papers from Google Scholar based on the student's field.
Deriving the optimal structure for the paper (introduction, methodology, results, discussion, etc.).
Providing detailed instructions on academic writing for each part of the paper.
Assisting in drafting titles and abstracts in appropriate and professional language.
Academic proofreading and improving the quality of written language using specialized artificial intelligence models.
Plagiarism detection and guidance on proper rewording to reduce similarity and achieve originality.
Suggesting appropriate journals or conferences for publication based on the paper's field.
In this way, the assistant is not just a technical tool, but an academic partner that helps students enter the world of scientific
publishing with confidence and mitigates the challenges they face at the beginning of their research career.
References
1. Valera Román A, Pato Martínez D, Lozano Murciego Á, Jiménez-Bravo DM, de Paz JF. Voice assistant application for
avoiding sedentarism in elderly people based on IoT technologies. Electronics. 2021 Apr 20;10(8):980.
2. Patil J, Shewale A, Bhushan E, Fernandes A, Khartadkar R. A voice-based assistant using Google dialogflow and machine
learning. International Journal of Scientific Research in Science and Technology. 2021 May;8(3):6-17.
3. Abougarair AJ, Aburakhis MK, Zaroug M. Design and implementation of smart voice assistant and recognizing academic
words. International Robotics & Automation Journal. 2022;8(1):27-32.
4. Mei Y. AI & Entertainment: The Revolution of Customer Experience. Lecture Notes in Education Psychology and Public
Media. 2023 Dec 7;30:274-9.
5. Shashikala KS, Vadlamudi S, Gurupriya M, Teja KD, Reddy MP, Reddy JK. Smart helper: A voice guided assistance for
visually impaired. InChallenges in Information, Communication and Computing Technology 2025 (pp. 597-600). CRC
Press.
6. Masina F, Orso V, Pluchino P, Dainese G, Volpato S, Nelini C, Mapelli D, Spagnolli A, Gamberini L. Investigating the
accessibility of voice assistants with impaired users: mixed methods study. Journal of medical Internet research. 2020 Sep
25;22(9):e18431.
7. Hoy MB. Alexa, Siri, Cortana, and more: an introduction to voice assistants. Medical reference services quarterly. 2018
Jan 2;37(1):81-8.
8. Wired. (2023, October 5). Why voice assistants still can't hold a conversation. Wired. https://www.wired.com/story/voice-
assistants-ambient-computing/
9. Song IY, Evans M, Park EK. A comparative analysis of entity-relationship diagrams. Journal of Computer and Software
Engineering. 1995;3(4):427-59.
10. Mohammed MA, Muhammed DA, Abdullah JM. Practical Approaches of Transforming ER Diagram into Tables.
International Journal of Multidisciplinary and Scientific Emerging Research. 2015 Oct;4(22):2349-6037.
11. Al-Fedaghi S. UML sequence diagram: an alternative model. arXiv preprint arXiv:2105.15152. 2021 May 31.
12. Swain SK, Mohapatra DP, Mall R. Test case generation based on use case and sequence diagram. International Journal of
Software Engineering. 2010 Jul;3(2):21-52.