INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue X, October 2025
Advancements in Artificial Intelligence for Real-World Problem
Solving: Foundations, Methods, and Applications
Meera DC
Abstract: This paper surveys recent advancements in artificial intelligence (AI) that have directly improved the capability of
systems to solve real-world problems. We review progress in foundation models and multimodal systems, generative models
(diffusion and transformer families), human-in-the-loop alignment (RLHF), and privacy/resilience techniques for deploying AI at
the edge (federated/TinyML). Building on the literature, we identify important gaps in robustness, evaluation, and societal
alignment, then propose a methodology combining multimodal pretraining, task-specific fine-tuning with human feedback, and
privacy-preserving edge deployments to address practical tasks in healthcare triage, environmental monitoring, and robotics.
Experimental designs, datasets, metrics, and ethical safeguards are provided to enable reproducible, responsible research.
I. Introduction
AI capability has advanced rapidly due to architectural innovations (transformers), scaling laws, and large-scale pretraining,
producing models that generalize across tasks and modalities. These advances enable practical solutions in image/video analysis,
natural language understanding, and embodied systems (robotics/autonomous agents). However, real-world deployment raises
challenges in robustness, data privacy, alignment to human values, and resource constraints at the edge. This paper synthesizes the
state of the art and proposes research directions to close gaps between lab progress and reliable, practical systems.
II. Literature Review — Key Recent Advances
Multimodal & Foundation Models
Large language models extended to handle multiple modalities (images, video, sensor data) have shown emergent capabilities—
performing tasks that require cross-modal reasoning and enabling richer human–AI interaction. Surveys show MLLMs (multimodal
large language models) are an active, high-impact area with rapid development.
Generative Models (Diffusion, Transformers, GANs)
Diffusion models and scaled transformer variants now dominate high-fidelity generation (images, audio). These generative
advances are crucial for data augmentation, simulation, and synthetic-data generation for low-resource domains. Comprehensive
reviews summarize theoretical progress and applications.
Human-Centered Alignment (RLHF and Related Methods)
In tasks where objective reward functions are hard to specify, reinforcement learning from human feedback (RLHF) and instruction
tuning have become central for aligning models with human preferences and safety constraints; surveys and practical guides outline
the technique and open problems.
Edge, Privacy, and Federated Learning
Deploying AI in privacy-sensitive, resource-constrained environments (IoT, mobile health) motivates federated learning, TinyML,
and compressed models that run on-device while preserving user privacy. Recent reviews examine integrating federated techniques
with TinyML for practical edge deployments.
Embodied & Robotics Systems
Embodied multimodal models (e.g., models combining language and sensory inputs) demonstrate transfer to robotics tasks,
suggesting end-to-end pretraining can accelerate robotic perception and planning.
Research Gaps & Challenges
1. Robustness & Distribution Shift: Models often fail when domain shifts occur; generalization guarantees are limited.
2. Evaluation: Benchmarks can be narrow; real-world performance needs richer, task-specific metrics.
3. Alignment & Safety: RLHF reduces some failure modes, but is expensive and can encode biases.
4. Privacy & Resource Constraints: Large models are costly to run and raise privacy concerns for sensitive data.
5. Reproducibility & Sim-to-Real Transfer: Simulated training often does not transfer cleanly to physical systems (robots,
sensors).
Page 1351