Scaling AI Applications on the Cloud toward Optimized Cloud-Native Architectures, Model Efficiency, and Workload Distribution

Article Sidebar

Main Article Content

Aravind Nuthalapati

Abstract: The rapid growth of Artificial Intelligence (AI) has increasefd the demand for scalable, efficient, and cost-effective computational infrastructure. Traditional on-premise systems face limitations in scalability, resource allocation, and cost efficiency, making cloud computing a preferred solution. This paper examines cloud-native architectures, including containerization, Kubernetes orchestration, serverless computing, and microservices, as key enablers of AI scalability. Modern approaches for optimizing AI models involve using quantization and pruning and knowledge distillation approaches to make them more efficient without sacrificing their accuracy levels. The paper investigates workload distribution methods like federated learning together with distributed training plus adaptive AI scaling for improving resource efficiency and lowering response times. The implementation continues to face difficulties concerning expense control and latency reduction and scheduling resources efficiently while ensuring security standards. The research presents three possible solutions namely automated AI scaling, edge-cloud integration and provisioning with cost intelligent management systems to overcome current limitations. This examination features a study of present-day trends which consist of AI-native cloud orchestration along with AutoML-based optimization and quantum computing applications for the enhancement of AI scaling capabilities. This research provides comprehensive insights about cloud-based AI scalability which helps researchers as well as practitioners improve their deployment and optimization capabilities of high-performance AI systems.

Scaling AI Applications on the Cloud toward Optimized Cloud-Native Architectures, Model Efficiency, and Workload Distribution. (2025). International Journal of Latest Technology in Engineering Management & Applied Science, 14(2), 200-206. https://doi.org/10.51583/IJLTEMAS.2025.14020022

Downloads

References

Mrida, M. S. H., Rahman, M. A., & Alam, M. S. (2025). AI-Driven Data Analytics and Automation: A Systematic Literature Review of Industry Applications. Strategic Data Management and Innovation, 2(01), 21-40. DOI: https://doi.org/10.71292/sdmi.v2i01.9

Kodakandla, N. (2024). Scaling AI responsibly: Leveraging MLOps for sustainable machine learning deployments. International Journal of Science and Research Archive, 13(1), 3447-3455. DOI: https://doi.org/10.30574/ijsra.2024.13.1.1798

Adinan bin sidhique, Ashwin gopakumar and Bushara A. R. Efficient net-based deep learning model for accurate plant disease classification and diagnosis. International Journal of Science and Research Archive, 2025, 14(01), 1264-1270. Article DOI: https://doi.org/10.30574/ijsra.2025.14.1.0170. DOI: https://doi.org/10.30574/ijsra.2025.14.1.0170

Gill, S. S., Tuli, S., Xu, M., Singh, I., Singh, K. V., Lindsay, D., ... & Garraghan, P. (2019). Transformative effects of IoT, Blockchain and Artificial Intelligence on cloud computing: Evolution, vision, trends and open challenges. Internet of Things, 8, 100118. DOI: https://doi.org/10.1016/j.iot.2019.100118

Santoso, A., & Surya, Y. (2024). Maximizing Decision Efficiency with Edge-Based AI Systems: Advanced Strategies for Real-Time Processing, Scalability, and Autonomous Intelligence in Distributed Environments. Quarterly Journal of Emerging Technologies and Innovations, 9(2), 104-132.

Gill, S. S., Xu, M., Ottaviani, C., Patros, P., Bahsoon, R., Shaghaghi, A., ... & Uhlig, S. (2022). AI for next generation computing: Emerging trends and future directions. Internet of Things, 19, 100514. DOI: https://doi.org/10.1016/j.iot.2022.100514

Suri babu Nuthalapati. (2023). AI-Enhanced Detection and Mitigation of Cybersecurity Threats in Digital Banking. Educational Administration: Theory and Practice, 29(1), 357–368. https://doi.org/10.53555/kuey.v29i1.6908 DOI: https://doi.org/10.53555/kuey.v29i1.6908

Duan, S., Wang, D., Ren, J., Lyu, F., Zhang, Y., Wu, H., & Shen, X. (2022). Distributed artificial intelligence empowered by end-edge-cloud computing: A survey. IEEE Communications Surveys & Tutorials, 25(1), 591-624. DOI: https://doi.org/10.1109/COMST.2022.3218527

Prangon, N. F., & Wu, J. (2024). AI and computing horizons: cloud and edge in the modern era. Journal of Sensor and Actuator Networks, 13(4), 44. DOI: https://doi.org/10.3390/jsan13040044

Yue, H., & Chen, L. (2024). Dynamic Scaling Strategies for AI Workloads in Cloud Environments. Asian American Research Letters Journal, 1(2).

Akshaya M. George, Aswathy Ramachandran, Mubaris C. M, Muhammed Ajnas T, Dr. Bushara A.R, Pierre Subeh. YOLO-Based Object Recognition System for Visually Impaired. International Journal of Science and Engineering Applications Volume 14-Issue 01, 34 – 42, 2025. https://doi.org/10.7753/ijsea1401.1009 DOI: https://doi.org/10.7753/IJSEA1401.1009

Masdari, M., & Khoshnevis, A. (2020). A survey and classification of the workload forecasting methods in cloud computing. Cluster Computing, 23(4), 2399-2424. DOI: https://doi.org/10.1007/s10586-019-03010-3

Pazhani, A. A. J., & Vinodh, K. A. (2025). AI-Based ULP Microprocessors and Microcontrollers. In Self-Powered AIoT Systems (pp. 219-238). Apple Academic Press. DOI: https://doi.org/10.1201/9781032684000-11

Gill, S. S., Wu, H., Patros, P., Ottaviani, C., Arora, P., Pujol, V. C., ... & Buyya, R. (2024). Modern computing: Vision and challenges. Telematics and Informatics Reports, 100116. DOI: https://doi.org/10.1016/j.teler.2024.100116

Bushara A. R, Adnan Zaman K. T and Fathima Misriya P. S. Optimizing crop yield forecasting with ensemble machine learning techniques. International Journal of Science and Research Archive, 2025, 14(01), 1456-1467. Article DOI: https://doi.org/10.30574/ijsra.2025.14.1.0189. DOI: https://doi.org/10.30574/ijsra.2025.14.1.0189

Ahmad, T., Zhu, H., Zhang, D., Tariq, R., Bassam, A., Ullah, F., ... & Alshamrani, S. S. (2022). Energetics Systems and artificial intelligence: Applications of industry 4.0. Energy Reports, 8, 334-361. DOI: https://doi.org/10.1016/j.egyr.2021.11.256

S. B. Nuthalapati, "Advancements in Generative AI: Applications and Challenges in the Modern Era," Int. J. Sci. Eng. Appl., vol. 13, no. 8, pp. 106-111, 2024, https://doi.org/10.7753/IJSEA1308.1023 DOI: https://doi.org/10.7753/IJSEA1308.1023

Kanchepu, N. (2023). Cloud-Native Architectures: Design Principles and Best Practices for Scalable Applications. International Journal of Sustainable Development Through AI, ML and IoT, 2(2), 1-21.

Carrión, C. (2022). Kubernetes as a standard container orchestrator-a bibliometric analysis. Journal of Grid Computing, 20(4), 42. DOI: https://doi.org/10.1007/s10723-022-09629-8

Priyadarshini, S., Sawant, T. N., Bhimrao Yadav, G., Premalatha, J., & Pawar, S. R. (2024). Enhancing security and scalability by AI/ML workload optimization in the cloud. Cluster Computing, 27(10), 13455-13469. DOI: https://doi.org/10.1007/s10586-024-04641-x

Nuthalapati, S. B., Bushara, A. R., & Abubeker, K. M. (2024, September). SPP_CNN: Spatial Pyramid Pooling for Optimizing Brain Tumor Classification. In International Conference on Electrical and Electronics Engineering (pp. 1-16). Singapore: Springer Nature Singapore. DOI: https://doi.org/10.1007/978-981-97-9112-5_1

Mungoli, N. (2023). Scalable, Distributed AI Frameworks: Leveraging Cloud Computing for Enhanced Deep Learning Performance and Efficiency. arXiv preprint arXiv:2304.13738.

Alomar, K., Aysel, H. I., & Cai, X. (2024). RNNs, CNNs and Transformers in Human Action Recognition: A Survey and A Hybrid Model. arXiv preprint arXiv:2407.06162.

Nuthalapati, S. B., & Nuthalapati, A. (2024). Advanced Techniques for Distributing and Timing Artificial Intelligence Based Heavy Tasks in Cloud Ecosystems. J. Pop. Ther. Clin. Pharm, 31(1), 2908-2925. DOI: https://doi.org/10.53555/jptcp.v31i1.6977

Tang, S., Yu, Y., Wang, H., Wang, G., Chen, W., Xu, Z., ... & Gao, W. (2023). A survey on scheduling techniques in computing and network convergence. IEEE Communications Surveys & Tutorials. DOI: https://doi.org/10.1109/COMST.2023.3329027

Hemmati, A., Raoufi, P., & Rahmani, A. M. (2024). Edge artificial intelligence for big data: a systematic review. Neural Computing and Applications, 1-34. DOI: https://doi.org/10.1007/s00521-024-09723-w

Muhammed Kunju, A. K., Baskar, S., Zafar, S., & AR, B. (2024). A transformer based real-time photo captioning framework for visually impaired people with visual attention. Multimedia Tools and Applications, 1-20. DOI: https://doi.org/10.1007/s11042-024-18966-7

KODAKANDLA, N. (2021). Serverless Architectures: A Comparative Study of Performance, Scalability, and Cost in Cloud-native Applications. Iconic Research And Engineering Journals, 5(2), 136-150.

Vasireddy, I., Kandi, P., & Gandu, S. (2023). Efficient Resource Utilization in Kubernetes: A Review of Load Balancing Solutions. International Journal of Innovative Research in Engineering & Management, 10(6), 44-48. DOI: https://doi.org/10.55524/ijirem.2023.10.6.6

Babu Nuthalapati, S., & Nuthalapati, A., "Accurate Weather Forecasting with Dominant Gradient Boosting Using Machine Learning," Int. J. Sci. Res. Arch., vol. 12, no. 2, pp. 408-422, 2024,https://doi.org/10.30574/ijsra.2024.12.2.1246 DOI: https://doi.org/10.30574/ijsra.2024.12.2.1246

Banerjee, S. (2024). Intelligent Cloud Systems: AI-Driven Enhancements in Scalability and Predictive Resource Management. International Journal of Advanced Research in Science, Communication and Technology, 266-276. DOI: https://doi.org/10.48175/IJARSCT-22840

Mungoli, N. (2023). Scalable, Distributed AI Frameworks: Leveraging Cloud Computing for Enhanced Deep Learning Performance and Efficiency. arXiv preprint arXiv:2304.13738.

Nama, P., Pattanayak, S., & Meka, H. S. (2023). AI-driven innovations in cloud computing: Transforming scalability, resource management, and predictive analytics in distributed systems. International Research Journal of Modernization in Engineering Technology and Science, 5(12), 4165.

AR, B., RS, V. K., & SS, K. (2023). LCD-capsule network for the detection and classification of lung cancer on computed tomography images. Multimedia Tools and Applications, 82(24), 37573-37592. DOI: https://doi.org/10.1007/s11042-023-14893-1

Nuthalapati, A., Abubeker, K. M., & Bushara, A. R. (2024, September). Internet of Things and Cloud Assisted LoRaWAN Enabled Real-Time Water Quality Monitoring Framework for Urban and Metropolitan Cities. In 2024 IEEE North Karnataka Subsection Flagship International Conference (NKCon) (pp. 1-6). IEEE. DOI: https://doi.org/10.1109/NKCon62728.2024.10775117

Subeh, P., & AR, B. (2024). Cloud data centers and networks: Applications and optimization techniques. International Journal of Science and Research Archive, 13(2), 10-30574. DOI: https://doi.org/10.30574/ijsra.2024.13.2.2100

Article Details

How to Cite

Scaling AI Applications on the Cloud toward Optimized Cloud-Native Architectures, Model Efficiency, and Workload Distribution. (2025). International Journal of Latest Technology in Engineering Management & Applied Science, 14(2), 200-206. https://doi.org/10.51583/IJLTEMAS.2025.14020022