AI-Powered Document Generation: Using NLP for Intelligent Data-To-Template Mapping

Article Sidebar

Main Article Content

Khushi Singh
Agrim Yadav
Tanya Chandervanshi

Abstract: Augmenting Automated Document Generation This paper introduces the Sandbox: Document Generating Engine, a novel, secure, and modular web application built with Python and Streamli (Achachlouei, A., Patil, M. A., Joshi, Q., Vair, T. & N. 2021). The primary research objective is to validate the feasibility and efficacy of augmenting Intelligent Document Processing (IDP) workflows by integrating Contemporary Large Language Models (LLMs) for semantic data-to-template mapping. Addressing the challenges of manual, time-consuming, and error-prone document creation, the system leverages Natural Language Processing (NLP) capabilities to analyze data uploaded in diverse formats (e.g., .csv, .xlsx, .txt) and automatically populate predefined document templates (Adhikari, P. R. 2018). The system features a robust secure authentication module utilizing bcrypt for password hashing and PostgreSQL for credential management. Our initial technical findings demonstrate high reliability, with Extraction Accuracy consistently over 95% across test documents. Furthermore, the system drastically reduced the time required for complex document creation, validating the capacity of LLM-enhanced IDP to yield substantial improvements in efficiency and productivity over simple rule-based methods. (Bitzenbauer, P. 2023).

AI-Powered Document Generation: Using NLP for Intelligent Data-To-Template Mapping. (2025). International Journal of Latest Technology in Engineering Management & Applied Science, 14(10), 221-229. https://doi.org/10.51583/IJLTEMAS.2025.1410000030

Downloads

References

Achachlouei, A., Patil, M. A., Joshi, Q., Vair, T. & N. (2021). Document Automation Architectures and Technologies: A Survey. arXiv. https://arxiv.org/abs/2109.02605

Adhikari, P. R. (2018). Understanding of Plagiarism through Information Literacy: A Study among the Students of Higher Education of Nepal. Journal of Business and Social Sciences Research, 3(2), 165–181. https://doi.org/10.3126/jbssr.v3i2.28132

AlAli, R., & Wardat, Y. (2024). Opportunities and Challenges of Integrating Generative Artificial Intelligence in Education. International Journal of Religion, 5(7), 784–793. https://doi.org/10.61707/8y29gv34

Aldosari, S. A. M. (2020). The Future of Higher Education in the Light of Artificial Intelligence Transformations. International Journal of Higher Education, 9(3), 145. https://doi.org/10.5430/ijhe.v9n3p145

Almahasees, Z., Khalil, M., & Am inzadeh, S. (2024). Students’ Perceptions of the Benefits and Challenges of Integrating ChatGPT in Higher Education. Pakistan Journal of Life and Social Sciences (PJLSS), 22(2), 3479–3494. https://doi.org/10.57239/PJLSS-2024-22.2.00256

Archila, P. A., Ortiz, B. T., Truscott de Mejía, A.-M., & Molina, J. (2024). Thinking critically about scientific information generated by ChatGPT. Information and Learning Science. https://doi.org/10.1108/ILS-04-2024-0040

Arora, S., Yang, S., Eyuboglu, B., Narayan, S., Hojel, A., Trummer, A., & E., I. R. (2023). Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes. Proc. VLDB Endow., 17(2), 92–104. https://doi.org/10.14778/3620359.3620366

Athaluri, A. S., Manthena, S. V., K., M. V. S. R., Kesapragada, V., Yarlagadda, T., Dave, & Dudumpudi, R. T. S. (2023). Exploring the Boundaries of Reality: Investigating the Phenomenon of Artificial Intelligence Hallucination in Scientific Writing Through ChatGPT References. Cureus, 15(12). https://doi.org/10.7759/cureus.49964

Bakiri, H., Mbembati, H., & Tinabo, R. (2023). Artificial Intelligence Services at Academic Libraries in Tanzania: Awareness, Adoption and Prospects. University of Dar Es Salaam Library Journal, 18(2).https://doi.org/10.4314/udslj.v18i2.3

Bearman, M., Tai, J., Dawson, P., Boud, D., & Ajjawi, R. (2024). Developing evaluative judgement for a time of generative artificial intelligence. Assessment & Evaluation in Higher Education, 49(6), 893–905. https://doi.org/10.1080/02602938.2024.2335321

Biswas, S., Jain, S., Morariu, R., Gu, V. L., Mathur, J., Wigington, P., Sun, C., & Uehida, T. (2024). DocSynthV2: A Practical Autoregressive Modelling for Document Generation. arXiv. https://arxiv.org/abs/2406.02492.

Bitzenbauer, P. (2023). ChatGPT in physics education: A pilot study on easy-to-implement activities. Contemporary Educational Technology, 15(3), ep430. https://doi.org/10.30935/cedtech/13176.

Borkovska, I., Kolosova, H., Kozubska, I., & Antonenko, I. (2024). Integration of AI into the Distance Learning Environment: Enhancing Soft Skills. Arab World English Journal, 1(1), 56–72. https://doi.org/10.24093/awej/ChatGPT.3

Bozkurt, A. (2024). Tell Me Your Prompts and I Will Make Them True: The Alchemy of Prompt Engineering and Generative AI. Open Praxis, 16(2), 111–118. https://doi.org/10.55982/openpraxis.16.2.661

Bradley, C. (2013). Information Literacy Articles in Science Pedagogy Journals. Evidence Based Library and Information Practice, 8(4), 78–92. https://doi.org/10.18438/B8JG76

Cain, W. (2024). Prompting Change: Exploring Prompt Engineering in Large Language Model AI and Its Potential to Transform Education. TechTrends, 68(1), 47–57. https://doi.org/10.1007/s11528-023-00896-0

Carroll, A. J., & Borycz, J. (2024). Integrating large language models and generative artificial intelligence tools into information literacy instruction. The Journal of Academic Librarianship, 50(4), 102899.https://doi.org/10.1016/j.acalib.2024.102899

ÇAYIR, A. (2023). A Literature Review on the Effect of Artificial Intelligence on Education. İnsan ve Sosyal Bilimler Dergisi, 6(2), 276–288. https://doi.org/10.53048/johass.1375684

Lin, C.-H., & Cheng, C. P. (2024). Legal Documents Drafting with Fine-Tuned Pre-trained Large Language Model. arXiv. https://arxiv.org/abs/2406.08860

Mohammadi, B., et al. (2024). Creativity Has Left the Chat: The Price of Debiasing Language Models. arXiv. https://arxiv.org/abs/2403.04595

Mridul, M. A., Sloyan, I., Gupta, A., & Seneviratne, O. (2025). AI4Contracts: LLM & RAG-Powered Encoding of Financial Derivative Contracts. arXiv. https://arxiv.org/abs/2506.09633

Nigam, S. K., Patnaik, B. D., Thomas, A. V., Shallum, N., Ghosh, K., & Bhattacharya, A. (2025). Structured Legal Document Generation in India: A Model-Agnostic Wrapper Approach with VidhiDastavej. International Journal of Law, Technology, and Management. https://doi.org/10.48550/arXiv.2506.09540

Zhao, H., & Li, D. (2024). A Large Language Model-based Framework for Semi-Structured Tender Document Retrieval–Augmented Generation. arXiv. https://arxiv.org/abs/2403.18560

Zhang, Q., Huang, B., Jiang, V., Wang, J., Jiang, Z., He, L., & Zhang, C. (2024). Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction. ResearchGate. https://arxiv.org/abs/2403.11186

Article Details

How to Cite

AI-Powered Document Generation: Using NLP for Intelligent Data-To-Template Mapping. (2025). International Journal of Latest Technology in Engineering Management & Applied Science, 14(10), 221-229. https://doi.org/10.51583/IJLTEMAS.2025.1410000030