Unlearning in AI: Techniques and Frameworks for Data Deletion in Pretrained Models Under Legal and Ethical Constraints

Article Sidebar

Main Article Content

Motunrayo Adebayo

Abstract: The rapid expansion of the AI revolution has been propelled by a focus on large-scale pretrained models, which have enabled significant advancements across diverse tasks in computer vision, multimodal applications, and natural language processing. This swift progress has simultaneously heightened concerns regarding data privacy and protection, particularly with the introduction of more stringent legislative measures like the California Consumer Privacy Act (CCPA) and the General Data Protection Regulation (GDPR). To address these challenges, the concept of "unlearning" is crucial. Unlearning refers to the technological process of eliminating specific data or its influence from a trained model, typically when necessitated by data deletion rights or ethical considerations. Unlike simply removing entries from a database, the complex and interconnected nature of learned representations in deep neural networks makes the process of unlearning within AI systems considerably more difficult. This study thoroughly investigates AI unlearning methods and structures for data erasure in trained models, operating within established ethical and legal boundaries.


The inquiry begins by discussing the moral and legal justifications for machine unlearning, emphasizing factors such as model functionality, data traceability, and the completeness of the deletion process. Next, i present a classification of existing unlearning techniques, ranging from those less suitable for handling large-scale pretrained models and diverse data types to those better adapted for real-world applications. This category includes techniques such as retraining, model modification, knowledge distillation, approximation unlearning, and certified removal. Following an assessment of unlearning approaches for large pretrained models and varied data modalities, the discussion expands into a detailed examination of their benefits, drawbacks, computational costs, and trade-offs. This includes a focus on concepts like 'influence' (data's impact) and 'deletion' (successful removal).


I formalize machine unlearning and establish its theoretical foundation. In my experience, unlearning can be effectively implemented in various contexts, particularly with pretrained models, to minimize accuracy loss while ensuring robust privacy assurances. This capability is enabled by specific methodological frameworks and algorithms. My experimental assessment compares various unlearning methods across a range of datasets and tasks, paying particular attention to the 'remembering' metric, model utility preservation, computational cost, and resilience to data reconstruction attacks. Furthermore, the study integrates technical and regulatory domains by connecting legal requirements to quantifiable machine learning goals and by illuminating moral dilemmas that seek to balance privacy with openness and justice. I clearly highlight significant inconsistencies between current legal requirements and the actual technical potential of unlearning, offering theoretical and technological guidance through multidisciplinary approaches. Despite these achievements, I found that scalable and verifiable unlearning in large pretrained models remains a nascent yet crucial field of study. To ensure adherence to privacy regulations and uphold ethical standards in AI applications, this study lays the groundwork for future research into unified standards, rigorous evaluation processes, and practical unlearning technology deployment. The overarching goal is to foster the sustained development of trustworthy AI systems that uphold personal data rights while simultaneously delivering genuine value and goodwill to society.

Unlearning in AI: Techniques and Frameworks for Data Deletion in Pretrained Models Under Legal and Ethical Constraints. (2025). International Journal of Latest Technology in Engineering Management & Applied Science, 14(8), 841-863. https://doi.org/10.51583/IJLTEMAS.2025.1408000109

Downloads

References

Awasthi, P., Balakrishnan, A., & Singla, S. (2021). Differentially Private Machine Unlearning. Neural Information Processing Systems (NeurIPS), 34, 13402–13413.

Bassily, R., Feldman, V., & Talwar, K. (2020). Stability and Privacy in Federated Learning. Advances in Neural Information Processing Systems, 33, 22225–22238.

Binns, R., Veale, M., & Edwards, L. (2023). Rights and Remedies under the EU AI Act: A Legal Perspective on Machine Learning. Computer Law & Security Review, 50, 105798.

Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., ... & Liang, P. (2022). On the Opportunities and Risks of Foundation Models. ACM Transactions on Machine Learning Research, 3(1), Article 21. https://doi.org/10.1145/3533278

Bourtoule, L., Chandrasekaran, V., Choquette-Choo, C. A., Tramèr, F., Zhang, C., Evci, U., ... & Papernot, N. (2021). Machine Unlearning. Proceedings of the 42nd IEEE Symposium on Security and Privacy (SP), 141–159. https://doi.org/10.1109/SP40001.2021.00027

California Civil Code. (2020). California Consumer Privacy Act (CCPA). Retrieved from https://leginfo.legislature.ca.gov

California Consumer Privacy Act (CCPA). (2020). California Civil Code §1798.100 et seq.

California Legislature. (2023). SB-362 Delete Act. Retrieved from https://leginfo.legislature.ca.gov

Cao, Y., & Yang, J. (2023). Ethical AI and the Right to be Forgotten: A Global Perspective. AI & Society, 38(2), 543–558.

Carlini, N., Tramer, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., ... & Song, D. (2021). Extracting Training Data from Large Language Models. Proceedings of the 30th USENIX Security Symposium, 2633–2650. https://www.usenix.org/conference/usenixsecurity21/presentation/carlini

Chen, Y., Zhang, H., & Wang, Z. (2022). Unlearning in Deep Neural Networks via Knowledge Amnesia. Proceedings of the AAAI Conference on Artificial Intelligence, 36(7), 7035–7043.

European Commission. (2021). General Data Protection Regulation (GDPR). Retrieved from https://gdpr-info.eu/

European Commission. (2024). Proposal for a Regulation on Artificial Intelligence. Retrieved from https://digital-strategy.ec.europa.eu

European Parliament. (2016). Regulation (EU) 2016/679 (General Data Protection Regulation). Official Journal of the European Union, L119, 1–88.

Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., ... & Schafer, B. (2022). AI4People’s Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations. Minds and Machines, 32(1), 1–24. https://doi.org/10.1007/s11023-022-09600-7

Ginart, A., Guan, M. Y., Valiant, G., & Zou, J. (2021). Making AI Forget You: Data Deletion in Machine Learning. Proceedings of the 38th International Conference on Machine Learning (ICML 2021), 3822–3831. https://proceedings.mlr.press/v139/ginart21a.html

Goldstein, A., et al. (2021). Data Provenance and Machine Unlearning: Challenges and Opportunities. Journal of Privacy and Confidentiality, 11(2), Article 3.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.

Hildebrandt, M. (2021). Law for Computer Scientists and Other Folk. Oxford University Press.

Katwala, R., & Noor, A. (2022). Unlearning in Vision Transformers: A Study on Selective Forgetting. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4532–4540.

Krishnan, V., Bhowmick, A., Mitzenmacher, M., & Shen, J. (2023). Challenges and Trade-offs in Machine Unlearning under Privacy Laws. IEEE Transactions on Knowledge and Data Engineering, 35(2), 456–468. https://doi.org/10.1109/TKDE.2022.3166790

Krizhevsky, A., et al. (2009). Learning multiple layers... Technical Report.

Koh, P. W., Sagawa, S., Marklund, H., Xie, S. M., Zhang, M., Balsubramani, A., ... & Liang, P. (2021). Understanding Deep Learning (Still) Requires Rethinking Generalization. Communications of the ACM, 64(3), 107–115. https://doi.org/10.1145/3430896

Liu, Q., & Wu, X. (2023). Federated Unlearning: Privacy-Compliant Machine Learning in Distributed Settings. IEEE Transactions on Knowledge and Data Engineering. Advance online publication. https://doi.org/10.1109/TKDE.2023.3287492

Liu, Y. (2022). The PIPL and the Future of Data Privacy in China. Asian Journal of Law and Technology, 4(1), 23–48.

Mitchell, E., Lin, C. E., Wallace, E., Santurkar, S., Krishnan, V., Jagielski, M., ... & Madry, A. (2022). Fast Model Editing at Scale. Proceedings of the 39th International Conference on Machine Learning (ICML 2022). https://proceedings.mlr.press/v162/mitchell22a.html

MIT Technology Review. (2023). The Soaring Cost of AI Models. Retrieved from https://www.technologyreview.com

Nguyen, L., Mohammadi, S., Goyal, N., & Gupta, A. (2022). A Survey of Machine Unlearning: Definitions, Techniques, and Challenges. IEEE Access, 10, 110233–110251. https://doi.org/10.1109/ACCESS.2022.3213254

OpenAI. (2023). GPT-4 Technical Report. Retrieved from https://openai.com/research

Russakovsky, O., et al. (2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3), 211–252.

Thudi, A., Li, Y., & Song, D. (2021). Forgetting Outside the Box: Certified Removal of User Data in Machine Learning Models. Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2511–2525.

Voigt, P., & Von dem Bussche, A. (2021). The EU General Data Protection Regulation (GDPR): A Practical Guide (2nd ed.). Springer. https://doi.org/10.1007/978-3-030-82383-9

Wu, Z., Wang, J., & Sun, Y. (2024). Auditing Machine Unlearning: A Survey and Open Problems. IEEE Transactions on Information Forensics and Security. Advance online publication. https://doi.org/10.1109/TIFS.2024.3342108

Zhu, L., Liu, Z., & Han, S. (2022). Deep Leakage from Gradients Revisited: More Practical Attacks and Defense. Proceedings of the 40th International Conference on Machine Learning (ICML 2022), 27483–27495. https://proceedings.mlr.press/v162/zhu22c.html

Article Details

How to Cite

Unlearning in AI: Techniques and Frameworks for Data Deletion in Pretrained Models Under Legal and Ethical Constraints. (2025). International Journal of Latest Technology in Engineering Management & Applied Science, 14(8), 841-863. https://doi.org/10.51583/IJLTEMAS.2025.1408000109