Advancing Predictive Analytics: Integrating Machine Learning and Data Modelling for Enhanced Decision-Making
Article Sidebar
Main Article Content
Abstract: In the era of big data, the synergy between machine learning (ML) and data modeling has emerged as a cornerstone for predictive analytics. This article explores the integration of machine learning techniques with traditional data modeling approaches to enhance decision-making across various domains. By leveraging the strengths of both methodologies, organizations can unlock deeper insights, improve accuracy, and drive innovation. This article discusses key concepts, challenges, and applications, providing a roadmap for researchers and practitioners to harness the full potential of these technologies.
Downloads
References
Adebayo, J., & Kagal, L. (2016). FairML. PMLR.
Airbnb Engineering. (2022). Scaling machine learning at Airbnb with data mesh. https://medium.com/airbnb-engineering
Angwin, J., et al. (2016). Machine bias. ProPublica.
Barocas, S., & Selbst, A. D. (2016). Big data's disparate impact. California Law Review, 104(3), 671-732. https://doi.org/10.15779/Z38BG31 DOI: https://doi.org/10.2139/ssrn.2477899
Bellamy, R. K., et al. (2019). AI Fairness 360. IBM Journal.
Bird, S., Dudík, M., Edgar, R., et al. (2020). Fairlearn: A toolkit for assessing and improving fairness in AI. Microsoft Research. https://www.microsoft.com/research/project/fairlearn/
Bolukbasi, T., Chang, K.-W., Zou, J. Y., et al. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. Advances in Neural Information Processing Systems, 29.
Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Proceedings of the Conference on Fairness, Accountability, and Transparency, 77-91.
Chang, C.C., and Lin, C.J. (2011). "LIBSVM: A Library for Support Vector Machines." ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 1–27. DOI: https://doi.org/10.1145/1961189.1961199
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 785-794). DOI: https://doi.org/10.1145/2939672.2939785
Chouldechova, A. (2017). Fair prediction. FATML.
Dwork, C., et al. (2012). Fairness through awareness. ITCS. DOI: https://doi.org/10.1145/2090236.2090255
Elmasri, R., & Navathe, S. B. (2016). Fundamentals of Database Systems (7th Edition). Pearson
Esteva, A., Kuprel, B., Novoa, R. A., et al. (2017). "Dermatologist-Level Classification of Skin Cancer with Deep Neural Networks." Nature, 542(7639), 115–118. DOI: https://doi.org/10.1038/nature21056
EU AI Act. (2024). Regulation on artificial intelligence. European Parliament. https://www.europarl.europa.eu/
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., & Hutter, F. (2015). Efficient and robust automated machine learning. Advances in neural information processing systems, 28.
Gartner. (2022). "Top 10 Data and Analytics Trends for 2023."
Gebru, T., Morgenstern, J., Vecchione, B., et al. (2021). Datasheets for datasets. Communications of the ACM, 64(12), 86-92. DOI: https://doi.org/10.1145/3458723
Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.
Google PAIR. (2023). People + AI guidebook. https://pair.withgoogle.com/guidebook
Hamilton, W. L. (2023). Graph representation learning. Morgan & Claypool.
Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Inductive representation learning on large graphs. Advances in neural information processing systems, 30.
Han, J., Kamber, M., and Pei, J. (2011). Data Mining: Concepts and Techniques. Morgan Kaufmann.
Hoberman, S. (2020). Data Modeling Made Simple: A Practical Guide for Business and IT Professionals. Technics Publications.
Hutter, F., Kotthoff, L., and Vanschoren, J. (2019). Automated Machine Learning: Methods, Systems, Challenges. Springer. DOI: https://doi.org/10.1007/978-3-030-05318-5
IBM. (2020). "The Role of Data Modeling in AI and Machine Learning."
IBM. (2021). AI governance framework. https://www.ibm.com/artificial-intelligence/governance
Inmon, W. H., and Linstedt, D. (2019). Data Architecture: A Primer for the Data Scientist. Morgan Kaufmann.
Jolliffe, I. T., and Cadima, J. (2016). "Principal Component Analysis: A Review and Recent Developments." Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065), 20150202. DOI: https://doi.org/10.1098/rsta.2015.0202
Jordan, M. I., and Mitchell, T. M. (2015). "Machine Learning: Trends, Perspectives, and Prospects." Science, 349(6245), 255–260. DOI: https://doi.org/10.1126/science.aaa8415
Kairouz, P., et al. (2021). "Advances and Open Problems in Federated Learning." Foundations and Trends in Machine Learning, 14(1–2), 1–210.
Kanter, J. M., and Veeramachaneni, K. (2015). "Deep Feature Synthesis: Towards Automating Data Science Endeavors." IEEE International Conference on Data Science and Advanced Analytics (DSAA). DOI: https://doi.org/10.1109/DSAA.2015.7344858
Kimball, R., & Ross, M. (2013). The data warehouse toolkit: The definitive guide to dimensional modeling (3rd ed.). Wiley.
Kohavi, R., and Provost, F. (1998). "Glossary of Terms." Machine Learning, 30(2–3), 271–274. DOI: https://doi.org/10.1023/A:1017181826899
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). "ImageNet Classification with Deep Convolutional Neural Networks." Advances in Neural Information Processing Systems (NeurIPS).
LeCun, Y., Bengio, Y., and Hinton, G. (2015). "Deep Learning." Nature, 521(7553), 436–444. DOI: https://doi.org/10.1038/nature14539
Leskovec, J., Lang, K. J., Dasgupta, A., and Mahoney, M. W. (2010). "Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters." Internet Mathematics, 6(1), 29–123. DOI: https://doi.org/10.1080/15427951.2009.10129177
Lundberg, S. M., and Lee, S. I. (2017). "A Unified Approach to Interpreting Model Predictions." Advances in Neural Information Processing Systems (NeurIPS).
Manyika, J., Chui, M., Brown, B., et al. (2011). "Big Data: The Next Frontier for Innovation, Competition, and Productivity." McKinsey Global Institute.
McInnes, L., Healy, J., and Melville, J. (2018). "UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction." arXiv preprint arXiv:1802.03426 DOI: https://doi.org/10.21105/joss.00861
McKinsey and Company. (2021). "The AI Frontier: Modeling the Impact of AI on the World Economy." Mehrabi, N., et al. (2021). Bias in AI. ACM Computing Surveys.
Mitchell, M., Wu, S., Zaldivar, A., et al. (2019). Model cards for model reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency, 220-229. DOI: https://doi.org/10.1145/3287560.3287596
Mnih, V., Kavukcuoglu, K., Silver, D., et al. (2015). "Human-Level Control Through Deep Reinforcement Learning." Nature, 518(7540), 529–533. DOI: https://doi.org/10.1038/nature14236
Molnar, C. (2022). Interpretable Machine Learning: A Guide for Making Black Box Models Explainable.
Müllner, D. (2011). "Modern Hierarchical, Agglomerative Clustering Algorithms." arXiv preprint arXiv:1109.2378.
Murphy, K. P. (2022). Probabilistic Machine Learning: An Introduction. MIT Press.
Obermeyer, Z., et al. (2019). Dissecting racial bias. Science.
Provost, F., & Fawcett, T. (2013). Data science for business: What you need to know about data mining and data-analytic thinking. O'Reilly Media, Inc.
Raghavan, M., Barocas, S., Kleinberg, J., & Levy, K. (2020). Mitigating bias in algorithmic hiring. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. DOI: https://doi.org/10.1145/3351095.3372828
Rajkomar, A., Hardt, M., Howell, M. D., et al. (2018). Ensuring fairness in machine learning to advance health equity. Annals of Internal Medicine, 169(12), 866-872. DOI: https://doi.org/10.7326/M18-1990
Ribeiro, M. T., Singh, S., and Guestrin, C. (2016). "Why Should I Trust You? Explaining the Predictions of Any Classifier." Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. DOI: https://doi.org/10.1145/2939672.2939778
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., & Monfardini, G. (2009). The graph neural network model. IEEE transactions on neural networks, 20(1), 61-80. DOI: https://doi.org/10.1109/TNN.2008.2005605
Schulman, J., Levine, S., Abbeel, P., Jordan, M., & Moritz, P. (2015). "Trust Region Policy Optimization." Proceedings of the 32nd International Conference on Machine Learning (ICML), 37, 1889–1897.
Sculley, D., Holt, G., Golovin, D., et al. (2015). Hidden technical debt in machine learning systems. Advances in Neural Information Processing Systems, 28.
Shalev-Shwartz, S., and Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9781107298019
Shi, W., Cao, J., Zhang, Q., et al. (2016). "Edge Computing: Vision and Challenges." IEEE Internet of Things Journal, 3(5), 637–646. DOI: https://doi.org/10.1109/JIOT.2016.2579198
Stanford HAI. (2023). AI index report 2023. https://hai.stanford.edu/research/ai-index
Sutton, R. S., and Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT Press.
Sweeney, L. (2013). "Discrimination in Online Ad Delivery." Communications of the ACM, 56(5), 44–54. DOI: https://doi.org/10.1145/2447976.2447990
Topol, E. (2019). Deep medicine: How artificial intelligence can make healthcare human again. Basic Books.
Wexler, J., Pushkarna, M., Bolukbasi, T., et al. (2019). The what-if tool: Interactive probing of machine learning models. IEEE Transactions on Visualization and Computer Graphics. DOI: https://doi.org/10.1109/TVCG.2019.2934619
Yang, Q., Liu, Y., Chen, T., & Tong, Y. (2019). Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2), 1-19. DOI: https://doi.org/10.1145/3298981
Zhou, J., Cui, G., Hu, S., et al. (2023). Graph neural networks: Taxonomy, advances, and trends. ACM Transactions on Intelligent Systems and Technology, 14(1), 1-54. DOI: https://doi.org/10.1145/3495161

This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in our journal are licensed under CC-BY 4.0, which permits authors to retain copyright of their work. This license allows for unrestricted use, sharing, and reproduction of the articles, provided that proper credit is given to the original authors and the source.