Logistic Regression with Min-Max Scaling and TF-IDF for App Classification and Recommendation on Google Play Store
DOI:
https://doi.org/10.54082/jiki.288Keywords:
App Recommendation, Google Play Store, Logistic Regression, Min-Max Scaling, TF-IDF Vectorizer, User ExperienceAbstract
In the rapidly evolving mobile application ecosystem, enhancing user experience on the Google Play Store has become a critical challenge due to the vast number of available applications. This study proposes an integrated approach combining Logistic Regression, Min-Max Scaling, and the Term Frequency–Inverse Document Frequency (TF-IDF) Vectorizer to classify applications and generate personalized recommendations. The dataset, obtained from the Google Play Store, includes numerical features such as ratings, size, and installs, as well as textual data from user reviews. Min-Max Scaling was applied to normalize numerical attributes, ensuring balanced feature contributions during model training. TF-IDF was employed to convert textual reviews into meaningful numerical representations, enabling the model to capture the semantic importance of terms. The classification and recommendation system was evaluated using accuracy, precision, and recall as performance metrics. Experimental results demonstrated a substantial improvement compared to the baseline model, with accuracy, precision, and recall reaching 99.8%, compared to the previous 22.8% baseline performance. The system effectively recommended relevant applications based on user preferences, as measured through cosine similarity in feature space. These results indicate that the proposed method not only improves classification accuracy but also enhances the quality of app recommendations, thereby significantly improving user experience. The findings contribute to the field of computer science by demonstrating an effective integration of feature scaling and text vectorization into a classical machine learning model, offering a scalable and interpretable solution for large-scale recommendation systems in digital marketplaces. This approach can be further adapted to other domains requiring hybrid processing of numerical and textual data for predictive analytics.
References
S. Arora and J. Singh Bal, “JOURNAL OF CRITICAL REVIEWS A Study on Google Play Store,” vol. 08, no. 03, pp. 575–580, 2021.
A. T. Rizkya, R. Rianto, and A. I. Gufroni, “Implementation of the Naive Bayes Classifier for Sentiment Analysis of Shopee E-Commerce Application Review Data on the Google Play Store,” Int. J. Appl. Inf. Syst. Informatics, vol. 1, no. 1, pp. 31–37, 2023.
F. Alqahtani and R. Orji, “Insights from user reviews to improve mental health apps,” Health Informatics J., vol. 26, no. 3, pp. 2042–2066, 2020.
C. Ma, Y. Sun, Z. Yang, H. Huang, D. Zhan, and J. Qu, “Content Feature Extraction-based Hybrid Recommendation for Mobile Application Services,” Comput. Mater. Contin., vol. 71, no. 2, pp. 6201–6217, 2022..
H. Ko, S. Lee, Y. Park, and A. Choi, “A Survey of Recommendation Systems: Recommendation Models, Techniques, and Application Fields,” Electronics (Switzerland), vol. 11, no. 1. MDPI, Jan. 01, 2022.
E. O. Buhl, M., Dirckinck-Holmfeld, L., & Jensen, “article Fagfellevurdert publication Exp anding and o rc h e st ra t ing th e p r ob le m iden t ifica t i o n ph a s e o f de s ign -b a s ed re s earc h,” 2022.
S. Sadiq, M. Umer, S. Ullah, S. Mirjalili, V. Rupapara, and M. Nappi, “Discrepancy detection between actual user reviews and numeric ratings of Google App store using deep learning,” Expert Syst. Appl., vol. 181, no. April, 2021.
M. Umer, I. Ashraf, A. Mehmood, S. Ullah, and G. S. Choi, “Predicting numeric ratings for Google apps using text features and ensemble learning,” ETRI J., vol. 43, no. 1, pp. 95–108, Feb. 2021.
M. Faisal, A. Scally, R. Howes, K. Beatson, D. Richardson, and M. A. Mohammed, “A comparison of logistic regression models with alternative machine learning methods to predict the risk of in-hospital mortality in emergency medical admissions via external validation,” Health Informatics J., vol. 26, no. 1, pp. 34–44, 2020.
H. Aldabbas, A. Bajahzar, M. Alruily, A. A. Qureshi, R. M. Amir Latif, and M. Farhan, “Google Play Content Scraping and Knowledge Engineering using Natural Language Processing Techniques with the Analysis of User Reviews,” J. Intell. Syst., vol. 30, no. 1, pp. 192–208, 2020.
H. Mohammad et al., “Identifying data elements and key features of a mobile-based self-care application for patients with COVID-19 in Iran,” Health Informatics J., vol. 27, no. 4, pp. 1–15, 2021.
W. Yue, Z. Wang, J. Zhang, and X. Liu, “An Overview of Recommendation Techniques and Their Applications in Healthcare,” IEEE/CAA J. Autom. Sin., vol. 8, no. 4, pp. 701–717, 2021.
T. Alanzi, “A review of mobile applications available in the app and google play stores used during the COVID-19 outbreak,” J. Multidiscip. Healthc., vol. 14, pp. 45–57, 2021.
D. Garcia-Gonzalez, D. Rivero, E. Fernandez-Blanco, and M. R. Luaces, “A public domain dataset for real-life human activity recognition using smartphone sensors,” Sensors (Switzerland), vol. 20, no. 8, 2020.
Z. A. Memon, N. Munawar, and M. Kamal, “App store mining for feature extraction: analyzing user reviews,” Acta Sci. - Technol., vol. 46, p. 62867, 2023.
B. A. Mandour, “Traditional textile printing between spontaneity and planning: A study of creative practice,” Int. J. Educ. Arts, vol. 25, no. 4, 2024.
A. Fuad, S. Bayoumi, and H. Al-Yahya, “A recommender system for mobile applications of google play store,” Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 9, pp. 42–50, 2020.
A. S. Dharma and Y. G. R. Saragih, “Comparison of Feature Extraction Methods on Sentiment Analysis in Hotel Reviews,” Sinkron, vol. 7, no. 4, pp. 2349–2354, 2022.
V. Christanti Mawardi and E. Darmaja, “Logistic Regression Method for Sentiment Analysis Application on Google Playstore,” Int. J. Appl. Sci. Technol. Eng., vol. 1, no. 1, pp. 241–247, 2023.
E. Johnson-Green, C. Lee, and M. Flannery, “A Musical Perspective on STEM: Evaluating the EcoSonic Playground Project from a Co-equal STEAM Integration Standpoint,” Res. Stud. Music Educ., vol. 15, no. 1, p. 71, 2023.
T. Smith, K. Bylica, and R. Martin, “Back to Basics: Development of Additional Courses for Creative Dance in a Thai Secondary School,” vol. 24, 2024.
S. A. Hicks et al., “On evaluation metrics for medical applications of artificial intelligence,” Sci. Rep., vol. 12, no. 1, pp. 1–9, 2022.
Ž. Vujović, “Classification Model Evaluation Metrics,” Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 6, pp. 599–606, 2021.
Z. Fayyaz, M. Ebrahimian, D. Nawara, A. Ibrahim, and R. Kashef, “Recommendation systems: Algorithms, challenges, metrics, and business opportunities,” Appl. Sci., vol. 10, no. 21, pp. 1–20, 2020.
N. B. Diamond, M. J. Armson, and B. Levine, “The truth is out there: Accuracy in recall of verifiable real-world events The Baycrest Tour Event: Supplementary Methods The Audio Guide,” Psychol. Sci., 2020.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Calista Anindita, Wike Laelatunuji, Rusmini Rusmini

This work is licensed under a Creative Commons Attribution 4.0 International License.