keyboard_arrow_up
Artificial Intelligence and NLP on Reddit: Unsupervised Detection of Food Trends and Healthy Eating Patterns

Authors

Rocío del Campo-Pedrosa1 and Bettina Merlin 2, Diego del Campo-Pedrosa1,Ana González-Marcos1,1 Department of Mechanical Engineering, Universidad de La Rioja, Logroño, La Rioja, Spain, 2 Fakultät International Business, Hochschule Heilbronn, Heilbronn, Germany

Abstract

Traditional sensory analysis in food innovation provides limited insight into consumer behavior, whereas social platforms such as Reddit offer large-scale, real-time textual data on food-related practices and perceptions. This study evaluates Reddit as a scalable source for detecting food trends and healthy eating patterns in Spanish-language discussions using artificial intelligence (AI) and natural language processing (NLP). An end-to-end pipeline was implemented, including targeted data scraping across seven food-related domains, Spanish-language filtering (≥70% confidence), customized preprocessing, and unsupervised topic discovery via k-means clustering. The system processed 17,774 Spanish-language posts from an initial corpus of 92,949 entries. Despite linguistic challenges such as polysemy and lemmatization errors, the method produced coherent and representative themes, including barriers to home cooking, weight management concerns, economic factors, food categories, and nutrition-related consultations. These results demonstrate the effectiveness of unsupervised NLP techniques for large-scale monitoring of food-related discourse on social media.

Keywords

Natural Language Processing, Unsupervised Learning, Social Media Mining, Artificial Intelligence