When purchasing foods, many consumers give food labels cursory scans, taking in information such as calorie levels or sodium content. Why is streamlining this process crucial from a public and policy perspective?
Creating and maintaining the databases needed by researchers and others to establish food policies and monitor the food supply is a significant task. This involves classifying and analyzing hundreds of thousands of foods, a process that is typically done manually and infrequently.
Guanlan Hu, Postdoctoral Fellow in the Department of Nutritional Sciences (Temerty Faculty of Medicine, U of T), is on a mission to simplify this complex process. Her research explores the use of pre-trained language models and supervised machine learning to analyze unstructured food label text, thereby streamlining food categorization and other important classification tasks. Among her primary goals is to revolutionize the understanding and categorization of ultra-processed foods (UPFs), particularly for the benefit of the public and policy makers. Her aim is to improve public health and streamline the analysis of food, underscoring the broader impact and significance of her research.
Supervised by Professor Emerita Mary R. L’Abbé (Temerty Faculty of Medicine, U of T), and co-authored by Postdoctoral Fellow Mavra Ahmed and PhD student Nadia Flexner, Hu’s presentation at the DSI Research Day signals a shift in the landscape of food classification and health policy.
“Using cutting-edge language models and machine learning, we’ve automated food categorization, nutrition quality scoring and food processing level classification,” says Hu. “This streamlines food analysis and holds promise for swift, scalable monitoring of the global food supply, particularly in identifying ultra-processed foods.”
Leveraging pre-trained language models and the XGBoost multi-class classification algorithm, Hu’s methodology achieved an impressive accuracy score of 0.98 in predicting both major and sub-category classification of foods, outperforming traditional bag-of-words methods and presenting a powerful tool for efficiently determining food categories and food processing levels.
“The research holds the potential to expedite the monitoring and regulation of ultra-processed foods in the global food supply, offering a transformative impact on public health and regulatory practices,” says Professor L’Abbé.
This research is part of a DSI Catalyst Grant project, Using deep learning and image recognition to develop AI technology to measure child-directed marketing on food and beverage packaging and investigate the relationship between marketing, nutritional quality and price, awarded to L’Abbé and Professors David Soberman (Joseph L. Rotman School of Management), Laura Rosella (Dalla Lana School of Public Health), and Steve Mann (Edward S. Rogers Sr. Department of Electrical & Computer Engineering, Faculty of Applied Science & Engineering). The Collaborative Research Team includes trainees such as Hu.
By refining food analysis and offering a better method for policymakers to monitor and regulate UPFs, Hu especially hopes to improve public health and dietary understanding in countries where highly processed foods contribute significantly to daily energy intake, such as Canada, the United States and Argentina, where Hu has applied her work.
Her just-completed research, though, is simply a first step. “Much like the continual evolution of technology,” says Hu, “our work demands continuous development and evolution in this pioneering field.”
In the meantime, Hu’s work underscores the potential of machine learning and natural language processing in nutrition sciences and the interdisciplinary nature of such breakthroughs, reflecting the importance Data Sciences Institute grants in fostering collaborative research.
As a collaborative community, the DSI promotes innovation and facilitates the exchange of ideas, connecting diverse groups of researchers and trainees spanning various disciplines. One of the many ways that trainees can get involved is through the DSI’s Postdoctoral Fellowship, designed to support multi and interdisciplinary training and collaborative research in data sciences.
This news story was originally posted on the Data Sciences Institute website.