A research project at the University of Washington to see whether analyzing food product reviews on Amazon.com might help to predict recalls is showing promise, and the team leader hopes it will have practical applications for illness outbreak investigations. Project lead Elaine Nsoesie, Ph.D.., assistant professor of global health at the university’s Institute for Health Metrics and Evaluation, said that while the Unsafe Food Project is still in the “mission learning stage,” initial results are encouraging after a summer fellowship program recently ended.

UW food recall project team
This team from the University of Washington’s Data Science for Social Good program worked on the Unsafe Food Project. They are, from left, Lead Data Scientist Valentina Staneva, Data Science Fellows Michael Munsell, Cynthia Vint, Kara Woo, and Kiren Verma, and Project Lead Elaine Nsoesie. (Photo courtesy of eScience Institute)
“What we did this summer is get all the data we needed together from the FDA and try and match that data to the Amazon product reviews we were getting. It was more challenging than we expected,” she said. “Then we created a database with all that data. The internship was only 10 weeks, so we didn’t have that much time. We tried to automate the process in the time we had.” According to the program description, “The goal of this project is to use product reviews from Amazon.com to identify potentially unsafe food products. Foods that are mislabeled, contaminated, or spoiled get recalled through a time-consuming process that can leave consumers at risk of allergic reactions, injury, and illness for months. Our goal is to use reviews that consumers post online to predict whether a product will be recalled.” Concern is growing over the lag time between when complaints about a food item surface and when a recall is announced. In some cases, it may take a year because illnesses must be reported and confirmed, food testing done, and a government investigation completed. The project team mined the product reviews for specific terms such “sick,” “mold” and “vomit” and then tried to figure out which ones might help to predict subsequent food recalls. Nsoesie said that the Amazon reviews turned out to be mostly positive, although a closer look occasionally indicated that something else might be going on. “Some people are really sarcastic. They have a product that they really hate, but they will give it a five-star review, so when you read it, you see that it’s really a bad review. They might think they will get more attention with a good review than a bad one,” she said. Even so, there were correlations that could be made, and the team had some success in linking subsequent recalls with the substance of the food product reviews. “For some of them, we can possibly be able to predict a recall. For sensitivity of a model, it’s about 35 percent, so we need to work and improve on that,” Nsoesie said. The project team also created an online tool for viewing recalled product reviews that shows reviews and ratings for a recalled product over time. The team noted that the reviews in the tool “provide some support for the idea that product reviews can be a fruitful data source for identifying unsafe foods.” The tool has a pull-down menu with a variety of food products ranging from trail mixes and granola bars to chili, molasses, peanut butter, tea, spices and several others. After choosing one, it’s possible to see when and where the reviews are plotted along time and rating lines. For a brand of Hungarian paprika, for example, the reviews posted from 2009-14 went from a high score of 5, with comments such as “Perfect!” and “I love it!,” to a low score of 1, with comments such as “Disgusting!” “Bugs in this product!,” and with a number of other comments landing in between. Nsoesie said the next step for the project will be trying to get real information in real time. The team will seek cooperation from Amazon, Twitter, Yelp and perhaps others to get that accomplished since adequate data-gathering with better and faster technology is key. “If someone has information about a tomato, you might not know what tomato they’re talking about. You need the product identification number (UPC) to track it back to the company and maybe back to where they got it,” she explained. The Unsafe Food Project was one of four from the 2016 Data Science for Social Good summer fellowship at the UW’s eScience Institute in Seattle. In addition to Nsoesie, the team included Lead Data Scientist Valentina Staneva and Data Science Fellows Michael Munsell, Cynthia Vint, Kara Woo, and Kiren Verma. (To sign up for a free subscription to Food Safety News, click here.)