If you’ve ever tried to find a recent inspection report for a particular restaurant, you’ll know that the ease of accessing and understanding that information can depend on what city or county you live in. University of Maryland Economics Professor Ginger Jin and UCLA Associate Professor of Business Management Philip Leslie have been doing research about food safety inspections since the 1990s. Out of that work grew the idea to collect and normalize data from food safety inspections from local government websites. Along with UMD Computer Science Professor Ben Bederson and graduate students Alexander Quinn and Ben Zou, they created a huge database of food service inspections from about half of all counties in the U.S. The team started with the largest jurisdictions — nearly 20 states that publish data statewide and large metropolitan areas such as New York City — which means their database includes almost 900,000 establishments, almost 7 million inspections, and more than 18 million violations. The database doesn’t cover every single restaurant in the country because not all local governments publish their data, but Jin estimates that more than half of the restaurants in the country are represented. “If you need one particular piece of data, it might not be there, but if you use it for the broader kinds of analyses that we think provide the most value, then you’re well-covered,” Bederson says. The scan is continuous so it’s pretty current and includes information dating back to when the team started scanning sites a few years ago. In addition, some sites had historic data going back to the mid-1990s in some cases, so that was included in the database. The technology behind the database had two major hurdles to overcome. The first is that although there is guidance provided by the Food and Drug Administration on best practices for running and reporting food safety inspections, each municipality does things differently. For example, some jurisdictions might have broad categories for problems encountered during inspections while others go into granular detail. It’s the difference between “temperature out of range” and “the temperature in the corner of the refrigerator was between this and that temperature.” The second is that inspection results are reported online in many different ways. There could be a more human-friendly PDF file or webpage that searchable by restaurant but terrible for a computer to mine, or a computer-friendly database that might be harder for the average consumer to interpret. The database was built with a couple different customers in mind. One was chain restaurants that don’t want to manually collect data across hundreds of locations. Another was local health departments wanting to understand whether emulating the way a nearby jurisdiction operates could make their own inspections more efficient. The third main target audience was national policymakers who want a big picture of food safety and/or fiscal effectiveness. Going forward, the team is working on commercializing the database for companies and organizations. They’re also working on developing different tools for different potential users. Otherwise, it’s about making the database bigger and better. Jin notes that the team welcomes food safety inspection data from any health department that hasn’t posted their data on the web. “Our engineers can deal with the technical problem and save them the cost of setting up their own website and tracking data over time,” she says. “If they are not ready to disclose the data to the public yet, we can incorporate their data into our framework and limit the access to their specification.” They’re also happy to work with local and federal governments on exploring ways that might improve inspections and compliance with food safety regulations. “The business world has been into ‘big data’ for quite some time and, in my view, the government — especially local governments — have lagged behind in this movement,” Jin says. “I think potentially that big data could be even more useful for local governments themselves because they have collected many of those data.” For non-commercial use, the database is available free to the public at HazelAnalytics.com.