If you’ve ever tried to find a recent inspection report for a particular restaurant, you’ll know that the ease of accessing and understanding that information can depend on what city or county you live in. University of Maryland Economics Professor Ginger Jin and UCLA Associate Professor of Business Management Philip Leslie have been doing research about food safety inspections since the 1990s. Out of that work grew the idea to collect and normalize data from food safety inspections from local government websites. magnifying-glass-computerAlong with UMD Computer Science Professor Ben Bederson and graduate students Alexander Quinn and Ben Zou, they created a huge database of food service inspections from about half of all counties in the U.S. The team started with the largest jurisdictions — nearly 20 states that publish data statewide and large metropolitan areas such as New York City — which means their database includes almost 900,000 establishments, almost 7 million inspections, and more than 18 million violations. The database doesn’t cover every single restaurant in the country because not all local governments publish their data, but Jin estimates that more than half of the restaurants in the country are represented. “If you need one particular piece of data, it might not be there, but if you use it for the broader kinds of analyses that we think provide the most value, then you’re well-covered,” Bederson says. The scan is continuous so it’s pretty current and includes information dating back to when the team started scanning sites a few years ago. In addition, some sites had historic data going back to the mid-1990s in some cases, so that was included in the database. The technology behind the database had two major hurdles to overcome. The first is that although there is guidance provided by the Food and Drug Administration on best practices for running and reporting food safety inspections, each municipality does things differently. For example, some jurisdictions might have broad categories for problems encountered during inspections while others go into granular detail. It’s the difference between “temperature out of range” and “the temperature in the corner of the refrigerator was between this and that temperature.” The second is that inspection results are reported online in many different ways. There could be a more human-friendly PDF file or webpage that searchable by restaurant but terrible for a computer to mine, or a computer-friendly database that might be harder for the average consumer to interpret. The database was built with a couple different customers in mind. One was chain restaurants that don’t want to manually collect data across hundreds of locations. Another was local health departments wanting to understand whether emulating the way a nearby jurisdiction operates could make their own inspections more efficient. The third main target audience was national policymakers who want a big picture of food safety and/or fiscal effectiveness. Going forward, the team is working on commercializing the database for companies and organizations. They’re also working on developing different tools for different potential users. Otherwise, it’s about making the database bigger and better. Jin notes that the team welcomes food safety inspection data from any health department that hasn’t posted their data on the web. “Our engineers can deal with the technical problem and save them the cost of setting up their own website and tracking data over time,” she says. “If they are not ready to disclose the data to the public yet, we can incorporate their data into our framework and limit the access to their specification.” They’re also happy to work with local and federal governments on exploring ways that might improve inspections and compliance with food safety regulations. “The business world has been into ‘big data’ for quite some time and, in my view, the government — especially local governments — have lagged behind in this movement,” Jin says. “I think potentially that big data could be even more useful for local governments themselves because they have collected many of those data.” For non-commercial use, the database is available free to the public at HazelAnalytics.com.

  • LogicPolice

    I have looked at the website. I am familiar with both the code and the violation item and number as it appears on the inspection sheet. It is reflective of the 2009 FDA food code. What is odd is there are critical violations i.e. “toxic substances properly used, identified” and “insects, rodents & other pests” that are shown as “minor” non-critical violations. This seems like an abberation of the data collection and reporting method. Seeking ellaboration.

    • I work with InspectionRepo.com. In general, our policy is to record the data—including violation severity levels—exactly as we found it at the jurisdiction’s web site.

      I believe you may have been referring to inspections from Maricopa County, AZ. Is that right? Violations labelled “toxic substances properly used, identified” are marked as “risk factor”, “priority item”, and “critical”. Here is one example. Does that help?

      Please let us know if you have any questions or if you ever notice anything that needs correcting.

      • LogicPolice

        Hi Alex

        Sorry…took a few days to get back to you. The examples I looked at are in oklahoma as I am more familiar with the code adopted by the State HD there.

        Here is an example http://www.inspectionrepo.com/r/1393995/
        of what I was talking about. If you took the data exactly as it appears then someone is not doing thier job correctly…which is totally possible. I have seen food out of temp and improper sanitizing both reported as non-critical.
        In the example link you sent me, the inspection reported a live roach observed during an inspection. That also should only be reported as a critical violation. Not because of my opinion…it’s how the inspection sheet it constructed. Core violations=critical.
        I am grateful for your reply.

        • For the example you pointed to, our site faithfully reproduces the information as it was found. Oklahoma’s site indicates critical violations by highlighting them in orange. Therefore, InspectionRepo marks a violation as critical if and only if it was highlighted in orange on Oklahoma’s site. Those represent about 10% of all Oklahoma violations.

          To see an example of critical violations, try searching Oklahoma’s site (https://www.ok.gov/health/pub/wrapper/foodservice.html) for “Downtown Diner” in Woodward County. The violations that are marked as critical there are also marked as critical on our site (http://www.inspectionrepo.com/r/1393995/).

          On the other hand, if you search for “Okie Joe’s BBQ” in Adair county (the example you provided), you can see that none of the violations are marked as critical in the data provided by Oklahoma. That is reflected on our site as well (http://www.inspectionrepo.com/r/1821756/).

          Of course we both know that temperature violations and improper sanitizing would normally be classified as critical. However, our policy is to reproduce the information exactly as the inspectors and their department provided it.

          I hope that helps. Feel free to get in touch with any other questions.

          • LogicPolice

            Thanks a ton for the response Alex. Completely clarified. Take care sir