Fresh produce has been responsible for a significant number of foodborne illnesses.In the past 20 years, numerous studies have been performed for improving the microbial safety of fresh produce. However, fresh produce remains the leading cause of foodborne-illness outbreaks, surpassing other infectious-disease carriers.Currently, the official Centers for Disease Control and Prevention (CDC) outbreak-detection method includes 7 steps: in brief, detect, find, generate, test, solve, control, and decide (CDC, 2018b). The public are not officially informed of an outbreak until the CDC process has reached step 6 (Control the Outbreak), where a recall will be made or advice to consumers given by the CDC or public health offices. Therefore, with the current system, there is a significant delay between the first infections and when action is taken to inform the public about the incidence of an outbreak.

Since foodborne illnesses directly impact the consumers, understanding how consumers respond when interacting with foods as well as extracting information from posts on social media to detect early symptoms of foodborne illnesses, may provide a new means to reduce the risk and curtail the outbreak.The readily available and rapidly disseminated digital data in our increasingly connected digital world (e.g., news, social media) that, coupled with open-source big-data technologies from machine learning to cyber-informatics, could provide us with an unprecedented opportunity to identify emerging food-safety issues at an earlier stage and trace them to their source to plan for effective early prevention.

Therefore, our study is to develop an innovative big data analytics infrastructure for the modeling of fresh produce safety risks and the early warning of fresh produce safety outbreaks. The resulting infrastructure applies state-of-the-art cyber-informatics technologies that leverage multi-source big data, including social media, news media, and government reports, to reduce the incidence of foodborne diseases associated with the consumption of fresh produce. To detect foodborne illness related instances and determine potential trends accumulating about outbreaks, deep Learning, natural language processing techniques such as BERT, and other machine learning techniques are applied for analyzing unstructured data. Also, visual analytics techniques are used to allow users to understand the spread and severity of the predicted outbreak.

Project Goals: