The Coronavirus Disease 2019 (SARS-CoV-2 or COVID-19) began to spread since mid December 2019 from Wuhan, which has been widely regarded as the epicentre of the epidemic, to almost all provinces throughout China and 80+ other countries. The SARS-CoV-2 outbreak began to occur and escalate in a special holiday period in China (about 20 days surrounding the Lunar New Year), during which a huge volume of intercity travel took place, resulting in outbreaks in multiple regions connected by an active transportation network. Thus, in order to understand the SARS-CoV-2 spreading process in China, it is essential to examine the human migration dynamics, especially between the epicentre Wuhan and other Chinese cities.

Baidu Migration Data of Inflow and Outflow Migration Strengths of Wuhan

Model and Prediction for China (February 18, 2020)
We utilise the human migration data collected from Baidu Migration, which is a mobile-app based human migration tracking data system providing historical indicative daily volume of travellers to/from and between 367 cities in China.  We combine, in this study,  intercity travel data collected from Baidu Migration with the SEIR model to build a new dynamic model for the spreading of COVID-19 in China. Using official historical data of infected, recovered and death cases in 367 cities, we perform fitting of the data to estimate the best set of model parameters, which are then used to estimate the number of individuals exposed to the virus in each city and to predict the extent of spreading in the coming months. Our study shows that  provided such migration control and other stringent measures continue to be in place, the number of infected cases in various Chinese cities will peak between mid February to early March 2020, with about 0.8%, less than 0.1% and less than 0.01% of  the population eventually infected in Wuhan, Hubei Province, and the rest of China, respectively, and no new cases to be expected from mid March. Moreover, for most cities in and outside Hubei Province (except Wuhan), the total number of infected individuals will be less than 4000 and 300, respectively. Since January 24, 2020, very strict migration control has been imposed in various provinces and cities to restrict travel and hence to curb the spreading of the virus. The data used in this study for identification of the model parameters correspond to a low intercity migration period in China. Thus, the prediction of the epidemic propagation based on the model would only be valid provided the same level of migration control  continues to be in place. 

Predictions for Other Countries using Data Coding (March 8, 2020)
As an extension, we apply a data-driven coding method for prediction of the COVID-19 spreading profile in any given population that shows an initial phase of epidemic progression. Based on the historical data collected for COVID-19 spreading in 367 cities in China and the set of parameters of the augmented Susceptible-Exposed-Infected-Removed (SEIR) model obtained for each city, a set of profile codes representing a variety of transmission mechanisms and contact topologies is formed. By comparing the data of an early outbreak of a given population with the complete set of historical profiles, the best fit profiles are selected and the corresponding sets of profile codes are used for prediction of the future progression of the epidemic in that population. Application of the method to the data collected for South Korea, Italy and Iran shows that peaks of infection cases are expected to occur before the end of March 2020, and that the percentage of population infected in each city will be less than 0.01%, 0.05% and 0.02%, for South Korea, Italy and Iran, respectively.

Predictions for Japan and USA using a New Model That Accounts for Unreported Cases (March 26, 2020)
A new Susceptible-Exposed-Infected-Confirmed-Removed (SEICR) model with consideration of intercity travel and active intervention is proposed for predicting the spreading progression of the 2019 New Coronavirus Disease (COVID-19). The model takes into account the known or reported number of infected cases being fewer than the actual number of infected individuals due to insufficient testing. The model integrates intercity travel data to track the movement of exposed and infected individuals among cities, and allows different levels of active intervention to be considered so that realistic prediction of the number of infected individuals can be performed.

© Michael Tse 2020