click here to download the abstract project
click here to download the base paper
ABSTRACT
Novel Corona Virus (COVID-19 or 2019-nCoV) pandemic has neither clinically proven vaccine nor drugs; however, its patients are recovering with the aid of antibiotic medications, anti-viral drugs, and chloroquine as well as vitamin C supplementation. It is now evident that the world needs a speedy and quicker solution to contain and tackle the further spread of COVID-19 across the world with the aid of non-clinical approaches such as data mining approaches, augmented intelligence and other artificial intelligence techniques so as to mitigate the huge burden on the healthcare system while providing the best possible means for patients’ diagnosis and prognosis of the 2019-nCoV pandemic effectively. In this project, data mining models were developed for the prediction of COVID-19 infected patients’ recovery and prevalence using epidemiological dataset of COVID-19. Machine Learning algorithms such as Linear Regression, Support Vector Machine Regression, Polynomial Regression, ARIMA and FB Prophet were applied directly on the dataset using python programming language to develop the models. The models forecast the number of Confirmed, Recovered and Death cases for the next week. Predicting the prevalence and incidence of this disease throughout the world is crucial to helping health professionals make key decisions about the disease.
Keywords – COVID-19, Linear Regression, Support Vector Regression, Polynomial regression, ARIMA, Prophet, RMSE, MAE, MAPE.
INTRODUCTION
Introduction:
Severe Acute Respiratory Syndrome Coronavirus Two (SARS-CoV-2), the causative agent of novel coronavirus (COVID-19 or 2019-nCoV), has emerged in late 2019 which is believed to be originated from Hubei Province, China called Wuhan. 2019-nCoV or COVID-19 is rapidly spreading in humans. The major symptoms of SARS-CoV-2 include fever, cough, and shortness of breath which in many instances appeared to be similar to that flu. COVID-19 had since reached a decisive point and pandemic potential which claimed the lives of many people across the world and human-to-human transmission of COVID-19 from infected individuals with mild symptoms have been reported. However, there is no drug or vaccine clinically proven to treat COVID-19 pandemic, therefore other non-clinical or non-medical therapeutic techniques are urgently needed to contain and prevent further outbreak of COVID-19 pandemic such as data mining techniques, machine learning and expert system among other artificial intelligence techniques.
Data mining (DM) is an advanced artificial intelligence (AI) technique that is used for discovering novel, useful, and valid hidden patterns or knowledge from dataset. The technique reveals relationships and knowledge or patterns among the dataset in several or single datasets. It has also widely used for the prognosis and diagnosis of many diseases including Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and Middle East Respiratory Syndrome Coronavirus (MERS-CoV) that were so far discovered in 2003 and 2012, respectively. As huge dataset generated around the world related to 2019-nCoV pandemic every day is a treasured resource to be mined and analysed for useful, valid, and novel knowledge or patterns extraction for better decision-making to contain the outbreak of COVID-19 pandemic. In the healthcare sector, data mining has been widely applied in many different applications such as predicting patient outcomes, modeling health outcomes, hospital ranking, and evaluation of treatment effectiveness and infection control, stability, and recovery.
In this study, we develop several data mining models for the prediction of 2019-nCoV infected patients’ recovery. The models predict when COVID-19 infected patients would be recovered and released from isolation centers as well as patients that may likely not be recovered and lost their lives to COVID-19 pandemic. The models help the health workers to determine the recovery and stability of the newly infected persons with pandemic COVID-19. Data mining algorithm which includes linear regression, support vector machine regression, etc. were applied directly on the dataset using python programming language to develop the models.