click here to download project abstract
click here to download the base paper
ABSTRACT
This paper represents that prediction of COVID-19 Outbreak is beneficial for the healthcare centres as it helps to minimize the burden on the healthcare centres by providing the best means for diagnosis and medication for COVID-19 affected patients .This project focuses on epidemiological dataset of COVID-19 patients and by analysing that data we will be developing data mining models for the prediction of outbreak of COVID- 19 in the upcoming days and also for the prediction of death rate of COVID-19 infected patients. This project would be very helpful for healthcare to fight against the COVID-19. This project is useful for the government to determine the amount of medical needs (such as equipment, medicines and accommodation )that are to be provided in the upcoming days for the COVID-19 affected patients. The data about the number of cases and deaths will be downloaded dynamically from the website till date. Later the system examines the previous data and will generate the prediction about cases and deaths in upcoming days. This System predicts the outbreak of COVID-19 in the future. This System uses Neural Networks and Polynomial Regression methodologies. This system is helpful for organizations that are trying to help the patients who are suffering with lack of medical facilities. It helps in decreasing the death rate by providing the required number of medical needs to the people.
INTRODUCTION
COVID-19 is a disease caused by a new virus, which emerged in Wuhan. People can get affected with COVID-19 from others who have the virus .This virus spreads through small droplets from the nose or mouth which are spread by coughing or exhaling. We should note the point that, This virus will stop spreading when it does not find any new bodies to infect. It is important to investigate the growth of transmission and predict the occurrences of the transmission in the future. Mathematical models which are based on machine learning are chosen to predict the outbreak of the virus. Machine learning and Neural Network technologies are implemented using the python library to predict the total number of confirmed and death cases .This prediction allows us to undertake specific determinations based on transmission growth, such as increasing the lockdown phase, performing the sanitation, and providing hospitality needs and daily support. Today, a clear solution has not been developed on COVID-19. The vast majority of measures taken on a country basis and individually are to prevent the transmission of this virus to more people. Because of the uncertainty in the transmission dynamics of SARS-CoV-2 and high certainty in its virulence, it is understandable that early responses have relied on blunt interventions, such as movement bans and closures, to save lives. Given the increasing caseload, there is an urgent need to augment medical and economical skills to face this critical illness. Hence, the scientific challenge now is to identify, through inference and simulation, measures that could provide as-good or better protection with less social cost The growing emphasis on machine learning techniques in medical fields can provide the right environment for change and improvement. To address this global novel pandemic, WHO, scientists and clinicians in medical industries are searching for new technology to screen infected patients in various stages, to find best clinical trials, control the spread of this virus, develop a vaccine for curing infected patients, and trace contacts. The role of the data science in this scenario consists in helping to speed up the process. Machine learning has proven to be invaluable in predicting risks in many spheres and since the spread of the virus started, its application is helping us fight against the viral pandemic. Like never before, people all around the world are collecting and sharing what they learn about the virus. Starting from this, the main goal of this work is to shine a light on their work, high- lighting the importance of the role of machine learning to tackle SARS-CoV-2 (Figure 1.1).
2 DATA MINING:
Data mining is a process of extracting and discovering patterns in large dataset involving methods at the intersection of machine learning, statistics and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information (with intelligent methods) from a data set and transform the information into a comprehensible structure for further use. Data mining is the analysis step of the “knowledge discovery in databases” process, or KDD. Aside from the raw analysis step, it also involves database and data management aspects, processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. The term “data mining” is a misnomer, because the goal is the extraction of patterns and knowledge from large amounts of data, not the extraction (mining) of data itself .It also is a buzzword and is frequently applied to any form of large-scale data or information processing (collection, extraction, warehousing, analysis, and statistics) as well as any application of computer decision support system, including artificial intelligence (e.g., machine learning) and business intelligence. The book Data mining: Practical machine learning tools and techniques with Java (which covers mostly machine learning material) was originally to be named just Practical machine learning, and the term data mining was only added for marketing reasons. Often the more general terms (large scale) data analysis and analytics—or, when referring to actual methods, artificial intelligence and machine learning—are more appropriate.
The actual data mining task is the semi-automatic or automatic analysis of large quantities of data to extract previously unknown, interesting patterns such as groups of data records (cluster analysis), unusual records(anomaly detection), and dependencies (association rule mining, sequential pattern mining). This usually involves using database techniques such as spatial indices. These patterns can then be seen as a kind of summary of the input data, and may be used in further analysis or, for example, in machine learning and predictive analytics. For example, the data mining step might identify multiple groups in the data, which can then be used to obtain more accurate prediction results by a decision support system. Neither the data Collection, data preparation, nor result interpretation and reporting is part of the data mining step, but do belong to the overall KDD process as additional steps. The difference between data analysis and data mining is that data analysis is used to test models and hypotheses on the dataset, e.g., analyzing the effectiveness of a marketing campaign, regardless of the amount of data; in contrast, data mining uses machine learning and statistical models to uncover clandestine or hidden patterns in a large volume of data. The related terms data dredging, data fishing, and data snooping refer to the use of data mining methods to sample parts of a larger population data set that are (or may be) too small for reliable statistical inferences to be made about the validity of any patterns discovered. These methods can, however, be used in creating new hypotheses to test against the larger data population.