click here to download project abstract of learning about machine learning
ABSTRACT
Currently, road transport infrastructure failing to cope up with the exponential
increase in vehicular population and to computing the fastest driving routes and
accidents in the presence of varying traffic conditions is an essential problem in modern
navigation systems. To prevent this problem is to investigate the transport department
dataset with ensemble learning method for finding the best road selection without
accident forecasting by prediction results of best accuracy calculation by comparing
supervised machine learning algorithms. In statistics and machine learning, ensemble
methods use multiple learning algorithms to obtain better predictive performance. The
analysis of dataset by supervised machine learning technique(SMLT) to capture
several information’s like, variable identification, uni-variate analysis, bi-variate and
multi-variate analysis, missing value treatments and analyze the data validation, data
cleaning/preparing and data visualization will be done on the entire given dataset.
Additionally, to compare and discuss the performance of various machine learning
algorithm measurements from the given transport department dataset with evaluation
of GUI based road accident prediction by given attributes.
vi
TABLE OF CONTENTS
CHAPTER
NO.
TITLE
PAGE
NO.
ABSTRACT
V
LIST OF FIGURES
IX
LIST OF ABBREVIATIONS
X
1
INTRODUCTION
1
1.1 EXISTING SYSTEM
3
1.2 PROPOSED SYSTEM
4
1.2.1 Ensemble learning
4
1.2.2 Max Voting
5
1.2.3 Averaging
5
1.2.4 Weighted Average
5
1.2.5 Voting based Ensemble learning
6
1.3 AIM
6
1.4 SCOPE
7
1.5 OBJECTIVE
7
2
LITERATURE SURVEY
8
2.1 LITERATURE SURVEY
8
3
METHODOLOGY
12
3.1 METHODOLOGIES
12
3.1.1 Sequential Ensemble learning (Boosting)
12
3.1.2 Parallel Ensemble Learning (Bagging)
12
3.1.3 Stacking & Blending
13
3.2 FEASIBILITY STUDY
14
3.2.1 Data Wrangling
14
3.2.2 Data collection
14
3.2.3 Preprocessing
14
3.3 CONSTRUCTION OF A PREDICTIVE MODEL
14
3.3.1 Dataflow diagram for machine learning
15
vii
3.3.2 Work flow diagram
16
3.3.3 UML Diagram
16
3.3.3.1 Use Case Diagram
16
3.3.3.2 Activity Diagram
17
3.3.4 Sequence Diagram
18
3.4 PROJECT REQUIREMENTS
18
3.5 ENVIRONMENTAL REQUIREMENTS
19
3.5.1 Software Description
20
3.5.2 Anaconda Navigator
20
3.5.3 Conda
21
3.5.4 The Jupyter Notebook
21
3.5.5 Notebook document
21
3.5.6 Jupyter Notebook App
22
3.5.7 Kernel
22
3.5.8 Notebook Dashboard
22
3.6 MODULE-01
24
3.6.1 Data validation process
24
3.6.2 Data Validation
24
3.6.3 Data Pre-processing
25
3.7 MODULE – 02
26
3.7.1 Exploration data analysis of visualization
26
3.7.2 Training the Dataset
28
3.7.3 Testing the Dataset
28
3.8 MODULE – 03
29
3.8.1 Logistic Regression
29
3.8.2 Decision Tree
30
3.9 MODULE -04
31
3.9.1 Support Vector Machines (SVM)
31
3.9.2 Random Forest
32
3.10 MODULE -05
33
3.10.1 K-Nearest Neighbor (KNN)
33