click here to download the project abstract
ABSTRACT
We provide machine learning in this paper. Cry is a form of communication for children to express their feelings. A baby’s cry can be characterized according to its natural periodic tone and the change of voice. Through their baby’s cry detection, parents can monitor their baby remotely only in important conditions. Detection of a baby cry in speech signals is a crucial step in applications like remote baby monitoring and it is also important for scholars, who study the relation between baby cry signal patterns and other developmental parameters. This study of sound recognition involves feature extraction and classification by determining the sound pattern. We use MFCC as a feature extraction method and K-Nearest Neighbor (K-NN) for classification. K-Nearest Neighbor (KNN) is a classification method that is often used for audio data. The KNN classifier is shown to yield considerably better results compared to other classifiers.
Key Words: Signal patterns, developmental parameters, sound recognition, Feature extraction, MFCC, K-Nearest Neighbors (KNN), speech signal processing
INTRODUCTION
Cry signals or cry patterns have been under research analysis for many years. Scholars and analysts found that the cry signals can give detailed pictures of the physical and mental states of newborns. According to the research of WHO, “every year, nearly 40% of infant deaths are happening in their initial 30-50 days of life, 72% of infant deaths happen in the first week of their birth, and up to 2/3 of infant lives can be saved if the cause is known much earlier. The techniques that can allow us to identify the former signs of infant health and hygiene can help us reduce infant mortality. To be precise this is the superior goal of our thesis is to develop or implement a reliable system that allows us to know diseases based only on cry sound examination. Development of such a type of system initially mentions the problem in finding reliable cry components or patterns in an input waveform. The NCDS system is probably confused if the input speech signal contains unnecessary noises other than the cry signal alone. Hence, the greatest challenge in designing and implementing a diagnostic system is to implement an automatic detecting machine to precisely search the inspiratory and expiratory parts of a crying pattern. After a lot of research on diseases and cry signals and their relation some useful results happened to develop automatic audio segmentation of expiratory and inspiratory parts of infant cries. If we can segment audio cry signals and examine vital parts of a pre-recorded sound signal, it would be very helpful and simpler to develop a completely automatic system that helps in understanding diseases. This system can definitely be used to support our decisions on understanding infant cries. Through this, we can determine the symptoms earlier and take necessary steps at an efficient and cheap cost. Recent studies on infant cries have shown that infants are crying for several reasons based on their requirements like hunger, fatigue, uncomfortable feelings, pain, and many other reasons. Researchers and scholars such as paediatricians health workers can relate between various types of infant cries and thus pre-estimate the infant’s requirement by using the cry sounds, gestures and other behaviour produced by the infant however, this is a real-time considerable problem (issue) for the parents not so skilled to those who can’t take proper care of the baby. This project provides an Automatic method for infant cry classification which is trained using a data set of five different infant cries.
Hence the main objective is to extract useful features from the cry audio signal i.e The infant cry test the unknown cry signal with the classified trainer and know the meaning of the infant cry, thereby taking care of the infant accordingly.
K-Nearest Neighbor Algorithm
K-Nearest Neighbor is one of the simplest Machine Learning algorithms based on the Supervised Learning Technique. The K-NN algorithm assumes the similarity between the new case/data and available cases and puts the new case into the category that is most similar to the available categories. The k-NN algorithm stores all the available data and classifies a new data point based on the similarity. This means when new data appears it can be easily classified into a well-suited category by using the K- NN algorithm. The K-NN algorithm can be used for Regression as well as for Classification but mostly it is used for classification Problems. K-nearest-neighbour is a simple nonparametric classification method, which means it does not make any assumptions on underlying data. It is also called a lazy learner algorithm because it does not learn from the training set immediately instead it stores the dataset and at the time of classification, it performs an action on the dataset. The KNN algorithm at the training phase just stores the dataset and when it gets new data, it classifies that data into a category that is similar to the new data. It classifies a sample to the majority class which is observed from the kth nearest neighbors in the feature space. The time complexity of training the K-nearest neighbours model is and in, where n, d, and k are the number of instances, data dimension, and the number of neighbours, respectively. K-Nearest Neighbors is one of the most basic yet essential classification algorithms in Machine Learning. It belongs to the supervised learning domain and finds intense application in pattern recognition, data mining
and intrusion detection. It is widely disposable in real-life scenarios since it is non-parametric, meaning, it does not make any underlying assumptions about the distribution of data (as opposed to other algorithms such as GMM, which assume a Gaussian distribution of the given data).
Advantages and Disadvantages of KNN:
Advantages:
● It is a very simple algorithm to understand and interpret.
● It is very useful for nonlinear data because there is no assumption
about data in this algorithm.
● It is a versatile algorithm as we can use it for classification as well
as regression.
● It has relatively high accuracy but there are much better supervised
learning models than KNN.
Disadvantages:
● It is computationally a bit expensive algorithm because it stores all
the training data.
● High memory storage is required as compared to other supervised
learning algorithms.
● Prediction is slow in the case of big N.
● It is very sensitive to the scale of data as well as irrelevant features.
Abstract:
The “Baby Cry Classification” project aims to develop an intelligent system that can analyze and classify different types of baby cries to assist parents and caregivers in understanding their baby’s needs. The project leverages machine learning techniques for audio signal processing and utilizes web technology to create a user-friendly interface for easy accessibility.
Existing System:
Currently, there is a lack of automated systems capable of accurately classifying various baby cries. Most existing systems focus on basic audio monitoring without the ability to distinguish between different cry patterns, making it challenging for parents to interpret their baby’s needs effectively.
Proposed System:
The proposed system introduces a robust solution for baby cry classification using machine learning algorithms. By analyzing acoustic features of baby cries, the system aims to categorize cries into distinct patterns corresponding to different needs, such as hunger, discomfort, or sleepiness. The system will be accessible through a web interface, providing an intuitive platform for users.
System Requirements:
- Hardware:
- Standard computer hardware with sufficient processing power for machine learning tasks.
- Microphones or audio recording devices for capturing baby cries.
- Software:
- Python for machine learning model development.
- Web development frameworks (e.g., Django or Flask) for creating the user interface.
- Audio processing libraries (e.g., LibROSA) for extracting features from baby cries.
Algorithms:
The project will explore various machine learning algorithms for audio classification, including:
- Convolutional Neural Networks (CNNs) for feature extraction.
- Long Short-Term Memory networks (LSTMs) for sequence modeling.
- Ensemble methods to enhance overall classification accuracy.
Hardware and Software Requirements:
- Hardware:
- A standard computer with sufficient RAM and GPU capabilities for model training.
- Microphones or audio recording devices for collecting baby cry samples.
- Software:
- Python for implementing machine learning algorithms.
- Audio processing libraries (e.g., LibROSA) for feature extraction.
- Web development frameworks (e.g., Django or Flask) for building the user interface.
- Database system (e.g., SQLite or PostgreSQL) for storing and managing cry data.
Architecture:
The system will consist of three main components:
- Data Collection: Acquiring and recording baby cries for model training.
- Machine Learning Model: Developing and training the model to classify different cry patterns.
- Web Interface: Creating a user-friendly web application for easy interaction with the system.
Technologies Used:
- Machine Learning:
- Python, scikit-learn, TensorFlow, Keras.
- Web Development:
- Django or Flask, HTML5, CSS, JavaScript.
- Database:
- SQLite or PostgreSQL.
- Audio Processing:
- LibROSA or similar libraries.
Web User Interface:
The web interface will provide users with:
- An intuitive dashboard displaying real-time cry analysis results.
- Historical data visualization for tracking cry patterns over time.
- User-friendly options for inputting additional contextual information.
The “Baby Cry Classification” project aims to provide an innovative and practical solution for parents and caregivers, enhancing their ability to respond effectively to their baby’s needs through advanced audio analysis and classification.