International Journal of All Research Education & Scientific Methods

An ISO Certified Peer-Reviewed Journal

ISSN: 2455-6211

Latest News

Visitor Counter
2825060541

Diabetes Prediction with improved Data Prepro...

You Are Here :
> > > >
Diabetes Prediction with improved Data Prepro...

Diabetes Prediction with improved Data Preprocessing and using Linear Support Vector Classifier

Author Name : Revathi Addala, Dr. Chandra Sekhar Vasamsetty

ABSTRACT

Diabetes Mellitus is a long-term disease which is caused by high blood sugar, it can become dangerous if not treated. The hormone called insulin in the pancreas is affected due to the loss of beta cells, which has an important functioning of converting the carbohydrates into the glucose which releases the energy into the cells of the body. The no of diabetic patients are on the rise every day. Several micro and macro vascular complications has been tied with the diabetes. Prediction of diabetes is an important term in person’s life that helps earlier to maintain healthy life. Machine learning has been involved in multiple fields which have solved real time problems. This proposal provides an approach for predicting diabetes using a supervised ML (machine learning) technique such as SVM (Support vector machine) on PIMA data. This paper focused on Outlier detection that is caused by higher Skewness in the data. Box-Cox transformation is an transformation method used for removing skewed data that are tend to have outliers in the data, by transforming to normal distribution.  This process is completed in data pre processing step, later dividing the data into 80% for training and 20% for testing using linear SVC method which is based on liblinear library that supports linear kernel which is faster in better scaling for large no of data points. Building the model with support vector machine has yielded an accuracy of 88% for testing data. After performing parameter tuning, the accuracy of the model has been improved to 92.20%.

Keywords:  Diabetes, linear Support Vector Machine , Skewness, Parameter tuning.