A Linear Model Based on Principal Component Analysis for Disease Prediction

A Linear Model Based on Principal Component Analysis for Disease Prediction

Abstract:

Various classification methods are applied to predict different diseases, such as diabetes, tuberculosis, and so on, in medical field. Diagnosis of diabetes can be analyzed by checking the level of blood sugar of patient with the normal known levels, blood pressure, BMI, skin thickness, and so on. Several classification methods have been implemented on diabetes. In this paper, the main aim is to build a statistical model for diabetes data to get better classification accuracy. Extracted features of diabetes data are projected to a new space using principal component analysis, then, it is modeled by applying linear regression method on these newly formed attributes. The accuracy obtained by this method is 82.1% for predicting diabetes which has reformed over other existing classification methods.

Existing System:

Attributes of PIDD are inspected at different angles to obtain required information for processing data. So, feature extraction is a major step in examining PIDD. The work concentrates on retrieving feature values from PIDD to a new feature space by employing PCA method. These new set of feature values are inspected for their importance and relevance, and are subjected for data mining methods like LRMto classify the given data for predicting diabetes disease.

Disadvantage:

Model building provides a good fit to any set of data. Linear statistical model estimate the unknown dependent PIDD feature value from the known independent PIDD feature values. The representation of relationship between dependent PIDD image feature and set of independent multiple PIDD features are known as regression analysis.

 

 

Proposed System:

The proposed method includes extraction of new group of features from PIDD by employing PCA so that the values are inspected for their importance and relevance, and are subjected for data mining methods like Linear Regression Model (LRM) to classify the given data for predicting diabetes disease.