Intelligent Heart Disease Prediction System Using Data Mining Technique
ABSTRACT
The healthcare industry collects huge amounts of healthcare data which, unfortunately, are not “mined” to discover hidden information for effective decision making. Discovery of hidden patterns and relationships often goes unexploited. Advanced data mining techniques can help remedy this situation. This research has developed a prototype ntelligent !eart Disease "rediction #ystem $!D"#% using data mining techniques, namely, Decision Trees, &a've (ayes and &eural &etwork. )esults show that each technique has its unique strength in reali*ing the o+ectives of the defined mining goals. !D"# can answer complex “what if” queries which traditional decision support systems cannot. -sing medical profiles such as age, sex, +lood pressure and +lood sugar it can predict the likelihood of patients getting a heart disease. t ena+les significant knowledge, e.g. patterns, relationships +etween medical factors related to heart heart diseas disease, e, to +e esta+l esta+lish ished. ed. !D"# !D"# is e+/+a e+/+ased sed,, user/ user/fri friend endly ly,, scala+ scala+le, le, relia+ relia+le le and expanda+le. t is implemented on the .&0T platform. Existing Systems:
1linical decisions are often made +ased on doctors2 intuition and experience rather than on the knowledge rich data hidden in the data+ase.
This practice leads to unwanted +iases, errors and excessive medical costs which affects the quality of service provided to patients.
There are many ways that a medical misdiagnosis can present itself. hether a doctor is at fault, or hospital staff, a misdiagnosis of a serious illness can have very extreme and harmful effects.
The &ational "atient #afety 3oundation cites that 456 of medical patients feel they have had experienced a medical error or missed diagnosis. "atient safety is sometimes
Intelligent Heart Disease Prediction System Using Data Mining Technique negligently given the +ack seat for other concerns, such as the cost of medical tests, drugs, and operations.
Medical Misdiagnoses are a serious risk to our healthcare profession. f they
continue, then people will fear going to the hospital for treatment. e can put an end to medical misdiagnosis +y informing the pu+lic and filing claims and suits against the medical practitioners at fault. Proposed Systems: This practice leads to unwanted +iases, errors and excessive medical costs which
affects the quality of service provided to patients.
Thus we proposed that integration of clinical decision support with computer/+ased patient records could reduce medical errors, enhance patient safety, decrease unwanted practice variation, and improve patient outcome.
This suggestion is promising as data modeling and analysis tools, e.g., data mining, have the potential to generate a knowledge/rich environment which can help to significantly improve the quality of clinical decisions.
The main o+ective of this research is to develop a prototype ntelligent !eart Disease
"rediction #ystem $!D"#% using three data mining modeling techniques, namely, Decision Trees, &a've (ayes and &eural &etwork.
#o its providing effective treatments, it also helps to reduce treatment costs. To
enhance visuali*ation and ease of interpretation,
Modules:
Intelligent Heart Disease Prediction System Using Data Mining Technique 1. Analyzing te !ata set ". #ai$es Baye%s &mplementation in Mining '. !esigning te (uestionnaire ). *eart !isease &n +EB
Analyzing te !ata set:
A data set $or dataset % is a collection of data, usually presented in ta+ular form. 0ach column represents a particular varia+le. 0ach row corresponds to a given mem+er of the data set in question. t lists values for each of the varia+les, such as height and weight of an o+ect or values of random num+ers. 0ach value is known as a datum. The data set may comprise data for one or more mem+ers, corresponding to the num+er of rows. The values may +e num+ers, such as real num+ers or integers, for example representing a person7s height in centimeters, +ut may also +e nominal data $i.e., not consisting of numerical values%, for example representing a person7s ethnicity. 8ore generally, values may +e of any of the kinds descri+ed as a level of measurement. 3or each varia+le, the values will normally all +e of the same kind. !owever, there may also +e 9missing values9, which need to +e indicated in some way. A total of :;; records with <: medical attri+utes $factors% were o+tained from the !eart Disease data+ase lists the attri+utes. The records were split equally into two datasets= training dataset $4:: records% and testing dataset $4:4 records%. To avoid +ias, the records for each set were selected randomly. The attri+ute “Diagnosis” was identified as the predicta+le attri+ute with value “<” for patients with heart disease and value “;” for patients with no heart disease. The attri+ute “"atientD” was used as the key> the rest are input attri+utes. t is assumed that pro+lems such as missing data, inconsistent data, and duplicate data have all +een resolved.
!ere in our proect we get a data set from .dat file as our file reader program will get the data from them for the input of &a've (ayes +ased mining process.
#ai$es Baye%s &mplementation in Mining:
Intelligent Heart Disease Prediction System Using Data Mining Technique
recommend using "ro+a+ility 3or Data 8ining for a more in/depth introduction to Density estimation and general use of (ayes 1lassifiers, with &aive (ayes 1lassifiers as a special case. (ut if you ust want the executive summary +ottom line on learning and using &aive (ayes classifiers on categorical attri+utes then these are the slides for you. (ayes7 Theorem finds the pro+a+ility of an event occurring given the pro+a+ility of another event that has already occurred. f ( represents the dependent event and A represents the prior event, (ayes7 theorem can +e stated as follows. !esigning te &nput attri,utes
?uestionnaires have advantages over some other types of medical symptoms that they are cheap, do not require as much effort from the questioner as ver+al or telephone surveys, and often have standardi*ed answers that make it simple to compile data. !owever, such standardi*ed answers may frustrate users. ?uestionnaires are also sharply limited +y the fact that respondents must +e a+le to read the questions and respond to them. !ere our questionnaire is +ased on the attri+ute given in the data set, so the our questionnaire contains =
&nput attri,utes
<. #ex $value <= 8ale> value ; = 3emale% 5. 1hest "ain Type $value <= typical type < angina, value 5= typical type angina, value @= non/ angina pain> value 4= asymptomatic% @. 3asting (lood #ugar $value <= <5; mgBdl> value ;=C <5; mgBdl% 4. )estecg resting electrographic results $value ;= normal> value <= < having #T/T wave a+normality> value 5= showing pro+a+le or definite left ventricular hypertrophy% :. 0xang exercise induced angina $value <= yes> value ;= no% E. #lope the slope of the peak exercise #T segment $value <= unsloping> value 5= flat> value @= downsloping% F. 1A num+er of maor vessels colored +y floursopy $value ; @% G. Thal $value @= normal> value E= fixed defect> value F=reversi+le defect%
Intelligent Heart Disease Prediction System Using Data Mining Technique H. Trest (lood "ressure $mm !g on admission to the hospital% <;. #erum 1holesterol $mgBdl% <<. Thalach maximum heart rate achieved <5. Ildpeak #T depression induced +y exercise relative to rest <@. Age in Jear <4. !eight in cms <:. eight in Kgs.
!esigning !ynamic -ser &nterace
n our !eart disease development the modeling and the standardi*ed notations allow to express complex ideas in a precise way, facilitating the communication among the proect participants that generally have different technical and cultural knowledge. 8L1 architecture has had wide acceptance for corporation software development. t plans to divide the system in three different layers that are in charge of interface control logic and data access, this facilitates the maintenance and evolution of systems according to the independence of the present classes in each layer. ith the purpose of illustrating a #uccessful application +uilt under 8L1, in this work we introduce different phases of analysis, design and implementation of a data+ase and we+ application. A/ax $asynchronous Mava#cript and N8O%, or A0A, is a group of interrelated we+
development techniques used for creating interactive we+ applications or rich nternet applications. ith Aax, we+ applications can retrieve data from the server asynchronously in the +ackground without interfering with the display and +ehavior of the existing page. n many cases, the pages on a we+site consist of much content that is common +etween them. -sing traditional methods, that content would have to +e reloaded on every request. !owever, using Aax, a we+ application can request only the content that needs to +e updated, thus drastically reducing +andwidth usage and load time. The use of asynchronous requests allows the client7s e+ +rowser - to +e more interactive and to respond quickly to inputs, and sections of pages can also +e reloaded individually. -sers may perceive the application to +e faster or more responsive, even if the application has not changed on the server side. The use of
Intelligent Heart Disease Prediction System Using Data Mining Technique Aax can reduce connections to the server, since scripts and style sheets only have to +e requested once. System Re2uirement Speciication: *ard3are Re2uirements:
P
#ystem
= "entium L 5.4 Q!*.
P
!ard Disk
= 4; Q(.
P
3loppy Drive = <.44 8+.
P
8onitor
= <: LQA 1olour.
P
8ouse
= Oogitech.
P
)am
= < Q(.
Sot3are Re2uirements:
P
Iperating system
= indows N".
P
1oding Oanguage
= A#".&et with 1R.
P
Data (ase
= #?O #erver 5;;:.