Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
PRACTICUM 2 PRESENTATION
Business Context
To understand the behavior, patterns and
factors that lead to the readmission of
diabetes patients.
Business Expectations
We intend to find the patterns in readmission
of diabetes patients and propose preventive
action to reduce medical costs.
Business Decomposition
From the hospital/health center
perspective, this analysis could be
helpful in procuring medicines,
recruiting doctors based on need
and also preventing the harmful
Business Decomposition effects of diabetes.
From a developer/analyst
perspective the translation of the
business expectation into model
is essential. For this we consult
domain experts and take their
opinion and seek their
knowledge.
Connecting the dots
Based on the domain knowledge we
have to decide what are the suitable
models for the analysis and also we
need to validate the predictions in
real-time.
Problems
The primary roadblock here is the
availability of data. The data is not
available in abundance as the
records are not maintained by all
healthcare centers/hospital.
Building a model
With the data collected, we must proceed to build an efficient model to
predict the diabetes readmission cases.
Data Requirements and Processing
What is the data needed and in what form is it needed for the analysis?
Data Requirement
Data provided by UCI Repository
is used in the analysis.
Data Cleaning
For data cleaning and pre-
processing, we have imputed
values for age by substituting
mid points of the age buckets,
dropped records where values
are unknowns or special
characters.
Dropped features which does
not provide any meaningful
insight.
Data Understanding
Making sense out of the data
Step 1
Data collection. We
have collected data
from UCI repository.
Data Cleaning
we have imputed values
for age by substituting mid
points of the age buckets,
dropped incomplete
records.
Data Visualization
Python Plotting Libraries like
graphviz, matplotlib and
seaborn were used to
Feature Engineering visualize the data.
New variables were created in the dataset
like Service Utilization, clubbing diagnosis
categories, medication changes which would
provide more meaningful insights on the Modelling and Inferences
dataset. We used five models that is Random Forests
(Gini and Entropy), Decision Tree (Gini and
Entropy) and Logistic Regression with two
sets of features and evaluated.
Modelling, Evaluation & Feedback
How good are the models
Step 3
Step 1 We choose five models for Step 5
We analyzed the each of the feature sets All the models are built using train data and
requirements specific to with high interpretability validated using the test data. Metric scores
healthcare domain of each model is compared
• https://kaggle.com
• https://github.com
Business Recommendation
Helpful insights that are derived from the analysis
For Hospitals
For hospitals, we highly recommend,
decreasing the usage of Repaglinide and For Patients
Insulin for treating diabetic patients as
appear to increase the odds of For patients we highly recommend
readmission and also recommend usage to follow the medical advice and do
of Chlorpropamide usage appears to not leave before the prescribed
decreases the odds treatment is completed.