This blog is my analysis the data provided in case "Tahoe Healthcare Systems", from Columbia Business School.  As my professor used to say, if that outcome is wrong, blame the data :)

Preface: 
In the case-study, the Management team at Tahoe Healthcare system, have to evaluate an application called CareTracker.  There are cost implications from the application and regulatory penalties related to patient readmission.  Based on patient dataset, the data analytics team (or person) has to evaluate if the cost of CareTracker outweighs the penalties and by how much.  This feedback then needs to be ingested by the management team to ensure a decision can be made.  
1.     What is the cost if we apply CareTracker to the entire population?
The entire population provided in the dataset constitutes 4382 patients.  CareTracker has a $1,200 per patient cost, and $8,000 readmission cost.  Assuming, careTracker program is applied to the entire population, without any statistical analysis, then the cost would be:
4832 * $1,200 = $5,258,400.00.
Once a statistical analysis is performed on the entire population, 657 patients in the validation dataset require consideration, assuming a cutoff threshold probability of 50%:
a.     498 patients will not require re-admission
a.     Model & Prediction match
b.     Cost = $0
b.     23 patients that should not be admitted are
a.     Model is incorrect
b.     Cost = $1,200 (already covered by careTracker)
c.     100 patients who should be admitted are not
a.     Model is incorrect
b.     Cost = Penalty applicable to 60%, as careTracker has 40% success rate
d.     36 patients are readmitted
a.     Model & Prediction match
b.     Cost = Penalty applicable to 60%, as careTracker has 40% success rate.
As such, there are costs that need to be considered:
1.     Cost for all patients on the careTracker Program = 657*$1,200 = $788,400
2.     Cost for 60% patients readmitted, not detected by the CareTracker = 136*0.6*$8,000 = $652,800
Total Cost for the population (including penalties) = $788,400 + $652,800 = $1,441,200
2.     What is the cost if we use the cutoff value “cleverly” and treat only those flagged at higher risk of readmission?
Six models were evaluated and while Model-5 and Model-6 are almost identical, Model-5 is selected as the better fit.  As such, the cut-off values were applied to this model.
Model-1: Age as predictor of readmittance
Explanation: Age has a high z and low p value, implying significance in understanding readmittance.  With a low Pseudo R-Sq of 9%, and AIC of 3320.1, other predictor variables should be used to explain readmittance.
Model-2 : Female (or sex categorical variable) as predictor of readmittance
Explanation: Female has a low z and high p value, implying non-significance in understanding readmittance.  With low Pseudo R-Sq of 3%, and AIC of 3343.6 (higher than Model-1), other predictor variables should be used to explain readmittance.
Model-3 : Age and Female (or sex categorical variable) as predictor of readmittance
Explanation: Age and Female combined, based on Pseudo R-Sq of 9%, explain the same variation as Model-1.  With AIC of 3320.7, slightly higher than Model-1, but lower than Model-2, Model-1 is a better fit in explaining the re-admittance.
Model-4: Age, Female (or sex categorical variable) and their interaction (age * female) as predictor of readmittance
Explanation: Age, Female and female_age combined, based on Pseudo R-Sq of 9%, explain the similar variation as Model-1.  With AIC of 3320.5, slightly higher than Model-1, but lower than Model-2 and Model-3, Model-1 is a better fit in explaining the re-admittance.
Model-5: All variables in the dataset as predictors of readmittance
Explanation: All variables in the dataset, based on Pseudo R-Sq of 45%, explain the highest readmission cases.  With AIC of 2742.4, it’s the lowest of Models 1,2,3 and 4 and therefore is the best fit.  In this model, flu_season, along with severity of case and comorbidity are the significant explanatory variable.
For Model-5, the Error Matrix suggests that model incorrectly identifies 23 patients who should not be re-admitted and 100 patients who should be.  The cost of these errors is $1,043,600, with a probability cutoff of .5 (or 50%) (See from RStudio and Excel below).
From RStudio
From Excel
When the cutoff is optimized to reduce highest cost factor (Actual 1 and Predicted 0 = 100 patients), the updated Error Matrix and associated cost for Model-5 are listed below:
From Excel
Cost Analysis (From Excel)
By applying a cutoff value of 28%, the patients with the highest risk of readmission drop from 100 to 47, and the associated cost drops from $800,000 to $470,000.  The overall cost drops from $1.04M to $1.01M.
Model-6: Flu Season, severity.score, cmorbidity.score and female_age as predictors of readmittance
Explanation: Flu Season, severity.score, cmorbidity.score and female_age, based on Pseudo R-Sq of 45.49%, explain almost the exact level as Model-5.  With AIC of 2739.5, it’s lower than Model-5, but due to slightly elevated values of z and p values, Model-5 is still considered better overall.
3.     Make your recommendations on what should Houssein do.
As it can be noted from Cost Analysis above, by applying a cutoff of .28 (or 28%), the cost is optimized.  While there are still errors in the model, the 100 patients previously identified by the model as should not be re-admitted but are, is reduced to 47.  Since, this category carries the highest penalty of $8,000 and would increase over time, it provides the highest benefit, both short and long term.
The recommendation to Houssein, is to apply Model-5, with a cutoff of .28 (28%).  This reduces the overall cost of applying the CareTracker program and associated re-admission penalties by $28,000 (see Cost Analysis below).  Additionally, as careTracker is currently only applied in Seattle Hospital, a similar cost benefit analysis should be done at the other 13 locations as well.  For locations where careTracker can provide benefit, it should be rolled out.  
Cost Analysis
Another recommendation would be to work with the providers of careTracker program and increase the success rate higher than 40%.  This would further reduce the penalties with readmission cost, controlling the bottom line and increasing the top line.