|
|
1 | (6) |
|
|
1 | (1) |
|
1.2 Goals of Data-Driven Approaches for High Utilizers |
|
|
2 | (1) |
|
|
3 | (1) |
|
|
4 | (3) |
|
Chapter 2 Overview of Health Care Data |
|
|
7 | (8) |
|
2.1 Type of Health Care Data |
|
|
7 | (2) |
|
2.2 Structure of Health Care Data |
|
|
9 | (3) |
|
2.2.1 Structured Health Care Data |
|
|
9 | (1) |
|
|
9 | (1) |
|
|
10 | (1) |
|
2.2.1.3 Pharmaceutical Codes |
|
|
11 | (1) |
|
2.2.2 Unstructured Health Care Data |
|
|
12 | (1) |
|
2.3 Common Data Sources for High Utilizers |
|
|
12 | (3) |
|
2.3.1 Administrative Claims Data |
|
|
13 | (1) |
|
2.3.2 PCORnet Common Data Model |
|
|
13 | (2) |
|
Chapter 3 Machine Learning Modeling from Health Care Data |
|
|
15 | (14) |
|
|
15 | (4) |
|
3.1.1 Ordinary Least Squares Linear Regression (LR) |
|
|
15 | (1) |
|
3.1.2 Regularized Regression (LASSO) |
|
|
16 | (1) |
|
3.1.3 Gradient Boosting Machine (GBM) |
|
|
16 | (1) |
|
3.1.4 Recurrent Neural Networks (RNN) |
|
|
17 | (2) |
|
3.2 Interpreting Supervised Models |
|
|
19 | (3) |
|
3.2.1 Global Interpretation: Understand Trained Model |
|
|
19 | (1) |
|
3.2.2 Local Interpretation: Understand Each Prediction |
|
|
20 | (1) |
|
3.2.3 Prediction Confidence |
|
|
21 | (1) |
|
3.2.3.1 Voting and Consensus Rate |
|
|
21 | (1) |
|
3.2.3.2 Providing Confidence Intervals |
|
|
21 | (1) |
|
|
22 | (3) |
|
3.3.1 Clinical Phenotyping |
|
|
22 | (1) |
|
3.3.2 Behavioral Phenotyping: Clustering Inter-Arrival Time of Health Care Encounters |
|
|
22 | (1) |
|
3.3.2.1 Histogram Representations of Asynchronous Time Series |
|
|
23 | (1) |
|
3.3.2.2 Wasserstein Distance |
|
|
23 | (2) |
|
3.3.2.3 Spectral Clustering |
|
|
25 | (1) |
|
|
25 | (4) |
|
Chapter 4 Descriptive Analysis of High Utilizers |
|
|
29 | (18) |
|
4.1 Threshold-Based Methods for Frequent Emergency Department Users |
|
|
29 | (10) |
|
|
29 | (1) |
|
|
30 | (1) |
|
|
30 | (1) |
|
|
30 | (1) |
|
4.1.2.3 Operational Definitions |
|
|
30 | (1) |
|
4.1.2.4 Medical Expenditures |
|
|
31 | (1) |
|
4.1.2.5 Enrollee Sociodemographics |
|
|
31 | (1) |
|
4.1.2.6 Diagnostic History |
|
|
32 | (1) |
|
4.1.2.7 New York University ED Profiling Algorithm |
|
|
32 | (1) |
|
4.1.2.8 Frequent and Persistent Users |
|
|
32 | (1) |
|
4.1.2.9 Annualized Visits |
|
|
32 | (1) |
|
4.1.2.10 Statistical Analyses |
|
|
32 | (1) |
|
|
33 | (1) |
|
4.1.4 Characteristics of ED Users |
|
|
33 | (3) |
|
|
36 | (1) |
|
|
37 | (1) |
|
4.1.5.1 Sociodemographics |
|
|
37 | (1) |
|
4.1.5.2 Setting-Specific, High-Frequency Use |
|
|
37 | (1) |
|
4.1.5.3 Cost Concentrations |
|
|
37 | (1) |
|
4.1.5.4 Chronic, Comorbid Conditions, Mental Illness, and SUDs |
|
|
38 | (1) |
|
4.1.5.5 Inappropriate and/or Avoidable Visits |
|
|
38 | (1) |
|
|
39 | (1) |
|
4.2 Temporal Consistency of High Utilizers |
|
|
39 | (8) |
|
|
39 | (1) |
|
|
39 | (1) |
|
|
39 | (1) |
|
|
40 | (1) |
|
|
40 | (1) |
|
4.2.3.1 Entire Adult Population |
|
|
40 | (2) |
|
4.2.3.2 Temporal Correlation for the Top 10% Population |
|
|
42 | (1) |
|
4.2.3.3 Chronic Conditions Cohorts |
|
|
43 | (2) |
|
|
45 | (2) |
|
Chapter 5 Residuals Analysis for Identifying High Utilizers |
|
|
47 | (18) |
|
|
47 | (1) |
|
|
48 | (5) |
|
|
48 | (1) |
|
|
48 | (1) |
|
|
48 | (1) |
|
5.2.3.1 Linear Regression |
|
|
49 | (1) |
|
|
49 | (1) |
|
|
50 | (1) |
|
5.2.4.1 Fitting Linear Regression |
|
|
50 | (1) |
|
5.2.4.2 Fitting Tree-Based Model |
|
|
51 | (1) |
|
5.2.5 Identifying the High Residuals Population |
|
|
52 | (1) |
|
5.2.6 Breakdown Residuals |
|
|
52 | (1) |
|
|
53 | (1) |
|
|
53 | (10) |
|
5.3.1 Compare Linear Regression and Tree-Based Model |
|
|
53 | (1) |
|
5.3.2 Characterizing the High Utilizers |
|
|
54 | (1) |
|
5.3.2.1 Demographics, Health Conditions, and Utilization |
|
|
55 | (1) |
|
5.3.2.2 Temporal Consistency of Residuals |
|
|
55 | (2) |
|
5.3.3 Breakdown Residuals to ICD-9-CM Codes |
|
|
57 | (1) |
|
5.3.3.1 Essential Hypertension |
|
|
58 | (1) |
|
5.3.3.2 Chronic Kidney Disease |
|
|
59 | (1) |
|
5.3.4 Stratified Models by Service Settings |
|
|
60 | (1) |
|
5.3.4.1 Residuals and Potentially Preventable Readmissions (PPR) |
|
|
60 | (1) |
|
5.3.4.2 Residuals and Potentially Preventable Emergency Department Visits (PPV) |
|
|
61 | (2) |
|
5.3.4.3 Residuals and Future Potentially Preventable Events |
|
|
63 | (1) |
|
|
63 | (2) |
|
Chapter 6 Machine Learning Results for High Utilizers |
|
|
65 | (22) |
|
6.1 Predicting Hospital Readmissions |
|
|
65 | (10) |
|
|
65 | (1) |
|
|
66 | (1) |
|
|
66 | (1) |
|
|
67 | (1) |
|
6.1.2.3 Regularized Logistic Regression (LASSO) |
|
|
68 | (1) |
|
6.1.2.4 Gradient Boosting Machine (GBM) |
|
|
68 | (1) |
|
6.1.2.5 Deep Neural Networks (DNN) |
|
|
68 | (1) |
|
|
69 | (1) |
|
6.1.3.1 Prediction Accuracy |
|
|
69 | (1) |
|
6.1.3.2 Interpret Models and Predictions |
|
|
70 | (2) |
|
6.1.3.3 Prediction Confidence |
|
|
72 | (2) |
|
|
74 | (1) |
|
6.2 Predicting Health Care Expenditure |
|
|
75 | (7) |
|
|
75 | (1) |
|
|
75 | (1) |
|
|
75 | (1) |
|
|
75 | (1) |
|
|
76 | (1) |
|
6.2.2.4 Predictive Models |
|
|
76 | (1) |
|
6.2.2.5 Model Selection and Validation |
|
|
76 | (1) |
|
6.2.3 Prediction Performance |
|
|
77 | (1) |
|
|
77 | (1) |
|
6.2.3.2 Choice of Period Length |
|
|
77 | (1) |
|
6.2.3.3 Using Additional Information |
|
|
78 | (1) |
|
6.2.3.4 Including Additional Prior Periods |
|
|
79 | (1) |
|
6.2.4 Interpreting the Models |
|
|
80 | (1) |
|
6.2.5 Choosing the Best Model |
|
|
81 | (1) |
|
|
82 | (1) |
|
6.3 Clustering Asynchronous Health Care Encounters Time Series |
|
|
82 | (5) |
|
6.3.1 Emergency Department Visits Time Series |
|
|
83 | (1) |
|
6.3.2 Inpatient Hospital Stays Time Series |
|
|
83 | (1) |
|
|
84 | (3) |
|
|
87 | (2) |
Appendix A Acknowledgment |
|
89 | (2) |
Bibliography |
|
91 | (14) |
Index |
|
105 | |