About This Book |
|
xi | |
About The Author |
|
xiii | |
Chapter 1 Research Strategy |
|
|
|
1 | (1) |
|
|
2 | (1) |
|
1.2.1 Measurement Scales for Variables |
|
|
2 | (1) |
|
1.2.2 Predictive Models with Textual Data |
|
|
2 | (1) |
|
|
2 | (8) |
|
1.3.1 Predicting Response to Direct Mail |
|
|
2 | (2) |
|
1.3.2 Predicting Risk in the Auto Insurance Industry |
|
|
4 | (1) |
|
1.3.3 Predicting Rate Sensitivity of Bank Deposit Products |
|
|
5 | (2) |
|
1.3.4 Predicting Customer Attrition |
|
|
7 | (1) |
|
1.3.5 Predicting a Nominal Categorical (Unordered Polychotomous) Target |
|
|
8 | (2) |
|
1.4 Sources of Modeling Data |
|
|
10 | (1) |
|
1.4.1 Comparability between the Sample and Target Universe |
|
|
10 | (1) |
|
1.4.2 Observation Weights |
|
|
10 | (1) |
|
1.5 Pre-Processing the Data |
|
|
10 | (2) |
|
1.5.1 Data Cleaning Before Launching SAS Enterprise Miner |
|
|
11 | (1) |
|
1.5.2 Data Cleaning After Launching SAS Enterprise Miner |
|
|
11 | (1) |
|
1.6 Alternative Modeling Strategies |
|
|
12 | (1) |
|
1.6.1 Regression with a Moderate Number of Input Variables |
|
|
12 | (1) |
|
1.6.2 Regression with a Large Number of Input Variables |
|
|
13 | (1) |
|
|
13 | (2) |
Chapter 2 Getting Started with Predictive Modeling |
|
15 | (124) |
|
|
16 | (1) |
|
2.2 Opening SAS Enterprise Miner 14.1 |
|
|
16 | (1) |
|
2.3 Creating a New Project in SAS Enterprise Miner 14.1 |
|
|
16 | (1) |
|
2.4 The SAS Enterprise Miner Window |
|
|
17 | (1) |
|
2.5 Creating a SAS Data Source |
|
|
18 | (9) |
|
2.6 Creating a Process Flow Diagram |
|
|
27 | (1) |
|
|
27 | (29) |
|
|
27 | (2) |
|
2.7.2 Data Partition Node |
|
|
29 | (1) |
|
|
29 | (4) |
|
|
33 | (3) |
|
|
36 | (14) |
|
|
50 | (3) |
|
|
53 | (3) |
|
2.8 Tools for Initial Data Exploration |
|
|
56 | (38) |
|
|
57 | (7) |
|
|
64 | (3) |
|
|
67 | (6) |
|
2.8.4 Variable Clustering Node |
|
|
73 | (9) |
|
|
82 | (3) |
|
2.8.6 Variable Selection Node |
|
|
85 | (9) |
|
2.9 Tools for Data Modification |
|
|
94 | (26) |
|
|
94 | (1) |
|
|
95 | (3) |
|
|
98 | (1) |
|
2.9.4 Interactive Binning Node |
|
|
99 | (7) |
|
2.9.5 Principal Components Node |
|
|
106 | (6) |
|
2.9.6 Transform Variables Node |
|
|
112 | (8) |
|
|
120 | (6) |
|
|
120 | (6) |
|
2.11 Appendix to Chapter 2 |
|
|
126 | (9) |
|
2.11.1 The Type, the Measurement Scale, and the Number of Levels of a Variable |
|
|
126 | (3) |
|
2.11.2 Eigenvalues, Eigenvectors, and Principal Components |
|
|
129 | (3) |
|
|
132 | (1) |
|
2.11.4 Calculation of Chi-Square Statistic and Cramer's V for a Continuous Input |
|
|
133 | (2) |
|
|
135 | (2) |
|
|
137 | (2) |
Chapter 3 Variable Selection and Transformation of Variables |
|
139 | (56) |
|
|
139 | (1) |
|
|
140 | (22) |
|
3.2.1 Continuous Target with Numeric Interval-scaled Inputs (Case 1) |
|
|
140 | (7) |
|
3.2.2 Continuous Target with Nominal-Categorical Inputs (Case 2) |
|
|
147 | (6) |
|
3.2.3 Binary Target with Numeric Interval-scaled Inputs (Case 3) |
|
|
153 | (5) |
|
3.2.4 Binary Target with Nominal-scaled Categorical Inputs (Case 4) |
|
|
158 | (4) |
|
3.3 Variable Selection Using the Variable Clustering Node |
|
|
162 | (14) |
|
3.3.1 Selection of the Best Variable from Each Cluster |
|
|
164 | (10) |
|
3.3.2 Selecting the Cluster Components |
|
|
174 | (2) |
|
3.4 Variable Selection Using the Decision Tree Node |
|
|
176 | (3) |
|
3.5 Transformation of Variables |
|
|
179 | (11) |
|
3.5.1 Transform Variables Node |
|
|
179 | (2) |
|
3.5.2 Transformation before Variable Selection |
|
|
181 | (2) |
|
3.5.3 Transformation after Variable Selection |
|
|
183 | (2) |
|
3.5.4 Passing More Than One Type of Transformation for Each Interval Input to the Next Node |
|
|
185 | (4) |
|
3.5.5 Saving and Exporting the Code Generated by the Transform Variables Node |
|
|
189 | (1) |
|
|
190 | (1) |
|
|
190 | (2) |
|
3.7.1 Changing the Measurement Scale of a Variable in a Data Source |
|
|
190 | (2) |
|
3.7.2 SAS Code for Comparing Grouped Categorical Variables with the Ungrouped Variables |
|
|
192 | (1) |
|
|
192 | (1) |
|
|
193 | (2) |
Chapter 4 Building Decision Tree Models to Predict Response and Ris |
|
195 | (84) |
|
|
196 | (1) |
|
4.2 An Overview of the Tree Methodology in SAS® Enterprise Miner |
|
|
196 | (6) |
|
|
196 | (1) |
|
4.2.2 Decision Tree Models |
|
|
196 | (2) |
|
4.2.3 Decision Tree Models vs. Logistic Regression Models |
|
|
198 | (1) |
|
4.2.4 Applying the Decision Tree Model to Prospect Data |
|
|
198 | (1) |
|
4.2.5 Calculation of the Worth of a Tree |
|
|
199 | (2) |
|
4.2.6 Roles of the Training and Validation Data in the Development of a Decision Tree |
|
|
201 | (1) |
|
|
202 | (1) |
|
4.3 Development of the Tree in SAS Enterprise Miner |
|
|
202 | (19) |
|
4.3.1 Growing an Initial Tree |
|
|
202 | (7) |
|
4.3.2 P-value Adjustment Options |
|
|
209 | (2) |
|
4.3.3 Controlling Tree Growth: Stopping Rules |
|
|
211 | (1) |
|
4.3.3.1 Controlling Tree Growth through the Split Size Property |
|
|
211 | (1) |
|
4.3.4 Pruning: Selecting the Right-Sized Tree Using Validation Data |
|
|
211 | (2) |
|
4.3.5 Step-by-Step Illustration of Growing and Pruning a Tree |
|
|
213 | (5) |
|
4.3.6 Average Profit vs. Total Profit for Comparing Trees of Different Sizes |
|
|
218 | (1) |
|
4.3.7 Accuracy/Misclassification Criterion in Selecting the Right-sized Tree (Classification of Records and Nodes by Maximizing Accuracy) |
|
|
218 | (2) |
|
4.3.8 Assessment of a Tree or Sub-tree Using Average Square Error |
|
|
220 | (1) |
|
4.3.9 Selection of the Right-sized Tree |
|
|
220 | (1) |
|
4.4 Decision Tree Model to Predict Response to Direct Marketing |
|
|
221 | (15) |
|
4.4.1 Testing Model Performance with a Test Data Set |
|
|
230 | (1) |
|
4.4.2 Applying the Decision Tree Model to Score a Data Set |
|
|
231 | (5) |
|
4.5 Developing a Regression Tree Model to Predict Risk |
|
|
236 | (8) |
|
4.5.1 Summary of the Regression Tree Model to Predict Risk |
|
|
243 | (1) |
|
4.6 Developing Decision Trees Interactively |
|
|
244 | (25) |
|
4.6.1 Interactively Modifying an Existing Decision Tree |
|
|
244 | (22) |
|
4.6.3 Developing the Maximal Tree in Interactive Mode |
|
|
266 | (3) |
|
|
269 | (1) |
|
|
270 | (5) |
|
4.8.1 Pearson's Chi-Square Test |
|
|
270 | (1) |
|
4.8.2 Calculation of Impurity Reduction using Gini Index |
|
|
271 | (1) |
|
4.8.3 Calculation of Impurity Reduction/Information Gain using Entropy |
|
|
272 | (2) |
|
4.8.4 Adjusting the Predicted Probabilities for Over-sampling |
|
|
274 | (1) |
|
4.8.5 Expected Profits Using Unadjusted Probabilities |
|
|
275 | (1) |
|
4.8.6 Expected Profits Using Adjusted Probabilities |
|
|
275 | (1) |
|
|
275 | (2) |
|
|
277 | (2) |
Chapter 5 Neural Network Models to Predict Response and Risk |
|
279 | (90) |
|
|
280 | (1) |
|
5.1.1 Target Variables for the Models |
|
|
280 | (1) |
|
5.1.2 Neural Network Node Details |
|
|
281 | (1) |
|
5.2 General Example of a Neural Network Model |
|
|
281 | (9) |
|
|
282 | (1) |
|
|
283 | (5) |
|
5.2.3 Output Layer or Target Layer |
|
|
288 | (1) |
|
5.2.4 Activation Function of the Output Layer |
|
|
289 | (1) |
|
5.3 Estimation of Weights in a Neural Network Model |
|
|
290 | (1) |
|
5.4 Neural Network Model to Predict Response |
|
|
291 | (17) |
|
5.4.1 Setting the Neural Network Node Properties |
|
|
293 | (4) |
|
5.4.2 Assessing the Predictive Performance of the Estimated Model |
|
|
297 | (3) |
|
5.4.3 Receiver Operating Characteristic (ROC) Charts |
|
|
300 | (3) |
|
5.4.4 How Did the Neural Network Node Pick the Optimum Weights for This Model? |
|
|
303 | (2) |
|
5.4.5 Scoring a Data Set Using the Neural Network Model |
|
|
305 | (3) |
|
|
308 | (1) |
|
5.5 Neural Network Model to Predict Loss Frequency in Auto Insurance |
|
|
308 | (14) |
|
5.5.1 Loss Frequency as an Ordinal Target |
|
|
309 | (2) |
|
5.5.1.1 Target Layer Combination and Activation Functions |
|
|
311 | (10) |
|
5.5.3 Classification of Risks for Rate Setting in Auto Insurance with Predicted Probabilities |
|
|
321 | (1) |
|
5.6 Alternative Specifications of the Neural Networks |
|
|
322 | (8) |
|
5.6.1 A Multilayer Perceptron (MLP) Neural Network |
|
|
322 | (2) |
|
5.6.2 Radial Basis Function (RBF) Neural Network |
|
|
324 | (6) |
|
5.7 Comparison of Alternative Built-in Architectures of the Neural Network Node |
|
|
330 | (24) |
|
5.7.1 Multilayer Perceptron (MLP) Network |
|
|
332 | (1) |
|
5.7.2 Ordinary Radial Basis Function with Equal Heights and Widths (ORBFEQ) |
|
|
333 | (5) |
|
5.7.3 Ordinary Radial Basis Function with Equal Heights and Unequal Widths (ORBFUN) |
|
|
)335 | |
|
5.7.4 Normalized Radial Basis Function with Equal Widths and Heights (NRBFEQ) |
|
|
338 | (2) |
|
5.7.5 Normalized Radial Basis Function with Equal Heights and Unequal Widths (NRBFEH) |
|
|
340 | (3) |
|
5.7.6 Normalized Radial Basis Function with Equal Widths and Unequal Heights (NRBFEW) |
|
|
343 | (3) |
|
5.7.7 Normalized Radial Basis Function with Equal Volumes (NRBFEV) |
|
|
346 | (2) |
|
5.7.8 Normalized Radial Basis Function with Unequal Widths and Heights (NRBFUN) |
|
|
348 | (3) |
|
5.7.9 User-Specified Architectures |
|
|
351 | (3) |
|
|
354 | (2) |
|
|
356 | (2) |
|
5.10 Dmine Regression Node |
|
|
358 | (2) |
|
5.11 Comparing the Models Generated by DMNeural, AutoNeural, and Dmine Regression Nodes |
|
|
360 | (2) |
|
|
362 | (1) |
|
5.13 Appendix to Chapter 5 |
|
|
363 | (2) |
|
|
365 | (2) |
|
|
367 | (2) |
Chapter 6 Regression Models |
|
369 | (84) |
|
|
369 | (1) |
|
6.2 What Types of Models Can Be Developed Using the Regression Node? |
|
|
369 | (14) |
|
6.2.1 Models with a Binary Target |
|
|
369 | (4) |
|
6.2.2 Models with an Ordinal Target |
|
|
373 | (6) |
|
6.2.3 Models with a Nominal (Unordered) Target |
|
|
379 | (4) |
|
6.2.4 Models with Continuous Targets |
|
|
383 | (1) |
|
6.3 An Overview of Some Properties of the Regression Node |
|
|
383 | (32) |
|
6.3.1 Regression Type Property |
|
|
384 | (1) |
|
6.3.2 Link Function Property |
|
|
384 | (2) |
|
6.3.3 Selection Model Property |
|
|
386 | (17) |
|
6.3.4 Selection Criterion Property5 |
|
|
403 | (12) |
|
6.4 Business Applications |
|
|
415 | (27) |
|
6.4.1 Logistic Regression for Predicting Response to a Mail Campaign |
|
|
417 | (14) |
|
6.4.2 Regression for a Continuous Target |
|
|
431 | (11) |
|
|
442 | (1) |
|
|
443 | (9) |
|
|
443 | (4) |
|
6.6.2 Examples of the selection criteria when the Model Selection property set to Forward. |
|
|
447 | (4) |
|
|
451 | (1) |
|
|
452 | (1) |
Chapter 7 Comparison and Combination of Different Models |
|
453 | (36) |
|
|
453 | (1) |
|
7.2 Models for Binary Targets: An Example of Predicting Attrition |
|
|
454 | (10) |
|
7.2.1 Logistic Regression for Predicting Attrition |
|
|
456 | (2) |
|
7.2.2 Decision Tree Model for Predicting Attrition |
|
|
458 | (2) |
|
7.2.3 A Neural Network Model for Predicting Attrition |
|
|
460 | (4) |
|
7.3 Models for Ordinal Targets: An Example of Predicting the Risk of Accident Risk |
|
|
464 | (12) |
|
7.3.1 Lift Charts and Capture Rates for Models with Ordinal Targets |
|
|
465 | (1) |
|
7.3.2 Logistic Regression with Proportional Odds for Predicting Risk in Auto Insurance |
|
|
466 | (3) |
|
7.3.3 Decision Tree Model for Predicting Risk in Auto Insurance |
|
|
469 | (4) |
|
7.3.4 Neural Network Model for Predicting Risk in Auto Insurance |
|
|
473 | (3) |
|
7.4 Comparison of All Three Accident Risk Models |
|
|
476 | (1) |
|
7.5 Boosting and Combining Predictive Models |
|
|
476 | (10) |
|
|
477 | (2) |
|
7.5.2 Stochastic Gradient Boosting |
|
|
479 | (1) |
|
7.5.3 An Illustration of Boosting Using the Gradient Boosting Node |
|
|
479 | (3) |
|
|
482 | (3) |
|
7.5.5 Comparing the Gradient Boosting and Ensemble Methods of Combining Models |
|
|
485 | (1) |
|
|
486 | (2) |
|
|
486 | (1) |
|
7.6.2 Least Absolute Deviation Loss |
|
|
486 | (1) |
|
|
487 | (1) |
|
|
487 | (1) |
|
|
488 | (1) |
|
|
488 | (1) |
Chapter 8 Customer Profitability |
|
489 | (10) |
|
|
489 | (2) |
|
|
491 | (1) |
|
|
492 | (1) |
|
|
493 | (2) |
|
8.6 The Optimum Cutoff Point |
|
|
495 | (1) |
|
8.7 Alternative Scenarios of Response and Risk |
|
|
496 | (1) |
|
8.8 Customer Lifetime Value |
|
|
496 | (1) |
|
8.9 Suggestions for Extending Results |
|
|
497 | (1) |
|
|
497 | (2) |
Chapter 9 Introduction to Predictive Modeling with Textual Data |
|
499 | (48) |
|
|
499 | (8) |
|
9.1.1 Quantifying Textual Data: A Simplified Example |
|
|
500 | (3) |
|
9.1.2 Dimension Reduction and Latent Semantic Indexing |
|
|
503 | (3) |
|
9.1.3 Summary of the Steps in Quantifying Textual Information |
|
|
506 | (1) |
|
9.2 Retrieving Documents from the World Wide Web |
|
|
507 | (2) |
|
9.2.1 The %TMFILTER Macro |
|
|
507 | (2) |
|
9.3 Creating a SAS Data Set from Text Files |
|
|
509 | (3) |
|
|
512 | (2) |
|
9.5 Creating a Data Source for Text Mining |
|
|
514 | (2) |
|
|
516 | (5) |
|
|
521 | (7) |
|
9.7.1 Frequency Weighting |
|
|
521 | (1) |
|
|
521 | (1) |
|
9.7.3 Adjusted Frequencies |
|
|
521 | (1) |
|
9.7.4 Frequency Weighting Methods |
|
|
521 | (2) |
|
9.7.5 Term Weighting Methods |
|
|
523 | (5) |
|
|
528 | (6) |
|
9.8.1 Developing a Predictive Equation Using the Output Data Set Created by the Text Topic Node |
|
|
533 | (1) |
|
|
534 | (12) |
|
9.9.1 Hierarchical Clustering |
|
|
535 | (1) |
|
9.9.2 Expectation-Maximization (EM) Clustering |
|
|
536 | (6) |
|
9.9.3 Using the Text Cluster Node |
|
|
542 | (4) |
|
|
546 | (1) |
|
|
546 | (1) |
Index |
|
547 | |