Statistical Performance Measures
Is the decision threshold really well chosen?
To answer this question and validate the credit approval system, we will use total profit and various statistical performance measures.
The decision threshold is used for prediction. If a person has a credit score greater than or equal to the threshold, we predict they will repay the loan. If their score is below the threshold, we predict they will not repay. The threshold was initially fixed at 70. For all persons with a score >= 70, we predict repayment. For all others, we predict non-repayment. These predictions can now be compared with the known outcomes.
Reminder: We are working with data of individuals for whom it is known whether they repaid their loan in the past, and we use this to compare with our predictions.
The Confusion Matrix
The number of correct and incorrect predictions for both groups (“repays” and “does not repay”) is shown in the table below. This table is also called a confusion matrix.
Evaluating the Decision Model
There are several performance metrics we can use to evaluate our model:
- Accuracy: Percentage of correct predictions out of all data points.
- Positive Rate: Percentage of positive predictions (predicts: repays) out of all data points.
- True Positive Rate: Percentage of correct positive predictions out of all actual positive cases (data: repays).
- Profit: Total profit of the bank (Reminder: the bank earns €300 for each repaid loan and loses €700 for each not repaid loan).