Statistical Performance Measures

Is the decision threshold really well chosen?

To answer this question and validate the credit approval system, we will use total profit and various statistical performance measures.

The decision threshold is used for prediction. If a person has a credit score greater than or equal to the threshold, we predict they will repay the loan. If their score is below the threshold, we predict they will not repay. The threshold was initially fixed at 70. For all persons with a score >= 70, we predict repayment. For all others, we predict non-repayment. These predictions can now be compared with the known outcomes.

Reminder: We are working with data of individuals for whom it is known whether they repaid their loan in the past, and we use this to compare with our predictions.

The Confusion Matrix

The number of correct and incorrect predictions for both groups (“repays” and “does not repay”) is shown in the table below. This table is also called a confusion matrix.

How many people receive a loan with a threshold of 70?

How many people receive a loan at threshold 70 who are not creditworthy?

Evaluating the Decision Model

There are several performance metrics we can use to evaluate our model:

Accuracy: Percentage of correct predictions out of all data points.
Positive Rate: Percentage of positive predictions (predicts: repays) out of all data points.
True Positive Rate: Percentage of correct positive predictions out of all actual positive cases (data: repays).
Profit: Total profit of the bank (Reminder: the bank earns €300 for each repaid loan and loses €700 for each not repaid loan).

Calculate the values of the four performance metrics using the confusion matrix. Enter your results in the table below.

What percentage of creditworthy individuals actually receive a loan?

What percentage of individuals in the dataset receive a loan?