Is your predictive model really 90% accurate?
- Elom Goka
- 2 days ago
- 5 min read

The ability to predict future outcomes is what got me interested in Data Science. It immediately drew me in and I knew it was the career I wanted to pursue. When the predictive modelling classes for my data science master’s program began, I wondered how accurate model predictions could be. Could predictions be 70%, 80%, or even 90% accurate? “That would be amazing!”, I thought to myself and the answer turned out to be yes but it didn’t end there; the more important questions were:
What accuracy metric is being used?
What is it conveying?
Is it the appropriate accuracy metric to report in light of the project context?
In this article, I will explain why the model accuracy metric you choose to report is important.
Understanding the importance of choosing & reporting the right accuracy metric
Before delving into the heart of this topic, I will explain the concepts, classification problem and class imbalance as they will be needed to understand this article.
A classification problem is one where the model predicts which class an observation belongs to - such as “spam vs not spam,” “disease vs no disease,” or “customer will churn vs will not churn.” When there are two classes, it is called binary classification; when there are more than two, it is called multi-class classification.
A class imbalance refers to a situation in a classification problem where one class appears much more frequently than the other(s). For example: 97% non-fraud cases vs. 3% fraud
Now, let’s say 5% of a telecommunications company’s high value enterprise customers (that bring in millions of Ghana Cedis) left last year. The company has therefore tasked you - the resident data scientist - with building a model to predict which high value customers will leave in the future. The company’s operations team would then reach out to these customers and maybe for example present them with a special offer to mitigate the risk of them leaving.
You build the model and find that it is 90% accurate. You share this with your stakeholders and everyone is excited; but as mentioned earlier, what accuracy metric are you reporting and is it a true reflection of the value the model will bring given the project context?
Very often, assuming the model is not over-fitting, when you encounter such a high model accuracy, what is being reported is the overall accuracy and there is usually an accompanying class imbalance. Using the telecommunications scenario, the overall accuracy captures how good the model is at predicting both those who will leave and those who will stay; And since 5% of customers leave and 95% (the majority) stay, the model will by default be very good at predicting those who will stay. This is what drives the high overall accuracy (90%) and yet those who will stay obviously aren’t the sub-population the company is interested in targeting for risk mitigation.
What is more important is how good the model is at predicting those who will leave and there are two accuracy metrics that will provide this insight.
1)Precision:
Of all the customers the model predicts will leave, how many will truly leave.
For example, 30% of predicted churners will actually leave.
Formula: Precision = correctly predicted churners / total predicted churners
Note: a churner is customer who will leave.
2)Recall / Sensitivity:
Of all the customers that will truly leave, how many does the model correctly predict will leave.
For example, 20% of true churners are correctly identified by the model.
Formula: Recall = correctly predicted churners / total true churners.
Tip: To help with your understanding,
note the difference in the denominator of the formulas for precision and sensitivity.
Precision tells you “How often will the model be right when it raises an alert?
Recall tells you “What proportion of real cases will be identified”?

These accuracy metrics (precision and recall):
provide a true reflection of the value the model will actually provide.
will very likely be significantly lower than the reported overall accuracy of 90%. Why do I say so? Because the customer churn, in this particular scenario, is a rare event (5%) which makes it hard to predict.
Having provided context for these accuracy metrics, my recommendation is to avoid presenting the overall model accuracy in classification problems when there is a class imbalance. Why do I say this?
Presenting only the overall accuracy when there is a class imbalance gives your business stakeholders an unrealistic expectation of the model performance in relation to the outcome of interest - customers who will leave. When that expectation is not met, it leads to a lack of trust in data driven solutions.
Now, what about presenting the overall accuracy along with the model precision / recall, when there is a class imbalance? Wouldn’t that solve the problem of setting unrealistic expectations for your stakeholders? While it would, speaking from experience, presenting different accuracy metrics can be confusing for non-technical business stakeholders. It is best to keep things as simple and concise as possible for clear understanding. It is for this same reason that even when there isn’t a class imbalance, you could probably skip presenting the overall accuracy since your business stakeholders are ultimately more interested in the outcome of interest - customers who will leave - and the precision and recall provide clearer insight into this.
Side note: The precision / recall being significantly lower than the overall accuracy is not a bad thing. It does not mean that the model is not valuable. Remember that in this scenario, the goal is to identify HIGH VALUE customers who are likely to leave so a model with a “lower” accuracy can still be very valuable. Digging further into this, in the absence of a model, the telecommunications company will by definition correctly identify 5% of churners. So, all else being equal, a model that identifies even 20% of churners (20% recall) will likely be added value. This improvement is referred to as model lift.
Conclusion
From my personal experience and interactions with other data professionals, it is quite uncommon in practice to have model accuracies that reflect their true value in the 80% range and above.
It is therefore important to evaluate if you’re reporting the right accuracy metric in light of the project context so you give your stakeholders a true reflection of the model’s value. Aside from that, an additional step I would recommend taking, which is standard practice anyway, is ensuring that your model is not over-fitting as that can cause a false high model accuracy.
Thank you for reading! Don’t forget to subscribe to the newsletter and share this article with your network! 🙂
