Probabilities are Better than Scores

Strategic Analytics slide from Fair Isaac Interact on 2007 mortgage meltdownDuring a panel at Fair Isaac’s Interact conference last week, a banker from Abbey National in the UK suggested that part of the credit crunch was due to the use of the FICO score.  Unlike other panelists, who were former Fair Isaac employees, this gentleman was formerly of Experian!  So there was perhaps some friendly rivalry, but his point was a good one.  He cited an earlier presentation by the founder of Strategic Analytics that touched on the divergence between FICO scores and the probability of default.  The panelist’s key point was that some part of the mortgage crisis could be blamed on credit scores, a point that was first raised in the media last fall.

The FICO score is not a probability. 

Fair Isaac people describe the FICO score as a ranking of creditworthiness.  And banks rely on the FICO score for pricing and qualification for mortgages.  The ratio of the loan to value is also critical, but for any two applicants seeking a loan with the same LTV, the one with the better FICO score is more likely to qualify and receive the better price.

Ideally, a bank’s pricing and qualification criteria would accurately reflect the likelihood of default.  The mortgage crisis demonstrates that their assessment, expressed with the FICO score, was wrong.  Their probabilities were off.

  • Was the FICO score a useful metric of creditworthiness before the crisis but not during?
  • Is the FICO score a reliable metric going forward?

In these mid-crunch days, Fair Isaac is reminding its customers that the FICO score is a ranking not a probability.  The underlying point they seek to make is that the relationship between the FICO score and the probability of default is more complex and dynamic than their banking customers understood last year.  (Another post on predictive analytics also discussed stationarity.)

It’s the probability that matters, not the score!

In his keynote, Ian Ayres also focused on the inadequacy of scores.

He was explicit that bankers need the probability of default and, further, that they need to know how reliable such probabilities are.  As an example, he cited polls where one candidate is leading by 6 points within a margin of error of 3 points as almost meaningless.  More meaningful would be the probability that the leading candidate will win.  Even better would be an estimate of the probability of default along with an assessment of the reliability or accuracy of that probability.

Bankers increasingly understand that the FICO score is not the probability of default that they need when originating and underwriting credit.  As a result, bankers increasingly understand that there is no adequate external source of the probabilities they need in order to optimize their portfolio performance. 

This realization has several ramifications:

  • A market opportunity for predictive analytics in credit has opened on Fair Isaac’s turf.
  • Scorecards have lost much of their “solutions” luster, becoming just another technique.

But several things also became clear as I talked with numerous practitioners last week.  Fair Isaac doesn’t have much competition.  In fact, it is shocking how little competition they have in such a large and lucrative market. 

What decisioning market?

Although there is a market opportunity for more rigorous decisioning solutions, there is no significant challenger to Fair Isaac.  I expected to hear more about the Experian Group, but the only direct competitor identified by more than one person was Austin Logistics.  Several people indicated that they were using statistical tools directly, especially SAS, and Fair Isaac itself is placing a great deal of emphasis on its own predictive analytic tools, especially Model Builder.

Note that Vantage Score is really having an impact on Fair Isaac scoring revenues, as reflected in their most recent earnings call transcript.  So there is more competition than may seem apparent to the audience that attends Interact.

Another chasm to cross

Generally speaking, this market needs the benefits of broader machine learning techniques, such as statistics, and a more rigorous understanding and emphasis on probabilities.  The audience, however, is not technically sophisticated enough to become aggressive adopters, despite recent harsh lessons.  In the same panel, every banker in turn solicited risk analysts to join their organizations, headquartered in Asia-Pacific, London, Canada, and on the west coast.  They also agreed that it is easier to learn finance than analytics.

The market for analytics is crowded with sophisticated tools and intellectually demanding techniques that are simply too hard for most people to understand and use, let alone to use effectively and reliably.  This is precisely the circumstances that decision management was in during the late nineties when business rules technology started going mainstream.  In 2000, we crossed that chasm by introducing natural language business rules (see Haley’s Authority).  At the same time, Blaze Advisor, now owned by Fair Isaac, was crossing that chasm using a form based approach called “Innovator”.

Similar advances will be forthcoming in analytics.  As with business rules, this will not eliminate the need for highly skilled consultants, but their criticality and marginal value will diminish as analytics becomes more effective in the hands of non-experts (and as better solutions develop in key markets, such as in credit, risk, fraud and other criminality or terrorism).

Until it’s easy, use expertise

If you are in this market and could use some help with modeling, analytics, or adaptive decision management, feel free to get in touch.  We have some excellent capabilities and partners in these areas.  We are happy to help recommend approaches or products, or simply to make referrals.  Of course, there are also highly specialized consultancies, such as Strategy Analytics that can give excellent implementation-agnostic advice. 

One thing worth noting, but only in passing for now, is Fair Isaac’s acquisition of Dash Optimization.  This  reflects the increasing trend towards broader and deeper application of technology within credit decisioning.  it is also a response to the decline of scoring and the increasing need for decision optimization, which is a broader subject than decision management, with or without predictive analytics and adaptation. 

Nonetheless, optimizing portfolios will not optimize profits if the scores used are not reliably correlated with probabilities.

It is also interesting how Ilog and Fair Isaac continue to converge from a technological perspective. 

del.icio.us Reddit Slashdot Digg Facebook Technorati Google StumbleUpon Squidoo

5 Responses to “Probabilities are Better than Scores”

  1. jamet123 Says:

    Paul
    Normally I find your posts very accurate but today, I fear, I have to take issue:
    - The FICO score is not, never was and was never said to me a measure of the likelihood of default on a mortgage. Any lender who took it as such is probably beyond help - they would have taken any other analytic measure and misused it too.
    - Scorecards are a technique for PRESENTING an analytic that is defensible with regulators and easy to explain. Scores can be built using any and all analytic techniques and often are. Lots of other techniques, like Neural Nets, can create scores. What they struggle to do is explain them. Scorecards get used because they can be used to explain, not because of any particular predictive power. This has not changed in the current meltdown.
    - Vantage is impacting scoring revenues because the bureaus are practically giving it away, necessary to promote adoption, and the lenders can obviously use this to create price pressure on Fair Isaac. I don’t know but I suspect that every Vantage user is also using the FICO score. Competition, yes, but just another version of the same kind of model created to try and capture some of Fair Isaac’s revenue not to create any new kind of score.

    I do agree that there is a huge gap in the analytics world figuring out how to bring non-technical (in this case non-statistical) people into the process to enable effective collaboration. The rules community has done a lot here, the process community a little, the analytic community almost nothing. This needs to change.

    I actually started the comment simply to point out that ILOG’s recent release of a scorecard modeler (blogged about here) is another sign of convergence but you had made some unusually sloppy statements that I felt needed correction.

  2. paul@haleyAI.com Says:

    I would not disagree with you, James, that the FICO score was not intended as a probability, but practically speaking, that is how it has been used. Dr. Breeden’s presentation specifically addresses how lenders do not account for the variation between FICO scores and probabilities through interest rate and real estate cycles. My thesis here is not that scores are useless but that probabilities are better and more direct.

    Yep, Vantage is a nasty, effective move by the credit agencies against Fair Isaac’s prior monopoly. I feel for Fair Isaac on that one. But Dr. Greene, FICO CEO, explicitly acknowledged the continuing negative impact of Vantage on revenues.

    So, I think my facts are in order, I’m afraid. But I truly understand how upsetting the recent wounds from sloppy lending practices and unfair competition are to those who have enjoyed working with Fair Isaac.

    As for Ilog’s announcement, I was going to blog on that today! I’ll pass now that you’ve covered it.

    Respectfully, Paul

  3. snshor Says:

    I think FICO score still proved as areliable tool in measuring relative probabilities. Another thing, the crisis started in subprime mortgages - by definition the sector with a higher probability of default. IMHO, considering the width of the problem, it is not the model’s fault, it is the much more general systemic fault of knowing the limitations of the model by the people responsible for the decisions. As is the case with Newtonian physics, for example.

    The statistical models are vulnerable to the “black swan” events as a very timely Nasim Taleb’s book has pointed out. It is a bit ironic, that the people putting “past performance” clause into all the fine print, could not apply the same principle to the statistical models that always look at the back mirror. It is too much to expect that FICO score, solely based on individual’s credit behaviour, could predict housing market meltdown, that happened a few years after the most of the subjects happily granted and received their motgages, packaged it and sold to investors as the “the bulletproof” instrument. Last year Goldman and otehr were explaining their losses, as caused by 25-sigma event. Anyone with a slightest bit of statistical knowledge can judge this statement, as being either admission of using the low-touch-with-reality model, or even worse, not understanding it.

  4. nraden Says:

    Hi Paul,

    I’m a little late to this conversation, but I would add the following. If, in the best of all possible worlds, a score like FICO was a nearly perfect predictor of the outcome you want to predict, how would a probablistic score be any better? Now I know that’s a big IF, but the point I’m trying to make is that it may not be the nature of the measurement system that matters so much as the model that is used. In my experience, most models are vastly simplified and consider only the most basic criteria. You wouldn’t size an air conditioning unit by only two of the three dimensions of a space. I see no reason why some stochastic process would be necessarily superior to a scorecard like FICO. It all depends on the quality of the underlying model.

    For example, a FICO score gives an indication of credit worthiness at a point in time based on historical information, but it doesn’t explicitly predict whether or not I will default. If my score is 700 today, it will probably be 700 a week from now, even if Lehman Bros. just collapsed and AIG had to be bailed out. It doesn’t take into account those kinds of externalities. And they aren’t all black swans. A model is only as useful as it can predict, and unless it’s robust enough for that, instead of wet finger in the wind, it’s going to fail. As an old statistician myself, I understand the lure of probablistic measures, but down at the bottom of things, it’s the model that counts, and without better models, it doesn’t really matter what kind of metrics we use.

    Campbell’s law of corruption of measurement indicators applies here: “The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor” (Campbell, Donald. “Assessing the Impact of Planned Social Change. Evaluation and Program Planning, Vol 2 (1979), p. 85). Gaming the FICO score in mortgage underwriting is a good example. What credible evidence is there that another type of metric would have fared any better?

    -Neil Raden
    Partner, Smart (enough) Systems LLC

  5. paul@haleyAI.com Says:

    Greetings Neil,

    I think all your points are good but you are missing that the meaning of a score is unclear. The meaning of a probability is not. For example, “an indication of credit worthiness” is less clearly useful than “a probability of default”. Of course, the accuracy of the output is also important. The lack of clarity of what a FICO score meant to its market was a factor (according to the speaker). I’m comfortable with Campbell’s perspective on quantitative metrics that have no fundamental semantics, but if it’s a probability it is inherently resistant to pressure, distortion or corruption. This point was driven home by Mr. Ayers. BTW, I don’t blame FI for marketing something other than a probability. It’s easier, practical and liability limiting.

Leave a Reply

You must be logged in to post a comment.