AI and ML Misunderstandings
October 22, 2021 | By Rick Haskell
AI and ML Misunderstandings – PART 1 of 3
Put your seatbelts on, we’re about to get controversial… No, we’re not talking politics, but we are talking about two often misunderstood acronyms in recent times—AI and ML. There’s lots to unpack here, but we’re going to keep it simple—and we promise you’ll come away with a clear understanding of the basics.
Artificial Intelligence (AI)
First let’s tackle Artificial Intelligence, and please forgive our clichéd reference to robots… but will robots ever be able to produce original thoughts? The answer is no. Machines are created by humans and follow our instructions. Imagine 1 million IF STATEMENTS, each with their own 1 million nested IF STATEMENTS, and each of those with 1 million more nested IF STATEMENTS—yikes, a gazillion unique business rules firing off actions based on a massive network of conditional logic. And while there is no doubt one day all those rules will simulate human actions akin to Westworld’s finest AI, it still isn’t original thought. And even if the machine is adding new IF STATEMENTS on the fly, then even those actions must trace-back to previous logic that was written by humans. Human brains are biologically, incomprehensibly… human—and only organic beings are capable of original thought. Machines are not.
So what is Artificial Intelligence? First, the phrase was first coined in 1956 in the context of computing, so it doesn’t apply to mankind’s amazing inventions prior to the computing era. And for the sake of discussion, let’s limit the topic to that of loan management software. One example of AI is found in Lendisoft Servicing, which comprises dozens of screens with hundreds of input fields. When certain fields have the right combination of inputs, it triggers a rule that instantly recommends the next action for the agent to take. This happens instantly because these rules are constantly running in the background. We call this feature NEXT ACTION AI. And while there’s really no magic to it, it’s an amazing time saver for agents and makes them rockstar collectors because they can work an estimated 20% more accounts per day. Imagine if it didn’t exist. Agents would need to jump around to all the different screens to quickly get their bearings. So try and think of AI as timesaving tools that improve productivity while reducing errors. We include a bunch of AI tools throughout Lendisoft—little widgets that appear on various screens, each designed to make users more successful in their jobs.
One, small but powerful example of AI within Lendisoft:
In PART 2 of this article, we’ll cover what Machine Learning is, and equally important—what it isn’t. For now, you should think of ML models as a secret weapon found in some AI tools that super-charge their effectiveness. More on that next…
AI and ML Misunderstandings – PART 2 of 3
Building on last week’s post, we’ll continue with a clear understanding of what Machine Learning is, and how ML models can be used in loan servicing software.
Machine Learning (ML)
ML is as misunderstood these days as AI, but it’s really not complicated. ML models are statistical models that were “trained” (i.e., built) on a set of historical data with known outcomes. Some of you were around back when scorecards took over credit decision-making. In the old days, loan officers were tasked with making credit decisions using their good judgment and experience. Each decision was a human choice based on jumping around screens, reviewing various inputs, reading the credit bureaus, reviewing various KPIs, and finally making a decision. All this analysis took time (perhaps 20 minutes of critical thinking to adjudicate each loan application). Meanwhile, ML models were mostly used in other industries like medical research, but somewhere along the way those research analysts started having lunch with the banking folks, and the lending industry took a major step forward in its evolution.
So, ML models are statistically trained models that dramatically improve accuracy and speed when compared to human decision-making. Got it.
There are different types of ML models, but let’s not worry about that—all of them are powerful. The lending industry is dominated by one type, logistic regression models, since they are empirically derived (machine “trained”), very powerful, and it’s easy to explain why the prediction is what it is. For example, if you decline an applicant for credit you must provide the reason. Logistic regression models make this easy. And while logistic regression models are by far the most common ML models in the lending industry, the new kid in town are deep learning neural-network models (not really new, just becoming mainstream as of late). Neural-network models are more complex, and have potential to produce a slightly more predictive result than their ML counterparts. So while different ML model types may vary slightly in power from one another, the bigger news is that ML models regardless of type, are far more powerful than human experience-based decisioning.
Here's a chart that helps put different types of decision-making into perspective:
First, don’t get caught up on the actual percentages in the chart. The percentages shown estimate how many predictions turned out to be correct for each model type. Certainly, true accuracy depends on many factors including data accessibility and what it is we’re trying to predict. The main point is the relative accuracy between the bars. Bars #4 and #5 are both examples of ML models (machine “trained”), and while the #5 type is garnering lots of media attention these days, understand #4 will continue to be the first choice among the broadest swath of lenders for a long time to come. Also notice how none of these models’ predictions are right 100% of the time. The way AI and ML are over-hyped these days, it might come across as if ML models are practically clairvoyant. They are not.
And one other point to mention is, where do generic scores like FICO and Vantage fall in the chart? Answer: somewhere pretty close to #3, and it’s our experience their predictive accuracy starts to drop as you go deeper into subprime. But someone may argue, “aren’t FICO and Vantage examples of empirically derived ML models”? The answer is yes, which should make them compete with #4 and #5; however the Achilles heel of those models is they were built on lots of different types of consumers covering many different sources, and that generalization “waters-down” their power. The models we’re referring to in the chart (#3, #4, and #5) are models built on your own customer data, and that means your models precisely fit your type of customers—like hand-to-glove. And that means more power!
In PART 3 of this series, we’ll cover truths and misunderstandings about one final concept being perpetuated out there, and that’s about model replacement frequency. Note that there’s a lot of hype around models that apparently rebuild themselves every few seconds, which allegedly translates into super-powerful predictions. With these sorts of things, there’s usually some truths and lots of propaganda—we’ll help sort through all of this.
AI and ML Misunderstandings – PART 3 of 3
In this final post of the series, we’ll tackle the question of model replacement frequency—another important topic, and one where once again advertisers seem to be misleading folks with some of their messaging.
Model Replacement Frequency
Despite what you may be hearing out there, all models are static. That means they are built on a vintage slice of historical data. With traditional credit scoring models for the auto-finance industry, a typical modeling setup might be loans originated from Jan 1 – Dec 31, 2019, with performance outcomes measured on Dec 31, 2020. Now in theory, since your model was trained on 2019 originations, and your model is being used at present day (Oct 2021 as of this writing), it’s been 2 years, and what if the types of customers you get today are materially different from that of 2019? That would be terrible, because your model was trained on customers from 2019, and it wouldn’t be able to predict well anymore. Thankfully, customer diversity doesn’t shift around all that quickly in most finance scenarios, so odds are your ML model is still performing just fine. Of course you don’t leave this stuff to chance and there are model validations you should be running each quarter that tell you precisely how much “shifting” has been going on. At some point, it will be time to retire the old models and replace them with new ones, but it’s quite common in most finance settings that an ML model can run in production for at least a year or two before data diversity starts to shift materially.
Now let’s look at an entirely different industry—let’s say driverless cars. We don’t know much about this industry, but we can imagine a setting where ML models can predict what’s coming up ahead based on the latest road conditions and other data being gathered on an almost constant basis. In this case, data diversity may be shifting almost constantly. You may find you need to retrain or completely rebuild models daily, hourly, or even minute by minute. New technologies are being developed where ML models can be built extremely fast and repeatedly so they are always based on the latest data. The moral of the story is this: when model validations show that data diversity has materially changed, it’s time to rebuild your models.
So what industry are you in? Does changing-out models minute by minute bring you more predictive power? If you’ve been following along, you’d know the answer is: only if your underlying data diversity is significantly different today than it was when you built your last model. So let’s imagine building new ML models minute-by-minute for the lending industry. Wow, that’s a lot of effort in a setting where each and every model will virtually be an exact replica of the prior model (because the underlying data hasn’t materially changed—and won’t likely anytime soon). If you want to deploy an infrastructure that can pop-out new models every day, knock yourself out (forgive the sarcasm).
Final Thoughts On AI, ML, and Proper Risk Management…
As it relates to portfolio risk, it is the risk manager’s fundamental job to maximize loan origination volume and to otherwise grow one’s loan portfolio while keeping defaults (cumulative net losses) at their targeted levels. Notice we didn’t say while keeping defaults to a minimum. Risk management is a precise science. Once you set thresholds for acceptable loss volume (e.g., cum net losses), it’s the risk manager’s task to implement any number of AI tools, some fueled by ML models, to help maximize portfolio growth while keeping defaults at their targeted levels.
There is no room for exaggeration and hyperbole, despite what Hollywood and today’s marketers are touting about artificial intelligence. You can rest assured Lendisoft will always refrain from over-exaggerating the benefits of AI and ML in our advertising. That said, AI tools are a game changer, and ML models add speed and accuracy to your decision-making. If you’ve been thinking about how to modernize your operation with AI and ML models, look no further than Lendisoft Servicing. Lendisoft includes 19 scoring models and numerous AI tools right out of the box. We also include a risk management staff of analysts ready to assist you at a moment’s notice. We’ll help with model validations, ML model development, and various other risk management projects to help ensure you remain on the right path. Our unique blend of great software paired with risk management consulting, all in one package, can transform your operation very quickly. If you’re a small lender wanting the same competitive advantage that the enterprise lenders have, Lendisoft makes this possible at an affordable rate. Schedule a demo today!
Rick Haskell is the Founder & Chief Operations Officer at Lendisoft, a SaaS technologies company looking to disrupt the lending industry with its unique blend of enterprise software, risk management tools, and risk consulting services.