Banking Services Case Study

Large Bank – Card Auth Fraud Case Study

Company: Large Credit Card Issuer – US Co-Brand

Company launched a card product that was experiencing higher than expected fraud losses. They had a team who built fraud strategies using past experience. Due to several unique product features and lack of data to build strategies, there were several gaps. They were looking for support to evaluate their fraud authorizations and new account strategies against industry standards and best in class fraud controls.

PAG Solution:
PAG conducted a deep dive into the bank’s existing fraud data, reporting, authorization and fraud queuing systems, and detailed rules. Using deep industry experience and data and reporting available, PAG recommended rule changes to address emerging fraud trends and system enhancements to support next level fraud avoidance. In addition, PAG completed a deep dive of payment controls, recommended vendors and additional rules to reduce payment returns and first party fraud.

Client Benefits:
The changes PAG recommended drove:

  • $1.2MM+ in annual application fraud avoidance.
  • $13.2MM in annual fraud loss avoidance through incremental ring and bust out transaction rules (decline 35% $ fraud loss rate).
  • Incremental $46MM annually in low risk spend (approve 1.5% fraud $ loss rate)
  • Net impact $14MM fraud loss avoidance, $8MM incremental spend annually.
Banking Services

Machine Learning Keep

Analytical Decision Making: A Quick Guided Tour

How AI and Machine Learning can optimize business strategy in any industry
By Brian Kurz, Business Analyst, Predictive Analytics Group

With the advent of AI and machine learning, some of the most complicated problems in industry have been not only understood but optimized. How can your business utilize these new techniques, and why are they so important?

First, what is machine learning and why is it important?

Machine learning is defined as “the scientific study of algorithms and statistical models to perform a specific task.” Machine learning and AI when implemented in business systems can not only automate many business processes, but also bring to light unknown patterns, predictors, and take analysis to a whole new level. Used incorrectly; overfitting models or incorrectly characterizing the problem is frequent and is something even the most talented data scientists struggle with.

First let’s go through a general overview of Machine learning techniques, then we’ll go through some examples. Machine learning can be broken into 3 main groups with multiple subsets as specified in the chart below.

Supervised learning algorithms find the relationship of variables that has the most predictive power, based on historical data. The two methods are regression and classification methods.

  • Regression-based supervised learning methods try to predict outputs based on input variables.

Unsupervised learning algorithms attempt to understand the structure of data and to identify the main drivers behind it. This is based on clustering of variables, and factor input analysis.

Deep learning uses multi-layered neural networks to analyze trends and is an attempt to artificially recreate human intelligence.

  • Deep learning is used best for unstructured large data sets. A deep learning model could use a hypothetical financial data series to estimate the probability of a stock crash, or market correction.1

Reinforcement learning encourages algorithms to explore and find the most profitable trading strategies representative of AI.

This all sounds great, but what is a relevant example? Imagine a machine is given an entire set of returns from assets and must decide which of 200 variables are dependent and independent as well as their interactions. In this case using a deep learning algorithm would be appropriate and could lead to insights our minds couldn’t come to due to the complexity of a given data set.

Let’s use some of these techniques on an applicable problem: Many Universities base their cost structure on projections of Admissions yield (matriculation rate) or the expected number of students who will accept an offer of admission for the next class. If a University is not able to accurately predict the admissions yield within a certain error range, there could be significant negative impacts to the University. This is due to high fixed costs and expected expenses for students who do not accept their offer.23

Given an easily accessible data set4 from Kaggle (a data science collaboration platform), lets run some basic algorithms to see what insights we understand in predicting Admissions yield.

Since there are over 50 other variables in this data set, let’s run a random forest classifier to see which variables have the greatest impact on admissions yield. This is a common practice to cut down convoluted data sets with many columns that are often correlated/colinear with each other.

Random forests or random decision forests are an ensemble learning method for classification, regression and other tasks that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. Random decision forests correct for decision trees’ habit of overfitting to their training set.

1Butcher, Sarah. “JPMorgan’s Massive Guide to Machine Learning Jobs in Finance.” EFinancialCareers. May 05, 2018. Accessed May 05, 2019.
2Brad, Weiner. “Can Artificial Intelligence Automate College Admissions?” Capture Higher Ed. May 02, 2019. Accessed May 05, 2019.
3Baig, Edward C. “Who’s Going to Review Your College Applications – a Committee or a Computer?” USA Today. December 03, 2018. Accessed May 05, 2019.

In this random forest there is an aggregation of 1000 decision trees; one can be seen below (cut down to a max depth of 7 nodes)

We can also use the random forest model as a predictor. Using a cross validation training split, we can train the model and test it with a subset (around 10%) of our original data set.

For 153 tests, we predict with 78% accuracy the admissions yield based upon the 23 variables we are using in the total dataset.

Before we get ahead of ourselves, let’s do some basic exploratory data analysis for each variable and see if there are any obvious correlations before we run prediction algorithms.

As we can see, the average amount of applicants hover around 5,000, and this is in comparison to the average percent admitted which hovers around 75%.

Most freshman are receiving student loans

Total instate pricing hovers around $20,000 but also goes as high as $70,000

We can also see that the average university gives 75% of freshmen grant aid, and that the average price for out of state students is around $20,000 more than in state.

Many of these relationships are to be expected but will help us understand possible correlations.

Many variables are heavily correlated and most have to do with total applicants and enrolled total, something we would assume. For admissions yield there are few variables that seem to have a high impact besides percent admitted total, which is to be expected, but there seem to be high correlations with percent of students submitting ACT scores as well as graduation rate and cost of attendance.

We can run a few Machine learning methods to see how accurately they predict Admissions yield. Let’s start out with an unsupervised learning process K-Clustering.

K-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, allowing us to understand how our data clusters as well as run further analysis.

We first need to find out the optimal number of clusters to use; we can do this given an elbow curve. This will show us the optimal score based upon the number of clusters. As we can see these plateaus around 5 clusters for some, and around 7 or 10 for others.

Now we can visualize the relationship between the two variables with a few graphs and a KNN regression to model and fit the data.

Let’s try this with a few more variables as well:

We can see the different clusters and see the red dots in the K nearest neighbors’ algorithm with the same clusters in red to predict admissions yield using both a uniform and distance method.

These predict the general trend but not as well as we would like, each one scoring around 38% accuracy. We can try a basic linear case

The linear case also seems to be accurate, although there could be a higher prediction / score
Two other methods we can try are Linear Discriminant Analysis although this gives us a poor fit.

The last models we’ll use are a few Naive Bayesian methods: Naive Bayes is a simple technique for constructing classifiers: models that assign class labels to problem instances, represented as vectors of feature values, where the class labels are drawn from some finite set. These predictions also failed to predict well.

Overall there is much information we’ve been able to obtain from this basic and general analysis with a few algorithms. There is so much more that can be done whether finding the right variables to optimize prediction capabilities, and many other variables and trends to understand and predict. But this is dependent on the timeline, budget and depth of the model needed.

Although this dataset is within the higher education hemisphere, these methods can be applied to any industry, as long as the proper checks and balances are in place. It’s important to know your data, but even more important to continue to learn new ways of interacting with it so you can obtain the best results.

Out of all the methods that we have used I would suggest using the Random forest regressor; it has the highest prediction accuracy and accounts for the large scale of this data set as well.

Using these different methods we have identified a few variables of interest that seem predictive of Admissions yield; “Applicants total”, “Percent of freshmen submitting ACT scores”, “Percent of freshmen receiving federal student loans”, “Total price for out-of-state students living on campus 2013-14”, and “Percent of freshmen receiving institutional grant aid.” Many of these variables make sense; price and aid should be highly correlated especially if the university is expensive. Next steps in this analysis would include optimizing you models, cutting down the number of variables used based upon qualitative research or expert insights.

Implementing the science of mathematics, statistics, and state of the art algorithms in order to uncover patterns and insights can reduce time and increase savings exponentially. It’s critical to understand both your data and the why behind it, as with the absence of the right models, techniques and questions, the explanatory power of the data is greatly diminished. Data analysis is incredible powerful and important; pairing the art with this science will bring your analysis to the next level, keep you on the cutting edge of the industry, ensure that your models are applicable in the real world and improve your speed to market.

If you would like to learn more about how these methods can be applied in your business, please contact Predictive Analytics Group at 844-SEEK-PAG.

Banking Services

How to Start Your Own Business

Step 1: Set Goals

Define SMART goals. Are you trying to make a lot of money? Do you want to have freedom? Do you want to be creative? Make goals that are specific, measurable, achievable, relevant and time based.

Step 2: Do Research

What industry are you entering and who are the major players today? Who will buy your product or service? How much money are you going to need to begin?

Step 3: Create a Sample

Show your idea or sample product to 100 people. Do they like it? Do they find it useful? What suggestions do they have for improvement? Revise your product based on their feedback.

Step 4: Make it official

Create a name and register your company with the state and federal government. Obtain necessary permits and licenses. Purchase insurance.

Step 5: Plan

What is your strategy? How will you execute? What is your budget including expected expenses and realistic revenues? What is your expected cash flow? Set timelines for sales to hit targets.

Step 6: Evaluate Finances

Explore your options for financing your business. How much savings can you invest? Do you have “angel investors” or private parties, friends, relatives who can support you?
Other Sources: Credit Cards, Home Equity Loans, Savings, Personal Loans, Private Equity

Step 7:Execute

Create your business. Do you need an office or storefront? Do you need a website? At minimum, how many employees do you need? Make sure everything is set and ready.

Step 8: Measure and Grow

Test and learn from your mistakes. Expand your product offers. Think about social media. What can be improved? What will help you to better sell your product?

Banking Services

Risk Management

Risk is inherent in everything we do. Whether it be choosing to leave the house, turning on a burner, bungee jumping from a cliff, every action has the potential for something to go wrong. The same is true in business: business decisions carry risk. Sometimes it’s the expansion or growth that drives risk – but doing nothing is often a risk as well. A competitor could be leaping ahead while you are standing still. Inevitably there’s money that could have been made, if you had taken a chance.

Thus, strong risk management isn’t about stopping risk; it’s about managing risk effectively.

Effective risk management is a term that’s often thrown about as a sound bite. What does it really mean, and how do you get there?

  1. Balance risk and reward: Ensure you are getting paid (or value) commensurate with the risk. In finance this means setting pricing and fees at a point that balances future risk; in a casino, this means driving volume to offset wins or regulatory hassle; in health care this means driving enough timely value and customer satisfaction to offset the cost and regulatory headache.
  2. Measure small, miss small: Develop reporting and data infrastructure to measure and monitor the risk short term. Reporting at a more granular / marginal level soon after implementation drives faster response to problems and greater long-term savings. Many banks responded far too late during the Great Recession, primarily because they were monitoring the long-term blended performance. The short-term marginal performance gave early warning signals that were missed by many due to inadequate measures and reporting.
  3. Open Discussion and Debate: Robust discussion and debate of risks allows decision makers to be fully informed in making the risk / reward decisions. Hiding potential negatives may smooth the path to approval today, but they will usually come back to hurt the company (and your career) long-term. Identifying risks and mitigations before implementation ensures that risk management is effective. This process cannot be a rote repetition of last month’s risks. Critically thinking about what risks apply and what makes this decision or action different than others is just as important as thinking about how the decision and actions are the same. Reporting and measurements must be matched up to risks as mitigating factors that drive successful early identification and action on issues as they arise.
  4. Triggers: Of particular use in tracking performance vs. expectations, triggers are metrics that are established prior to implementation and set at expected levels with a small cushion. The cushion should consider the standard deviation / seasonal movement of the metric within normal conditions and be set to ensure that any unusual movement triggers for a response. Most often, these are established with red, yellow, and green indicators to show if a metric is within tolerance. Measures that go above the limit will be flagged for deeper analysis or explanation to management. This ensures that decisions are followed and monitored to ensure they perform as expected.
  5. Opportunity Cost: Decision making must also consider opportunity costs associated with taking or not taking an action. If you commit resources to this project, what other projects or needs are sidelined? If you don’t act on this opportunity, will your competitors steal market share? Will you lose customers?

In Ben Franklin’s words: “An ounce of prevention is worth a pound of cure.”


COVID-19: Back to Work

Has your business been drastically effected by the COVID-19 virus? Are you unsure of how your business needs to adapt to compete? If so, check out the latest whitepaper from PAG, “Optimizing Your Business Model – COVID ERA”. Let the experienced subject matter experts at PAG explain how the COVID Era has changed consumer mindset and more importantly, how you can evlove your business to thrive now and into the future. The answers are just a click away….


News Newsletters

Credit-Card Debt in U.S. Rises to Record $930 Billion

Serious delinquencies increase, particularly among younger borrowers

WASHINGTON—Credit-card debt rose to a record in the final quarter of 2019 as Americans spent aggressively amid a strong economy and job market, and the proportion of people seriously behind on their payments increased.

Total credit-card balances increased by $46 billion to $930 billion, well above the previous peak seen before the 2008 financial crisis, according to data released by the Federal Reserve Bank of New York on Tuesday.

Some cardholders, particularly younger ones, are running into trouble.

The proportion of credit-card debt in serious delinquency, meaning payments were late by 90 days or more, rose to 5.32% in the fourth quarter, the highest level in almost eight years, from 5.16% in the third quarter. The serious-delinquency rate for borrowers from 18 to 29 years old rose to 9.36%, the highest level since the fourth quarter of 2010, from 8.91%.

Read the full article here.

News Whitepapers

The Decade When Numbers Broke Sports

In the 2010s, data and analytics changed the way games are played—for better and worse

Brad Pitt (left) played Billy Beane in ‘Moneyball,’ which was released in 2011, at the beginning of a decade that would change sports forever. PHOTO: COLUMBIA PICTURES/EVERETT COLLECTION

By Ben Cohen, Jared Diamond and Andrew Beaton

It wasn’t long ago that baseball players still bunted, football coaches were unapologetically conservative and basketball teams doubted Stephen Curry. It was only the beginning of this decade.

But what happened over the last 10 years inside MLB ballparks, NFL stadiums and NBA arenas rendered the sports almost unrecognizable. The games barely resemble the previous iterations of themselves. They have been reinvented in front of our eyes.

Read the full article here.