STOR-i Annual Conference: Fraud Detection Using Machine Learning

I recently attended my first STOR-i Conference which took place on the 9-10th January 2020. During the conference we heard from several research leaders on cutting edge statistics and operational research topics, including professors from MIT, University of Oslo, University of Edinburgh, Columbia University, University of Southampton, Brunel University, Copenhagen Business School and University College Dublin. We also heard from current STOR-i students Henry Moss and Georgia Souli, who discussed their research, as well as, STOR-i alumni Tom Flowerdew who currently works for Featurespace and Ciara Pike-Burke who works as a researcher at the Univeritat Pompeu Fabra in Barcelona.

On the Thursday evening there was a poster session where 1st and 2nd year PhD students displayed their work. This was a valuable session as I was able to see the kind of projects they are working on before I choose my PhD topic later this term.

In this blog I will be discussing a talk I particularly enjoyed during the conference. This was STOR-i alum Tom Flowerdew’s presentation on Real-Time Fraud Detection. Tom was part of the second cohort of STOR-i students in 2011 where he completed a project funded by ATASS Sports, supervised by Prof Kevin Glazebrook, Dr Chris Kirkbride and Prof Jon Tawn. He now works at Featurespace, which is a world-leader in Adaptive Behavioural Analytics.

*Tom Flowerdew’s presentation at the 9th annual STOR-i Conference.*

The talk Tom gave during the conference was regarding how to score transactions on their risk of fraud.

Fraud detection needs to be deployed in real-time, as in we need to determine as the card is used whether it’s a fraudulent purchase or not. The speed of the scoring is therefore of high importance.

Before we go into the techniques in which transactions are scored, we first need a background in how payments actually occur. This is detailed in the diagram below. It begins with a cardholder using their card at a store. The store then passes the card information to an acquirer who then forward these details to card networks such as Visa or MasterCard. The card network then requests authorization from the bank who decide whether or not to approve based on the amount of money in the account.

*Payment process. Diagram based off the presentation given by Tom Flowerdew.*

If a cardholder believes they have been a victim to fraud they can raise a chargeback complaint to their bank. The bank then passes information back through the acquirer and merchant who decide whether to accept or dispute the fraud claim.

Featurespace try to spot fraud at all stages in the payment cycle using machine learning.

Machine Learning: Machine Learning is where machines can imitate the learning behavior of humans. That is, they can learn based on experiences, observations and analysis of data.

In this fraud detection application, we are interested in using machines to detect fraud and learn from the experience to further detect cases of fraudulent actions.

Tom and Featurespace are tackling this problem through the use of labels. They label cases based on whether or not they were fraudulent. They do so through cardholders going to their bank and claiming a chargeback. The bank then flags the transaction as risky.

These labels tend to be out of date as it may take some time for the cardholder to alert the bank of a case of fraud. For example, some people, particularly the elderly, do their banking through monthly statements rather than through online banking. This will obviously cause a delay in them noticing any odd transactions.

We could set a limit on how long you can wait to label a transaction, say set it to 35 days. But this means we are still 35 days out of date in terms of our data. In this time, fraudsters may change the way they commit their fraud and thus they may still go undetected. This will mean our fraud classifier will have an unpredictable performance. If we lower our limit to 20 days then we are lowering the number of correct labels (as any fraud detected after 20 days will be taken as genuine) and bias the fraud classifier towards detecting fraud for young people.

Another setback is that if a transaction is considered as fraud and then declined by the bank, the transaction won’t get a label and cannot be used in the data set. This will hamper the performance of the classifier as not all data is available.

Another interesting component of fraud detection that Tom mentioned was on the use of online banking. It is possible to compare mouse movement and clicks on an online banking site to the usual account user. If there are differences, it can be flagged up as fraud.

Tom’s talk on fraud detection was my first encounter of machine learning and I have to say it’s a field I found very interesting.

I look forward to attending more conferences in the future. For more information regarding events held by STOR-i including the conference, please visit https://www.lancaster.ac.uk/maths/about-us/events/.

References

Pachanekar, R., Tahsildar, S. (2019) Quant Insti: Machine Learning Basics (https://blog.quantinsti.com/machine-learning-basics/#what )

Posts

STOR-i Annual Conference: Fraud Detection Using Machine Learning

References

2 thoughts on “STOR-i Annual Conference: Fraud Detection Using Machine Learning”