"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

January 09, 2022

Fraud Detection Research Papers

Fraud Detection Research Papers

Paper #1 - Credit Card Fraud Detection in e-Commerce: An Outlier Detection Approach

Notes

  • No prior knowledge of outliers or inliers is needed
  • The proposed algorithm is easy to scale as it can easily be implemented in a distributed manner
  • Proposed algorithm is general in nature and does not require k-means algorithm as the only base clustering algorithm. 
  • Can estimate a measure of consistent behavior (good behavior) for each data point then we can identify outliers as data points with low consistency score.
  • Attempt the problem of outlier detection by estimating a consistency score
  • In our experiments we found that incrementally increasing k with a xed step works just as well as the ensemble created by carefully selecting k using a principled approach such as Silhouette Score
  • For #Fraud #detection with limited dataset, Algorithms to get started to find potential transactions #IsolationForecast, #OneClassSVM, #Clusteringbasedoutlierdetection

Paper #2 - A Comparison Study of Credit Card Fraud Detection: Supervised versus Unsupervised

Notes

  • 6 supervised classification models, i.e., Logistic Regression (LR), K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Decision Tree (DT), Random Forest (RF), Extreme Gradient Boosting (XGB)
  • 4 unsupervised anomaly detection models, i.e., One-Class SVM (OCSVM), Auto-Encoder (AE), Restricted Boltzmann Machine (RBM), and Generative Adversarial Networks (GAN)
  • Supervised Learning Methods
  • Logistic regression allows us to estimate the probability of a categorical response based on one or more predictor variables x.
  • KNN algorithm essentially boils down to forming a majority vote between the K most similar instances to a given unseen observation
  • SVM is to derive an optimal hyperplane that maximizes the margin between two classes
  • Decision trees are simple but intuitive models that utilize a top-down approach in which the root node creates binary splits until a certain criteria is met
  • XGB uses gradient descent for optimization to improve the predictive accuracy at each optimization step by following the negative of the gradient as we are trying to find the sink in a n-dimensional plane

Unsupervised Learning Methods

  • OneclassSVM - The algorithm learns a soft boundary in order to embrace the normal data instances using the training set, and then, using the testing instance, it tunes itself to identify the abnormalities that fall outside the learned region
  • RBM model consists of visible and hidden layers, which are connected through symmetric weights. The objective of the generative training in RBM is to learn the unknown (h) iteratively using the input (x).
  • An auto-encoder (AE) learns to map from input to output through a pair of encoding and decoding phases
  • GAN AnoGAN by simultaneously learn an encoder E that maps input samples x to a latent representation z, along with a generator G and discriminator D during training.

Paper #3 - xFraud: Explainable Fraud Transaction Detection

Key Notes

  • Fraudster user detection
  • Fraud transaction detection
  • Methods that do not need to define meta-paths a priori, instead are able to automatically learn these patterns using a GNN.

  • xFraud detector. We are inspired by Transformer [39] and HGT [18], when designing the xFraud detector incl. heterogeneous mutual attention and heterogeneous message passing with key, value, and query vector operations (self-attention mechanism).

Paper #4 - TitAnt: Online Real-time Transaction Fraud Detection in Ant Financial

Key Notes

  • Rule-based methods have been extensively studied over the years [46] for fraud detection problem
  • several unsupervised learning and anomaly detection methods are introduced
  • Recurrent neural network to exploit temporal information of account behavior
  • Anomaly detection methods, such as isolation forest sheds light on fraud detection tasks



Paper #5 - A Comprehensive Survey on Machine Learning Techniques and User Authentication Approaches for Credit Card Fraud Detection

Key Notes

  • Combination of Hidden Markov Model (HMM) and K-Means algorithms was used in (Kumari and Choubey, 2017) to identify the fraudulent activities on credit cards
  • A transaction is considered suspicious if its distance to the center of the cluster exceeds a pre-set threshold
  • Self-Organizing Map (SOM) is an unsupervised neural network learning model, which has been used to form customer profiles and visualize fraudulent patterns

Paper #6 - A Survey of Credit Card Fraud Detection Techniques: Data and Technique Oriented Perspective

Key Notes

  • A Hidden Markov Model is a double embedded stochastic process which is applied to model much more complicated stochastic processes as compared to a traditional Markov model
  • Genetic algorithms have been used in data mining tasks mainly for feature selection. 
  • A Bayesian network is a graphical model that represents conditional dependencies among random variables. The underlying graphical model is in the form of directed acyclic graph


More Reads

Keep Exploring!!!

No comments: