利用 MongoDB 向量搜索和 OpenAI 进行反洗钱和欺诈预防_AI阅读总结

包阅导读总结

关键词：Anti-Money Laundering、Fraud Prevention、Vector Search、OpenAI、MongoDB

总结：本文主要探讨了欺诈和反洗钱问题，指出传统方法的局限性，介绍了向量搜索在改进检测方面的作用，包括其发展阶段、在欺诈和反洗钱中的应用，强调了MongoDB结合向量搜索和OpenAI的优势，如消除手动特征工程、动态纳入新数据源等。

主要内容：

– 欺诈和反洗钱是企业和消费者关注的重大问题，传统方法存在局限性。

– 风险1.0阶段依赖手动和基于规则的系统，规则静态且缺乏适应性。

– 风险2.0阶段运用机器学习和预测建模，但面临特征工程开销和缺乏上下文等问题。

– 风险3.0阶段由向量搜索驱动，能解决之前的局限。

– MongoDB Atlas Vector Search的作用

– 是有效的向量数据库，结合实时分析和OpenAI嵌入，助力发现传统方法难以获取的洞察。

– 为欺诈和反洗钱创建不同的向量嵌入，根据嵌入分析交易。

– MongoDB用于反洗钱和欺诈预防的优势

– 统一数据平台，无需专门向量数据库，能处理各种数据。

– 结合向量搜索可构建智能应用、优化资源等。

思维导图：

文章地址：https://www.mongodb.com/blog/post/anti-money-laundering-fraud-prevention-mongodb-vector-search-openai

文章来源：mongodb.com

作者：Ainhoa Múgica

发布时间：2024/7/31 13:59

语言：英文

总字数：1529字

预计阅读时间：7分钟

评分：90分

标签：欺诈检测,反洗钱 (AML),MongoDB Atlas 向量搜索,OpenAI,机器学习

以下为原文内容

本内容来源于用户推荐转载，旨在分享知识与观点，如有侵权请联系删除联系邮箱 media@ilingban.com

Fraud and anti-money laundering (AML) are major concerns for both businesses and consumers, affecting sectors like financial services and e-commerce. Traditional methods of tackling these issues, including static, rule-based systems and predictive artificial intelligence (AI) methods, work but have limitations, such as lack of context and feature engineering overheads to keeping the models relevant, which can be time-consuming and costly.

Vector search can significantly improve fraud detection and AML efforts by addressing these limitations, representing the next step in the evolution of machine learning for combating fraud. Any organization that is already benefiting from real-time analytics will find that this breakthrough in anomaly detection takes fraud and AML detection accuracy to the next level.

In this post, we examine how real-time analytics powered by Atlas Vector Search enables organizations to uncover deeply hidden insights before fraud occurs.

Check out our AI resource page to learn more about building AI-powered apps with MongoDB.

The evolution of fraud and risk technology

Over the past few decades, fraud and risk technology have evolved in stages, with each stage building upon the strengths of previous approaches while also addressing their weaknesses:

Risk 1.0: In the early stages (the late 1990s to 2010), risk management relied heavily on manual processes and human judgment, with decision-making based on intuition, past experiences, and limited data analysis. Rule-based systems emerged during this time, using predefined rules to flag suspicious activities. These rules were often static and lacked adaptability to changing fraud patterns.
Risk 2.0: With the evolution of machine learning and advanced analytics (from 2010 onwards), risk management entered a new era with 2.0. Predictive modeling techniques were employed to forecast future risks and detect fraudulent behavior. Systems were trained on historical data and became more integrated, allowing for real-time data processing and the automation of decision-making processes. However, these systems faced limitations such as,

Feature engineering overhead: Risk 2.0 systems often require manual feature engineering.
Lack of context: Risk 1.0 and Risk 2.0 may not incorporate a wide range of variables and contextual information.

Risk 2.0 solutions are often used in combination with rule-based approaches because rules cannot be avoided. Companies have their business- and domain-specific heuristics and other rules that must be applied.

Risk 3.0: The latest stage (2023 and beyond) in fraud and risk technology evolution is driven by vector search. This advancement leverages real-time data feeds and continuous monitoring to detect emerging threats and adapt to changing risk landscapes, addressing the limitations of data imbalance, manual feature engineering, and the need for extensive human oversight while incorporating a wider range of variables and contextual information.

Depending on the particular use case, organizations can combine or use these solutions to effectively manage and mitigate risks associated with Fraud and AML.

Now, let us look into how MongoDB Atlas Vector Search (Risk 3.0) can help enhance existing fraud detection methods.

How Atlas Vector Search can help

A vector database is an organized collection of information that makes it easier to find similarities and relationships between different pieces of data. This definition uniquely positions MongoDB as particularly effective, rather than using a standalone or bolt-on vector database. The versatility of MongoDB’s developer data platform empowers users to store their operational data, metadata, and vector embeddings on MongoDB Atlas and seamlessly use Atlas Vector Search to index, retrieve, and build performant gen AI applications.

Watch how you can revolutionize fraud detection with MongoDB Atlas Vector Search.

The combination of real-time analytics and vector search offers a powerful synergy that enables organizations to discover insights that are otherwise elusive with traditional methods. MongoDB facilitates this through Atlas Vector Search integrated with OpenAI embedding, as illustrated in Figure 1 below.

Figure 1: Atlas Vector Search in action for fraud detection and AML

Business perspective: Fraud detection vs. AML

Understanding the distinct business objectives and operational processes driving fraud detection and AML is crucial before diving into the use of vector embeddings.

Fraud Detection is centered on identifying unauthorized activities aimed at immediate financial gain through deceptive practices. The detection models, therefore, look for specific patterns in transactional data that indicate such activities. For instance, they might focus on high-frequency, low-value transactions, which are common indicators of fraudulent behavior. AML, on the other hand, targets the complex process of disguising the origins of illicitly gained funds. The models here analyze broader and more intricate transaction networks and behaviors to identify potential laundering activities. For instance, AML could look at the relationships between transactions and entities over a longer period.

Creation of Vector Embeddings for Fraud and AML

Fraud and AML models require different approaches because they target distinct types of criminal activities. To accurately identify these activities, machine learning models use vector embeddings tailored to the features of each type of detection.

In this solution highlighted in Figure 1, vector embeddings for fraud detection are created using a combination of text, transaction, and counterparty data. Conversely, the embeddings for AML are generated from data on transactions, relationships between counterparties, and their risk profiles. The selection of data sources, including the use of unstructured data and the creation of one or more vector embeddings, can be customized to meet specific needs. This particular solution utilizes OpenAI for generating vector embeddings, though other software options can also be employed.

Historical vector embeddings are representations of past transaction data and customer profiles encoded into a vector format. The demo database is prepopulated with synthetically generated test data for both fraud and AML embeddings. In real-world scenarios, you can create embeddings by encoding historical transaction data and customer profiles as vectors.

Regarding the fraud and AML detection workflow, as shown in Figure 1, incoming transaction fraud and AML aggregated text are used to generate embeddings using OpenAI. These embeddings are then analyzed using Atlas Vector Search based on the percentage of previous transactions with similar characteristics that were flagged for suspicious activity.

In Figure 1, the term “Classified Transaction” indicates a transaction that has been processed and categorized by the detection system. This classification helps determine whether the transaction is considered normal, potentially fraudulent, or indicative of money laundering, thus guiding further actions.

If flagged for fraud: The transaction request is declined.
If not flagged: The transaction is completed successfully, and a confirmation message is shown.

For rejected transactions, users can contact case management services with the transaction reference number for details. No action is needed for successful transactions.

Combining Atlas Vector Search for fraud detection

With the use of Atlas Vector Search with OpenAI embeddings, organizations can:

Eliminate the need for batch and manual feature engineering required by predictive (Risk 2.0) methods.
Dynamically incorporate new data sources to perform more accurate semantic searches, addressing emerging fraud trends.
Adopt this method for mobile solutions, as traditional methods are often costly and performance-intensive.

Why MongoDB for AML and fraud prevention

Fraud and AML detection require a holistic platform approach as they involve diverse data sets that are constantly evolving. Customers choose MongoDB because it is a unified data platform (as shown in Figure 2 below) that eliminates the need for niche technologies, such as a dedicated vector database.

What’s more, MongoDB’s document data model incorporates any kind of data—any structure (structured, semi-structured, and unstructured), any format, any source—no matter how often it changes, allowing you to create a holistic picture of customers to better predict transaction anomalies in real time.

By incorporating Atlas Vector Search, institutions can:

Build intelligent applications powered by semantic search and generative AI over any type of data.
Store vector embeddings right next to your source data and metadata. Vectors inserted or updated in the database are automatically synchronized to the vector index.
Optimize resource consumption, improve performance, and enhance availability with Search Nodes.
Remove operational heavy lifting with the battle-tested, fully managed MongoDB Atlas developer data platform.

Figure 2: Unified risk management and fraud detection data platform

Given the broad and evolving nature of fraud detection and AML, these areas typically require multiple methods and a multimodal approach. Therefore, a unified risk data platform offers several advantages for organizations that are aiming to build effective solutions. Using MongoDB, you can develop solutions for Risk 1.0, Risk 2.0, and Risk 3.0, either separately or in combination, tailored to meet your specific business needs.

The concepts are demonstrated with two examples: a card fraud solution accelerator for Risk 1.0 and Risk 2.0 and a new Vector Search solution for Risk 3.0, as discussed in this blog. It’s important to note that the vector search-based Risk 3.0 solution can be implemented on top of Risk 1.0 and Risk 2.0 to enhance detection accuracy and reduce false positives.

If you would like to discover more about how MongoDB can help you supercharge your fraud detection systems, take a look at the following resources:

Add vector search to your arsenal for more accurate and cost-efficient RAG applications by enrolling in the DeepLearning.AI course “Prompt Compression and Query Optimization” for free today.

分类

利用 MongoDB 向量搜索和 OpenAI 进行反洗钱和欺诈预防_AI阅读总结 — 包阅AI