Posted in

利用超级图谱让数据治理自动化不再那么糟糕_AI阅读总结 — 包阅AI

包阅导读总结

1. 关键词:数据治理、超级图、GraphQL、数据网格、自动化

2. 总结:本文探讨了数据治理的困难,介绍了超级图和GraphQL在改善数据治理方面的作用,强调超级图的优势和价值,认为利用其架构实现数据治理目标具有战略意义和投资回报。

3. 主要内容:

– 数据治理通常很困难,成本高且不令人满意,面临诸多挑战,如跟上法规变化等。

– 超级图是一种具有特定架构和运行模式的技术,有助于设计、构建、维护和运营数据域。

– 其结合了操作模型和架构模式,能改进数据交付包括数据治理等方面。

– 典型的数据网格模式中,超级图模式将 APIs 和数据相链接,强 API 治理结合超级图的元数据很关键。

– GraphQL 是超级图范式的核心,若管理得当能为数据治理提供价值。

– 超级图架构有众多优势,利用其实现数据治理目标是战略之举,能降低复杂度、成本,提高元数据准确性。

思维导图:

文章地址:https://thenewstack.io/make-data-governance-automation-suck-less-with-a-supergraph/

文章来源:thenewstack.io

作者:Ken Stott

发布时间:2024/8/5 19:39

语言:英文

总字数:1170字

预计阅读时间:5分钟

评分:89分

标签:数据治理,超级图谱,GraphQL,API Governance,自动化


以下为原文内容

本内容来源于用户推荐转载,旨在分享知识与观点,如有侵权请联系删除 联系邮箱 media@ilingban.com

When it comes to data governance, my colleague Tina Sebert has channeled the fiery and controversial Fight Club character Tyler Durden. (And before you ask, no, she wasn’t punching co-workers in the face). She said: “The first rule of data governance is: Don’t talk about data governance.”

Why? Because it is notoriously difficult, expensive and not always satisfying. It can sometimes feel like filing your state and federal taxes, but year-round.

Why Data Governance Usually Sucks

Data governance involves managing an organization’s data through technology, processes and people. It ensures that data is secure, accurate, available and suitable for decision-making. It also involves adhering to enterprise policies and relevant laws, rules and regulations. One significant challenge is keeping up with ever-changing legal and regulatory requirements and the organization’s evolving data environment.

Data governance is a tough job — even if you just stop there.

However, a successful data governance team also understands how data enters the organization, who owns it, how it is transformed, where it is stored, how it is transported and who consumes it and why.

There are a lot of moving parts, a lot of opportunity for error and steep consequences for not getting it right. (Not to mention a consistent mandate from management to find faster and more cost-effective ways to manage all of this.)

So yes, the first rule of data governance is: Don’t talk about data governance. And the second rule of data governance is: Don’t talk about data governance.

In my experience, leveraging Supergraph makes addressing these concerns more manageable while providing clear business value. In other words, focus less on compliance and focus more on its business value.

What Is a Supergraph?

Supergraphs aren’t brand new to the data landscape but they are still relatively unknown. A Supergraph has two dimensions: an architecture and an operating model that facilitates designing, building, maintaining and operating a collection of data domains as a unified graph of composable entities and operations.

(If you’re curious to learn more, Tanmai Gopal, Hasura CEO and co-founder, is a vocal champion of the supergraph and synthesizes the architecture and its benefits really well in The Supergraph Manifesto.) The Supergraph architectural pattern combines an operating model and an architectural pattern to produce a powerful, virtuous cycle that you can leverage to improve many aspects of data delivery, including data governance.

Supergraph architectural pattern combines an operating model and an architectural pattern to produce a powerful, virtuous cycle that you can leverage to improve many aspects of data delivery, including data governance.

A typical data mesh pattern that incorporates the Supergraph pattern might look this:

A typical data mesh pattern that incorporates the Supergraph pattern

Although these domains are illustrated using a medallion architecture, there is sufficient freedom to architect and optimize to local needs, including using a Supergraph architecture, data lakehouse, data warehouse, virtual databases, or other means to define and manage data products.

APIs play a prominent and critical role in an organization’s data consumption and, by proxy, are essential in ensuring your data governance is sound.

Countless articles on data mesh architectures make the same point: The Supergraph pattern links APIs and data. Strong API governance, combined with harvesting and using the metadata produced by the Supergraph, is critical to getting insight into consumption, establishing feedback loops and developing self-correcting processes.

So, by my math, if you want good data governance, you need a good understanding of your data access and consumption, which means you better have governance over your API production, maintenance and consumption.

To run with an analogy, if data governance is heart health for your business, then data is your bloodline and APIs are your arteries. Do you want heart health? You had better think about your cholesterol, eat your Cheerios, and pay closer attention to your APIs.

Good data governance, good API governance and strong links between their platforms are essential in creating a robust, secure digital foundation.

Enter GraphQL to Accelerate Data Governance Automation

At the core of the Supergraph paradigm is a unified semantic layer, expressed in GraphQL SDL, combined with tooling that secures access to that model and provides observability and usage logging.

The value of GraphQL is sometimes associated with the value of the existing runtime engines and transport protocols, such as the Apollo Server on HTTP and Node.js. However, the GraphQL standard is technology-agnostic. Its design allows delivery on other transports, languages and environments, which means it can cater to new use cases with different connectivity, data volume or performance requirements.

GraphQL sometimes gets a bad rap, but that’s because organizations think of it as a tactic rather than a strategy. In the context of data governance, it can be very valuable. If managed, it can be the Rosetta Stone for your data, providing platform-agnostic data definitions and relationships that can be the contract between data producers and data consumers, as well as the metadata used to operate your data access platform.

But wait. Isn’t GraphQL hard? That’s why it has the reputation it has. Here is my pro-tip, (and if I were a consultant, I could charge you a lot of money for this advice): Automate your GraphQL. This allows you to quickly build a composable data access layer and create the backbone of an automated data governance strategy.

The GraphQL tooling landscape has quietly become very advanced, and many of those tools allow you to reap GraphQL’s benefits without the usual building pains or need for developer expertise. (I use Hasura, but there are plenty of other options in the market; that’ll cost you another imaginary consultancy fee.) These tools enabled us to simplify the development and operation of GraphQL endpoints from existing data sources and quickly achieve a composable data layer, aka a supergraph.

The ‘Sucks Less’ Part

The Supergraph architecture has four important things going for it:

  1. It uses a standard that emphasizes establishing a composable unified semantic layer based on a human- and machine-readable metadata format that aligns with data governance objectives and facilitates automation.
  2. It has a savvy standards committee that thoughtfully refines the underlying standard without hindering the creativity of practitioners.
  3. It has a vibrant vendor and development community that churns out creative new tooling and advancements at a rapid pace.
  4. It’s needed. Data mesh architecture makes sense because it aligns with the real world. That requires a way to organize data products, govern access and satisfy self-service requirements, which Supergraph provides.

GraphQL is resilient enough, specific enough and open-ended enough — a Goldilocks standard. This foresight gives Supergraph the potential to be at the top of the data-delivery stack, incorporating virtually all data transport mechanisms beneath it, including remote procedure call (RPC) frameworks (like OpenAPI, tRPC or gRPC).

Maybe You Can Talk About Data Governance

What does all this mean? Leveraging the Supergraph architecture to achieve data governance objectives is a strategic play that delivers a solid return on investment. It reduces complexity, lowers costs and improves metadata accuracy. It also allows people who find themselves in the data governance club to talk about data governance — with a solid business case and real payback.

YOUTUBE.COM/THENEWSTACK

Tech moves fast, don’t miss an episode. Subscribe to our YouTubechannel to stream all our podcasts, interviews, demos, and more.

GroupCreated with Sketch.