包阅导读总结
1. 关键词:Microsoft Fabric、Real-Time Data、Analytics、Enhancements、Data Intelligence
2. 总结:
Microsoft 的 Fabric 端到端平台致力于消除数据分析各方面的客户摩擦,在 Build 开发者大会上宣布了其实时数据处理和分析能力的重大增强,包括组件整合、新数据源连接等,使平台更事件驱动,实时能力更易获取。
3. 主要内容:
– Microsoft Fabric 致力于消除客户在数据分析方面的摩擦
– 在 Build 开发者大会上宣布对实时数据处理和分析能力的增强预览
– 合并组件,如将 Synapse Real-Time Analytics 和 Data Activator 合并为统一的 Real-Time Intelligence 组件
– 连接新的流数据源,包括来自亚马逊和谷歌云的
– 推出新的实时仪表盘和可视化数据探索
– 引入 Real-Time hub,方便发现和整合流数据源
– 改进各个组件并紧密集成,使平台更事件驱动
– 提升流数据摄取和处理能力,可从更多源摄取
– 增强数据的可发现性
– 改进分析功能,Event house 技术普遍可用
– 提供可视化和无代码探索功能,如实时仪表盘、交互探索和 Copilot
– 更易创建数据驱动的触发器和警报,触发更多工作单元
思维导图:
文章地址:https://thenewstack.io/microsoft-fabric-goes-all-in-on-real-time-data-intelligence/
文章来源:thenewstack.io
作者:Andrew Brust
发布时间:2024/6/23 22:06
语言:英文
总字数:1786字
预计阅读时间:8分钟
评分:91分
标签:云服务,数据
以下为原文内容
本内容来源于用户推荐转载,旨在分享知识与观点,如有侵权请联系删除 联系邮箱 media@ilingban.com
With its Fabric end-to-end platform, Microsofthas been working hard to remove the friction for its customers from all aspects of data analytics.
In that pursuit, at its Build developer conference in Seattle today, the company announced the public preview of significant enhancements to Fabric’s real-time data processing and analytics capabilities.
These enhancements include the merging of Synapse Real-Time Analytics and Data Activator into the unified Real-Time Intelligence component; connectivity to a range of new streaming data sources, including those on Amazon Web Services and Google Cloud; new real-time dashboards and accompanying visual data exploration, and the introduction of the Real-Time hub, which makes streaming data sources more discoverable, and facilitates their integration with Fabric Lakehouses to address the disconnect between data-in-motion and data-at-rest.
A Step Further
Fabric’s Synapse Real-Time Analytics workload (module) has already done a lot to make real-time analytics easier, and more integrated with batch analytics, BI and machine learning. That module’s powerful Event stream abstraction, along with Fabric’s Data Activator component for data monitoring and alerting, and its KQL database (which has gradually been rebranded as “Event house“) had already created a pretty compelling solution. But each of the aforementioned Fabric components has been somewhat disjoint, and using them together has required some savvy and effort on the part of the customer. Ironically, it’s just that kind of burden on the customer that Fabric aims to eliminate, so the situation has been a bit of an anomaly.
In response, the Fabric team has worked to improve each of these components and integrate them more tightly, requiring less customer effort and expertise to use them together. As a result, the Fabric platform as a whole is now more event-driven, and its real-time capabilities are more accessible to business users and analysts. Microsoft has made improvements in the areas of streaming data ingest and processing; discoverability; analysis; visualization and no-code exploration; and event-driven triggers. Highlights of each area of improvement follow, and I conclude with some thoughts on what all this means for Microsoft’s competitive position in the data analytics arena.
Streaming Data Ingest and Processing
Already a powerful abstraction, Fabric Event streams can now ingest data from Amazon Kinesis Data Streams, Google Pub/Sub and even from Kafka topics on the Confluent Cloud platform, all in addition to the Azure Event Hubs and IoT Hub connectivity they already had. Event streams can now also ingest from a range of Microsoft’s change data capture (CDC) sources, including Azure SQL Database (the cloud implementation of SQL Server), Azure Cosmos DB, and the Azure implementations of the open source MySQL and PostgreSQL databases.
Finally, Event streams can receive event data from Azure Blob Storage (as well as Azure Data Lake Storage) and even from Fabric itself. While the last of these capabilities may seem somewhat niche, it’s actually quite significant. Responding to Fabric events means that the entire Fabric platform can become event-driven, enabling scenarios like ingesting data into the Lakehouse when new file arrives in cloud storage, or retraining a machine learning model when a Lakehouse is updated.

The new array of streaming data sources available in Fabric’s “Get Events” experience
The functionality within Event streams has itself been enhanced, as well. Transforming the data as it arrives is now easier; routing data based on such transformations, or filters, is now also possible. And creating “derived streams” based on the output of these transformations and filters, for further downstream consumption, has become trivial to implement. Eventstreams now also have distinct Edit and Preview modes. The Edit mode allows development to occur in a quasi-offline fashion that assures Event streams in production won’t be disrupted. Once everything has been sufficiently tested, the new or updated Event stream can be explicitly published.
Discoverability
The addition of the new Fabric Real-Time hub, alongside the previously implemented OneLake data hub, makes streaming data sources far easier to discover, consume and analyze. Separate lists for data streams, Microsoft sources, and Fabric events are provided. The data streams list includes both default and derived streams from Event streams, as well as tables in Event house databases. By default, these lists include everything to which the user has access, but filtering them is possible. Data streams can be filtered by workspace, owner, type (stream or table) or parent item (Event stream or Event house database) and the Microsoft sources list can be filtered by source type or by Azure subscription, resource group or region.
It bears mention that the ability to create derived streams, then surface them (and endorse them as recommended or certified) to downstream users in the Real-Time hub, essentially makes them available as data products. While doing so was always possible in Fabric with the more static data in Data Lakehouses, this way of sharing streaming data that has already been cleansed, transformed and curated is a powerful implementation of the data mesh methodology, especially when paired with Fabric’s ability to create organizational domains.
Analysis
While most of what’s being announced today is in public preview, Fabric’s Event house technology is now generally available (GA). Event houses are at once a re-branding of KQL databases, based on Microsoft’s powerful “Kusto” time series database technology, and at the same time an enhancement to their functionality and management tooling. Event houses allow multiple KQL databases to be used and managed together, enabling them to be federated, and treated as a kind of partitioning mechanism.
This is especially true since a single pool of compute can serve all of the individual KQL databases in an Event house. Also GA is the ability for Eventhouses/KQL databases to replicate their data into OneLake, allowing all of the other engines in Fabric, including Fabric Data Warehouses, Apache Spark, and Power BI, to query and analyze the accumulated streaming data.
Visualization and No-Code Exploration
But if youare going to stick with the Event house technology, you need a way to visualize the data, explore it and query it in an ad hoc fashion. And given that KQL is a separate query language from SQL, there can be learning curve impediments there. Microsoft aims to overcome these impediments through three new capabilities: real-time dashboards, interactive visual data exploration, and a special Copilot that can generate KQL from natural language questions.
The dashboards are extremely interesting, as they resemble Power BI reports, but are in fact based on different technology. There are a few reasons this makes sense. First, KQL databases (and their Azure Data Explorer and Synapse Analytics Data Explorer pool precursors) have long had the ability to produce their own visualizations — doing so is a built-in primitive of KQL. Taking that capability and extending it to combining multiple tiled, query-based visualizations together into dashboards makes sense. Second, analysis of time series data has its own semantics and demands specialized visualization. While basic ones like bar, column, pie, area, and line charts are part of the mix, so too are specialized viz types like time charts, anomaly charts and stat/multi-stat charts. Meanwhile, like their Power BI counterparts, real-time dashboards support cross-filtering and drillthrough.

A Fabric real-time dashboard, with tiled KQL visualizations
Furthermore, since each visualization in a real-time dashboard is based on a distinct KQL query, it becomes easier to take any one of them in isolation and open it up for iterative tweaking and modification (including the addition of filters, creation of aggregations, and switching of visualization types) without editing the underlying queries. This forms the basis of Fabric Real-Time Intelligence’s visual data exploration. Users can make these tweaks through a user-interface, and each modification manifests as a corresponding change to the underlying KQL query. This time series analysis-specific approach simply wouldn’t be possible in today’s Power BI, which is more focused on dimensional aggregation and drill down of tabular data.

Fabric’s real time data exploration experience, allowing no-code modification of KQL queries
If that’s still not good enough, Microsoft is launching a Copilot for Real-Time Intelligence, which is smart enough to take natural language (“plain english”) questions and produce KQL queries from them that can be pasted into a KQL Queryset editor and executed. This query-generation approach has the side effect of teaching KQL by example to less-technical users, enabling the power users among them to learn the language and eventually to write those queries from scratch, should they feel interested and able.
Triggers
The last piece to the Fabric Real-Time Intelligence puzzle is the ability to create data-driven triggers and alerts much more easily than before. Rather than having to go to the Data Activator user interface, Fabric users can create triggers and alerts in context, right as they’re editing streams and dashboard tiles or while they’re in the Real-Time hub. Each such action will create a new Fabric Reflex object in the workspace, and these objects can do more than before. In addition to the pre-existing ability to send alerts as emails or messages in Teams, triggers can now kick off true units of work, including data pipelines, notebooks, and Spark job definitions. That means all of these executable packages graduate from just running on-demand or on a scheduled basis to being able to run on an event-driven basis too.
Conclusion
For years I’ve been saying that working with real-time, streaming event data has been a segregated specialty within analytics. Real-time analytics has demanded its own platforms and skill sets, therefore often requiring distinct personnel to work with it. This has made the notion of “360-degree analytics,” be it for the customer experience, predictive maintenance or financial market analysis, challenging and often prone to failure. That’s always been frustrating, but in the age of AI, it’s become unacceptable.
Microsoft is working earnestly to close this gap. Whether this release really does that is up for debate. I happen to think there’s a lot more work to do, and that integration of the Eventstream, Data Activator/Reflex and Eventhouse technologies has more distance to travel. I’m also concerned that the divergence of the real-time dashboards from Power BI could get dicey.
But I believe Microsoft is thinking more seriously than most of its competitors about how to bring streaming event data and data-at-rest together, in a way that its usable and intuitive, to engineers, analysts and business users. And this isn’t just abstract strategy — the company is building and shipping things, and this set of enhancements to Fabric proves it.
Disclosure: Post author Andrew Brust is a Microsoft Data Platform MVP and member of Microsoft’s Regional Directors Program for independent influencers. His company, Blue Badge Insights [www.bluebadgeinsights.com], has done work for Microsoft, including for the parts of the Fabric team.
YOUTUBE.COM/THENEWSTACK
Tech moves fast, don’t miss an episode. Subscribe to our YouTubechannel to stream all our podcasts, interviews, demos, and more.