包阅导读总结
1.
关键词:CrowdStrike、Windows、更新故障、全球影响、解决措施
2.
总结:美国网络安全公司 CrowdStrike 的产品更新导致全球约 850 万台 Windows 电脑故障,影响众多用户和企业。问题源于更新与 Windows 系统文件冲突,CrowdStrike 已停止更新并着手解决,微软也提供了恢复工具。
3.
– 事件
– CrowdStrike 产品更新致使全球约 850 万台 Windows 电脑无法启动。
– 受影响的是 CrowdStrike 的 Falcon 代理核心组件。
– 问题源于更新与 Windows 特定系统文件冲突,仅 Windows 受影响。
– 各方观点
– Reddit 上有人称 CrowdStrike 强制推送不可跳过的更新。
– Hacker News 上有人认为这是多方的多层级失败。
– 应对措施
– CrowdStrike 停止更新并提供修复指导。
– 微软发布恢复工具协助修复。
– CrowdStrike 承诺将透明化此次事件并防止再发生。
思维导图:
文章来源:infoq.com
作者:Steef-Jan Wiggers
发布时间:2024/7/23 0:00
语言:英文
总字数:583字
预计阅读时间:3分钟
评分:86分
标签:网络安全,Windows,CrowdStrike,软件更新失败,系统兼容性
以下为原文内容
本内容来源于用户推荐转载,旨在分享知识与观点,如有侵权请联系删除 联系邮箱 media@ilingban.com
CrowdStrike, an American cybersecurity technology company, recently released a product update that bricked an estimated 8.5 million computers running Windows globally, affecting businesses, individual users, and software companies. CrowdStrike provides cloud workload protection, endpoint security, threat intelligence, and cyberattack response services to secure critical areas of risk and prevent hackers’ breaches.
The problematic update affected the core components of CrowdStrike’s Falcon agent, a critical piece of software designed to detect and prevent security threats. According to initial investigations, the issue stemmed from a conflict between the update and specific low-level system files in Windows. Other operating systems, such as Mac and Linux, were not impacted.
Specifically, the update caused an incompatibility with the Windows kernel, the core part of the operating system responsible for managing hardware and system resources. This incompatibility led to a failure in the boot sequence, resulting in what is commonly known as a “bricked” machine — a device that cannot start up or function.
On one of the many Reddit threads, a respondent explained:
Crowdstrike pushed an “unskippable” update to all of their phone-home endpoints. Anyone set with an N-1 or N-2 configuration (where N represents the most recent version of the software, and the -# is how many versions behind someone chooses to be) had that option ignored.
This is logical for this product in some sense. A 0-day fix needs to be propagated immediately. Being N-1 on a 0-day is not wise.
Everyone believed that CrowdStrike was doing its due diligence in staging before pushing it out to the rest of the world. Obviously, someone in CrowdStrike skipped a step. Whatever approval/implementation system they used failed them. Anyone using the CrowdStrike program got the update and died. “Blue Screen of Death (BSOD) as a Service.”
In addition, a respondent on a Hacker News thread wrote:
This is a global multi-layer failure: Microsoft allowing kernel mods by third-party software, CrowdStrike not testing this, DevSecOps not doing a staged/canary deployment, half the world running the same OS, things that should not be connected to the internet but are by default. Microsoft and CrowdStrike drove a horse and a cart through all redundancy and failover designs and showed very clearly where no such designs were in place.
CrowdStrike responded swiftly by halting the update’s rollout and working on a patch to resolve the issue. The company provided detailed instructions for affected users to restore functionality, including booting into safe mode and uninstalling the problematic update – which means a lot of work. In a Reddit thread of the CrowdStrike BSOD issue, a respondent wrote:
I am very interested in the scale of resolving this globally because if it’s causing hardware to boot-loop with BSODs, you won’t be able to deploy a patch/ script to fix it. We’re going to have to go to every boot-looping machine and manually fix it!
Furthermore, Microsoft released a recovery tool to help IT admins with the repair process.
Shyam Sundar, Cloud Architect at Novac Technology Solutions, concluded in a Medium blog post on the details of the BSOD disaster with CrowdStrike:
This has been a disaster of monumental proportions for many businesses worldwide. We are yet to see what measures companies will take to prevent such incidents from happening again. Some A/B testing or staggered rollout would likely have prevented such a massive outage.
Lastly, CrowdStrike Founder and CEO George Kurtz stated that the company will provide full transparency on how this occurred and the steps to prevent anything like this from happening again.