包阅导读总结
1. 关键词:
– Decathlon
– Software Architecture
– Architecture Committee
– C4 Model
– Architecture Decision Records (ADRs)
2. 总结:
迪卡侬的员工工程师 Raphaël Tahar 分享了大规模软件架构的经验,包括架构委员会的作用、C4 模型的应用、ADRs 的重要性等,还提到了通过 KPI 评估效果以及面临的挑战。
3. 主要内容:
– 迪卡侬软件架构经验分享
– Raphaël Tahar 介绍了共同领导的大规模架构过程的见解
– 支持众多工程师,涉及新系统设计、现有系统优化和全球指南对齐
– 架构委员会的作用
– 基于垃圾罐模型阐述其必要性
– 使命是提供支持和指导
– 通过 KPI 评估效果,如工程师满意度、SLOs 影响、DORA 指标
– C4 模型与系统思维
– 定义明确范围和边界以管理复杂性
– 结合 Reductionism 和 Holism 进行有效决策
– Architecture Decision Records (ADRs)
– 提供结构化方式捕获决策信息
– 用 Structurizr 存储,集中文档促进协作
– 面临的挑战与应对
– 确保组织与流程一致和工程师适时参与
– 调整以改进 KPI 需仔细监督和反馈
思维导图:
文章来源:infoq.com
作者:Eran Stiller
发布时间:2024/7/24 0:00
语言:英文
总字数:1261字
预计阅读时间:6分钟
评分:91分
标签:软件架构,C4 模型,ADRs,架构委员会,系统思维
以下为原文内容
本内容来源于用户推荐转载,旨在分享知识与观点,如有侵权请联系删除 联系邮箱 media@ilingban.com
Raphaël Tahar, staff engineer at Decathlon, recently published his insights from co-leading an architecture process at scale. In a 4-part blog post series, Tahar depicts how, by combining methodologies like architecture committees, the C4 model, and System Thinking and emphasizing the importance of ADRs and centralized documentation, Decathlon ensures its teams are well-equipped to make informed, strategic decisions.
He is part of a group supporting over 120 engineers across 23 feature teams comprising one domain out of1500+ engineersglobally at Decathlon. Supporting this scale of developers is no small feat and involves providing support with designing new systems, optimizing existing ones, and ensuring alignment with global guidelines. To tackle this, Decathlon established an architecture committee, which plays a crucial role in guiding teams through the intricate decision-making process.
Tahar explains the need for an architecture committee via the Garbage Can Model. Developed in the 1970s, this model describes organizational decision-making as a chaotic process where problems, solutions, and decision-makers exist in separate streams. These streams interact in unpredictable ways, much like items in a garbage can, leading to decision opportunities that emerge from this interplay.
However, according to Tahar, this “garbage can flow” is missing three items: decision alternatives, consequences, and consequences vs objectives.
Organizations must ensure that those 3 points are covered; if they aren’t, the chances for late projects, late no-gos, and backfires (like unreliable services causing penalties) will significantly increase, resulting in huge budget waste.
The Garbage Can Model and its missing items (source)
The Architecture Committee steps in to bridge this gap. Its mission is not to make architectural decisions for software engineers but to provide them with the necessary support and guidance, similar to the Advice Process. The committee assists engineers in defining and refining their problems and context. It involves the right stakeholders, identifying solutions, and searching for the most relevant alternatives. Its members understand the consequences of each solution, ensure its alignment with Decathlon’s technical strategy and guidelines, and document all of this in Architecture Decision Records and Diagram as Code through C4 representations.
Decathlon assesses the effectiveness of the architecture committee through several key performance indicators (KPIs). One primary measure is the engineers’ satisfaction, gauged using a Net Promoter Score (NPS) survey, which reflects the teams’ confidence in their ability to share knowledge and identify blind spots.
Additionally, the committee monitors the impact on Service Level Objectives (SLOs), focusing on metrics aligned with organizational priorities, such as reliability during peak periods. Furthermore, the committee evaluates its influence on DORA metrics, which include deployment frequency, lead time for changes, change failure rate, and time to restore service. These KPIs help determine the committee’s role in enhancing deployment efficiency and reducing operational issues.
Decathlon manages complexity in software architecture by defining explicit scopes and boundaries using the C4 model. Different methodologies are applied to reduce complexity depending on the problem’s boundaries. Reductionism, which breaks problems into smaller, manageable tasks, and Holism, which focuses on understanding interdependencies within the system, are both essential for effective decision-making. System Thinking combines these approaches to navigate the layers of interconnected problems and solutions.
Finally, Architecture Decision Records (ADR) provide a structured way to capture the context, the considered alternatives, decision outcomes, and their implications. This documentation ensures that decisions are traceable, facilitates knowledge transfer, and helps prevent the recurrence of past mistakes, as documenting architectural decisions is critical for maintaining coherence and continuity in large organizations. Decathlon stores ADRs using Structurizr, which supports “Diagrams as Code” and centralizes documentation to promote team collaboration and consistency.
Structurizr auto-generated website example (source)
InfoQ spoke with Tahar about Decathlon’s architecture process, associated challenges, and organizational impact.
InfoQ: What are some of the challenges in running the architecture committee? How do you ensure that decisions that should visit the committee for review do indeed arrive at its table?
Raphaël Tahar: We encounter two primary challenges in our operations. The first challenge involves ensuring that a large and diverse organization, encompassing various professions and experiencing turnover, remains aligned with our processes. This blog post series addresses this challenge to some extent.
The second challenge revolves around the informed involvement of a staff engineer at the appropriate moment to assist in calibrating a project during its functional and technical discovery phases. Given the sheer volume of initiatives in our domain, and with only five Staff Engineers at our disposal, it is not always feasible to involve them in every initiative.
Furthermore, the specializations of our staff engineers and the uneven workload distribution across emerging initiatives further complicate this. To address this, we rely on the responsible involvement of the VP of engineering, directors, group product managers, engineering managers, and tech leads to engage the committee as necessary.
InfoQ: In the article, you mention the domain committee’s KPIs. How do the KPIs reflect the committee’s work? How long is the feedback cycle, and how do you adjust the committee to improve the KPIs?
Tahar: Quantifying the impact of the architecture committee is challenging, primarily due to its transitive nature. The committee’s role is to influence teams, which impact their products, thereby affecting the business. To effectively evaluate the initial phases of a committee, it is crucial to oversee each step of the process carefully.
Gathering feedback from the engineering team allowed us to make necessary adjustments and bring clarity to nuanced aspects. For instance, we realized that the specific changes necessitating the committee’s attention lacked a precise definition, so we went through multiple iterations to arrive at an accurate delineation. This process also facilitated addressing minor yet crucial considerations, such as identifying the most relevant time slots and optimal pace for reviews, workshops, and touchpoints. Gathering this feedback is essential in the initial months of the committee and should then be requested quarterly.
Finally, we can typically asses the business impact within a period of six months to a year after decision-making. Decisions must be implemented and deployed, and products must run in production for a certain amount of time to witness an impact on business.
InfoQ: What have been the most significant impacts of implementing the C4 model at Decathlon, and how has it helped manage complexity and improve system understanding among engineers?
Tahar: The C4 model is inherently declarative. It requires teams to address and synchronize their mental models of the code, components, containers, and even the context of their applications. This led to valuable discussions where knowledge was shared, and beliefs were reconsidered.
It also assisted leadership teams in understanding the interdependencies among teams and external systems. In other words, it helped identify risks and visualize the landscape for organizational optimizations.
Lastly, as the saying goes, “A picture is worth a thousand words.” The C4 model makes cross-team and cross-domain discussions much more seamless by standardizing the capture and sharing of contexts.
InfoQ: What are the most significant benefits you have observed from using ADRs in your projects, and how do you ensure that these records remain relevant and valuable over time?
Tahar: ADRs provide immediate benefits by capturing the context within a written document, thus reducing ambiguities and misunderstandings. It also exposes any gaps or biases in the thought process when written down.
Considering the potential for a large number of ADRs with 23 teams, we want to avoid unnecessary noise. Therefore, the C4 repository should only contain C1 and C2 changes, while decisions regarding C3 and C4 should be integrated into the relevant codebases. This approach limits the number of ADRs, centralizing only the most critical decisions, which facilitates understanding the context and the rationale behind the evolution of products.