Posted in

为什么内部开发者门户需要自动化_AI阅读总结 — 包阅AI

包阅导读总结

1. 关键词:Internal Developer Portals、Automations、Workflow Management、Software Development、Policy Enforcement

2. 总结:本文论述了内部开发者门户需要自动化的原因,介绍了自动化能做的事,如消除重复任务、提供高效警报与通知、执行策略等,通过实际场景展示其作用,并解释了自动化的工作原理,强调其对管理软件开发生命周期的重要性。

3. 主要内容:

– 为何需要自动化:软件开发复杂,内部开发者门户要发挥更好作用需连接各支柱实现自动化,以优化工作流程、减少人工干预和维护组织规范。

– 自动化能做什么:

– 消除重复任务:处理清理、权限管理等,避免错误和安全风险。

– 高效警报和通知:为开发者、经理、安全和 SRE 团队提供相关信息。

– 政策执行:设定资源消耗限制、即时访问等。

– 实际应用:事件管理:对比有无自动化的凌晨紧急警报处理情况,展示自动化的优势。

– 自动化工作原理:由触发事件和执行动作组成,触发时自动执行相关动作。

– 结论:自动化重新定义软件开发生命周期的管理,实现端到端流程编排,让团队更高效工作。

思维导图:

文章地址:https://thenewstack.io/why-internal-developer-portals-need-automations/

文章来源:thenewstack.io

作者:Zohar Einy

发布时间:2024/6/20 17:14

语言:英文

总字数:1166字

预计阅读时间:5分钟

评分:81分

标签:赞助-端口,赞助-帖子-贡献


以下为原文内容

本内容来源于用户推荐转载,旨在分享知识与观点,如有侵权请联系删除 联系邮箱 media@ilingban.com

Software development is inherently complex, involving numerous stakeholders, tools and steps. Effective workflow management is key to getting it right, whether it’s ensuring developers can self-serve efficiently or managers can uphold standards seamlessly. Linking the steps within the workflow can provide a better developer experience, and lead to a more efficient software development life cycle (SDLC).

An internal developer portal provides a better developer experience and yields better outcomes for platform engineering teams. But to get the best out of the portal’s pillars (the software catalog, self-service actions, scorecards and dashboards), you need a way to connect the pillars, automating the entire process.

This is why internal developer portals need automations. Automations provide the process orchestration required to create seamless workflows, reduce manual intervention and maintain organizational guardrails. They help engineering teams to derive even more value from their internal developer portals.

Let’s break down what automations are and what you can do with them.

What Automations Can Do

Eliminate Repetitive Tasks

Automations can handle tasks like clean-ups, permission management and terminating unused development environments. Traditionally, these tasks require manual oversight, ad-hoc scripts or cron jobs, leading to errors and security risks. For example, neglecting to revoke access for a former employee could lead to the exposure of sensitive data. Similarly, failing to terminate an unused development environment could result in unnecessary cloud expenses.

Automations can ensure access is revoked for ex-employees or automatically terminate environments after a predefined time-to-live (TTL) expires.

Efficient Alerts and Notifications

Automations improve alert management by making sure the right information gets to the relevant person (or people) at the optimal time. Here are some examples:

Alerts and Notifications for Developers

Automations provide developers with relevant information from the catalog, helping them to complete tasks with the context they need in-hand.

You can notify developers of:

  • Pending pull request reviews, nudging them when the request has been in review for too long
  • Deployment status updates, including successful and failed deployments, enabling them to quickly react and identify issues
  • Changes to dependencies when services or components are modified and their dependencies are affected.

Alerts and Notifications for Managers

Automations provide managers with the information they need to better understand and manage their team’s performance and goals. They’ll be automatically provided with a link to the relevant section in the software catalog, allowing them to quickly identify any issues.

Some examples include being informed of unmet service-level objectives (SLOs), performance degradation or escalating cloud costs.

Alerts and Notifications for Security and SREs

Alerting security and site reliability engineering (SRE) teams enables them to swiftly respond to critical issues.

For example, alerting the security team about a critical vulnerability that affects a high-priority asset, or notifying SREs of unusual patterns of system behaviors, enabling rapid response and mitigation.

Policy Enforcement

There’s more to automation than doing away with mundane tasks, speeding up processes or sending out alerts. Automation can also help enforce organizational policies consistently by integrating these policies directly into your development workflows.

You can set:

  • Resource consumption limits: Trigger approval workflows when a developer exceeds resource limits; for example if they attempt to create a third development environment in a specific month. This can also depend on other factors; for instance, granting automatic approval if the developer has been with the company for more than two years.
  • Just-in-time access: Automatically grant and revoke production access for on-call engineers during their shifts.

Real-World Application: Incident Management

Consider a 3 a.m. critical alert scenario.

Without automations, the on-call engineer will have to:

  • Assess impact, including the severity of the issue and the impact on staging and production.
  • Coordinate with stakeholders: They would usually open an incident ticket and trawl through outdated documentation to try and identify owners, dependencies and stakeholders.
  • Investigate the issue: They will immediately be put on the back burner as information is spread across numerous tools and dashboards. After looking at infrastructure health, Kubernetes logs and recent deployments, they find that a recent deployment is causing the issue as it is causing a spike in memory usage.
  • Seek approvals for remediation: When a rollback is required, the change request can take a long time to approve.
  • Resolve and communicate: They deploy a rollback and the system returns to normal. Now the engineer has to manually update stakeholders, with an overview of the incident, its impact and the resolution steps taken.

All in all, the process takes a number of hours; in fact, the engineer has to get ready to head to work again as it’s the start of a new working day.

This is a stressful way of dealing with incidents; it’s inefficient, can lead to errors along the way and it puts far too much onus on the on-call engineer.

Now let’s revisit that 3 a.m. scenario, but with automations integrated into the process:

  • Swift, targeted alerts: The critical alert triggers an automation that immediately notifies the designated on-call engineer, emphasizing the high-priority nature of the issue.
  • Seamless incident creation: The automation promptly opens an incident in the incident management system such as PagerDuty, Opsgenie, populating it with essential details from the software catalog.
  • Streamlined communication: An incident-specific Slack channel is automatically set up, inviting all relevant service owners, team members and stakeholders. This eliminates the need for frantic searches to gather the right people.
  • Centralized investigation: The on-call engineer receives a direct link to the portal catalog within the incident channel, enabling quick access to asset dependencies, recent deployments and monitoring dashboards. They immediately pinpoint a memory spike following a recent deployment.
  • Efficient self-service remediation: With all necessary information at hand, the engineer initiates a rollback via a self-service action in the portal. Given their on-call status, the rollback is automatically approved, and the automation also ensures their production access is revoked once their shift ends.
  • Automated resolution and communication: Upon successful completion of the rollback and the system being stabilized, the automation updates the incident status in both the Slack channel and the incident management system, notifying all stakeholders and closing the loop.

By 3:30 a.m., the engineer has effectively resolved the incident, conserving precious time and minimizing the business impact. They can return to sleep, confident that the system is stable and all parties are informed.

How Automations Work

Automations consist of triggers (events in your software catalog) and actions (logic executed when triggers occur). Supported actions include webhook, GitHub workflows, GitLab pipelines, Terraform Cloud and Azure Pipelines, among others. When a trigger event occurs, the portal automatically executes the associated action, ensuring seamless and scalable process orchestration.

Conclusion

Automations redefine how you manage your SDLC, from automating tedious tasks to delivering intelligent alerts and enforcing policies. They provide a cohesive, end-to-end process orchestration that empowers your teams to work smarter and more efficiently.

Want to see how automations for your internal developer work? Book a demo with Port, here.

YOUTUBE.COM/THENEWSTACK

Tech moves fast, don’t miss an episode. Subscribe to our YouTubechannel to stream all our podcasts, interviews, demos, and more.

GroupCreated with Sketch.