Posted in

通过亚马逊 S3 条件写入提升分布式系统数据完整性_AI阅读总结 — 包阅AI

包阅导读总结

1. 关键词:Amazon S3、Conditional Writes、Data Integrity、Distributed System、AWS

2. 总结:

AWS 宣布 Amazon S3 支持条件写入,可防止覆盖已有对象,简化分布式应用数据更新,提升性能和效率,无额外费用,适用于大规模分析等工作,在各区域可用,有示例代码。

3. 主要内容:

– Amazon S3 新增条件写入功能

– 允许用户在创建对象前检查其是否存在

– 防止上传数据时覆盖现有对象

– 对分布式应用的影响

– 简化多客户端更新共享数据集的方式

– 无需客户端协调更新的机制

– 开发者可将验证工作交给 S3

– 使用方法

– 通过添加 HTTP if-none-match 条件头与 PutObject 和 CompleteMultipartUpload API 请求使用

– 给出使用 AWS CLI 上传对象的示例

– 条件写入行为

– 无同名对象时写入成功,有则失败

– 版本控制下的不同情况

– 同名多条件写入及并发请求的结果

– 删除请求优先

– 错误响应及可能的重试

– 其他信息

– 在所有 AWS 区域免费可用,包括 GovCloud 和中国区

– GitHub 有示例

思维导图:

文章地址:https://www.infoq.com/news/2024/08/amazon-s3-conditional-writes/?utm_campaign=infoq_content&utm_source=infoq&utm_medium=feed&utm_term=global

文章来源:infoq.com

作者:Steef-Jan Wiggers

发布时间:2024/8/28 0:00

语言:英文

总字数:495字

预计阅读时间:2分钟

评分:85分

标签:亚马逊 S3,条件写入,分布式系统,数据完整性,AWS


以下为原文内容

本内容来源于用户推荐转载,旨在分享知识与观点,如有侵权请联系删除 联系邮箱 media@ilingban.com

AWS recently announced support for conditional writing in Amazon S3, allowing users to check for the existence of an object before creating it. This feature helps prevent overwriting existing objects when uploading data, making it easier for applications to manage data.

Conditional writes simplify how distributed applications with multiple clients can update data in parallel across shared datasets. Each client can write objects conditionally, ensuring that it doesn’t overwrite any objects already written by another client. This means there’s no need to build client-side consensus mechanisms to coordinate updates or use additional API requests to check for the presence of an object before uploading data.

Instead, developers can offload such validations to S3, which improves performance and efficiency for large-scale analytics, distributed machine learning, and highly parallelized workloads. To use conditional writes, developers can add the HTTP if-none-match conditional header along with PutObject and CompleteMultipartUpload API requests.

A put-object using the AWS CLI to upload an object with a conditional write header using the if-none-match parameter could look like this:

aws s3api put-object --bucket amzn-s3-demo-bucket --key dir-1/my_images.tar.bz2 --body my_images.tar.bz2 --if-none-match "*"

On a Hacker News thread, someone asked if most current systems requiring a reliable managed service for distributed locking use DynamoDB. Are there any scenarios where S3 is preferable to DynamoDB for implementing such distributed locking? With another one answering:

Using only s3 would be more straightforward, with less setup, less code, and less expensive

The conditional write behavior is as follows, according to the company’s documentation:

  • When performing conditional writes in Amazon S3, if no object with the same key name exists in the bucket, the write operation succeeds with a 200 response.
  • If an existing object exists, the write operation fails with a 412 Precondition Failed response. When versioning is enabled, S3 checks for the presence of a current object version with the same name.
  • If no current object version exists or the current version is a delete marker, the write operation succeeds.
  • Multiple conditional writes for the same object name will result in the first write operation succeeding and subsequent writes failing with a 412 Precondition Failed response. Additionally, concurrent requests may result in a 409 Conflict response.
  • If a delete request to an object succeeds before a conditional write operation completes, the delete request takes precedence. After receiving a 409 error with PutObject and CompleteMultipartUpload, a retry may be needed.

412 Precondition Failed response (Source: Conditional Requests Documentation)

Paul Meighan, a product manager at AWS, stated in a LinkedIn post:

This is a big simplifier for distributed applications that have the potential for many concurrent writers and, in general, a win for data integrity.

Followed by a comment from Gregor Hohpe:

Now, that’s what I call a distributed system “primitive”: conditional write.

Currently, the conditional writes feature in Amazon S3 is available at no additional charge in all AWS regions, including the AWS GovCloud (US) Regions and the AWS China regions. In addition, samples are available in a GitHub repository.