Posted in

使用云存储客户端库提高吞吐量_AI阅读总结 — 包阅AI

包阅导读总结

1. 关键词:Cloud Storage、Client Libraries、Throughput、Performance、Transfer Manager

2. 总结:本文主要介绍了通过 Cloud Storage 客户端库的传输管理器来提高吞吐量,不同环境和工作负载下配置并行性会有不同效果,还提到了不同文件大小和实例的测试结果,最后给出了开始使用的参考及反馈途径。

3. 主要内容:

– 性能提升与配置并行性

– 可配置并行性适应不同环境和工作负载,其方式因编程语言而异

– 大量数据传输时切换到传输管理器效果显著

– 测试效果

– 下载大量 16Kb 以下文件,64 个工作者实现 50 倍吞吐量提升

– 移动 64MB 大文件,8 个工作者使吞吐量提高 4.5 倍

– 优化配置影响因素

– 包括网络延迟、CPU 类型和内存等

– 开始使用与反馈

– 参考主要用例的代码样本和 API 文档

– 欢迎通过文档页按钮或 Github 问题反馈

思维导图:

文章地址:https://cloud.google.com/blog/products/storage-data-transfer/improve-throughput-with-cloud-storage-client-libraries/

文章来源:cloud.google.com

作者:Andrew Gorcester,Vivek Saraswat

发布时间:2024/7/19 0:00

语言:英文

总字数:704字

预计阅读时间:3分钟

评分:90分

标签:云计算,云存储,客户端库,并行处理,性能优化


以下为原文内容

本内容来源于用户推荐转载,旨在分享知识与观点,如有侵权请联系删除 联系邮箱 media@ilingban.com

Performance benefits

You can configure parallelism to accommodate different operating environments and workloads, and the performance impacts of transfer manager vary according to these variables. Whether the parallelism uses threads, processes, or co-routines depends on the programming language.

Switching from ordinary transfers to transfers performed by the transfer manager will have the greatest impact in your application when a lot of data needs to be moved at once. The more there is to transfer in terms of the number of objects, the size of objects, or both, the more your application will benefit.

For example, when downloading a large number of files under 16Kbs using 64 workers on a c3-highcpu-8 Compute Engine instance, the Python library transfer manager module achieved a 50x throughput improvement over a single-worker solution! Testing showed that large numbers of workers are most effective for very small files. While this example uses a fairly extreme number of workers for a relatively small instance, a smaller number of workers still delivers a significant performance improvement.

On the same instance, when moving larger files of 64MB using only 8 workers, the Cloud Storage client library transfer manager increased the throughput by 4.5x from a much higher initial baseline. Performance for sharded uploads and downloads with chunk sizes in the 32 to 64MB range performed similarly.

The optimal configuration for throughput improvements on a given workload varies depending on a number of factors including networking latency, CPU type, and memory. For example, Compute Engine instances have different networking configurations, as well as different CPU and memory resources. Likewise, accessing Cloud Storage from outside of Compute Engine imposes radically different constraints on network throughput and round-trip time.

Getting started

To get started with the Cloud Storage client library transfer manager, refer to our code samples for some major use cases, and the API reference documentation for each client library:

Whether our new client library features solve problems for you or could use improvement, we are eager to hear your feedback. Please reach out via the “Send Feedback” button on the Cloud Storage Client Libraries documentation page, or via Github issue on any client library repo.