Different Ways of Listing the Size of Each S3 Bucket

3 min readNov 7, 2023

When it comes to managing AWS S3 storage, keeping track of the size of your buckets is crucial for maintaining cost efficiency and operational performance. With some handy scripting, you can easily keep tabs on your data storage. Here, we discuss three different approaches to listing the size of each S3 bucket.

Approach 1: The Simple `ls` and Recursive Technique

The first approach uses the AWS CLI’s ls command with the --recursive and --summarize flags. This method is straightforward and effective for buckets with a smaller number of objects. However, it can be slow for buckets that contain a large number of files due to the need to list every object.

#!/usr/bin/env bash

set -e

aws s3api list-buckets | jq -r '.Buckets[] | .Name' | \
while read -r bucketName;
do
  echo "$bucketName,$(aws s3 ls "s3://$bucketName" --recursive --summarize --human-readable | grep "Total Size" | cut -d: -f2)"
done

Approach 2: Fetching Average Size from CloudWatch Metric Statistics

The second approach taps into the CloudWatch metric statistics to fetch the average size of each bucket over a period. This method provides a quicker overview and is not dependent on the number of objects in the bucket.

#!/usr/bin/env bash

set -e

aws s3api list-buckets | jq -r '.Buckets[] | .Name' | \
while read -r bucketName;
do
  size=$(aws cloudwatch get-metric-statistics --namespace AWS/S3 \
    --start-time $(date -d '1 month ago' +%Y-%m-%dT%H:%M:%SZ) \
    --end-time $(date +%Y-%m-%dT%H:%M:%SZ) \
    --period 31536000 \
    --statistics Average \
    --metric-name BucketSizeBytes \
    --dimensions Name=BucketName,Value="$bucketName" Name=StorageType,Value=StandardStorage \
    --output json)

    size_in_bytes=$(jq 'if .Datapoints == [] then 0 else .Datapoints[0].Average end' <<< "$size")
    echo -e "$bucketName,$(echo "scale=2; $size_in_bytes / (1024 * 1024 * 1024)" | bc) GB"
done

Approach 3: Enhancing with GNU Parallel

Building on the second method, this approach incorporates GNU parallel to speed up the processing. It’s particularly useful for accounts with a large number of buckets. By paralleling the requests, you can drastically reduce the time taken to get the size of each bucket.

#!/usr/bin/env bash

set -e

function get_average_bucket_size(){
  bucketName=$1

  size=$(aws cloudwatch get-metric-statistics --namespace AWS/S3 \
      --start-time $(date -d '1 month ago' +%Y-%m-%dT%H:%M:%SZ) \
      --end-time $(date +%Y-%m-%dT%H:%M:%SZ) \
      --period 31536000 \
      --statistics Average \
      --metric-name BucketSizeBytes \
      --dimensions Name=BucketName,Value="$bucketName" Name=StorageType,Value=StandardStorage \
      --output json)

      size_in_bytes=$(jq 'if .Datapoints == [] then 0 else .Datapoints[0].Average end' <<< "$size")
      echo -e "$bucketName,$(echo "scale=2; $size_in_bytes / (1024 * 1024 * 1024)" | bc) GB"
}

export -f get_average_bucket_size

aws s3api list-buckets | jq -r '.Buckets[] | .Name' | sed 's/"//g' | \
  parallel --will-cite --jobs 10 --colsep ',' get_average_bucket_size

Example Output

bucket-name-1, 2.3 GB
bucket-name-2, 754.8 MB
bucket-name-3, 68.5 GB
...

Use Cases

Cost Management: By regularly monitoring bucket sizes, organizations can better predict and manage AWS costs.
Data Lifecycle: Understanding bucket sizes can help in applying data lifecycle policies more effectively.
Performance Optimization: Large bucket sizes can affect performance; knowing sizes can be the first step in optimization.
Compliance: For regulatory compliance, it’s essential to have an overview of the data footprint.

Conclusion

Monitoring the size of your S3 buckets is key to maintaining a cost-effective and efficient cloud storage solution. Whether you prefer a simple command-line request, a CloudWatch metric, or a parallel processing script, you have options to get the insights you need. By using one of these methods, you can easily integrate bucket size monitoring into your regular AWS maintenance routine, ensuring you stay on top of your storage needs.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Andy Rea

4 Followers

1 Following

Experimenting with Medium to share my AWS CLI queries in combination with other shell utilities and also help from ChatGPT for post and image content

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Recommended from Medium

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Level Up Coding

Jacob Bennett

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Jan 7

260

Mastering AWS VPC Endpoints: In-Depth Traffic Analysis and Private Connectivity

Rahulbhatia1998

Mastering AWS VPC Endpoints: In-Depth Traffic Analysis and Private Connectivity

In the world of cloud computing, security and network efficiency are paramount. AWS provides a robust mechanism to enhance both through VPC…

Nov 11, 2024

Lists

General Coding Knowledge

20 stories1945 saves

Natural Language Processing

1977 stories1620 saves

Productivity

245 stories697 saves

How I Am Using a Lifetime 100% Free Server

Harendra

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

Oct 26, 2024

170

Technical Guide: End-to-End CI/CD DevOps with Jenkins, Docker, Kubernetes, ArgoCD, Github Actions , AWS EC2 and Terraform by Joel .O Wembo

Django Unleashed

Joel Wembo

Technical Guide: End-to-End CI/CD DevOps with Jenkins, Docker, Kubernetes, ArgoCD, Github Actions …

Building an end-to-end CI/CD pipeline for Django applications using Jenkins, Docker, Kubernetes, ArgoCD, AWS EKS, AWS EC2

Apr 12, 2024

AWS in Plain English

Rahul Sharma

I have Asked This SSH Question in Every AWS Interview — And Here’s the Catch

When I interview people, I always ask questions about problems that people face in the real world.

Sep 16, 2024

Stackademic

Crafting-Code

I Stopped Using Kubernetes. Our DevOps Team Is Happier Than Ever

Why Letting Go of Kubernetes Worked for Us

Nov 19, 2024

173

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams

Different Ways of Listing the Size of Each S3 Bucket

Approach 1: The Simple `ls` and Recursive Technique

Approach 2: Fetching Average Size from CloudWatch Metric Statistics

Approach 3: Enhancing with GNU Parallel

Example Output

Use Cases

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Andy Rea

No responses yet

More from Andy Rea

Retrieving the Uptime for EC2 Instances Using the AWS Config Service API

Managing AWS EC2 instances involves a close watch on their state transitions and uptime. This is crucial for maintaining service…

Listing All EC2 Instances with AMI Attributes Including the Age

When managing a fleet of EC2 instances on AWS, obtaining detailed information about each instance and its associated AMI can be critical…

Show the Last 6 Months of Cost For Your AWS Cloud Broken Down by Service and Output a Tabular View

Introduction

List and Visualize your AWS Services and Sub Resources in an Account

Managing and understanding AWS infrastructure across various services and resources can be challenging. This post explores how to list…

Recommended from Medium

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Mastering AWS VPC Endpoints: In-Depth Traffic Analysis and Private Connectivity

In the world of cloud computing, security and network efficiency are paramount. AWS provides a robust mechanism to enhance both through VPC…

Lists

General Coding Knowledge

Natural Language Processing

Productivity

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

Technical Guide: End-to-End CI/CD DevOps with Jenkins, Docker, Kubernetes, ArgoCD, Github Actions …

Building an end-to-end CI/CD pipeline for Django applications using Jenkins, Docker, Kubernetes, ArgoCD, AWS EKS, AWS EC2

I have Asked This SSH Question in Every AWS Interview — And Here’s the Catch

When I interview people, I always ask questions about problems that people face in the real world.

I Stopped Using Kubernetes. Our DevOps Team Is Happier Than Ever

Why Letting Go of Kubernetes Worked for Us

Different Ways of Listing the Size of Each S3 Bucket

Approach 1: The Simple ls and Recursive Technique

Approach 2: Fetching Average Size from CloudWatch Metric Statistics

Approach 3: Enhancing with GNU Parallel

Example Output

Use Cases

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Andy Rea

No responses yet

More from Andy Rea

Retrieving the Uptime for EC2 Instances Using the AWS Config Service API

Managing AWS EC2 instances involves a close watch on their state transitions and uptime. This is crucial for maintaining service…

Listing All EC2 Instances with AMI Attributes Including the Age

When managing a fleet of EC2 instances on AWS, obtaining detailed information about each instance and its associated AMI can be critical…

Show the Last 6 Months of Cost For Your AWS Cloud Broken Down by Service and Output a Tabular View

Introduction

List and Visualize your AWS Services and Sub Resources in an Account

Managing and understanding AWS infrastructure across various services and resources can be challenging. This post explores how to list…

Recommended from Medium

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Mastering AWS VPC Endpoints: In-Depth Traffic Analysis and Private Connectivity

In the world of cloud computing, security and network efficiency are paramount. AWS provides a robust mechanism to enhance both through VPC…

Lists

General Coding Knowledge

Natural Language Processing

Productivity

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

Technical Guide: End-to-End CI/CD DevOps with Jenkins, Docker, Kubernetes, ArgoCD, Github Actions …

Building an end-to-end CI/CD pipeline for Django applications using Jenkins, Docker, Kubernetes, ArgoCD, AWS EKS, AWS EC2

I have Asked This SSH Question in Every AWS Interview — And Here’s the Catch

When I interview people, I always ask questions about problems that people face in the real world.

I Stopped Using Kubernetes. Our DevOps Team Is Happier Than Ever

Why Letting Go of Kubernetes Worked for Us

Approach 1: The Simple `ls` and Recursive Technique