How to transfer files from Google Cloud Storage (GCS) into Amazon S3 bucket without downloading the files
- September 13, 2021
- Posted by: Vignesh Kumar Sivanadan
- Categories: Amazon S3, Google Cloud, Tutorial

- This is also the best available method to transfer data from BigQuery to Amazon S3 (through GCS)
- This tutorial is written for beginners with zero scripting expertise and can be utilized by anyone
Pre-Requirements
To transfer the file from Google cloud storage into Amazon S3, you will need the following
- Access to create ‘VM Instances’ in Google Cloud (steps explained below). Pricing will be around 0.5$ per hour for a instance with 4 CPUs
- Write access to S3 folder
- Path of the S3 folder where the files needs to be copied and the credentials
- Path Example : s3://mybucket/filetransferfrombigquery/
- Access Key Example : AKIAJPBXUHVICKYDEEPIA
- Secret Access Key Example : z+9VVickyDeepi+hVaAtmbepw9gA1vjJeshX
Transferring files from Google Cloud Storage to Amazon S3
To summarize how this method works, we will be creating a virtual machine instance which will act as intermediate system for uploading and downloading using gsutil command (This will not consume any bandwidth from our system)
Step 1 : Create a VM instance and access SSH root
- Go to Google VM Instance and sign in with your google cloud credentials
- Click on ‘Create Instance’
- In the region and zone, select Region which is same as S3 region (Will be present in AWS URL)
- Based on the size of the data you want to transfer, select the number of CPUs and the memory. For transferring 40 GB data, I created v4CPUs with 30GB memory
- Let the instance get up and running. Now go back to the VM instances list and open SSH terminal by clicking on SSH next to the list
- This will open a terminal
Step 2: Creating Configuration File in GCS VM Instance for access
Now that we have the VM instance ready and access to the Google VM Instance, we will have to create a configuration file that will store the credentials to Amazon S3 bucket. You will by default have access to the GCS folder (if it is accessible through your account)
-
- First step is to navigate to the user’s home directory (This will be default folder but let’s not leave it to option)
[username@instancename ~]$ cd ~
- Now create an empty .boto file where we will be storing our credentials
[username@instancename ~]$ touch .boto
- Now input 3 rows into the .boto file using command line (Header, Access Key and Secret Access Key
[username@instancename ~]$ echo [Credentials] >> ~/.boto
[username@instancename ~]$ echo aws_access_key_id = AKIAJPBXUHVICKYDEEPIA >> ~/.boto
[username@instancename ~]$ echo aws_secret_access_key = z+9VVickyDeepi+hVaAtmbepw9gA1vjJeshX >> ~/.boto - Now read the file to confirm that the .boto configuration file has the right credentials
[username@instancename ~]$ cat .boto
- This will display the credentials as shown below the terminal. Once you confirm that the credentials are right, you are good to transfer the files
- First step is to navigate to the user’s home directory (This will be default folder but let’s not leave it to option)
Step 3: Transferring files using gsutil
Now that we have the credentials in place, use gsutil commandline tool to instruct files transfer from GCS to Amazon S3
-
- Following command will copy all the files in the folder into the S3 bucket folder
[username@instancename ~]$ gsutil cp -r gs://myfolder/myfilefolder/ s3://mybucket/filetransferfrombigquery/
- If you want to copy specific files ending with a wildcard entry, you can use the following command
[username@instancename ~]$ gsutil cp *.txt gs://myfolder/myfilefolder/ s3://mybucket/filetransferfrombigquery/
- Following command will copy all the files in the folder into the S3 bucket folder
Once this is done, the SSH terminal will start showing the progress bar. And the transfer will be complete in maximum of 20 minutes. Now you have all the files in S3 folder
Do not forget to terminate the instance once you are done. Or else you will be billed as long as it is open.
Got questions? Feel free to comment below.
Thanks for the documentation! Very helpful , straightforward and easy to follow