Lead Image © erikdegraaf, 123RF.com

Getting data from AWS S3 via Python scripts

Pumping Station

Article from ADMIN 41/2017

By Mike Schilli

Data on AWS S3 is not necessarily stuck there. If you want your data back, you can siphon it out all at once with a little Python pump.

Data produced on EC2 instances or AWS lambda servers often end up in Amazon S3 storage. If the data is in many small files, of which the customer only needs a selection, downloading from the browser can bring on finicky behavior. Luckily the Amazon toolshed offers Python libraries as pipes for programmatic data draining in the form of awscli and boto3.

At the command line, the Python tool aws copies S3 files from the cloud onto the local computer. Install this using

pip3 install --user awscli

and then answer the questions for the applicable AWS zone, specifying the username and password as you go. You then receive an access token, which aws stores in ~/.aws/credentials and, from then on, no longer prompts you for the password [1].

Data exists in S3 as objects indexed by string keys. If a prosnapshot bucket contains a video.mp4 video file under the hello.mp4 key, you can use the

aws s3 cp s3://prosnapshot/hello.mp4 video.mp4

command to retrieve it from the cloud and store on the local hard disk, just as in the browser (Figure 1).

Figure 1: The browser moves selected S3 files from the cloud to the

...

Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF

Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES

Print Issues

Digital Issues

SUBSCRIPTIONS

Print Subs

Digisubs

TABLET & SMARTPHONE APPS

US / Canada

UK / Australia

Getting data from AWS S3 via Python scripts

Pumping Station

Buy this article as PDF

Buy ADMIN Magazine

Related content