Delete file in sub-directory of S3 using Python

Hi All,
We use boto3 libraries to connect to S3 and do actions on bucket for objects to  upload, download, copy, delete. But let’s say if you want to download a specific object which is under a sub directory in the bucket then it becomes difficult to its less known on how to do this.

Below are few python script examples on using prefix of the subdirectory with boto and boto3 libraries

Example 1: Copy a file/object which is residing in a subdiretory of Bucket1 to Bucket2

import boto
conn = boto.connect_s3()

srcBucket = conn.get_bucket('testProjBucket-1') #Source Bucket name
dstBucket = conn.get_bucket('testProjBucket-2') #Destination Bucket name
fileName='test.txt'

dstBucket.copy_key('Dir2/subDir2/'+fileName,srcBucket,'Dir1/subDir1/'+fileName)

Example 2: Downloads the test.txt from bucket ‘testProjBucket-1’ to the local system path /home/ec2-user/mydownloads/
Here the downloaded file name will be as hai.txt

import boto3
s3 = boto3.resource('s3')

fileName="test.txt"
prefix1=('Dir1/subDir1/'+fileName)

s3.meta.client.download_file('testProjBucket-1', prefix1, '/home/ec2-user/mydownloads/hai.txt')

Example 3: Delete a specific object from a specific sub-directory inside a bucket (Using boto libraries)

import boto
conn = boto.connect_s3(region_name='', aws_access_key_id = '', aws_secret_access_key = '')

fileName = "test.py"

srcBucket = conn.get_bucket('testProjBucket-1')
srcBucket.delete_key('Dir1/subDir1/'+fileName)

Example 4: Delete a specific object from a specific sub-directory inside a bucket (Using boto3 libraries)

import boto3
client = boto3.client('s3', region_name='us-east-1', aws_access_key_id = '', aws_secret_access_key = '')
fileName="test.txt"

prefix1=('Dir1/subDir1/'+fileName)

response = client.delete_object(
Bucket='testProjBucket-1',
Key=prefix1
)

Note: There is no move command for object in boto3 library. We can only use copy command. But we can use move in the aws-cli

Advertisements

Fetch the Elastic Beanstalk environment details using python script

Hi there!!

Its been quite sometime and have been busy working on multiple technologies. Recently my lead asked me to create a python script to fetch the minimum and maximum instances count of all the Elastic beanstalk environments. It was great to work on this requirement. Below is the python script

def get_details():
	row1=['Application Name','Environment Name','Min Count','Max count']
	with open('EB-instances-count',"a") as csvDataFile:
		writer = csv.writer(csvDataFile)
		writer.writerow(row1)
	try:
		eb = boto3.client('elasticbeanstalk',"us-east-1")
		NameInfo=eb.describe_environments()
		for names in NameInfo['Environments']:
			app_name=(names['ApplicationName'])
			env_name=(names['EnvironmentName'])  
			response = eb.describe_configuration_settings(
				EnvironmentName=env_name,
				ApplicationName=app_name
			 )
			minCount=response['ConfigurationSettings'][0]['OptionSettings'][4]
			maxCount=response['ConfigurationSettings'][0]['OptionSettings'][3]
			minVal=minCount['Value']
			maxVal=maxCount['Value']
			print "Gathering count for Environment: " + env_name
			fields=[app_name,env_name,minVal,maxVal]
			with open('EB-instances-count.csv',"a") as csvDataFile:
				writer = csv.writer(csvDataFile)
				writer.writerow(fields)
	except ClientError as e:
		if e.response['Error']['Code'] == "InvalidParameterValue":
			print env_name + " Environment not found, so skipping it"
		pass
	
if __name__ == '__main__':
	get_details()

In this I am getting the application names and environment names of Elastic beanstalk in a region and parsing through them and fetching the min and max instances count. Also after fetching the counts I am writing them to .csv (spreadsheet) file. We can run this script at any time to know the present count of instances being used. This script can be further updated/modified to fetch different information of the environments in Elastic beanstalk.

The interesting part would be to filter the required information from the response. And other thing is lets say if the environment is deleted it will take sometime to disappear from the console and we might see error as we can still get the environment name but not its settings as its already deleted right.
So in this case we have to capture the particular error the exception part and ignore it.

Note: Be careful about the indentation 🙂

Let me know for any questions. Thanks

EBS snapshots deletion by filtering tags

Hey guys!!

Having daily backups of your data is the most important thing in IT industry. EBS snapshots are used to backup Amazon EBS volume with data. Taking regular backup of the volumes decreases the risk of disaster incase of failures. For more detail refer to this post here

Here we taking EBS snapshots for Production environment daily and its not required to have many snapshots as the cost will increase. So in such cases we will be deleting the snapshot after 10 days from the backup date, so that we will endup having 10 snapshots at any given point of time.

The below python script will uses the boto3 library to connect to AWS and fetch the details of services. When a EBS snapshot is created for a EC2 instance, there will be a tag created for snapshot with instanceId details and DateToDelete key with value of future 10th day date.

We will be using two arrays to filter the snapshot tags with key ebsSnaphots_clean:true and instance tags with Environment:Prod
Next we will use for loop to parse through all the ec2 instance details which have tag value and key as Environment:Prod

Similarly we will parse through the EBS snapshots with ebsSnaphots_clean:true and Deletion_date having today’s date.
Next we will fetch the tags and compare the snapshot instanceID with the respective EC2 instanceID of production environment and if they match then that respective snapshot will be deleted.

import boto3
import datetime
import dateutil
from dateutil import parser
from boto3 import ec2

ec = boto3.client('ec2')

def lambda_handler(event, context):
    Deletion_date = datetime.date.today().strftime('%Y-%m-%d')
    firstFilter = [
        {'Name': 'tag-key', 'Values': ['DateToDelete']},
        {'Name': 'tag-value', 'Values': [Deletion_date]},
		{'Name': 'tag-key', 'Values': ['ebsSnaphots_clean']},
		{'Name': 'tag-value', 'Values': ['true']},
    ]

    secondFilter = [
        {'Name': 'tag-key', 'Values': ['Environment']},
		{'Name': 'tag-value', 'Values': ['Sandbox']},
    ]	

    snapshot_details = ec.describe_snapshots(Filters=firstFilter)
    ec2_details = ec.describe_instances(Filters=secondFilter)
	
    for myinst in ec2_details['Reservations']:
        for instID in myinst['Instances']:
            print "The instanceID is %s" % instID['InstanceId']
            Instance_ID = instID['InstanceId']
            for snap in snapshot_details['Snapshots']:
                print "Checking Snapshot %s" % snap['Snapshot_Id']
                for tag in snap['Tags']:
                    if tag['Key'] == 'snap_InstanceID':
                        match_instance = tag['Value']
                        if Instance_ID == match_instance:
                            print "Checking Snapshot %s" % snap['Snapshot_Id']
                            print "the instanceID " +Instance_ID+ " matches with the Snapshot assigned instanceID tag " +match_instance+ " for snapshot %s" % snap['Snapshot_Id']
                            print "Deleting snapshot %s" % snap['Snapshot_Id']
                            ec.delete_snapshot(SnapshotId=snap['Snapshot_Id'])
                        else:
                            print "The instance " +Instance_ID+" is of different environment and do not match with snapshot "+match_instance
                    else:
                        print "no matches"

Note: Please check and take care of indentation

Thanks!!