boto3を使ってs3のバケットにキーが存在するかどうかをチェック

Question

キーがboto3に存在するかどうか知りたいのですが。バケットの内容をループして、一致する場合はキーをチェックします。

しかし、それは長くなり過ぎ、やり過ぎです。 Boto3の公式文書はこれを行う方法を明示的に述べています。

私は自明を見逃しているのかもしれません。誰が私がこれを達成することができる方法を私に指摘できますか。

Wander Nauta · Accepted Answer

Boto 2のboto.s3.key.Keyオブジェクトは、HEADリクエストを実行してその結果を調べることで、キーがS3に存在するかどうかをチェックするexistsメソッドを持っていましたが、もう存在しないようです。あなたはそれを自分でしなければなりません：

import boto3 import botocore s3 = boto3.resource('s3') try: s3.Object('my-bucket', 'dootdoot.jpg').load() except botocore.exceptions.ClientError as e: if e.response['Error']['Code'] == "404": # The object does not exist. ... else: # Something else has gone wrong. raise else: # The object does exist. ...

load()は単一のキーに対してHEADリクエストを実行します。これは、問題のオブジェクトが大きい場合や、バケット内に多数のオブジェクトがある場合でも高速です。

もちろん、オブジェクトの使用を検討しているので、そのオブジェクトが存在するかどうかを確認しているかもしれません。その場合は、load()を忘れて直接get()またはdownload_file()を実行して、そこでエラーのケースを処理することができます。

EvilPuppetMaster · Answer

制御フローに例外を使用することは大好きではありません。これは、boto3で機能する代替アプローチです。

import boto3 s3 = boto3.resource('s3') bucket = s3.Bucket('my-bucket') key = 'dootdoot.jpg' objs = list(bucket.objects.filter(Prefix=key)) if len(objs) > 0 and objs[0].key == key: print("Exists!") else: print("Doesn't exist")

o_c · Answer

私が見つけた（そしておそらく最も効率的な）最も簡単な方法はこれです：

import boto3 from botocore.errorfactory import ClientError s3 = boto3.client('s3', aws_access_key_id='aws_key', aws_secret_access_key='aws_secret') try: s3.head_object(Bucket='bucket_name', Key='file_path') except ClientError: # Not found pass

Lucian Thorr · Answer

Boto3では、list_objectsを使ってフォルダ（接頭辞）かファイルをチェックしているなら。あなたは、オブジェクトが存在するかどうかのチェックとして、応答辞書内の 'Contents'の存在を使用することができます。 @EvilPuppetMasterが示唆しているように、try/exceptキャッチを回避するもう1つの方法です。

import boto3 client = boto3.client('s3') results = client.list_objects(Bucket='my-bucket', Prefix='dootdoot.jpg') return 'Contents' in results

Vitaly Zdanevich · Answer

clientだけでなくbucketも同様です。

import boto3 import botocore bucket = boto3.resource('s3', region_name='eu-west-1').Bucket('my-bucket') try: bucket.Object('my-file').get() except botocore.exceptions.ClientError as ex: if ex.response['Error']['Code'] == 'NoSuchKey': print('NoSuchKey')

Vivek · Answer

import boto3 client = boto3.client('s3') s3_key = 'Your file without bucket name e.g. abc/bcd.txt' bucket = 'your bucket name' content = client.head_object(Bucket=bucket,Key=s3_key) if content.get('ResponseMetadata',None) is not None: print "File exists - s3://%s/%s " %(bucket,s3_key) else: print "File does not exist - s3://%s/%s " %(bucket,s3_key)

Alkesh Mahajan · Answer

これを簡単に試す

import boto3 s3 = boto3.resource('s3') bucket = s3.Bucket('mybucket_name') # just Bucket name file_name = 'A/B/filename.txt' # full file path obj = list(bucket.objects.filter(Prefix=file_name)) if len(obj) > 0: print("Exists") else: print("Not Exists")

Andy Reagan · Answer

FWIW、これが私が使っている非常に単純な関数です。

import boto3 def get_resource(config: dict={}): """Loads the s3 resource. Expects AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to be in the environment or in a config dictionary. Looks in the environment first.""" s3 = boto3.resource('s3', aws_access_key_id=os.environ.get( "AWS_ACCESS_KEY_ID", config.get("AWS_ACCESS_KEY_ID")), aws_secret_access_key=os.environ.get("AWS_SECRET_ACCESS_KEY", config.get("AWS_SECRET_ACCESS_KEY"))) return s3 def get_bucket(s3, s3_uri: str): """Get the bucket from the resource. A thin wrapper, use with caution. Example usage: >> bucket = get_bucket(get_resource(), s3_uri_prod)""" return s3.Bucket(s3_uri) def isfile_s3(bucket, key: str) -> bool: """Returns T/F whether the file exists.""" objs = list(bucket.objects.filter(Prefix=key)) return len(objs) == 1 and objs[0].key == key def isdir_s3(bucket, key: str) -> bool: """Returns T/F whether the directory exists.""" objs = list(bucket.objects.filter(Prefix=key)) return len(objs) > 1

VinceP · Answer

S3Fs を使うことができます。これは基本的にboto3のラッパーであり、典型的なファイルシステムスタイルの操作を公開しています。

import s3fs s3 = s3fs.S3FileSystem() s3.exists('myfile.txt')

Veedka · Answer

Boto3の場合、 ObjectSummary を使用してオブジェクトが存在するかどうかを確認できます。

Amazon S3バケットに格納されているオブジェクトの概要が含まれています。このオブジェクトには、オブジェクトの完全なメタデータもその内容も含まれていません

import boto3 from botocore.errorfactory import ClientError def path_exists(path, bucket_name): """Check to see if an object exists on S3""" s3 = boto3.resource('s3') try: s3.ObjectSummary(bucket_name=bucket_name, key=path).load() except ClientError as e: if e.response['Error']['Code'] == "404": return False else: raise e return True path_exists('path/to/file.html')

In ObjectSummary.load

S3.Client.head_objectを呼び出して、ObjectSummaryリソースの属性を更新します。

get()を使用しない場合は、ObjectSummaryの代わりにObjectを使用できることがわかります。 load()関数は、要約を取得するだけのオブジェクトを取得しません。

Vitaly Zdanevich · Answer

あなたがディレクトリやバケツの中に1000個以下しか持っていないなら、あなたはそれらのセットを得ることができ、そしてこのセットの中にそのようなキーがあるかどうかチェックした後：

files_in_dir = {d['Key'].split('/')[-1] for d in s3_client.list_objects_v2( Bucket='mybucket', Prefix='my/dir').get('Contents') or []}

my/dirが存在しなくても、そのようなコードは機能します。

http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.list_objects_v2

Rush S · Answer

これは私のために働く解決策です。注意点の1つは、キーの正確なフォーマットが事前にわかっていることです。そのため、単一ファイルのみをリストしています

import boto3 # The s3 base class to interact with S3 class S3(object): def __init__(self): self.s3_client = boto3.client('s3') def check_if_object_exists(self, s3_bucket, s3_key): response = self.s3_client.list_objects( Bucket = s3_bucket, Prefix = s3_key ) if 'ETag' in str(response): return True else: return False if __== '__main__': s3 = S3() if s3.check_if_object_exists(bucket, key): print "Found S3 object." else: print "No object found."

user 923227 · Answer

botocore.exceptions.ClientErrorを使って例外をキャッチするためには、botocoreをインストールする必要があることに気づきました。 botocoreは36Mのディスク容量を占有します。これは、awsのラムダ関数を使用する場合に特に影響を及ぼします。その代わりに、単に例外を使用するのであれば、追加のライブラリの使用をスキップすることができます。

ファイル拡張子が '.csv'であることを確認しています
バケットが存在しない場合でも、これは例外をスローしません。
バケットは存在するがオブジェクトが存在しない場合でも、これは例外をスローしません。
バケットが空の場合、これは例外をスローします。
バケットに権限がない場合、これは例外をスローします。

コードはこんな感じです。あなたの考えを共有してください：

import boto3 import traceback def download4mS3(s3bucket, s3Path, localPath): s3 = boto3.resource('s3') print('Looking for the csv data file ending with .csv in bucket: ' + s3bucket + ' path: ' + s3Path) if s3Path.endswith('.csv') and s3Path != '': try: s3.Bucket(s3bucket).download_file(s3Path, localPath) except Exception as e: print(e) print(traceback.format_exc()) if e.response['Error']['Code'] == "404": print("Downloading the file from: [", s3Path, "] failed") exit(12) else: raise print("Downloading the file from: [", s3Path, "] succeeded") else: print("csv file not found in in : [", s3Path, "]") exit(12)

jms · Answer

S3_REGION="eu-central-1" bucket="mybucket1" name="objectname" import boto3 from botocore.client import Config client = boto3.client('s3',region_name=S3_REGION,config=Config(signature_version='s3v4')) list = client.list_objects_v2(Bucket=bucket,Prefix=name) for obj in list.get('Contents', []): if obj['Key'] == name: return True return False

Peter Kahn · Answer

ディレクトリと同等のキーを探すのであれば、この方法が望ましいでしょう。

session = boto3.session.Session() resource = session.resource("s3") bucket = resource.Bucket('mybucket') key = 'dir-like-or-file-like-key' objects = [o for o in bucket.objects.filter(Prefix=key).limit(1)] has_key = len(objects) > 0

これは、親キー、またはファイルに相当するキー、または存在しないキーに対して機能します。私は上記の好ましいアプローチを試してみましたが、親キーに失敗しました。

Mahesh Mogal · Answer

S3バケットにファイルが存在するかどうかを確認できる簡単な方法が1つあります。これには例外を使用する必要はありません

sesssion = boto3.Session(aws_access_key_id, aws_secret_access_key) s3 = session.client('s3') object_name = 'filename' bucket = 'bucketname' obj_status = s3.list_objects(Bucket = bucket, Prefix = object_name) if obj_status.get('Contents'): print("File exists") else: print("File does not exists")