botoを使用して、パブリックURLで入手可能な画像をS3にアップロードします

Question

私はPython Web環境で作業しており、botoのkey.set_contents_from_filename（path/to/file）を使用して、ファイルシステムからS3にファイルをアップロードするだけです。しかし、すでにWeb上にある画像をアップロードします（たとえば、 https://pbs.twimg.com/media/A9h_htACIAAaCf6.jpg:large と言います）。

どういうわけか、イメージをファイルシステムにダウンロードしてから、通常どおりbotoを使用してS3にアップロードしてから、イメージを削除する必要がありますか？

Botoのkey.set_contents_from_fileまたは他のコマンドを取得して、サーバーにファイルのコピーを明示的にダウンロードすることなく、URLを受け入れて画像をS3に適切にストリーミングする方法があれば、理想的です。

def upload(url): try: conn = boto.connect_s3(settings.AWS_ACCESS_KEY_ID, settings.AWS_SECRET_ACCESS_KEY) bucket_name = settings.AWS_STORAGE_BUCKET_NAME bucket = conn.get_bucket(bucket_name) k = Key(bucket) k.key = "test" k.set_contents_from_file(url) k.make_public() return "Success?" except Exception, e: return e

上記のようにset_contents_from_fileを使用すると、「文字列オブジェクトに属性 'tell'がない」エラーが発生します。 URLでset_contents_from_filenameを使用すると、そのようなファイルまたはディレクトリはありませんというエラーが表示されます。 boto storage documentation は、ローカルファイルのアップロードを中止し、リモートに保存されたファイルのアップロードについて言及していません。

garnaat · Accepted Answer

残念ながら、これを行う方法は実際にはありません。少なくとも現時点ではそうではありません。メソッドをbotoに追加できます。たとえば、set_contents_from_url、ただし、その方法でも、ファイルをローカルマシンにダウンロードしてアップロードする必要があります。それでも便利な方法かもしれませんが、何も節約できません。

本当にやりたいことを行うには、S3サービス自体にURLを渡して、URLをバケットに保存させる機能が必要です。それはかなり便利な機能のように聞こえます。これをS3フォーラムに投稿することをお勧めします。

dgh · Answer

OK、@ garnaatから、S3が現在URLによるアップロードを許可しているようには思えません。リモートの画像をメモリに読み込むだけで、S3にアップロードできました。これは機能します。

def upload(url): try: conn = boto.connect_s3(settings.AWS_ACCESS_KEY_ID, settings.AWS_SECRET_ACCESS_KEY) bucket_name = settings.AWS_STORAGE_BUCKET_NAME bucket = conn.get_bucket(bucket_name) k = Key(bucket) k.key = url.split('/')[::-1][0] # In my situation, ids at the end are unique file_object = urllib2.urlopen(url) # 'Like' a file object fp = StringIO.StringIO(file_object.read()) # Wrap object k.set_contents_from_file(fp) return "Success" except Exception, e: return e

また rllib.urlopen（）が返す「ファイルのようなオブジェクト」からGzipFileインスタンスを作成するにはどうすればよいですか？

GISD · Answer

（元の回答からの古い「boto」パッケージの代わりに）公式の「boto3」パッケージを使用するこの質問への2017関連の回答の場合：

Python 3.5

クリーンPython installの場合、最初に両方のパッケージをpipインストールします。

pip install boto3

pip install requests

import boto3 import requests # Uses the creds in ~/.aws/credentials s3 = boto3.resource('s3') bucket_name_to_upload_image_to = 'photos' s3_image_filename = 'test_s3_image.png' internet_image_url = 'https://docs.python.org/3.7/_static/py.png' # Do this as a quick and easy check to make sure your S3 access is OK for bucket in s3.buckets.all(): if bucket.name == bucket_name_to_upload_image_to: print('Good to go. Found the bucket to upload the image into.') good_to_go = True if not good_to_go: print('Not seeing your s3 bucket, might want to double check permissions in IAM') # Given an Internet-accessible URL, download the image and upload it to S3, # without needing to persist the image to disk locally req_for_image = requests.get(internet_image_url, stream=True) file_object_from_req = req_for_image.raw req_data = file_object_from_req.read() # Do the actual upload to s3 s3.Bucket(bucket_name_to_upload_image_to).put_object(Key=s3_image_filename, Body=req_data)

blaklaybul · Answer

これが requests でどのように行われたかです。キーは、最初に要求を行うときに_stream=True_を設定し、upload.fileobj()メソッドを使用してs3にアップロードすることです。

_import requests import boto3 url = "https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg" r = requests.get(url, stream=True) session = boto3.Session() s3 = session.resource('s3') bucket_name = 'your-bucket-name' key = 'your-key-name' # key is the name of file on your bucket bucket = s3.Bucket(bucket_name) bucket.upload_fileobj(r.raw, key) _

Filippo Vitale · Answer

ラムダですぐに機能する単純な3行の実装：

import boto3 import requests s3_object = boto3.resource('s3').Object(bucket_name, object_key) with requests.get(url, stream=True) as r: s3_object.put(Body=r.content)

.get部分のソースは requestsドキュメントから直接取得されます

x00x70 · Answer

Boto3を使用するupload_fileobjメソッドを使用すると、ディスクに保存せずにファイルをS3バケットにストリーミングできます。これが私の機能です：

import boto3 import StringIO import contextlib import requests def upload(url): # Get the service client s3 = boto3.client('s3') # Rember to se stream = True. with contextlib.closing(requests.get(url, stream=True, verify=False)) as response: # Set up file stream from response content. fp = StringIO.StringIO(response.content) # Upload data to S3 s3.upload_fileobj(fp, 'my-bucket', 'my-dir/' + url.split('/')[-1])

yunus kula · Answer

私はboto3で次のように試しましたが、うまくいきます：

import boto3; import contextlib; import requests; from io import BytesIO; s3 = boto3.resource('s3'); s3Client = boto3.client('s3') for bucket in s3.buckets.all(): print(bucket.name) url = "@resource url"; with contextlib.closing(requests.get(url, stream=True, verify=False)) as response: # Set up file stream from response content. fp = BytesIO(response.content) # Upload data to S3 s3Client.upload_fileobj(fp, 'aws-books', 'reviews_Electronics_5.json.gz')

Shabari Prasad H.D · Answer

import boto from boto.s3.key import Key from boto.s3.connection import OrdinaryCallingFormat from urllib import urlopen def upload_images_s3(img_url): try: connection = boto.connect_s3('access_key', 'secret_key', calling_format=OrdinaryCallingFormat()) bucket = connection.get_bucket('boto-demo-1519388451') file_obj = Key(bucket) file_obj.key = img_url.split('/')[::-1][0] fp = urlopen(img_url) result = file_obj.set_contents_from_string(fp.read()) except Exception, e: return e

Prateek Bhuwania · Answer

S3は今のところリモートアップロードをサポートしていません。以下のクラスを使用して、S3に画像をアップロードできます。ここでのアップロードメソッドは、最初に画像のダウンロードを試み、アップロードされるまでしばらくの間メモリに保持します。 S3に接続できるようにするには、コマンドpip install awscliを使用してAWS CLIをインストールし、コマンドaws configureを使用していくつかの認証情報を入力する必要があります。

import urllib3 import uuid from pathlib import Path from io import BytesIO from errors import custom_exceptions as cex BUCKET_NAME = "xxx.yyy.zzz" POSTERS_BASE_PATH = "assets/wallcontent" CLOUDFRONT_BASE_URL = "https://xxx.cloudfront.net/" class S3(object): def __init__(self): self.client = boto3.client('s3') self.bucket_name = BUCKET_NAME self.posters_base_path = POSTERS_BASE_PATH def __download_image(self, url): manager = urllib3.PoolManager() try: res = manager.request('GET', url) except Exception: print("Could not download the image from URL: ", url) raise cex.ImageDownloadFailed return BytesIO(res.data) # any file-like object that implements read() def upload_image(self, url): try: image_file = self.__download_image(url) except cex.ImageDownloadFailed: raise cex.ImageUploadFailed extension = Path(url).suffix id = uuid.uuid1().hex + extension final_path = self.posters_base_path + "/" + id try: self.client.upload_fileobj(image_file, self.bucket_name, final_path ) except Exception: print("Image Upload Error for URL: ", url) raise cex.ImageUploadFailed return CLOUDFRONT_BASE_URL + id