PythonでBSONファイルを読みますか？

Question

PythonでBSON形式のMongoダンプを読み取り、データを処理したい。Python bson package （which pymongoの依存関係よりも使用したいのですが）、ファイルからの読み取り方法については説明していません。

これは私が試していることです：

bson_file = open('statistics.bson', 'rb') b = bson.loads(bson_file) print b[0]

しかし、私は得ます：

Traceback (most recent call last): File "test.py", line 11, in <module> b = bson.loads(bson_file) File "/Library/Python/2.7/site-packages/bson/__init__.py", line 75, in loads return decode_document(data, 0)[1] File "/Library/Python/2.7/site-packages/bson/codec.py", line 235, in decode_document length = struct.unpack("<i", data[base:base + 4])[0] TypeError: 'file' object has no attribute '__getitem__'

私は何が間違っているのですか？

njzk2 · Accepted Answer

ドキュメントには次のように記載されています。

> help(bson.loads) Given a BSON string, outputs a dict.

文字列を渡す必要があります。例えば：

> b = bson.loads(bson_file.read())

Marc Maxmeister · Answer

私はこれがmongodb2.4BSONファイルとpythonの「bson」モジュールでうまくいくことを発見しました：

import bson with open('survey.bson','rb') as f: data = bson.decode_all(f.read())

これにより、そのmongoコレクションに格納されているJSONドキュメントに一致する辞書のリストが返されました。

F.read（）データはBSONでは次のようになります。

>>> rawdata[:100] '\x04\x01\x00\x00\x12_id\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02_type\x00\x07\x00\x00\x00simple\x00	changed\x00\xd0\xbb\xb2\x9eI\x01\x00\x00	created\x00\xd0L\xdcfI\x01\x00\x00\x02description\x00\x14\x00\x00\x00testing the bu'

Wander Nauta · Answer

loadsは、ファイルではなく文字列（ 's'の略）を想定しています。ファイルから読み取り、結果をloadsに渡してみてください。