pythonを使用してcsvデータをmongodbにプッシュする方法

Question

Python.iを使用してcsvデータをmongodbにプッシュしようとすると、pythonの初心者です＆mongodb..iは次のコードを使用しました

import csv import json import pandas as pd import sys, getopt, pprint from pymongo import MongoClient #CSV to JSON Conversion csvfile = open('C://test//final-current.csv', 'r') jsonfile = open('C://test//6.json', 'a') reader = csv.DictReader( csvfile ) header= [ "S.No", "Instrument Name", "Buy Price", "Buy Quantity", "Sell Price", "Sell Quantity", "Last Traded Price", "Total Traded Quantity", "Average Traded Price", "Open Price", "High Price", "Low Price", "Close Price", "V" ,"Time"] #fieldnames=header output=[] for each in reader: row={} for field in header: row[field]=each[field] output.append(row) json.dump(output, jsonfile, indent=None, sort_keys=False , encoding="UTF-8") mongo_client=MongoClient() db=mongo_client.october_mug_talk db.segment.drop() data=pd.read_csv('C://test//6.json', error_bad_lines=0) df = pd.DataFrame(data) records = csv.DictReader(df) db.segment.insert(records)

しかし、出力はこの形式で与えられます

/* 0 */ { "_id" : ObjectId("54891c4ffb2a0303b0d43134"), "[{\"AverageTradedPrice\":\"0\"" : "BuyPrice:\"349.75\"" } /* 1 */ { "_id" : ObjectId("54891c4ffb2a0303b0d43135"), "[{\"AverageTradedPrice\":\"0\"" : "BuyQuantity:\"3000\"" } /* 2 */ { "_id" : ObjectId("54891c4ffb2a0303b0d43136"), "[{\"AverageTradedPrice\":\"0\"" : "ClosePrice:\"350\"" } /* 3 */ { "_id" : ObjectId("54891c4ffb2a0303b0d43137"), "[{\"AverageTradedPrice\":\"0\"" : "HighPrice:\"0\"" }

実際、私は他のすべてのフィールドがサブタイプとして示される必要がある単一のidに対して出力を好きにしたいです：

 _id" : ObjectId("54891c4ffb2a0303b0d43137") AveragetradedPrice :0 HighPrice:0 ClosePrice:350 buyprice:350.75

私を助けてください。よろしくお願いします

Viswanathan · Accepted Answer

提案ありがとうございます。これは修正されたコードです：

import csv import json import pandas as pd import sys, getopt, pprint from pymongo import MongoClient #CSV to JSON Conversion csvfile = open('C://test//final-current.csv', 'r') reader = csv.DictReader( csvfile ) mongo_client=MongoClient() db=mongo_client.october_mug_talk db.segment.drop() header= [ "S No", "Instrument Name", "Buy Price", "Buy Quantity", "Sell Price", "Sell Quantity", "Last Traded Price", "Total Traded Quantity", "Average Traded Price", "Open Price", "High Price", "Low Price", "Close Price", "V" ,"Time"] for each in reader: row={} for field in header: row[field]=each[field] db.segment.insert(row)

Adil · Answer

最も簡単な方法は、pandasを使用することです。私のコードは

import json import pymongo import pandas as pd myclient = pymongo.MongoClient() df = pd.read_csv('yourcsv.csv',encoding = 'ISO-8859-1') # loading csv file df.to_json('yourjson.json') # saving to json file jdf = open('yourjson.json').read() # loading the json file data = json.loads(jdf) # reading json file

これで、このjsonをmangodbデータベースに挿入できます：-]

deenaik · Answer

CSVにヘッダー行があると仮定すると、インポートの数を減らすより良い方法があります。

from pymongo import MongoClient import csv # DB connectivity client = MongoClient('localhost', 27017) db = client.db collection = db.collection # Function to parse csv to dictionary def csv_to_dict(): reader = csv.DictReader(open(FILEPATH)) result = {} for row in reader: key = row.pop('First_value') result[key] = row return query # Final insert statement db.collection.insert_one(csv_to_dict())

それが役に立てば幸い

Perfect · Answer

データを1つずつ挿入するのはなぜですか？これを見てください。

import pandas as pd from pymongo import MongoClient client = MongoClient(<your_credentials>) database = client['YOUR_DB_NAME'] collection = database['your_collection'] def csv_to_json(filename, header=None): data = pd.read_csv(filename, header=header) return data.to_dict('records') collection.insert_many(csv_to_json('your_file_path'))

ファイルが大きすぎる場合、アプリがクラッシュする可能性があることに注意してください。