Pandas groupby to tocsv

Question

Pandas groupbyデータフレームをCSVに出力したい。さまざまなStackOverflowソリューションを試しましたが、機能していません。

Python 3.6.1、Pandas 0.20.1

groupbyの結果は次のようになります。

id month year count week 0 9066 82 32142 895 1 7679 84 30112 749 2 8368 126 42187 872 3 11038 102 34165 976 4 8815 117 34122 767 5 10979 163 50225 1252 6 8726 142 38159 996 7 5568 63 26143 582

のようなCSVが欲しい

week count 0 895 1 749 2 872 3 976 4 767 5 1252 6 996 7 582

現在のコード：

week_grouped = df.groupby('week') week_grouped.sum() #At this point you have the groupby result week_grouped.to_csv('week_grouped.csv') #Can't do this - .to_csv is not a df function.

読むSOソリューション：

csvファイルpandasにgroupbyを出力

week_grouped.drop_duplicates().to_csv('week_grouped.csv')

Result：AttributeError：「DataFrameGroupBy」オブジェクトの呼び出し可能な属性「drop_duplicates」にアクセスできません。「apply」メソッドを使用してみてください

Python pandas-groupby出力をファイルに書き込む

week_grouped.reset_index().to_csv('week_grouped.csv')

Result：AttributeError： "'DataFrameGroupBy'オブジェクトの呼び出し可能な属性 'reset_index'にアクセスできません。'apply 'メソッドを使用してみてください"

Alex Luis Arias · Answer

これを試してください：

week_grouped = df.groupby('week') week_grouped.sum().reset_index().to_csv('week_grouped.csv')

これにより、データフレーム全体がファイルに書き込まれます。これら2つの列のみが必要な場合は、

week_grouped = df.groupby('week') week_grouped.sum().reset_index()[['week', 'count']].to_csv('week_grouped.csv')

元のコードの行ごとの説明を次に示します。

# This creates a "groupby" object (not a dataframe object) # and you store it in the week_grouped variable. week_grouped = df.groupby('week') # This instructs pandas to sum up all the numeric type columns in each # group. This returns a dataframe where each row is the sum of the # group's numeric columns. You're not storing this dataframe in your # example. week_grouped.sum() # Here you're calling the to_csv method on a groupby object... but # that object type doesn't have that method. Dataframes have that method. # So we should store the previous line's result (a dataframe) into a variable # and then call its to_csv method. week_grouped.to_csv('week_grouped.csv') # Like this: summed_weeks = week_grouped.sum() summed_weeks.to_csv('...') # Or with less typing simply week_grouped.sum().to_csv('...')

Peter Leimbigler · Answer

2行目をweek_grouped = week_grouped.sum()に変更して、3行すべてを再実行してください。

独自のJupyterノートブックセルでweek_grouped.sum()を実行すると、ステートメントがどのように出力をセルの出力に返すかを確認できます。結果を_week_grouped_に戻す代わりに。一部のpandasメソッドには_inplace=True_引数があります（例：df.sort_values(by=col_name, inplace=True)）が、sumにはありません。

編集：各週番号はCSVに1回だけ表示されますか？もしそうなら、これはgroupbyを使用しないより簡単なソリューションです：

_df = pd.read_csv('input.csv') df[['id', 'count']].to_csv('output.csv') _

Revaz · Answer

Group Byはキーと値のペアを返します。キーはグループの識別子で、値はグループ自体、つまりキーに一致した元のdfのサブセットです。

あなたの例では、week_grouped = df.groupby('week')はグループのセット（pandas.core.groupby.DataFrameGroupByオブジェクト）であり、次のように詳細に調べることができます。

for k, gr in week_grouped: # do your stuff instead of print print(k) print(type(gr)) # This will output <class 'pandas.core.frame.DataFrame'> print(gr) # You can save each 'gr' in a csv as follows gr.to_csv('{}.csv'.format(k))

または、グループ化されたオブジェクトの集計関数を計算できます

result = week_grouped.sum() # This will be already one row per key and its aggregation result result.to_csv('result.csv')

例では、デフォルトでpandasオブジェクトは不変であるため、関数の結果を何らかの変数に割り当てる必要があります。

some_variable = week_grouped.sum() some_variable.to_csv('week_grouped.csv') # This will work

基本的にresult.csvとweek_grouped.csvは同じであることが意図されています

Lucas Dresl · Answer

Groupbyを使用する必要はないと思います。不要な列をドロップするだけです。

df = df.drop(['month','year'], axis=1) df.reset_index() df.to_csv('Your path')