パンダ：日付「オブジェクト」をintに変換する

Question

Pandasデータフレームがあり、日付を含む列をintに変換する必要がありますが、残念ながら、指定されたすべてのソリューションはエラーになります（以下）

test_df.info() <class 'pandas.core.frame.DataFrame'> Data columns (total 4 columns): Date 1505 non-null object Avg 1505 non-null float64 TotalVol 1505 non-null float64 Ranked 1505 non-null int32 dtypes: float64(2), int32(1), object(1)

サンプルデータ：

 Date Avg TotalVol Ranked 0 2014-03-29 4400.000000 0.011364 1 1 2014-03-30 1495.785714 4.309310 1 2 2014-03-31 1595.666667 0.298571 1 3 2014-04-01 1523.166667 0.270000 1 4 2014-04-02 1511.428571 0.523792 1

私はすべてを試しましたが、何もうまくいかないと思います

test_df['Date'].astype(int):

TypeError：int（）引数は、「datetime.date」ではなく、文字列、バイトのようなオブジェクト、または数値でなければなりません

test_df['Date']=pd.to_numeric(test_df['Date']):

TypeError：位置0のオブジェクトタイプが無効です

test_df['Date'].astype(str).astype(int):

ValueError：基数10のint（）の無効なリテラル： '2014-03-29'

test_df['Date'].apply(pd.to_numeric, errors='coerce'):

列全体をNaNに変換します

Neroksi · Accepted Answer

test_df['Date'].astype(int)がエラーを返す理由は、日付にハイフン "-"が含まれているためです。最初にtest_df['Date'].str.replace("-","")を実行してそれらを抑制し、次に結果のシリーズに最初のメソッドを適用できます。したがって、ソリューション全体は次のようになります。

test_df['Date'].str.replace("-","").astype(int)これは、 "Date"列が文字列オブジェクトでない場合、通常はPandasがすでにシリーズを解析している場合）機能しないことに注意してくださいTimeStampとして。この場合、次のように使用できます。

test_df['Date'].dt.strftime("%Y%m%d").astype(int)

Rakesh · Answer

pd.to_datetime().dt.strftime("%Y%m%d")が必要なようです。

デモ：

import pandas as pd df = pd.DataFrame({"Date": ["2014-03-29", "2014-03-30", "2014-03-31"]}) df["Date"] = pd.to_datetime(df["Date"]).dt.strftime("%Y%m%d") print( df )

出力：

 Date 0 20140329 1 20140330 2 20140331

msolomon87 · Answer

これはうまくいくはずです

df['Date'] = pd.to_numeric(df.Date.str.replace('-','')) print(df['Date']) 0 20140329 1 20140330 2 20140331 3 20140401 4 20140402