pandas）でfloatシリーズを整数シリーズに変換します

Question

次のデータフレームがあります。

In [31]: rise_p Out[31]: time magnitude 0 1379945444 156.627598 1 1379945447 1474.648726 2 1379945448 1477.448999 3 1379945449 1474.886202 4 1379945699 1371.454224

ここで、1分以内の行をグループ化します。したがって、時系列を100で除算します。次のようになります。

In [32]: rise_p/100 Out[32]: time magnitude 0 13799454.44 1.566276 1 13799454.47 14.746487 2 13799454.48 14.774490 3 13799454.49 14.748862 4 13799456.99 13.714542

上で説明したように、時間に基づいてグループを作成したいと思います。したがって、予想されるサブグループは、時間13799454および13799456の行になります。私はこれをします：

In [37]: ts = rise_p['time']/100 In [38]: s = rise_p/100 In [39]: new_re_df = [s.iloc[np.where(int(ts) == int(i))] for i in ts] --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-39-5ea498cf32b2> in <module>() ----> 1 new_re_df = [s.iloc[np.where(int(ts) == int(i))] for i in ts] TypeError: only length-1 arrays can be converted to Python scalars

Int（）はシリーズまたはリストを引数として受け取らないので、tsを整数シリーズに変換するにはどうすればよいですか？ pandasにこれを行う方法はありますか？

drexiya · Accepted Answer

Astypeで変換してみてください：

new_re_df = [s.iloc[np.where(ts.astype(int) == int(i))] for i in ts]

編集

@Rutger Kassiesの提案では、シリーズをキャストしてからグループ化するのがより良い方法です。

rise_p['ts'] = (rise_p.time / 100).astype('int') ts_grouped = rise_p.groupby('ts') ...

Jeff · Answer

これがあなたの問題を解決する別の方法です

In [3]: df Out[3]: time magnitude 0 1379945444 156.627598 1 1379945447 1474.648726 2 1379945448 1477.448999 3 1379945449 1474.886202 4 1379945699 1371.454224 In [4]: df.dtypes Out[4]: time int64 magnitude float64 dtype: object

エポックのタイムスタンプを秒に変換する

In [7]: df['time'] = pd.to_datetime(df['time'],unit='s')

インデックスを設定する

In [8]: df.set_index('time',inplace=True) In [9]: df Out[9]: magnitude time 2013-09-23 14:10:44 156.627598 2013-09-23 14:10:47 1474.648726 2013-09-23 14:10:48 1477.448999 2013-09-23 14:10:49 1474.886202 2013-09-23 14:14:59 1371.454224

1分ごとにグループ化し、結果を意味します（how=は任意の関数にすることもできます）

In [10]: df.resample('1Min',how=np.mean) Out[10]: magnitude time 2013-09-23 14:10:00 1145.902881 2013-09-23 14:11:00 NaN 2013-09-23 14:12:00 NaN 2013-09-23 14:13:00 NaN 2013-09-23 14:14:00 1371.454224

winni2k · Answer

tsをタイプSeriesのintに変換するもう1つの非常に一般的な方法は次のとおりです。

rise_p['ts'] = (rise_p.time / 100).apply(lambda val: int(val))

applyを使用すると、任意の関数をSeriesオブジェクトの値ごとに適用できます。 applyは、DataFrameオブジェクトの列でも機能します。