最も近い秒に四捨五入-Python

Question

次のような500 000を超える日付と時刻のスタンプを持つ大規模なデータセットがあります。

_date time 2017-06-25 00:31:53.993 2017-06-25 00:32:31.224 2017-06-25 00:33:11.223 2017-06-25 00:33:53.876 2017-06-25 00:34:31.219 2017-06-25 00:35:12.634 _

これらのタイムスタンプを最も近い秒に丸めるにはどうすればよいですか？

私のコードは次のようになります：

_readcsv = pd.read_csv(filename) log_date = readcsv.date log_time = readcsv.time readcsv['date'] = pd.to_datetime(readcsv['date']).dt.date readcsv['time'] = pd.to_datetime(readcsv['time']).dt.time timestamp = [datetime.datetime.combine(log_date[i],log_time[i]) for i in range(len(log_date))] _

だから今私は日付と時刻を次のような_datetime.datetime_オブジェクトのリストにまとめました：

_datetime.datetime(2017,6,25,00,31,53,993000) datetime.datetime(2017,6,25,00,32,31,224000) datetime.datetime(2017,6,25,00,33,11,223000) datetime.datetime(2017,6,25,00,33,53,876000) datetime.datetime(2017,6,25,00,34,31,219000) datetime.datetime(2017,6,25,00,35,12,634000) _

ここからどこへ行くのですか？ df.timestamp.dt.round('1s')関数が機能していないようです？また、.split()を使用すると、秒と分が59を超えると問題が発生しました

どうもありがとう

Srisaila · Accepted Answer

_for loop_およびstr.split()の使用：

_dts = ['2017-06-25 00:31:53.993', '2017-06-25 00:32:31.224', '2017-06-25 00:33:11.223', '2017-06-25 00:33:53.876', '2017-06-25 00:34:31.219', '2017-06-25 00:35:12.634'] for item in dts: date = item.split()[0] h, m, s = [item.split()[1].split(':')[0], item.split()[1].split(':')[1], str(round(float(item.split()[1].split(':')[-1])))] print(date + ' ' + h + ':' + m + ':' + s) 2017-06-25 00:31:54 2017-06-25 00:32:31 2017-06-25 00:33:11 2017-06-25 00:33:54 2017-06-25 00:34:31 2017-06-25 00:35:13 >>> _

あなたはそれを関数に変えることができます：

_def round_seconds(dts): result = [] for item in dts: date = item.split()[0] h, m, s = [item.split()[1].split(':')[0], item.split()[1].split(':')[1], str(round(float(item.split()[1].split(':')[-1])))] result.append(date + ' ' + h + ':' + m + ':' + s) return result _

関数のテスト：

_dts = ['2017-06-25 00:31:53.993', '2017-06-25 00:32:31.224', '2017-06-25 00:33:11.223', '2017-06-25 00:33:53.876', '2017-06-25 00:34:31.219', '2017-06-25 00:35:12.634'] from pprint import pprint pprint(round_seconds(dts)) ['2017-06-25 00:31:54', '2017-06-25 00:32:31', '2017-06-25 00:33:11', '2017-06-25 00:33:54', '2017-06-25 00:34:31', '2017-06-25 00:35:13'] >>> _

Python 2.7を使用しているようなので、末尾のゼロを削除するには、次のように変更する必要があります。

str(round(float(item.split()[1].split(':')[-1])))

に

str(round(float(item.split()[1].split(':')[-1]))).rstrip('0').rstrip('.')

Python 2.7 at repl.it で関数を試したところ、期待どおりに実行されました。

electrovir · Answer

追加のパッケージがない場合、次の単純な関数を使用して、datetimeオブジェクトを最も近い秒に丸めることができます。

import datetime def roundSeconds(dateTimeObject): newDateTime = dateTimeObject if newDateTime.microsecond >= 500000: newDateTime = newDateTime + datetime.timedelta(seconds=1) return newDateTime.replace(microsecond=0)

cs95 · Answer

パンダを使用している場合は、dt.roundを使用してデータをround最も近い秒まで_できます-

df timestamp 0 2017-06-25 00:31:53.993 1 2017-06-25 00:32:31.224 2 2017-06-25 00:33:11.223 3 2017-06-25 00:33:53.876 4 2017-06-25 00:34:31.219 5 2017-06-25 00:35:12.634 df.timestamp.dt.round('1s') 0 2017-06-25 00:31:54 1 2017-06-25 00:32:31 2 2017-06-25 00:33:11 3 2017-06-25 00:33:54 4 2017-06-25 00:34:31 5 2017-06-25 00:35:13 Name: timestamp, dtype: datetime64[ns]

timestampがdatetime列でない場合は、最初にpd.to_datetimeを使用して変換します-

df.timestamp = pd.to_datetime(df.timestamp)

その後、dt.roundが機能するはずです。

mike rodent · Answer

質問には、丸めたいhowとは記載されていません。多くの場合、時間関数には切り捨てが適切です。これは統計ではありません。

rounded_down_datetime = raw_datetime.replace(microsecond=0)

Rudradev Pal · Answer

データセットをファイルに保存する場合は、次のようにできます。

with open('../dataset.txt') as fp: line = fp.readline() cnt = 1 while line: line = fp.readline() print "
" + line.strip() sec = line[line.rfind(':') + 1:len(line)] rounded_num = int(round(float(sec))) print line[0:line.rfind(':') + 1] + str(rounded_num) print abs(float(sec) - rounded_num) cnt += 1

リストにデータセットを保存している場合：

dts = ['2017-06-25 00:31:53.993', '2017-06-25 00:32:31.224', '2017-06-25 00:33:11.223', '2017-06-25 00:33:53.876', '2017-06-25 00:34:31.219', '2017-06-25 00:35:12.634'] for i in dts: line = i print "
" + line.strip() sec = line[line.rfind(':') + 1:len(line)] rounded_num = int(round(float(sec))) print line[0:line.rfind(':') + 1] + str(rounded_num) print abs(float(sec) - rounded_num)

gerardw · Answer

@electrovirのソリューションの代替バージョン：

import datetime def roundSeconds(dateTimeObject): newDateTime = dateTimeObject + datetime.timedelta(seconds=.5) return newDateTime.replace(microsecond=0)

Maciej Gumułka · Answer

誰かが単一の日時アイテムを最も近い秒に丸めたい場合、これはうまく機能します：

pandas.to_datetime(your_datetime_item).round('1s')

Gerardsson · Answer

標準の日時モジュールのみを必要とするエレガントなソリューション。

import datetime currentimemili = datetime.datetime.now() currenttimesecs = currentimemili - \ datetime.timedelta(microseconds=currentimemili.microsecond) print(currenttimesecs)