python / pandas：月のintを月の名前に変換

Question

私が見つけた情報のほとんどは、python> pandas> dataframeにはなかったため、質問です。

1〜12の整数を月の短縮名に変換したい。

次のようなdfがあります。

 client Month 1 sss 02 2 yyy 12 3 www 06

私はdfを次のように見せたいです：

 client Month 1 sss Feb 2 yyy Dec 3 www Jun

EoinS · Accepted Answer

_calendar.month_abbr_とdf[col].apply()を組み合わせてこれを効率的に行うことができます

_import calendar df['Month'] = df['Month'].apply(lambda x: calendar.month_abbr[x]) _

Vagee · Answer

これにはstrptimeおよびlambda関数を使用します。

from time import strptime df['Month'] = df['Month'].apply(lambda x: strptime(x,'%b').tm_mon)

andrew · Answer

これは、列適用を使用して簡単に実行できます。

import pandas as pd df = pd.DataFrame({'client':['sss', 'yyy', 'www'], 'Month': ['02', '12', '06']}) look_up = {'01': 'Jan', '02': 'Feb', '03': 'Mar', '04': 'Apr', '05': 'May', '06': 'Jun', '07': 'Jul', '08': 'Aug', '09': 'Sep', '10': 'Oct', '11': 'Nov', '12': 'Dec'} df['Month'] = df['Month'].apply(lambda x: look_up[x]) df Month client 0 Feb sss 1 Dec yyy 2 Jun www

pekapa · Answer

これを行う1つの方法は、データフレームでapplyメソッドを使用することですが、それを行うには、月を変換するためのマップが必要です。関数/辞書またはPython独自の日時を使用してそれを行うことができます。

日時を使用すると、次のようになります。

def mapper(month): date = datetime.datetime(2000, month, 1) # You need a dateobject with the proper month return date.strftime('%b') # %b returns the months abbreviation, other options [here][1] df['Month'].apply(mapper)

同様の方法で、カスタム名用の独自のマップを作成できます。次のようになります。

months_map = {01: 'Jan', 02: 'Feb'} def mapper(month): return months_map[month]

明らかに、この関数を明示的に定義する必要はなく、applyメソッドでlambdaを直接使用できます。

today · Answer

短縮された月名はフルネームの最初の3文字であるため、最初にMonth列をdatetimeに変換してから、dt.month_name()を使用してフルネームを取得できます。最後にstr.slice()メソッドを使用して最初の3文字を取得します。すべてpandasを使用し、1行のコードでのみ使用します。

df['Month'] = pd.to_datetime(df['Month'], format='%m').dt.month_name().str.slice(stop=3) df Month client 0 Feb sss 1 Dec yyy 2 Jun www

jpp · Answer

calendar モジュールは便利ですが、 calendar.month_abbr は配列のようなものです。ベクトル化された方法で直接使用することはできません。効率的なマッピングを行うには、辞書を作成してからpd.Series.map：

import calendar d = dict(enumerate(calendar.month_abbr)) df['Month'] = df['Month'].map(d)

パフォーマンスベンチマークでは、約130倍のパフォーマンスの違いが示されています。

import calendar d = dict(enumerate(calendar.month_abbr)) mapper = calendar.month_abbr.__getitem__ np.random.seed(0) n = 10**5 df = pd.DataFrame({'A': np.random.randint(1, 13, n)}) %timeit df['A'].map(d) # 7.29 ms per loop %timeit df['A'].map(mapper) # 946 ms per loop

Suhas_Pote · Answer

def mapper(month): return month.strftime('%b') df['Month'] = df['Month'].apply(mapper)

参照：

http://strftime.org/

Heather · Answer

これらすべてを大規模なデータセットでテストした結果、次のことが最速であることがわかりました。

import calendar def month_mapping(): # I'm lazy so I have a stash of functions already written so # I don't have to write them out every time. This returns the # {1:'Jan'....12:'Dec'} dict in the laziest way... abbrevs = {} for month in range (1, 13): abbrevs[month] = calendar.month_abbr[month] return abbrevs abbrevs = month_mapping() df['Month Abbrev'} = df['Date Col'].dt.month.map(mapping)