pythonシーケンスをnの倍数で繰り返しますか？

Question

シーケンスの要素をバッチで慣例的に処理するにはどうすればよいですか？

たとえば、シーケンス「abcdef」とバッチサイズが2の場合、次のようなことを行います。

for x, y in "abcdef": print "%s%s
" % (x, y) ab cd ef

もちろん、これは、それ自体が2つの要素を含むリストから1つの要素を想定しているため、機能しません。

リストの次のn個の要素をバッチで処理する、またはより大きな文字列からの長さnのサブ文字列（2つの同様の問題）を処理するための、すてきで短く、クリーンなPythonの方法は何ですか？

Paolo Bergantino · Accepted Answer

誰かがもっと「Pythonic」を思い付くと確信していますが、どうでしょうか。

_for y in range(0, len(x), 2): print "%s%s" % (x[y], x[y+1]) _

これは、len(x) % 2 == 0;がわかっている場合にのみ機能することに注意してください。

rpr · Answer

ジェネレーター関数はきちんとしているでしょう：

def batch_gen(data, batch_size): for i in range(0, len(data), batch_size): yield data[i:i+batch_size]

使用例：

a = "abcdef" for i in batch_gen(a, 2): print i

プリント：

ab cd ef

Silverfish · Answer

既知の長さを持たない反復可能オブジェクトに対して機能する代替アプローチがあります。

 def groupsgen(seq, size): it = iter(seq) while True: values = () for n in xrange(size): values += (it.next(),) yield values

これは、サイズのグループでシーケンス（または他のイテレーター）を反復処理し、タプルに値を収集することによって機能します。各グループの最後に、タプルが生成されます。

イテレータが値を使い果たすと、StopIteration例外が生成され、それが伝播されて、groupsgenが値を超えていることを示します。

値はサイズのセット（2、3などのセット）で提供されることを前提としています。そうでない場合、残った値はすべて破棄されます。

Jason Coon · Answer

Zip（）関数を忘れないでください：

a = 'abcdef' for x,y in Zip(a[::2], a[1::2]): print '%s%s' % (x,y)

SilentGhost · Answer

しかし、より一般的な方法は次のようになります（この回答に触発されました）：

for i in Zip(*(seq[i::size] for i in range(size))): print(i) # Tuple of individual values

tzot · Answer

そして、常にドキュメントがあります。

def pairwise(iterable): "s -> (s0,s1), (s1,s2), (s2, s3), ..." a, b = tee(iterable) try: b.next() except StopIteration: pass return izip(a, b) def grouper(n, iterable, padvalue=None): "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')" return izip(*[chain(iterable, repeat(padvalue, n-1))]*n)

注：これらは、入力として文字列シーケンスを指定すると、部分文字列ではなくタプルを生成します。

dbr · Answer

>>> a = "abcdef" >>> size = 2 >>> [a[x:x+size] for x in range(0, len(a), size)] ['ab', 'cd', 'ef']

..または、リスト内包としてではありません：

a = "abcdef" size = 2 output = [] for x in range(0, len(a), size): output.append(a[x:x+size])

または、ジェネレーターとして、複数回使用する場合に最適です（1回限りの場合、リスト内包表記はおそらく「最適」です）。

def chunker(thelist, segsize): for x in range(0, len(thelist), segsize): yield thelist[x:x+segsize]

..そしてその使用法：

>>> for seg in chunker(a, 2): ... print seg ... ab cd ef

SilentGhost · Answer

次のジェネレーターを作成できます

def chunks(seq, size): a = range(0, len(seq), size) b = range(size, len(seq) + 1, size) for i, j in Zip(a, b): yield seq[i:j]

次のように使用します。

for i in chunks('abcdef', 2): print(i)

Gregor Melhorn · Answer

more_itertoolsのドキュメントから： more_itertools.chunked()

more_itertools.chunked(iterable, n)

Iterableを指定された長さのリストに分割します。

>>> list(chunked([1, 2, 3, 4, 5, 6, 7], 3)) [[1, 2, 3], [4, 5, 6], [7]]

Iterableの長さがnで均等に割り切れない場合、最後に返されるリストは短くなります。

Herbert · Answer

2つの回答を除いて、バッチで多くの時期尚早な具体化と添え字（すべてのイテレーターで機能するわけではありません）が見られました。したがって、私はこの代替案を思いついた：

def iter_x_and_n(iterable, x, n): yield x try: for _ in range(n): yield next(iterable) except StopIteration: pass def batched(iterable, n): if n<1: raise ValueError("Can not create batches of size %d, number must be strictly positive" % n) iterable = iter(iterable) try: for x in iterable: yield iter_x_and_n(iterable, x, n-1) except StopIteration: pass

これに対するワンライナーまたは少数ライナーの解決策がないことは私を打ち負かします（私の知る限りでは）。重要な問題は、外部ジェネレーターと内部ジェネレーターの両方がStopIterationを正しく処理する必要があることです。外側のジェネレーターは、少なくとも1つの要素が残っている場合にのみ何かを生成する必要があります。これを確認する直感的な方法は、next（...）を実行し、StopIterationをキャッチすることです。

Jacob Engelbrecht · Answer

 s = 'abcdefgh' for e in (s[i:i+2] for i in range(0,len(s),2)): print(e)

dano · Answer

itertools doc には、次のレシピがあります。

from itertools import izip_longest def grouper(iterable, n, fillvalue=None): "Collect data into fixed-length chunks or blocks" # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx args = [iter(iterable)] * n return izip_longest(fillvalue=fillvalue, *args)

使用法：

>>> l = [1,2,3,4,5,6,7,8,9] >>> [z for z in grouper(l, 3)] [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

pylang · Answer

与えられた

from __future__ import print_function # python 2.x seq = "abcdef" n = 2

コード

while seq: print("{}".format(seq[:n]), end="
") seq = seq[n:]

出力

ab cd ef

Jochen Wersd&#246;rfer · Answer

Itertoolsはどうですか？

from itertools import islice, groupby def chunks_islice(seq, size): while True: aux = list(islice(seq, 0, size)) if not aux: break yield "".join(aux) def chunks_groupby(seq, size): for k, chunk in groupby(enumerate(seq), lambda x: x[0] / size): yield "".join([i[1] for i in chunk])

Karol Trojanowski · Answer

この回答 for Python 3：

def groupsgen(seq, size): it = iter(seq) iterating = True while iterating: values = () try: for n in range(size): values += (next(it),) except StopIteration: iterating = False if not len(values): return None yield values

値がsizeで割り切れない場合は、安全に終了し、値を破棄しません。

Craig McQueen · Answer

これが一連のイテレータを生成するソリューションであり、各イテレータはnアイテムを反復処理します。

def groupiter(thing, n): def countiter(nextthing, thingiter, n): yield nextthing for _ in range(n - 1): try: nextitem = next(thingiter) except StopIteration: return yield nextitem thingiter = iter(thing) while True: try: nextthing = next(thingiter) except StopIteration: return yield countiter(nextthing, thingiter, n)

私はそれを次のように使用します：

table = list(range(250)) for group in groupiter(table, 16): print(' '.join('0x{:02X},'.format(x) for x in group))

nの倍数ではないオブジェクトの長さを処理できることに注意してください。

Jason Coon · Answer

1つの解決策、ただし私は誰かにもっとうまくやるように挑戦します;-)

a = 'abcdef' b = [[a[i-1], a[i]] for i in range(1, len(a), 2)] for x, y in b: print "%s%s
" % (x, y)