numpyを使用して2D配列で最大/平均プーリングを実行する方法

Question

2D（M x N）行列と2D Kernel（K x L）が与えられた場合、画像上で与えられたカーネルを使用して最大または平均プーリングの結果である行列をどのように返しますか？

可能であればnumpyを使用したいと思います。

注：M、N、K、Lは偶数でも奇数でもかまいません。たとえば、7x5マトリックスと2x2カーネルなど、互いに完全に割り切れる必要はありません。

例：最大プーリングの：

matrix: array([[ 20, 200, -5, 23], [ -13, 134, 119, 100], [ 120, 32, 49, 25], [-120, 12, 09, 23]]) kernel: 2 x 2 soln: array([[ 200, 119], [ 120, 49]])

mdh · Answer

Scikit-image block_reduce を使用できます。

import numpy as np import skimage.measure a = np.array([ [ 20, 200, -5, 23], [ -13, 134, 119, 100], [ 120, 32, 49, 25], [-120, 12, 9, 23] ]) skimage.measure.block_reduce(a, (2,2), np.max)

与える：

array([[200, 119], [120, 49]])

Elliot · Answer

画像サイズがカーネルサイズで均等に割り切れる場合は、配列を変更して、maxまたはmeanを適切に使用できます。

import numpy as np mat = np.array([[ 20, 200, -5, 23], [ -13, 134, 119, 100], [ 120, 32, 49, 25], [-120, 12, 9, 23]]) M, N = mat.shape K = 2 L = 2 MK = M // K NL = N // L print(mat[:MK*K, :NL*L].reshape(MK, K, NL, L).max(axis=(1, 3))) # [[200, 119], [120, 49]]

偶数のカーネルがない場合は、境界を個別に処理する必要があります。（コメントで指摘されているように、これはコピーされるマトリックスになり、パフォーマンスに影響します）。

mat = np.array([[20, 200, -5, 23, 7], [-13, 134, 119, 100, 8], [120, 32, 49, 25, 12], [-120, 12, 9, 23, 15], [-57, 84, 19, 17, 82], ]) # soln # [200, 119, 8] # [120, 49, 15] # [84, 19, 82] M, N = mat.shape K = 2 L = 2 MK = M // K NL = N // L # split the matrix into 'quadrants' Q1 = mat[:MK * K, :NL * L].reshape(MK, K, NL, L).max(axis=(1, 3)) Q2 = mat[MK * K:, :NL * L].reshape(-1, NL, L).max(axis=2) Q3 = mat[:MK * K, NL * L:].reshape(MK, K, -1).max(axis=1) Q4 = mat[MK * K:, NL * L:].max() # compose the individual quadrants into one new matrix soln = np.vstack([np.c_[Q1, Q3], np.c_[Q2, Q4]]) print(soln) # [[200 119 8] # [120 49 15] # [ 84 19 82]]

Jason · Answer

エリオットの答えが示すように「クアドラント」を作成する代わりに、均等に分割できるようにパディングし、最大または平均プーリングを実行できます。

プーリングはCNNでよく使用されるため、入力配列は通常3Dです。そこで、2Dまたは3D配列で機能する関数を作成しました。

def pooling(mat,ksize,method='max',pad=False): '''Non-overlapping pooling on 2D or 3D data. <mat>: ndarray, input array to pool. <ksize>: Tuple of 2, kernel size in (ky, kx). <method>: str, 'max for max-pooling, 'mean' for mean-pooling. <pad>: bool, pad <mat> or not. If no pad, output has size n//f, n being <mat> size, f being kernel size. if pad, output has size ceil(n/f). Return <result>: pooled matrix. ''' m, n = mat.shape[:2] ky,kx=ksize _ceil=lambda x,y: int(numpy.ceil(x/float(y))) if pad: ny=_ceil(m,ky) nx=_ceil(n,kx) size=(ny*ky, nx*kx)+mat.shape[2:] mat_pad=numpy.full(size,numpy.nan) mat_pad[:m,:n,...]=mat else: ny=m//ky nx=n//kx mat_pad=mat[:ny*ky, :nx*kx, ...] new_shape=(ny,ky,nx,kx)+mat.shape[2:] if method=='max': result=numpy.nanmax(mat_pad.reshape(new_shape),axis=(1,3)) else: result=numpy.nanmean(mat_pad.reshape(new_shape),axis=(1,3)) return result

場合によっては、カーネルサイズと等しくないストライドで、オーバーラッププーリングを実行することがあります。パディングの有無にかかわらず、これを行う関数を次に示します。

def asStride(arr,sub_shape,stride): '''Get a strided sub-matrices view of an ndarray. See also skimage.util.shape.view_as_windows() ''' s0,s1=arr.strides[:2] m1,n1=arr.shape[:2] m2,n2=sub_shape view_shape=(1+(m1-m2)//stride[0],1+(n1-n2)//stride[1],m2,n2)+arr.shape[2:] strides=(stride[0]*s0,stride[1]*s1,s0,s1)+arr.strides[2:] subs=numpy.lib.stride_tricks.as_strided(arr,view_shape,strides=strides) return subs def poolingOverlap(mat,ksize,stride=None,method='max',pad=False): '''Overlapping pooling on 2D or 3D data. <mat>: ndarray, input array to pool. <ksize>: Tuple of 2, kernel size in (ky, kx). <stride>: Tuple of 2 or None, stride of pooling window. If None, same as <ksize> (non-overlapping pooling). <method>: str, 'max for max-pooling, 'mean' for mean-pooling. <pad>: bool, pad <mat> or not. If no pad, output has size (n-f)//s+1, n being <mat> size, f being kernel size, s stride. if pad, output has size ceil(n/s). Return <result>: pooled matrix. ''' m, n = mat.shape[:2] ky,kx=ksize if stride is None: stride=(ky,kx) sy,sx=stride _ceil=lambda x,y: int(numpy.ceil(x/float(y))) if pad: ny=_ceil(m,sy) nx=_ceil(n,sx) size=((ny-1)*sy+ky, (nx-1)*sx+kx) + mat.shape[2:] mat_pad=numpy.full(size,numpy.nan) mat_pad[:m,:n,...]=mat else: mat_pad=mat[:(m-ky)//sy*sy+ky, :(n-kx)//sx*sx+kx, ...] view=asStride(mat_pad,ksize,stride) if method=='max': result=numpy.nanmax(view,axis=(2,3)) else: result=numpy.nanmean(view,axis=(2,3)) return result