scipy.sparse行列をスライスする最速の方法は何ですか？

Question

普段使っています

matrix[:, i:]

思ったほど速く動作しないようです。

Saullo G. P. Castro · Accepted Answer

出力としてスパース行列を取得する場合、行スライスを実行する最も速い方法はcsr型を使用することであり、列スライスの場合はcsc、ここで詳しく説明します。どちらの場合も、現在行っていることを実行する必要があります。

matrix[l1:l2,c1:c2]

別のタイプを出力として使用したい場合は、もっと速い方法があります。この別の回答では行列をスライスするための多くの方法とそれらの異なるタイミングの比較について説明されています。たとえば、出力としてndarrayが必要な場合、最速のスライスは次のようになります。

matrix.A[l1:l2,c1:c2]

または：

matrix.toarray()[l1:l2,c1:c2]

よりもはるかに速い：

matrix[l1:l2,c1:c2].A #or .toarray()

Sorig · Answer

アドバタイズされたscipy.sparse.csr_matrixの高速行インデックス作成は、独自の行インデクサーをロールすることではるかに高速にできることがわかりました。アイデアは次のとおりです。

class SparseRowIndexer: def __init__(self, csr_matrix): data = [] indices = [] indptr = [] # Iterating over the rows this way is significantly more efficient # than csr_matrix[row_index,:] and csr_matrix.getrow(row_index) for row_start, row_end in Zip(csr_matrix.indptr[:-1], csr_matrix.indptr[1:]): data.append(csr_matrix.data[row_start:row_end]) indices.append(csr_matrix.indices[row_start:row_end]) indptr.append(row_end-row_start) # nnz of the row self.data = np.array(data) self.indices = np.array(indices) self.indptr = np.array(indptr) self.n_columns = csr_matrix.shape[1] def __getitem__(self, row_selector): data = np.concatenate(self.data[row_selector]) indices = np.concatenate(self.indices[row_selector]) indptr = np.append(0, np.cumsum(self.indptr[row_selector])) shape = [indptr.shape[0]-1, self.n_columns] return sparse.csr_matrix((data, indices, indptr), shape=shape)

つまり、各行のゼロ以外の値を別々の配列（行ごとに異なる長さ）に格納し、それらのすべての行配列をオブジェクト型配列に配置することで、numpy配列の高速インデックス作成を利用できます（各行のサイズを変えることができます）、効率的にインデックスを付けることができます。列インデックスは同じ方法で格納されます。このアプローチは、ゼロ以外のすべての値を単一の配列に格納する標準のCSRデータ構造とは少し異なり、各行の開始位置と終了位置を確認するためのルックアップが必要です。これらのルックアップはランダムアクセスの速度を低下させる可能性がありますが、連続する行の取得には効率的である必要があります。

プロファイリング結果

私の行列matは、400,000,000の非ゼロ要素を持つ1,900,000x1,250,000 csr_matrixです。 ilocsは、200,000個のランダムな行インデックスの配列です。

>>> %timeit mat[ilocs] 2.66 s ± 233 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

に比べ：

>>> row_indexer = SparseRowIndexer(mat) >>> %timeit row_indexer[ilocs] 59.9 ms ± 4.51 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

SparseRowIndexerは、ブールマスクと比較して、ファンシーインデックスを使用する方が高速であるようです。