SwiftのCVPixelBufferRefからピクセル値を取得します

Question

CVPixelBufferRefからRGB（またはその他の形式）ピクセル値を取得するにはどうすればよいですか？私は多くのアプローチを試みましたが、まだ成功していません。

func captureOutput(captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, fromConnection connection: AVCaptureConnection!) { let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)! CVPixelBufferLockBaseAddress(pixelBuffer, 0) let baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer) //Get individual pixel values here CVPixelBufferUnlockBaseAddress(pixelBuffer, 0) }

Codo · Accepted Answer

baseAddressは安全ではない可変ポインタ、より正確にはUnsafeMutablePointer<Void>。ポインタをVoidから特定の型に変換すると、メモリに簡単にアクセスできます。

// Convert the base address to a safe pointer of the appropriate type let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress) // read the data (returns value of type UInt8) let firstByte = byteBuffer[0] // write data byteBuffer[3] = 90

正しいタイプ（8、16、または32ビットの符号なし整数）を使用していることを確認してください。それはビデオフォーマットに依存します。おそらくそれは8ビットです。

バッファ形式の更新：

AVCaptureVideoDataOutputインスタンスを初期化するときに形式を指定できます。基本的に次の選択肢があります。

BGRA：青、緑、赤、アルファの値がそれぞれ32ビット整数に格納される単一の平面
420YpCbCr8BiPlanarFullRange：2つのプレーン。最初のプレーンにはY（輝度）値を持つ各ピクセルのバイトが含まれ、2番目のプレーンにはピクセルグループのCbおよびCr（彩度）値が含まれます。
420YpCbCr8BiPlanarVideoRange：420YpCbCr8BiPlanarFullRangeと同じですが、Yの値は16〜235の範囲に制限されます（歴史的な理由により）

色の値に関心があり、速度（または最大フレームレート）に問題がない場合は、より単純なBGRA形式を使用してください。それ以外の場合は、より効率的なネイティブビデオ形式のいずれかを使用します。

2つのプレーンがある場合、目的のプレーンのベースアドレスを取得する必要があります（ビデオ形式の例を参照）。

ビデオ形式の例

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)! CVPixelBufferLockBaseAddress(pixelBuffer, 0) let baseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0) let bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0) let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress) // Get luma value for pixel (43, 17) let luma = byteBuffer[17 * bytesPerRow + 43] CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)

BGRAの例

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)! CVPixelBufferLockBaseAddress(pixelBuffer, 0) let baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer) let int32PerRow = CVPixelBufferGetBytesPerRow(pixelBuffer) let int32Buffer = UnsafeMutablePointer<UInt32>(baseAddress) // Get BGRA value for pixel (43, 17) let luma = int32Buffer[17 * int32PerRow + 43] CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)

swift taylor · Answer

Swift3の更新：

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)! CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0)); let int32Buffer = unsafeBitCast(CVPixelBufferGetBaseAddress(pixelBuffer), to: UnsafeMutablePointer<UInt32>.self) let int32PerRow = CVPixelBufferGetBytesPerRow(pixelBuffer) // Get BGRA value for pixel (43, 17) let luma = int32Buffer[17 * int32PerRow + 43] CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)

Josh Bernfeld · Answer

Codosの回答に加えて、BGRAピクセルバッファーから個々のRGB値を取得する方法を次に示します。注：これを呼び出す前に、バッファーをロックする必要があります。

func pixelFrom(x: Int, y: Int, movieFrame: CVPixelBuffer) -> (UInt8, UInt8, UInt8) { let baseAddress = CVPixelBufferGetBaseAddress(movieFrame) let bytesPerRow = CVPixelBufferGetBytesPerRow(movieFrame) let buffer = baseAddress!.assumingMemoryBound(to: UInt8.self) let index = x*4 + y*bytesPerRow let b = buffer[index] let g = buffer[index+1] let r = buffer[index+2] return (r, g, b) }

Rumo · Answer

Swift 5

私は同じ問題を抱えていて、次の解決策に終わりました。私のCVPixelBufferは次元__68 x 68_を持っていました。

_CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0)) print(CVPixelBufferGetWidth(pixelBuffer)) print(CVPixelBufferGetHeight(pixelBuffer)) _

行ごとのバイト数も知っておく必要があります。

_print(CVPixelBufferGetBytesPerRow(pixelBuffer)) _

私の場合は320でした。

さらに、ピクセルバッファーのデータ型を知る必要があります。これは_Float32_でした。

次に、バイトバッファーを作成し、次のように連続してバイトを読み取ります（上記のようにベースアドレスをロックすることを忘れないでください）。

_var byteBuffer = unsafeBitCast(CVPixelBufferGetBaseAddress(pixelBuffer), to: UnsafeMutablePointer<Float32>.self) var pixelArray: Array<Array<Float>> = Array(repeating: Array(repeating: 0, count: 68), count: 68) for row in 0...67{ for col in 0...67{ pixelArray[row][col] = byteBuffer.pointee byteBuffer = byteBuffer.successor() } byteBuffer = byteBuffer.advanced(by: 12) } CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0)) _

byteBuffer = byteBuffer.advanced(by: 12)という部分について疑問に思うかもしれません。これをしなければならない理由は次のとおりです。

行ごとに320バイトあることがわかっています。ただし、バッファの幅は68で、データ型は_Float32_です。値ごとに4バイト。つまり、実質的には行あたり_272_バイトのみがあり、その後にゼロパディングが続きます。このゼロパディングには、おそらくメモリレイアウトの理由があります。

したがって、byteBuffer = byteBuffer.advanced(by: 12)（_12*4 = 48_）によって実行される各行の最後の48バイトをスキップする必要があります。

このアプローチは、次のbyteBufferへのポインターを使用するため、他のソリューションとは多少異なります。ただし、これはより簡単で直感的です。