dplyrを使用して頻度テーブルを生成する方法

Question

データフレーム内のいくつかの列の頻度を持つテーブルを作成するのが好きです。以下のデータフレームの一部をコピーしています。

この表には、色の「赤」と性別の「F」の頻度（nと％の両方）があると想定されています。

Dplyrパッケージでこれができると思いますが、私にはわかりません。

ありがとうございました-

 RespondentID色の性別 1 1503赤F 2 1653 NA M 3 1982赤F 4 4862赤NA 15 4880ブルーM

JasonAizkalns · Answer

library(dplyr) df %>% count(Color, Gender) %>% group_by(Color) %>% # now required with changes to dplyr::count() mutate(prop = prop.table(n)) # Source: local data frame [4 x 4] # Groups: Color [3] # # Color Gender n prop # (fctr) (fctr) (int) (dbl) # 1 Blue M 1 1.0000000 # 2 Red F 2 0.6666667 # 3 Red NA 1 0.3333333 # 4 NA M 1 1.0000000

コメントごとに更新-各変数を個別に確認する場合は、最初にデータフレームを再配置する必要があります。 tidyrでこれを達成できます：

library(tidyr) library(dplyr) gather(df, "var", "value", -RespondentID) %>% count(var, value) %>% group_by(var) %>% # now required with changes to dplyr::count() mutate(prop = prop.table(n)) # Source: local data frame [6 x 4] # Groups: var [2] # # var value n prop # (fctr) (chr) (int) (dbl) # 1 Color Blue 1 0.2 # 2 Color Red 3 0.6 # 3 Color NA 1 0.2 # 4 Gender F 2 0.4 # 5 Gender M 2 0.4 # 6 Gender NA 1 0.2