1つのグラフに複数の箱ひげ図をプロット

Question

12列の.csvファイルとしてデータを保存しました。列2〜11（ラベルF1, F2, ..., F11）はfeaturesです。 Column oneには、これらの機能のlabelがgoodまたはbadのいずれかを含みます。

boxplotに対してこれら11の機能すべてのlabelをプロットしたいのですが、goodまたはbadで区切ってください。これまでの私のコードは次のとおりです。

qplot(Label, F1, data=testData, geom = "boxplot", fill=Label, binwidth=0.5, main="Test") + xlab("Label") + ylab("Features")

ただし、これはlabelに対してF1のみを示しています。

私の質問は、いくつかのF2, F3, ..., F11を持つ1つのグラフでlabelに対してdodge positionを表示する方法ですか？ [0 1]の範囲内で同じスケールになるように、フィーチャを正規化しました。

テストデータは here にあります。私は問題を説明するために手で何かを描きました（以下を参照）。

hand-drawn boxplot example

Arun · Accepted Answer

プロットする前に、データを融解することで特定の形式でデータを取得する必要があります（融解したデータの外観については以下を参照）。そうでなければ、あなたがやったことは大丈夫のようです。

require(reshape2) df <- read.csv("TestData.csv", header=T) # melting by "Label". `melt is from the reshape2 package. # do ?melt to see what other things it can do (you will surely need it) df.m <- melt(df, id.var = "Label") > df.m # pasting some rows of the melted data.frame # Label variable value # 1 Good F1 0.64778924 # 2 Good F1 0.54608791 # 3 Good F1 0.46134200 # 4 Good F1 0.79421221 # 5 Good F1 0.56919951 # 6 Good F1 0.73568570 # 7 Good F1 0.65094207 # 8 Good F1 0.45749702 # 9 Good F1 0.80861929 # 10 Good F1 0.67310067 # 11 Good F1 0.68781739 # 12 Good F1 0.47009455 # 13 Good F1 0.95859182 # 14 Good F1 1.00000000 # 15 Good F1 0.46908343 # 16 Bad F1 0.57875528 # 17 Bad F1 0.28938046 # 18 Bad F1 0.68511766 require(ggplot2) ggplot(data = df.m, aes(x=variable, y=value)) + geom_boxplot(aes(fill=Label))

boxplot_ggplot2

編集：ファセットが必要な場合があることを理解しています。以下もその実装です。

p <- ggplot(data = df.m, aes(x=variable, y=value)) + geom_boxplot(aes(fill=Label)) p + facet_wrap( ~ variable, scales="free")

ggplot2_faceted

編集2： x-labels、y-labels、titleの追加、legend headingの変更、jitterの追加方法

p <- ggplot(data = df.m, aes(x=variable, y=value)) p <- p + geom_boxplot(aes(fill=Label)) p <- p + geom_jitter() p <- p + facet_wrap( ~ variable, scales="free") p <- p + xlab("x-axis") + ylab("y-axis") + ggtitle("Title") p <- p + guides(fill=guide_legend(title="Legend_Title")) p

ggplot2_geom_plot

編集3：ボックスプロットの中心にgeom_point()ポイントを揃える方法は？ position_dodgeを使用して実行できます。これは動作するはずです。

require(ggplot2) p <- ggplot(data = df.m, aes(x=variable, y=value)) p <- p + geom_boxplot(aes(fill = Label)) # if you want color for points replace group with colour=Label p <- p + geom_point(aes(y=value, group=Label), position = position_dodge(width=0.75)) p <- p + facet_wrap( ~ variable, scales="free") p <- p + xlab("x-axis") + ylab("y-axis") + ggtitle("Title") p <- p + guides(fill=guide_legend(title="Legend_Title")) p

ggplot2_position_dodge_geom_point

agstudy · Answer

プロットパッケージについては言及していないので、ここではLatticeバージョンを使用して提案します（少なくともここにいるので、ラティスのものよりもggplot2の回答が多いと思います）。

 ## reshaping the data( similar to the other answer) library(reshape2) dat.m <- melt(TestData,id.vars='Label') library(lattice) bwplot(value~Label |variable, ## see the powerful conditional formula data=dat.m, between=list(y=1), main="Bad or Good")

enter image description here

JT85 · Answer

ラティスプロットのggplotバージョン：

library(reshape2) library(ggplot2) df <- read.csv("TestData.csv", header=T) df.m <- melt(df, id.var = "Label") ggplot(data = df.m, aes(x=Label, y=value)) + geom_boxplot() + facet_wrap(~variable,ncol = 4)

プロット： enter image description here

dww · Answer

基本グラフィックスを使用して、at =を使用してボックスの位置を制御し、ボックスの幅のboxwex =と組み合わせることができます。最初のboxplotステートメントは、空白のプロットを作成します。次に、次の2つのステートメントに2つのトレースを追加します。

以下では、df[,-1]を使用して、プロットする値から最初の（id）列を除外します。異なるデータフレームの場合、プロットするデータが含まれる列のサブセットにこれを変更する必要がある場合があります。

df <- data.frame(id = c(rep("Good",200), rep("Bad", 200)), F1 = c(rnorm(200,10,2), rnorm(200,8,1)), F2 = c(rnorm(200,7,1), rnorm(200,6,1)), F3 = c(rnorm(200,6,2), rnorm(200,9,3)), F4 = c(rnorm(200,12,3), rnorm(200,8,2))) boxplot(df[,-1], xlim = c(0.5, ncol(df[,-1])+0.5), boxfill=rgb(1, 1, 1, alpha=1), border=rgb(1, 1, 1, alpha=1)) #invisible boxes boxplot(df[which(df$id=="Good"), -1], xaxt = "n", add = TRUE, boxfill="red", boxwex=0.25, at = 1:ncol(df[,-1]) - 0.15) #shift these left by -0.15 boxplot(df[which(df$id=="Bad"), -1], xaxt = "n", add = TRUE, boxfill="blue", boxwex=0.25, at = 1:ncol(df[,-1]) + 0.15) #shift these right by +0.15

user2103050 · Answer

私はこれが少し古い質問であることを知っていますが、それも私が持っていたものであり、受け入れられた答えは機能しますが、同様の何かをする方法がありますwithoutggplotや格子などの追加パッケージを使用します。ボックスプロットが並んで表示されるのではなく重なっているという点で、ニースほどではありませんが、

boxplot(data1[,1:4]) boxplot(data2[,1:4],add=TRUE,border="red")

これにより、2組の箱ひげ図が入れられ、2番目の箱ひげ図には赤のアウトライン（塗りつぶしなし）があり、外れ値も赤になります。良いことは、2つの異なるデータフレームに対して機能するのではなく、それらを機能させることです。早くて汚い方法。

Karolis Koncevičius · Answer

ベースRでは、相互作用（:）を使用した式インターフェースを使用してこれを実現できます。

df <- read.csv("~/Desktop/TestData.csv") df <- data.frame(stack(df[,-1]), Label=df$Label) # reshape to long format boxplot(values ~ Label:ind, data=df, col=c("red", "limegreen"), las=2)