次のデータフレームがあります。<pre><code>library(dplyr) library(tibble) df <- tibble( source = c("a", "b", "c", "d", "e"), score = c(10, 5, NA, 3, NA ) ) df </code></pre>次のようになります。<pre><code># A tibble: 5 x 2 source score <chr> <dbl> 1 a 10 . # current max value 2 b 5 3 c NA 4 d 3 5 e NA </code></pre>私がしたいことは、スコア列の<code>NA</code>を既存の<code>max + n</code>以降の範囲の値に置き換えることです。ここで、<code>n</code>の範囲は1から<code>df</code>の行の総数です。この結果（手作業でコーディング）：<pre><code> source score a 10 b 5 c 11 # obtained from 10 + 1 d 3 e 12 # obtained from 10 + 2 </code></pre>どうすればそれを達成できますか？

ベースのRソリューションと比較してかなりエレガントではありませんが、それでも可能です：<pre><code>library(data.table) setDT(df) max.score = df[, max(score, na.rm = TRUE)] df[is.na(score), score :=(1:.N) + max.score] </code></pre>または1行で少し遅い：<pre><code>df[is.na(score), score := (1:.N) + df[, max(score, na.rm = TRUE)]] df source score 1: a 10 2: b 5 3: c 11 4: d 3 5: e 12 </code></pre>

NAを値のセットで置き換える方法

次のデータフレームがあります。

library(dplyr)
library(tibble)


df <- tibble(
  source = c("a", "b", "c", "d", "e"),
  score = c(10, 5, NA, 3, NA ) ) 


df

次のようになります。

# A tibble: 5 x 2
  source score
  <chr>  <dbl>
1 a         10 . # current max value
2 b          5
3 c         NA
4 d          3
5 e         NA

私がしたいことは、スコア列のNAを既存のmax + n以降の範囲の値に置き換えることです。ここで、nの範囲は1からdfの行の総数です。

この結果（手作業でコーディング）：

  source score
  a         10
  b          5
  c         11 # obtained from 10 + 1
  d          3
  e         12 #  obtained from 10 + 2

どうすればそれを達成できますか？

rdplyrtibble

2020/02/14scamander

ベースのRソリューションと比較してかなりエレガントではありませんが、それでも可能です：

library(data.table)
setDT(df)

max.score = df[, max(score, na.rm = TRUE)]
df[is.na(score), score :=(1:.N) + max.score]

または1行で少し遅い：

df[is.na(score), score := (1:.N) + df[, max(score, na.rm = TRUE)]]
df
   source score
1:      a    10
2:      b     5
3:      c    11
4:      d     3
5:      e    12

2020/02/14Sergiy