LM.FIT（x、y、offset = offset、singular.ok = singular.ok、...）： 'y'のNA / NN / INF、あらゆる方法を試してみました

Question

ここで私のデータセットはpdと私はそれをトレーニングとテストデータとして分割しましたpd_train1 と pd_train2 _

 sku national_inv lead_time in_transit_qty forecast_3_month forecast_6_month 1 3921548 8 12 0 0 0 2 3191009 83 2 33 157 377 3 2935810 8 4 0 0 0 4 2205847 31 4 63 70 160 5 4953497 3 12 0 0 0 6 2286884 0 8 0 0 0 forecast_9_month sales_1_month sales_3_month sales_6_month sales_9_month min_bank 1 0 1 1 2 5 2 2 603 44 98 148 156 53 3 0 0 0 1 1 0 4 223 27 90 164 219 0 5 0 0 0 0 0 0 6 0 0 0 0 0 0 potential_issue pieces_past_due perf_6_month_avg perf_12_month_avg local_bo_qty 1 0 0 0.63 0.75 0 2 0 0 0.68 0.66 0 3 0 0 0.73 0.78 0 4 0 0 0.73 0.78 0 5 0 0 0.81 0.74 0 6 0 0 0.91 0.96 0 deck_risk oe_constraint ppap_risk stop_auto_buy rev_stop went_on_backorder data 1 0 0 0 1 0 No train 2 0 0 0 1 0 No train 3 0 0 0 1 0 No train 4 0 0 1 1 0 No train 5 0 0 0 1 0 No train 6 0 0 0 1 0 No train  _

私のトレーニングデータのためのLMモデルを作成したかったpd_train1は以下のようにこのエラーを得ています。

> fit=lm(went_on_backorder~.,data=pd_train1) Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y' In addition: Warning message: In storage.mode(v) <- "double" : NAs introduced by coercion  _

無限値を検索しました。

sapply(pd_train1, function(x) sum(is.infinite(x))) sku national_inv lead_time in_transit_qty forecast_3_month 0 0 0 0 0 forecast_6_month forecast_9_month sales_1_month sales_3_month sales_6_month 0 0 0 0 0 sales_9_month min_bank potential_issue pieces_past_due perf_6_month_avg 0 0 0 0 0 perf_12_month_avg local_bo_qty deck_risk oe_constraint ppap_risk 0 0 0 0 0 stop_auto_buy rev_stop went_on_backorder data 0 0 0 0  _

また、リニアモデルを作りたい私のトレーニングデータのNA/NAN値の場合

 sku national_inv lead_time in_transit_qty forecast_3_month 0 0 0 0 0 forecast_6_month forecast_9_month sales_1_month sales_3_month sales_6_month 0 0 0 0 0 sales_9_month min_bank potential_issue pieces_past_due perf_6_month_avg 0 0 0 0 0 perf_12_month_avg local_bo_qty deck_risk oe_constraint ppap_risk 0 0 0 0 0 stop_auto_buy rev_stop went_on_backorder 0 0 0 Inf %in% pd_train1$went_on_backorder 1] FALSE NaN %in% pd_test$went_on_backorder 1] FALSE  _

それ以降、私のデータセットでNA/NAN/INF値を取得できません誰かが私が誤ってエラーを投げるのですか？ここ went_on_backorderターゲット変数です。

cddt · Accepted Answer

列went_on_backorderは要因です。線形回帰には数値応答変数が必要です。

ロジスティック回帰を使用するには、基本Rのglmまたはvgamなどのパッケージを使用してください。これは簡単な例です：

pd_train1 <- data.frame('went_on_backorder' = c('No','Yes','Yes'), 'lead_time' = 1:3) model <- glm(went_on_backorder ~ ., data = pd_train1, family = 'binomial')  _

そしてあなたのクラスを予測することができます：

predict(model, newdata = data.frame('lead_time' = c(0,1,2.5,3.5)), type = "response")  _

Roland · Answer

went_on_backorderは数値変数ではありません。 lm数値以外の変数に対処できません。ロジスティック回帰。