Rで操作したい1500個以上のjsonオブジェクトを含むファイルがあります。データをリストとしてインポートすることはできましたが、有用な構造に強制するのに問題があります。各jsonオブジェクトの行と各key:valueペアの列を含むデータフレームを作成します。
この小さな偽のデータセットで状況を再現しました。
[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
{"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
{"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
{"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
{"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
{"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
{"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]
データの機能:
この質問に基づいて: R list(structure(list()))to data frame 、私は次のことを試しました:
json_file <- "test.json"
json_data <- fromJSON(json_file)
asFrame <- do.call("rbind.fill", lapply(json_data, as.data.frame))
私の実際のデータとこの偽のデータの両方で、最後の行は私にこのエラーを与えます:
Error in data.frame(name = "Doe, John", group = "Red", `age (y)` = 24, :
arguments imply differing number of rows: 1, 0
NULLをNAに置き換えるだけです。
require(RJSONIO)
json_file <- '[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
{"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
{"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
{"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
{"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
{"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
{"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]'
json_file <- fromJSON(json_file)
json_file <- lapply(json_file, function(x) {
x[sapply(x, is.null)] <- NA
unlist(x)
})
各要素にnull以外の値を設定したら、エラーを発生させることなくrbind
を呼び出すことができます。
do.call("rbind", json_file)
name group age (y) height (cm) wieght (kg) score
[1,] "Doe, John" "Red" "24" "182" "74.8" NA
[2,] "Doe, Jane" "Green" "30" "170" "70.1" "500"
[3,] "Smith, Joan" "Yellow" "41" "169" "60" NA
[4,] "Brown, Sam" "Green" "22" "183" "75" "865"
[5,] "Jones, Larry" "Green" "31" "178" "83.9" "221"
[6,] "Murray, Seth" "Red" "35" "172" "76.2" "413"
[7,] "Doe, Jane" "Yellow" "22" "164" "68" "902"
library(jsonlite)
と関数fromJSON
を使用する場合、これは非常に簡単です。また、null
値を処理し、NA
に変換します。
json_file <- '[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
{"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
{"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
{"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
{"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
{"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
{"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]'
library(jsonlite)
fromJSON(json_file)
# name group age (y) height (cm) wieght (kg) score
# 1 Doe, John Red 24 182 74.8 NA
# 2 Doe, Jane Green 30 170 70.1 500
# 3 Smith, Joan Yellow 41 169 60.0 NA
# 4 Brown, Sam Green 22 183 75.0 865
# 5 Jones, Larry Green 31 178 83.9 221
# 6 Murray, Seth Red 35 172 76.2 413
# 7 Doe, Jane Yellow 22 164 68.0 902
str(fromJSON(json_file))
# 'data.frame': 7 obs. of 6 variables:
# $ name : chr "Doe, John" "Doe, Jane" "Smith, Joan" "Brown, Sam" ...
# $ group : chr "Red" "Green" "Yellow" "Green" ...
# $ age (y) : int 24 30 41 22 31 35 22
# $ height (cm): int 182 170 169 183 178 172 164
# $ wieght (kg): num 74.8 70.1 60 75 83.9 76.2 68
# $ score : int NA 500 NA 865 221 413 902
Null値を削除するには、パラメーターnullValueを使用します
json_data <- fromJSON(json_file, nullValue = NA)
asFrame <- do.call("rbind.fill", lapply(json_data, as.data.frame))
このようにして、出力に不要な引用符はありません
library(rjson)
Lines <- readLines("Yelp_academic_dataset_business.json")
business <- as.data.frame(t(sapply(Lines, fromJSON)))
これを試してJSONデータをRにロードできます
dplyr::bind_rows(fromJSON(file_name))