我想将几个Json对象与LineStrings组合成一个LineStrings特性集合。
例如,我有以下格式错误的json对象:
{"lat":16.0269337,"lon":40.073042,"score":1,"ID":"13800006252028","TYPES":"Regional","N2C":"2","NAME":"Strada Statale della Val Sinni"}
{"lat":16.0316522,"lon":41.6808931,"score":1,"ID":"13800006260148","TYPES":"Scenic","N2C":"2"}
{"lat":16.0316523,"lon":41.6808932,"score":1,"ID":"13800006260148","TYPES":"Scenic","N2C":"2","NAME":"Giro del Gargano"}
{"lat":16.0316522,"lon":41.6808931,"score":1,"ID":"13800006203620","TYPES":"Scenic","N2C":"2"}
{"lat":16.0316522,"lon":41.6808931,"score":1,"ID":"13800006203620","TYPES":"Regional","N2C":"2","NAME":"Giro del Gargano"}
{"lat":16.032322,"lon":40.6097567,"score":1,"ID":"13800005963252","TYPES":"Scenic","N2C":"2"}
{"lat":16.032322,"lon":40.6097567,"score":1,"ID":"13800005963252","TYPES":"Regional","N2C":"2","NAME":"Via Appia"}我打算做两件事
发布于 2017-12-29 00:14:23
我不是专家,但这是第一批。This post有助于获得基本知识。对于每个"Geo“、"JSON”和熊猫操作片段,可能有比我更优雅的方法。
目前还不清楚您的数据的确切格式是什么,但是如果它确实是使用pd.DataFrame的字典列表,或者直接使用pd.from_json从JSON文件中导入它,那么您可以直接导入它。我只是和前者一起工作,但这很容易改变。
json_objs = [{"lat":16.0269337,"lon":40.073042,"score":1,"ID":"13800006252028","TYPES":"Regional","N2C":"2","NAME":"Strada Statale della Val Sinni"},
{"lat":16.0316522,"lon":41.6808931,"score":1,"ID":"13800006260148","TYPES":"Scenic","N2C":"2"},
{"lat":16.0316523,"lon":41.6808932,"score":1,"ID":"13800006260148","TYPES":"Scenic","N2C":"2","NAME":"Giro del Gargano"},
{"lat":16.0316522,"lon":41.6808931,"score":1,"ID":"13800006203620","TYPES":"Scenic","N2C":"2"},
{"lat":16.0316522,"lon":41.6808931,"score":1,"ID":"13800006203620","TYPES":"Regional","N2C":"2","NAME":"Giro del Gargano"},
{"lat":16.032322,"lon":40.6097567,"score":1,"ID":"13800005963252","TYPES":"Scenic","N2C":"2"},
{"lat":16.032322,"lon":40.6097567,"score":1,"ID":"13800005963252","TYPES":"Regional","N2C":"2","NAME":"Via Appia"}然后与熊猫一起操作数据,并使用geojson包创建LineString对象。
import geojson
import pandas as pd
df = pd.DataFrame(json_objs)
grouped = df.groupby(["ID","TYPES"])
features = []
for name,indices in grouped.groups.items():
group = df.iloc[indices]
score = group["score"].sum()
points = list(zip(group["lat"].values,group["lon"].values))
properties = {"ID": name[0], "record_type": name[1], "score": score}
ls = geojson.LineString(coordinates = points, properties = properties)
print(ls)
features.append(ls)输出:
{"coordinates": [[16.032322, 40.6097567]], "properties": {"ID": "13800005963252", "record_type": "Scenic", "score": 1}, "type": "LineString"}
{"coordinates": [[16.0269337, 40.073042]], "properties": {"ID": "13800006252028", "record_type": "Regional", "score": 1}, "type": "LineString"}
{"coordinates": [[16.032322, 40.6097567]], "properties": {"ID": "13800005963252", "record_type": "Regional", "score": 1}, "type": "LineString"}
{"coordinates": [[16.0316522, 41.6808931], [16.0316523, 41.6808932]], "properties": {"ID": "13800006260148", "record_type": "Scenic", "score": 2}, "type": "LineString"}
{"coordinates": [[16.0316522, 41.6808931]], "properties": {"ID": "13800006203620", "record_type": "Scenic", "score": 1}, "type": "LineString"}
{"coordinates": [[16.0316522, 41.6808931]], "properties": {"ID": "13800006203620", "record_type": "Regional", "score": 1}, "type": "LineString"}然后将这些LineStrings转换为FeatureCollection
feature = geojson.FeatureCollection(features)显然,您也可以手动构造这种格式,但我希望geojson提供了更多的灵活性。显然,也有一个geopandas包,我无法安装。
https://stackoverflow.com/questions/48015136
复制相似问题