我目前正在使用Python,没有太多使用此代码的经验。我正在使用VScode,并尝试将CSV文件中的数据重新格式化为新的CSV文件。我有以下数据,我想知道如何重新格式化它,因为一列是JSON消息,另外两列是唯一的传感器ID和时间戳。理想情况下,我希望为每个传感器ID创建一个单独的列,其中JSON消息中的所有参数都作为列。
我还希望将此输出放入新的CSV中,并为JSON消息中的每种数据类型提供单独的列。当前代码如下:
with open(file name here) as file:
reader = csv.reader(file)
for row in reader:
if row[1] == 'sensor/200187000000000000000000/501':
print (row[2])
data = (r'file name', 'r')
df = pd.read_json(data, orient='index')
print我认为pd_read_json会为我将消息转换为单独的列,但我收到以下错误:
ValueError: Invalid file path or buffer object type: <class 'tuple'>以下是文本格式的数据集示例:
receivedTS,topic,messageData
1623887376046,sensor/200187000000000000000000/501,"{""m1"":0,""m2"":0,""t1"":222,""t2"":232,""t3"":230,""t4"":200,""timestamp"":1623887386}"
1623887070457,sensor/200187000000000000000000/501,"{""m1"":0,""m2"":0,""t1"":222,""t2"":232,""t3"":232,""t4"":205,""timestamp"":1623887079}"
1623887062762,transceiver/004900314E46500D2033334D/event/ex/scan,"{""sensors"":[{""id"":""200187000000000000000000"",""type"":""501"",""modAdr"":3,""fw"":""48.50"",""hw"":""15.15""}],""timestamp"":1623887071}"
1623887056853,transceiver/004900314E46500D2033334D/data/network,"{""c"":1,""d"":0,""n"":[""53001""]}"
1623887051515,transceiver/004900314E46500D2033334D/data/cell-strength,"{""rssi"":-61,""ber"":3,""timestamp"":1623887028}"
1623887045991,transceiver/004900314E46500D2033334D/data/mains-voltage,"{""mv"":24288,""timestamp"":1623886996}"
1623887040884,transceiver/004900314E46500D2033334D/data/battery-voltage,"{""mv"":13648,""timestamp"":1623886996}"
1623887030393,transceiver/004900314E46500D2033334D/event/boot,"{""rebootReason"":""0"",""version"":""17.501"",""mBoard"":""0.0.4"",""expBoard"":""0.0.7"",""sBoard"":""0.0.6"",""timestamp"":1623886993,""sensors"":[]}"
1623887017480,transceiver/004900314E46500D2033334D/event/netid,"{""imsi"":""204047116508564"",""imei"":""352909081612471""}"
1623886957737,transceiver/004900314E46500D2033334D/data/mains-voltage,"{""mv"":24280,""timestamp"":1623886908}"新文件的预期输出列:
传感器ID、时间戳、m1、m2、t1、t2、t3、t4
希望这是足够的信息。
发布于 2021-06-22 15:02:19
希望这是一次简单而明显的尝试。
with open(csvfilename) as infile, open(newcsv, 'w') as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile)
keys = ['sensor ID', 'timestamp', 'm1', 'm2', 't1', 't2', 't3', 't4']
writer.writerow(keys)
for row in reader:
if not row[1].startswith('sensor/'):
continue
jdata = json.loads(row[2])
jdata['sensor ID'] = row[1][7:]
writer.writerow(jdata[key] for key in keys)这里主要的heureka时刻可能是最后一行,它从JSON字典中以特定的顺序从keys中取出请求的字段。在前一行中,我们将传感器ID添加到提取的字典中,以便可以使后续代码变得非常简单。
我假设您只需要删除前缀的sensor/行(这就是[7:]所做的),并且时间戳应该是来自JSON数据的时间戳,而不是来自input CSV中类似格式的receivedTS字段的时间戳。(下一步,可能会尝试更具体地说明这些细节。)
https://stackoverflow.com/questions/68062085
复制相似问题