我正在尝试合并各种.json文件,以便以后可以对它们进行情感分析。我已经尝试过其他的方法,但它们总是以错误告终。我检查了.json的格式是否正确,没有发现任何问题。我还附上了一个.json文件的示例。
错误信息附在我的代码下面。
import glob
import json
# list all files containing News from Guardian API
files = list(glob.iglob('/Users/xxx/tempdata/articles_data/*.json'))
news_data = []
for file in files:
news_file = open(file, "r", encoding = 'utf-8')
# Read in news and store in list: news_data
for line in news_file:
news = json.loads(line)
news_data.append(news)
news_file.close()更新错误输出
AttributeError Traceback (most recent call last)
<ipython-input-86-3019ee85b15b> in <module>
12 # Read in news and store in list: news_data
13 for line in news_file:
---> 14 news = json.load(line)
15 news_data.append(news)
16
~/opt/anaconda3/lib/python3.8/json/__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
291 kwarg; otherwise ``JSONDecoder`` is used.
292 """
--> 293 return loads(fp.read(),
294 cls=cls, object_hook=object_hook,
295 parse_float=parse_float, parse_int=parse_int,
AttributeError: 'str' object has no attribute 'read'2019-11-01.json
{
"id":"business/2019/nov/01/google-snaps-up-fitbit-for-21bn",
"type":"article",
"sectionId":"business",
"sectionName":"Business",
"webPublicationDate":"2019-11-01T14:26:19Z",
"webTitle":"Google snaps up Fitbit for $2.1bn",
"webUrl":"https://www.theguardian.com/business/2019/nov/01/google-snaps-up-fitbit-for-21bn",
"apiUrl":"https://content.guardianapis.com/business/2019/nov/01/google-snaps-up-fitbit-for-21bn",
"fields":{
"headline":"Google snaps up Fitbit for $2.1bn",
"standfirst":"<p>Takeover allows web giant to take on Apple in fast-growing smartwatch and wearables business</p>",
"trailText":"Takeover allows web giant to take on Apple in fast-growing smartwatch and wearables business",
"byline":"Kalyeena Makortoff",
"main":"<figure class=\"element element-image\" data-media-id=\"fc8abb0f70105fcab3aee86dea6c89e211337660\"> <img src=\"https://media.guim.co.uk/fc8abb0f70105fcab3aee86dea6c89e211337660/0_158_3571_2143/1000.jpg\" alt=\"The wireless activity tracker Zip by Fitbit Inc\" width=\"1000\" height=\"600\" class=\"gu-image\" /> <figcaption> <span class=\"element-image__caption\">The wireless activity tracker Zip by Fitbit Inc. Google has confirmed it will buy Fitbit for $2.1bn.</span> <span class=\"element-image__credit\">Photograph: Franck Robichon/EPA</span> </figcaption> </figure>",
"body":"<p>Google has snapped up the Fitbit... ",
"newspaperPageNumber":"38",
"wordcount":"679",
"firstPublicationDate":"2019-11-01T14:25:58Z",
"isInappropriateForSponsorship":"false",
"isPremoderated":"false",
"lastModified":"2019-11-01T18:56:38Z",
"newspaperEditionDate":"2019-11-02T00:00:00Z",
"productionOffice":"UK",
"publication":"The Guardian",
"shortUrl":"https://gu.com/p/cjeze",
"shouldHideAdverts":"false",
"showInRelatedContent":"true",
"thumbnail":"https://media.guim.co.uk/fc8abb0f70105fcab3aee86dea6c89e211337660/0_158_3571_2143/500.jpg",
"legallySensitive":"false",
"lang":"en",
"isLive":"true",
"bodyText":"Google has snapped up the Fitbit activity tracker business in a $2.1bn (\u00a31.6bn) deal that will enable the search giant to go toe-to-toe with Apple in the fast-growing smartwatch and wearables business..." ,
"charCount":"4149",
"shouldHideReaderRevenue":"false",
"showAffiliateLinks":"false",
"bylineHtml":"<a href=\"profile/kalyeena-makortoff\">Kalyeena Makortoff</a>"
},
"isHosted":false,
"pillarId":"pillar/news",
"pillarName":"News"
},发布于 2020-10-12 13:45:34
如果从文件中读取json,则应该使用json.load而不是json.loads。json.loads用于从字符串中读取JSON。见json.load 文档。
例如:
import json
with open('ts.json', 'r') as f:
content = json.load(f)
print(content)https://stackoverflow.com/questions/64318935
复制相似问题