我有一个产品列表,其中包含许多具有id、image_url等属性的对象。如下所示:
total_products
[{u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQCG1ObwtCgqxZIk&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F1000000.png&cfs=1&_nc_hash=AQAPdo31zo9WJk8j', u'id': u'1539966686030963', u'retailer_id': u'product-1000000'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQDyc-Yyic5QLOqH&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F0.png&cfs=1&_nc_hash=AQDhmhPJxFZEpMFX', u'id': u'993388404100117', u'retailer_id': u'product-0'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQAwTzrzAjdKFjmB&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F1000.png&cfs=1&_nc_hash=AQCMMJRJ_r7QB06I', u'id': u'642820939176165', u'retailer_id': u'product-1000'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQBHdbRqB7F6aMKM&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F1.png&cfs=1&_nc_hash=AQDx7P52g0NYBB-3', u'id': u'1411912028843607', u'retailer_id': u'product-1'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQB7aSPmk_j21umz&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F100000.png&cfs=1&_nc_hash=AQAPV5oe_ymaAcXr', u'id': u'942522339181104', u'retailer_id': u'product-100000'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQB69V2cgASUIci1&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F100.png&cfs=1&_nc_hash=AQAk3eZ4vqWYbOW4', u'id': u'1347112758661660', u'retailer_id': u'product-100'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQD44rjEUMk6Yp2H&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F1000001.png&cfs=1&_nc_hash=AQBT_0iB417B08ux', u'id': u'1354204821311003', u'retailer_id': u'product-1000001'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQB4ucqXEbo2DyC7&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F1000002.png&cfs=1&_nc_hash=AQAQ2vuj0WmuXSqw', u'id': u'1776841739008769', u'retailer_id': u'product-1000002'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQBM75VZTNuxqaoq&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F10.png&cfs=1&_nc_hash=AQAUdkc6II5eu47D', u'id': u'1358784964179738', u'retailer_id': u'product-10'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQAY0kmVnHXBbhHe&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F10000.png&cfs=1&l&_nc_hash=AQCT1PHl5h1Rhc5r', u'id': u'1337513966312571', u'retailer_id': u'product-10000'}]
我正在读取一个csv文件,其中包含以下数据:
csv_file_data:

如您所见,csv_file、id和retailer_id中的id对于某些产品是相同的,所以如果retailer_id和id匹配的话,我想在csv file中更改image_link。
在这样做的过程中,我将逐行读取csv文件,并遍历total_products中的所有产品,如果发现任何匹配,则更改image_link。
代码:
def update_csv(file):
print file
reader = csv.DictReader(open(file))
out_file_name = str(file).replace(".csv", "")
writer = csv.DictWriter(open(out_file_name+"_updated.csv","wb"),fieldnames=reader.fieldnames)
writer.writeheader()
for current_row in reader:
for product in total_products:
retailer_id = product['retailer_id']
if(current_row['id']==retailer_id):
current_row['image_link']= "RajSharma"
print "Match = "+str(retailer_id)+" in "+file
break
writer.writerow(current_row)这种方法的问题是,如果total_products包含1000到1万多个,那么运行时间太长了。
是否有方法在retailer_id中找到total_products,如果是的话,更改image_link
发布于 2017-01-19 02:32:33
首先,从total_products创建一组ids
product_ids = set([product['retailer_id'] for product in total_products])然后,检查current_row['id']是否在集合中:
for current_row in reader:
if current_row['id'] in product_ids:
current_row['image_link'] = 'RajSharma'一个集合的搜索速度要快得多,我们只需要一个唯一的产品ID列表来检查。使用if current_row['image_link'] in product_ids将底层C代码用于循环,从而优化了对集合中某个值的检查。
https://stackoverflow.com/questions/41732602
复制相似问题