我正在从获得json数据,内容是:
现在,我的程序看起来如下(在python伪代码中,请注意一些变量为了隐私已经更改了):
import json
import requests
# protected
_accessCode = "someAccessToken"
_accessStr = "?access_token=" + _accessCode
_myID = "myIDNumber"
r = requests.get("https://graph.facebook.com/" + _myID + "/friends/" + _accessStr)
raw = json.loads(r.text)
terminate = len(raw["data"])
# list used to store the friend/friend relationships
a = list()
for j in range(0, terminate + 1):
# calculate terminating displacement:
term_displacement = terminate - (j + 1)
print("Currently processing: " + str(j) + " of " + str(terminate))
for dj in range(1, term_displacement + 1):
# construct urls based on the raw data:
url = "https://graph.facebook.com/" + raw["data"][j]["id"] + "/friends/" + raw["data"][j + dj]["id"] + "/" + _accessStr
# visit site *THIS IS THE BOTTLENECK*:
reqTemp = requests.get(url)
rawTemp = json.loads(reqTemp.text)
if len(rawTemp["data"]) != 0:
# data dumps to list which dumps to file
a.append(str(raw["data"][j]["id"]) + "," + str(rawTemp["data"][0]["id"]))
outputFile = "C:/Users/franklin/Documents/gen/friendsRaw.csv"
output = open(outputFile, "w")
# write all me/friend relationship to file
for k in range(0, terminate):
output.write(_myID + "," + raw["data"][k]["id"] + "\n")
# write all friend/friend relationships to file
for i in range(0, len(a)):
output.write(a[i])
output.close()因此,它所做的是:首先,它调用我的页面并获取我的好友列表(通过使用access_token的facebook ),不允许调用朋友的朋友列表,但我可以通过请求列表中的一个朋友和我列表上的另一个朋友之间的关系来解决这个问题。因此,在第二部分(由双for循环表示)中,我再次请求查看某个朋友a是否也是b的朋友(这两个都在我的列表中);如果是的话,将有一个长度为json的对象,其中一个带有一个朋友的名字。
但是有大约357个朋友,确实有成千上万的页面请求需要发出。换句话说,该程序正在花费大量时间等待json请求。
我的问题是,这是否可以重写,以提高效率?目前,由于安全限制,不允许调用朋友的朋友列表属性。而且看起来api不允许这样做。有什么巨蟒的技巧可以使这个运行更快吗?也许是平行?
下面的“答案”部分中粘贴了更新修改后的代码。
发布于 2012-12-31 22:00:23
这不太可能是最优的,但我稍微修改了代码,以使用请求异步方法(未经测试):
import json
import requests
from requests import async
# protected
_accessCode = "someAccessToken"
_accessStr = "?access_token=" + _accessCode
_myID = "myIDNumber"
r = requests.get("https://graph.facebook.com/" + _myID + "/friends/" + _accessStr)
raw = json.loads(r.text)
terminate = len(raw["data"])
# list used to store the friend/friend relationships
a = list()
def add_to_list(reqTemp):
rawTemp = json.loads(reqTemp.text)
if len(rawTemp["data"]) != 0:
# data dumps to list which dumps to file
a.append(str(raw["data"][j]["id"]) + "," + str(rawTemp["data"][0]["id"]))
async_list = []
for j in range(0, terminate + 1):
# calculate terminating displacement:
term_displacement = terminate - (j + 1)
print("Currently processing: " + str(j) + " of " + str(terminate))
for dj in range(1, term_displacement + 1):
# construct urls based on the raw data:
url = "https://graph.facebook.com/" + raw["data"][j]["id"] + "/friends/" + raw["data"][j + dj]["id"] + "/" + _accessStr
req = async.get(url, hooks = {'response': add_to_list})
async_list.append(req)
# gather up all the results
async.map(async_list)
outputFile = "C:/Users/franklin/Documents/gen/friendsRaw.csv"
output = open(outputFile, "w")https://stackoverflow.com/questions/14106255
复制相似问题