首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用python中相同的值更新所有目录或列表值。

使用python中相同的值更新所有目录或列表值。
EN

Stack Overflow用户
提问于 2018-03-28 06:21:33
回答 2查看 71关注 0票数 1

背景

使用Python,我通过迭代列表来爬行存储在列表中的网站列表。每个网站URL都从列表中收集,并通过一个函数进行爬行。返回该函数的响应,并将爬行的数据添加到目录中。

问题

每次从爬行函数调用返回新响应并将响应添加到字典时,目录中的所有值都会使用最新的值进行更新。我还尝试将响应添加到列表中,列表中的所有值也将使用最新的响应值进行更新。

调试尝试了

在将它们添加到字典或列表之前和之后,我在每次迭代中都会打印单独的响应,这些响应在添加到目录或列表之前和之后是相同的,并且在每次迭代中都不同。这意味着,根据预期的行为,人们的反应是截然不同的。但是整个列表会用最新的值进行更新。

代码语言:javascript
复制
for jobListingPage in jobListingPages:
    try:
        r = urllib.urlopen(jobListingPage).read()
        soup = BeautifulSoup(r, "html.parser")
        jobsSummaryMarkup = soup.find_all("h2", class_=["g-col10"])
        i = 0
        for jobSummaryMarkup in jobsSummaryMarkup:
            jobDetailsURL = base_url_sof+str(jobSummaryMarkup.a["href"])
            jobDetailsFindRes = find_job_details(jobDetailsURL)
            if(jobDetailsFindRes[0] == 0):
                #print("******crawled response before adding")
                #print(jobDetailsFindRes[1])
                i=i+1
                all_jobs_data["job "+str(i)] = jobDetailsFindRes[1]
                #print("******crawled response after adding")
                #print(jobDetailsFindRes[1])
                #print("******cumulative dictionary")
                #print(all_jobs_data)
                #print("###########################################")
        return([0, all_jobs_data])
    except Exception as e:
        return([-1, e])

上述代码的输出

取消对打印语句进行注释后的输出,下面的输出是obtained.after三次迭代,即从列表中抓取三个网站。

代码语言:javascript
复制
******crawled response before adding
{'location_name': 'Bengaluru', 'tags': ['user-interface', 'html5', 'javascript', 'angularjs', 'reactjs'], 'job_url': 'http://www.stackoverflow.com/jobs/170630/ui-front-end-developer-citrix', 'Experience level': ['Mid-Level', ' Senior', ' Lead'], 'Job type': ['Permanent'], 'Role': ['Frontend Developer'], 'company_name': 'Citrix', 'job_name': 'UI /Front-End Developer'}
******crawled response after adding
{'location_name': 'Bengaluru', 'tags': ['user-interface', 'html5', 'javascript', 'angularjs', 'reactjs'], 'job_url': 'http://www.stackoverflow.com/jobs/170630/ui-front-end-developer-citrix', 'Experience level': ['Mid-Level', ' Senior', ' Lead'], 'Job type': ['Permanent'], 'Role': ['Frontend Developer'], 'company_name': 'Citrix', 'job_name': 'UI /Front-End Developer'}
******cumulative dictionary
{'job 1': {'location_name': 'Bengaluru', 'tags': ['user-interface', 'html5', 'javascript', 'angularjs', 'reactjs'], 'job_url': 'http://www.stackoverflow.com/jobs/170630/ui-front-end-developer-citrix', 'Experience level': ['Mid-Level', ' Senior', ' Lead'], 'Job type': ['Permanent'], 'Role': ['Frontend Developer'], 'company_name': 'Citrix', 'job_name': 'UI /Front-End Developer'}}
#########################################
******crawled response before adding
{'location_name': 'Bengaluru', 'tags': ['python', 'django', 'java'], 'job_url': 'http://www.stackoverflow.com/jobs/171885/full-stack-developer-mishipay', 'Industry': ['Mobile Payments', ' POS', ' Retail'], 'Experience level': ['Mid-Level'], 'Job type': ['Permanent'], 'Role': ['Full Stack Developer'], 'company_name': 'MishiPay', 'job_name': 'Full Stack Developer'}
******crawled response after adding
{'location_name': 'Bengaluru', 'tags': ['python', 'django', 'java'], 'job_url': 'http://www.stackoverflow.com/jobs/171885/full-stack-developer-mishipay', 'Industry': ['Mobile Payments', ' POS', ' Retail'], 'Experience level': ['Mid-Level'], 'Job type': ['Permanent'], 'Role': ['Full Stack Developer'], 'company_name': 'MishiPay', 'job_name': 'Full Stack Developer'}
******cumulative dictionary
{'job 1': {'location_name': 'Bengaluru', 'tags': ['python', 'django', 'java'], 'job_url': 'http://www.stackoverflow.com/jobs/171885/full-stack-developer-mishipay', 'Industry': ['Mobile Payments', ' POS', ' Retail'], 'Experience level': ['Mid-Level'], 'Job type': ['Permanent'], 'Role': ['Full Stack Developer'], 'company_name': 'MishiPay', 'job_name': 'Full Stack Developer'}, 'job 2': {'location_name': 'Bengaluru', 'tags': ['python', 'django', 'java'], 'job_url': 'http://www.stackoverflow.com/jobs/171885/full-stack-developer-mishipay', 'Industry': ['Mobile Payments', ' POS', ' Retail'], 'Experience level': ['Mid-Level'], 'Job type': ['Permanent'], 'Role': ['Full Stack Developer'], 'company_name': 'MishiPay', 'job_name': 'Full Stack Developer'}}
#########################################
******crawled response before adding
{'location_name': 'Hyderabad', 'tags': ['architecture', 'web-services', 'togaf', 'websecurity', 'bigdata'], 'job_url': 'http://www.stackoverflow.com/jobs/168402/web-security-architect-in-fintech-big-data-paysafe', 'Industry': ['Financial Services', ' Financial Technology', ' Information Technology'], 'Experience level': ['Mid-Level', ' Senior'], 'Job type': ['Permanent'], 'Role': ['System Administrator'], 'company_name': 'Paysafe', 'job_name': 'Web Security Architect  in Fintech & Big Data'}
******crawled response after adding
{'location_name': 'Hyderabad', 'tags': ['architecture', 'web-services', 'togaf', 'websecurity', 'bigdata'], 'job_url': 'http://www.stackoverflow.com/jobs/168402/web-security-architect-in-fintech-big-data-paysafe', 'Industry': ['Financial Services', ' Financial Technology', ' Information Technology'], 'Experience level': ['Mid-Level', ' Senior'], 'Job type': ['Permanent'], 'Role': ['System Administrator'], 'company_name': 'Paysafe', 'job_name': 'Web Security Architect  in Fintech & Big Data'}
******cumulative dictionary
{'job 1': {'location_name': 'Hyderabad', 'tags': ['architecture', 'web-services', 'togaf', 'websecurity', 'bigdata'], 'job_url': 'http://www.stackoverflow.com/jobs/168402/web-security-architect-in-fintech-big-data-paysafe', 'Industry': ['Financial Services', ' Financial Technology', ' Information Technology'], 'Experience level': ['Mid-Level', ' Senior'], 'Job type': ['Permanent'], 'Role': ['System Administrator'], 'company_name': 'Paysafe', 'job_name': 'Web Security Architect  in Fintech & Big Data'}, 'job 2': {'location_name': 'Hyderabad', 'tags': ['architecture', 'web-services', 'togaf', 'websecurity', 'bigdata'], 'job_url': 'http://www.stackoverflow.com/jobs/168402/web-security-architect-in-fintech-big-data-paysafe', 'Industry': ['Financial Services', ' Financial Technology', ' Information Technology'], 'Experience level': ['Mid-Level', ' Senior'], 'Job type': ['Permanent'], 'Role': ['System Administrator'], 'company_name': 'Paysafe', 'job_name': 'Web Security Architect  in Fintech & Big Data'}, 'job 3': {'location_name': 'Hyderabad', 'tags': ['architecture', 'web-services', 'togaf', 'websecurity', 'bigdata'], 'job_url': 'http://www.stackoverflow.com/jobs/168402/web-security-architect-in-fintech-big-data-paysafe', 'Industry': ['Financial Services', ' Financial Technology', ' Information Technology'], 'Experience level': ['Mid-Level', ' Senior'], 'Job type': ['Permanent'], 'Role': ['System Administrator'], 'company_name': 'Paysafe', 'job_name': 'Web Security Architect  in Fintech & Big Data'}}
#########################################

最后一项是通过整个字典传递的,并更新所有项。如果我将最后一项追加到列表中,则整个列表将使用最后一项进行更新。

如何将不同的项添加到字典中,而不是将整个目录由最后一个项更新?

编辑:在列表中添加响应的代码版本,而不是添加到字典中。

代码语言:javascript
复制
for jobListingPage in jobListingPages:
    try:
        r = urllib.urlopen(jobListingPage).read()
        soup = BeautifulSoup(r, "html.parser")
        jobsSummaryMarkup = soup.find_all("h2", class_=["g-col10"])
        for jobSummaryMarkup in jobsSummaryMarkup:
            jobDetailsURL = base_url_sof+str(jobSummaryMarkup.a["href"])
            jobDetailsFindRes = find_job_details(jobDetailsURL)
            if(jobDetailsFindRes[0] == 0):
                #print("******crawled response before adding")
                #print(jobDetailsFindRes[1])
                all_jobs_data_list.append(jobDetailsFindRes[1])
                #print("******crawled response after adding")
                #print(jobDetailsFindRes[1])
                #print("******cumulative list")
                #print(all_jobs_data_list)
                #print("###########################################")
        return([0, all_jobs_data])
    except Exception as e:
        return([-1, e])

上面代码的输出是:

代码语言:javascript
复制
******crawled response before adding
{'location_name': 'Bengaluru', 'tags': ['user-interface', 'html5', 'javascript', 'angularjs', 'reactjs'], 'job_url': 'http://www.stackoverflow.com/jobs/170630/ui-front-end-developer-citrix', 'Experience level': ['Mid-Level', ' Senior', ' Lead'], 'Job type': ['Permanent'], 'Role': ['Frontend Developer'], 'company_name': 'Citrix', 'job_name': 'UI /Front-End Developer'}
******crawled response after adding
{'location_name': 'Bengaluru', 'tags': ['user-interface', 'html5', 'javascript', 'angularjs', 'reactjs'], 'job_url': 'http://www.stackoverflow.com/jobs/170630/ui-front-end-developer-citrix', 'Experience level': ['Mid-Level', ' Senior', ' Lead'], 'Job type': ['Permanent'], 'Role': ['Frontend Developer'], 'company_name': 'Citrix', 'job_name': 'UI /Front-End Developer'}
******cumulative dictionary
[{'location_name': 'Bengaluru', 'tags': ['user-interface', 'html5', 'javascript', 'angularjs', 'reactjs'], 'job_url': 'http://www.stackoverflow.com/jobs/170630/ui-front-end-developer-citrix', 'Experience level': ['Mid-Level', ' Senior', ' Lead'], 'Job type': ['Permanent'], 'Role': ['Frontend Developer'], 'company_name': 'Citrix', 'job_name': 'UI /Front-End Developer'}]
#########################################
******crawled response before adding
{'location_name': 'Bengaluru', 'tags': ['python', 'django', 'java'], 'job_url': 'http://www.stackoverflow.com/jobs/171885/full-stack-developer-mishipay', 'Industry': ['Mobile Payments', ' POS', ' Retail'], 'Experience level': ['Mid-Level'], 'Job type': ['Permanent'], 'Role': ['Full Stack Developer'], 'company_name': 'MishiPay', 'job_name': 'Full Stack Developer'}
******crawled response after adding
{'location_name': 'Bengaluru', 'tags': ['python', 'django', 'java'], 'job_url': 'http://www.stackoverflow.com/jobs/171885/full-stack-developer-mishipay', 'Industry': ['Mobile Payments', ' POS', ' Retail'], 'Experience level': ['Mid-Level'], 'Job type': ['Permanent'], 'Role': ['Full Stack Developer'], 'company_name': 'MishiPay', 'job_name': 'Full Stack Developer'}
******cumulative dictionary
[{'location_name': 'Bengaluru', 'tags': ['python', 'django', 'java'], 'job_url': 'http://www.stackoverflow.com/jobs/171885/full-stack-developer-mishipay', 'Industry': ['Mobile Payments', ' POS', ' Retail'], 'Experience level': ['Mid-Level'], 'Job type': ['Permanent'], 'Role': ['Full Stack Developer'], 'company_name': 'MishiPay', 'job_name': 'Full Stack Developer'}, {'location_name': 'Bengaluru', 'tags': ['python', 'django', 'java'], 'job_url': 'http://www.stackoverflow.com/jobs/171885/full-stack-developer-mishipay', 'Industry': ['Mobile Payments', ' POS', ' Retail'], 'Experience level': ['Mid-Level'], 'Job type': ['Permanent'], 'Role': ['Full Stack Developer'], 'company_name': 'MishiPay', 'job_name': 'Full Stack Developer'}]
#########################################
******crawled response before adding
{'location_name': 'Hyderabad', 'tags': ['architecture', 'web-services', 'togaf', 'websecurity', 'bigdata'], 'job_url': 'http://www.stackoverflow.com/jobs/168402/web-security-architect-in-fintech-big-data-paysafe', 'Industry': ['Financial Services', ' Financial Technology', ' Information Technology'], 'Experience level': ['Mid-Level', ' Senior'], 'Job type': ['Permanent'], 'Role': ['System Administrator'], 'company_name': 'Paysafe', 'job_name': 'Web Security Architect  in Fintech & Big Data'}
******crawled response after adding
{'location_name': 'Hyderabad', 'tags': ['architecture', 'web-services', 'togaf', 'websecurity', 'bigdata'], 'job_url': 'http://www.stackoverflow.com/jobs/168402/web-security-architect-in-fintech-big-data-paysafe', 'Industry': ['Financial Services', ' Financial Technology', ' Information Technology'], 'Experience level': ['Mid-Level', ' Senior'], 'Job type': ['Permanent'], 'Role': ['System Administrator'], 'company_name': 'Paysafe', 'job_name': 'Web Security Architect  in Fintech & Big Data'}
******cumulative dictionary
[{'location_name': 'Hyderabad', 'tags': ['architecture', 'web-services', 'togaf', 'websecurity', 'bigdata'], 'job_url': 'http://www.stackoverflow.com/jobs/168402/web-security-architect-in-fintech-big-data-paysafe', 'Industry': ['Financial Services', ' Financial Technology', ' Information Technology'], 'Experience level': ['Mid-Level', ' Senior'], 'Job type': ['Permanent'], 'Role': ['System Administrator'], 'company_name': 'Paysafe', 'job_name': 'Web Security Architect  in Fintech & Big Data'}, {'location_name': 'Hyderabad', 'tags': ['architecture', 'web-services', 'togaf', 'websecurity', 'bigdata'], 'job_url': 'http://www.stackoverflow.com/jobs/168402/web-security-architect-in-fintech-big-data-paysafe', 'Industry': ['Financial Services', ' Financial Technology', ' Information Technology'], 'Experience level': ['Mid-Level', ' Senior'], 'Job type': ['Permanent'], 'Role': ['System Administrator'], 'company_name': 'Paysafe', 'job_name': 'Web Security Architect  in Fintech & Big Data'}, {'location_name': 'Hyderabad', 'tags': ['architecture', 'web-services', 'togaf', 'websecurity', 'bigdata'], 'job_url': 'http://www.stackoverflow.com/jobs/168402/web-security-architect-in-fintech-big-data-paysafe', 'Industry': ['Financial Services', ' Financial Technology', ' Information Technology'], 'Experience level': ['Mid-Level', ' Senior'], 'Job type': ['Permanent'], 'Role': ['System Administrator'], 'company_name': 'Paysafe', 'job_name': 'Web Security Architect  in Fintech & Big Data'}]
#########################################

jobListingPages的示例数据:

代码语言:javascript
复制
['https://stackoverflow.com/jobs?sort=p&l=India&d=100&u=Km', 'https://stackoverflow.com/jobs?l=India&d=100&u=Km&sort=i&pg=2']

jobListingPages的示例数据:

代码语言:javascript
复制
http://www.stackoverflow.com/jobs/170630/ui-front-end-developer-citrix
http://www.stackoverflow.com/jobs/171885/full-stack-developer-mishipay
http://www.stackoverflow.com/jobs/168402/web-security-architect-in-fintech-big-data-paysafe
EN

回答 2

Stack Overflow用户

发布于 2018-03-28 06:32:26

我相信i = 0是罪魁祸首。请将它移出外部循环,然后再试一次。作业计数器在列表的每个URL元素处被重置,它更新相同键的现有值(例如:作业1)

票数 0
EN

Stack Overflow用户

发布于 2018-03-28 07:25:38

解决了。

我不知道它是如何工作的,但是all_jobs_data_list.append(str(jobDetailsFindRes[1]))给列表而不是all_jobs_data_list.append(jobDetailsFindRes[1])做了我的工作。

类似地,all_jobs_data_list["job "+str(i)] = str(jobDetailsFindRes[1])代替了all_jobs_data_list["job "+str(i)] = jobDetailsFindRes[1],得到了不同的条目。

如果有人能解释这一点,我会很感激的:)

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/49527445

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档