首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >将url页作为单个数据框架进行Concatenating

将url页作为单个数据框架进行Concatenating
EN

Stack Overflow用户
提问于 2015-06-24 15:01:59
回答 1查看 1.7K关注 0票数 3

我正在尝试下载给定位置的历史气象数据。我修改了在流量数据上给出的一个示例,但我仍然停留在最后一步--如何连接多个Data Frames

MWE:

代码语言:javascript
复制
import pandas as pd

frames = pd.DataFrame(columns=['TimeEET', 'TemperatureC', 'Dew PointC', 'Humidity','Sea Level PressurehPa', 
       'VisibilityKm', 'Wind Direction', 'Wind SpeedKm/h','Gust SpeedKm/h','Precipitationmm', 
       'Events','Conditions', 'WindDirDegrees', 'DateUTC<br />'])

# Iterate through year, month, and day
for y in range(2006, 2007):
    for m in range(1, 13):
       for d in range(1, 32):

# Check if leap year
        if y%400 == 0:
            leap = True
        elif y%100 == 0:
            leap = False
        elif y%4 == 0:
            leap = True
        else:
            leap = False

#Check if already gone through month
        if (m == 2 and leap and d > 29):
            continue
        elif (m == 2 and d > 28):
            continue
        elif (m in [4, 6, 9, 10] and d > 30):
            continue

 # Open wunderground.com url
        url = "http://www.wunderground.com/history/airport/EFHK/"+str(y)+ "/" + str(m) + "/" + str(d) + "/DailyHistory.html?req_city=Vantaa&req_state=&req_statename=Finlandia&reqdb.zip=00000&reqdb.magic=4&reqdb.wmo=02974&format=1"
        df=pd.read_csv(url, sep=',',skiprows=2)
        frames=pd.concat(df)

这就产生了一个错误:

代码语言:javascript
复制
 first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"

所需的输出将是有一个数据框架,所有的天,月和年。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-06-24 15:08:38

您应该在循环之外声明一个列表并附加到这个列表中,然后在要将所有dfs连接到一个df的循环之外:

代码语言:javascript
复制
import pandas as pd

frames = pd.DataFrame(columns=['TimeEET', 'TemperatureC', 'Dew PointC', 'Humidity','Sea Level PressurehPa', 
       'VisibilityKm', 'Wind Direction', 'Wind SpeedKm/h','Gust SpeedKm/h','Precipitationmm', 
       'Events','Conditions', 'WindDirDegrees', 'DateUTC<br />'])

# Iterate through year, month, and day
df_list = []
for y in range(2006, 2007):
    for m in range(1, 13):
       for d in range(1, 32):

# Check if leap year
        if y%400 == 0:
            leap = True
        elif y%100 == 0:
            leap = False
        elif y%4 == 0:
            leap = True
        else:
            leap = False

#Check if already gone through month
        if (m == 2 and leap and d > 29):
            continue
        elif (m == 2 and d > 28):
            continue
        elif (m in [4, 6, 9, 10] and d > 30):
            continue

 # Open wunderground.com url
        url = "http://www.wunderground.com/history/airport/EFHK/"+str(y)+ "/" + str(m) + "/" + str(d) + "/DailyHistory.html?req_city=Vantaa&req_state=&req_statename=Finlandia&reqdb.zip=00000&reqdb.magic=4&reqdb.wmo=02974&format=1"
        df=pd.read_csv(url, sep=',',skiprows=2)
        df_list.append(df)
frames=pd.concat(df_list, ignore_index=True)
票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/31030096

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档