我有一个包含2列的数据帧- 'Route Name‘和'Date’。路由名称包含各个城市的名称,'Date‘表示在这些城市中记录的事件。
例如,伦敦的过滤df如下所示=>
Route Name Date
London 2019-10-09
London 2019-10-09
London 2019-10-10
London 2019-10-10
London 2019-10-11
London 2019-10-11
London 2019-10-11
London 2019-10-11
London 2019-10-12
London 2019-10-12
London 2019-10-12我怎样才能得到每个城市的天数?期望的答案是这样的
Route Name Frequency(days)
London 4
Manchester 5
Glasgow 5等等。
发布于 2020-04-15 04:50:59
改编自Quang Hoang的评论,并修改为包括您关于每月计数的问题:
import pandas as pd
from io import StringIO
string = """Route Name\tDate
London\t2019-10-09
London\t2019-10-09
London\t2019-10-10
London\t2019-10-10
London\t2019-10-11
London\t2019-10-11
London\t2019-10-11
London\t2019-10-11
London\t2019-10-12
London\t2019-10-12
London\t2019-10-12
"""
df = pd.read_csv(StringIO(string), sep = "\t")
df["Date"] = pd.to_datetime(df["Date"], format = "%Y-%m-%d")
print(df.head())
"""
Route Name Date
0 London 2019-10-09
1 London 2019-10-09
2 London 2019-10-10
3 London 2019-10-10
4 London 2019-10-11
"""
df["Year"] = df["Date"].dt.year
df["Month"] = df["Date"].dt.month
print(df.head())
"""
Route Name Date Year Month
0 London 2019-10-09 2019 10
1 London 2019-10-09 2019 10
2 London 2019-10-10 2019 10
3 London 2019-10-10 2019 10
4 London 2019-10-11 2019 10
"""
annual_df = df.groupby(["Route Name", "Year"])["Date"].nunique()
print(annual_df.head())
"""
Route Name Year
London 2019 4
Name: Date, dtype: int64
"""
monthly_df = df.groupby(["Route Name", "Year", "Month"])["Date"].nunique()
print(monthly_df.head())
"""
Route Name Year Month
London 2019 10 4
Name: Date, dtype: int64
"""https://stackoverflow.com/questions/61211464
复制相似问题