文章/答案/技术大牛

发布

社区首页 >问答首页 >如何用python和streamlit过滤数据

问如何用python和streamlit过滤数据
EN

Stack Overflow用户

提问于 2022-08-31 00:39:20

回答 1查看 126关注 0票数 0

下面的dataframe有一个字段tech包含与-分隔的字符串值，我想将这个字段作为标记，一旦用户选择其中一个值，它就会显示按project_title分组的数据。

代码：

import streamlit as st
import pandas as pd
import numpy as np

data = {
    'project_title': ['LSE', 'DCP', 'Job-detection', 'Task management & Organizer'],
    'tech': ['python-RegExp-PyQt5', 'python-RegExp', 'python-RegExp-BeautifulSoup-pandas', 'python-pandas-MS_SQL-CSS/HTML-Javascript'],
    'Role': ['Junior developer ', 'Python developer', 'Python developer', 'Tech lead']
}

#split the tech column into multiple columns
df[['tech1','tech2','tech3','tech4','tech5']]=df['tech'].str.split('-', expand=True)

#create seperated list of each tech column
tech1 = df["tech1"].unique().tolist()
tech2 = df["tech2"].unique().tolist()
tech3 = df["tech3"].unique().tolist()
tech4 = df["tech4"].unique().tolist()
tech5 = df["tech5"].unique().tolist()

#concatinate all the lists into one list.
tech_all = tech1+tech2+tech3+tech4+tech5
tech_all = list(filter(None, tech_all))

#create multiselect widget that includes the created list of tech
regular_search_term =tech_all
choices = st.multiselect(" ",regular_search_term)

#return the dataframe based on the selected values from the multiselect widget.
df_result_search=df[df.loc[:,"tech1":"tech5"].isin(choices)]

st.write(df_result_search)

上面的代码没有返回我想要的结果。

基于@BeRT2me的回答

regular_search_term =df.tech.unique().tolist()

choices = st.selectbox(" ",regular_search_term)
df.loc[df.tech.eq(choices), 'project_title']

#  the below code doesn't return the correct result
choices = st.multiselect(" ",regular_search_term)
df.loc[df.tech.isin(choices), 'project_title']

python

pandas

回答 1

Stack Overflow用户

发布于 2022-08-31 01:02:10

您可能不需要expand=True，而是要explode它。

data = {
        'project_title': ['LSE', 'DCP', 'Job-detection', 'Task management & Organizer'],
        'tech': ['python-RegExp-PyQt5', 'python-RegExp', 'python-RegExp-BeautifulSoup-pandas', 'python-pandas-MS_SQL-CSS/HTML-Javascript'],
        'Role': ['Junior developer', 'Python developer', 'Python developer', 'Tech lead']
}

df = pd.DataFrame(data)

df.tech = df.tech.str.split('-')
df = df.explode('tech', ignore_index=True)

print(df)

# Output:

                  project_title           tech              Role
0                           LSE         python  Junior developer
1                           LSE         RegExp  Junior developer
2                           LSE          PyQt5  Junior developer
3                           DCP         python  Python developer
4                           DCP         RegExp  Python developer
5                 Job-detection         python  Python developer
6                 Job-detection         RegExp  Python developer
7                 Job-detection  BeautifulSoup  Python developer
8                 Job-detection         pandas  Python developer
9   Task management & Organizer         python         Tech lead
10  Task management & Organizer         pandas         Tech lead
11  Task management & Organizer         MS_SQL         Tech lead
12  Task management & Organizer       CSS/HTML         Tech lead
13  Task management & Organizer     Javascript         Tech lead

现在，您可以使用特定的技术搜索项目标题：

>>> df.loc[df.tech.eq('pandas'), 'project_title']
8                   Job-detection
10    Task management & Organizer
Name: project_title, dtype: object

可能有更有效的方法..。但我现在能想到的最好的就是这个。

给定一个可迭代的术语，我们可以找到具有如下所有条件的项目：

>>> terms = ('pandas', 'RegExp')
>>> df.groupby('project_title')['tech'].agg(lambda x: all(x.eq(term).any() for term in terms))[lambda x: x].index.to_list()
['Job-detection']

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/73549772

复制

相似问题

问如何用python和streamlit过滤数据
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何用python和streamlit过滤数据EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何用python和streamlit过滤数据
EN