下面的dataframe有一个字段tech包含与-分隔的字符串值,我想将这个字段作为标记,一旦用户选择其中一个值,它就会显示按project_title分组的数据。
代码:
import streamlit as st
import pandas as pd
import numpy as np
data = {
'project_title': ['LSE', 'DCP', 'Job-detection', 'Task management & Organizer'],
'tech': ['python-RegExp-PyQt5', 'python-RegExp', 'python-RegExp-BeautifulSoup-pandas', 'python-pandas-MS_SQL-CSS/HTML-Javascript'],
'Role': ['Junior developer ', 'Python developer', 'Python developer', 'Tech lead']
}
#split the tech column into multiple columns
df[['tech1','tech2','tech3','tech4','tech5']]=df['tech'].str.split('-', expand=True)
#create seperated list of each tech column
tech1 = df["tech1"].unique().tolist()
tech2 = df["tech2"].unique().tolist()
tech3 = df["tech3"].unique().tolist()
tech4 = df["tech4"].unique().tolist()
tech5 = df["tech5"].unique().tolist()
#concatinate all the lists into one list.
tech_all = tech1+tech2+tech3+tech4+tech5
tech_all = list(filter(None, tech_all))
#create multiselect widget that includes the created list of tech
regular_search_term =tech_all
choices = st.multiselect(" ",regular_search_term)
#return the dataframe based on the selected values from the multiselect widget.
df_result_search=df[df.loc[:,"tech1":"tech5"].isin(choices)]
st.write(df_result_search)上面的代码没有返回我想要的结果。
基于@BeRT2me的回答
regular_search_term =df.tech.unique().tolist()
choices = st.selectbox(" ",regular_search_term)
df.loc[df.tech.eq(choices), 'project_title']
# the below code doesn't return the correct result
choices = st.multiselect(" ",regular_search_term)
df.loc[df.tech.isin(choices), 'project_title']发布于 2022-08-31 01:02:10
您可能不需要expand=True,而是要explode它。
data = {
'project_title': ['LSE', 'DCP', 'Job-detection', 'Task management & Organizer'],
'tech': ['python-RegExp-PyQt5', 'python-RegExp', 'python-RegExp-BeautifulSoup-pandas', 'python-pandas-MS_SQL-CSS/HTML-Javascript'],
'Role': ['Junior developer', 'Python developer', 'Python developer', 'Tech lead']
}
df = pd.DataFrame(data)
df.tech = df.tech.str.split('-')
df = df.explode('tech', ignore_index=True)
print(df)
# Output:
project_title tech Role
0 LSE python Junior developer
1 LSE RegExp Junior developer
2 LSE PyQt5 Junior developer
3 DCP python Python developer
4 DCP RegExp Python developer
5 Job-detection python Python developer
6 Job-detection RegExp Python developer
7 Job-detection BeautifulSoup Python developer
8 Job-detection pandas Python developer
9 Task management & Organizer python Tech lead
10 Task management & Organizer pandas Tech lead
11 Task management & Organizer MS_SQL Tech lead
12 Task management & Organizer CSS/HTML Tech lead
13 Task management & Organizer Javascript Tech lead现在,您可以使用特定的技术搜索项目标题:
>>> df.loc[df.tech.eq('pandas'), 'project_title']
8 Job-detection
10 Task management & Organizer
Name: project_title, dtype: object可能有更有效的方法..。但我现在能想到的最好的就是这个。
给定一个可迭代的术语,我们可以找到具有如下所有条件的项目:
>>> terms = ('pandas', 'RegExp')
>>> df.groupby('project_title')['tech'].agg(lambda x: all(x.eq(term).any() for term in terms))[lambda x: x].index.to_list()
['Job-detection']https://stackoverflow.com/questions/73549772
复制相似问题