首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >获取openpyxl中的子头

获取openpyxl中的子头
EN

Stack Overflow用户
提问于 2021-11-21 14:35:57
回答 1查看 182关注 0票数 0

我目前正在使用openpyxl来使用一个excel文件,该文件具有多个索引(两个级别的头),并且我试图根据标题的子头来执行操作。我有一些经验。在熊猫身上这样做,但在这个项目中,我必须使用openpyxl,这是我以前经常使用的。我唯一能想到的就是手动方式:

  • 对行进行迭代
  • 将第一行保存为标题,第二行保存为子标题。
  • 做些清洁。
  • 手动将标题及其子标题保存在dics中。然后通过迭代所有的cols在值中进行归档。

我的代码如下:

代码语言:javascript
复制
#reading the excel file
path = r'path to file'
wb = load_workbook(path) #loading the excel table
ws = wb.active #grab the active worksheet


#Setting the doc Header
for h in ws.iter_rows(max_row = 1, values_only = True): #getting the first row (Headers) in the table
    header = list(h)

for sh in ws.iter_rows(min_row = 1 ,max_row = 2, values_only = True):
    sub_header = list(sh)

#removing all of the none Values
header = list(filter(None, header))
sub_header = list(filter(None, sub_header))

print(header)
print(sub_header)
#creating a list of all the Columns in the excel file
col_list = []

for col in ws.iter_cols(min_row=3,min_col = 1): #Iteration over every single row starting from the third row since first two are the headers
    col = [cell.value for cell in col] #Creating a list from each row
    col = list(filter(None, col)) #removing the none values from each row
    col_list.append(col) #creating a list of all rows (starting from the 3d one)
#print (col_list)

但我确信肯定有一个更好的方法,我无法找到在文档或通过检查这个网站。

提前感谢!

最后,我的目标是通过迭代头来自动化代码的这一部分,并在每次代码时使用该头的子头及其值:

代码语言:javascript
复制
#bulding the templates using yattag "yattag.org"
doc , tag , text = Doc().tagtext()

#building the tags of the xml file
with tag("Data"): #root tag
    for row in row_list :
        with tag("Row"):
            with tag("Input"):
                with tag(header[0].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[0] + " the Precentage of Students with "+ sub_header[0] + " is "+ str(row[1]) + " whereas the " + sub_header[1] + " are " +   str(row[2]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[0] + " | " + sub_header[0]+ " | " + str(row[1]) + " | " + sub_header[1] + " | " + str(row[2]))
                with tag(header[1].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[1] + " the Precentage of Students with "+ sub_header[2] + " is "+ str(row[3]) + " whereas the " + sub_header[3] + " are " +   str(row[4]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[1] + " | " + sub_header[2]+ " | " + str(row[3]) + " | " + sub_header[3] + " | " + str(row[4]))
                with tag(header[2].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[2] + " the Precentage of Students with "+ sub_header[4] + " is "+ str(row[5]) + " whereas the " + sub_header[5] + " are " +   str(row[6]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[2] + " | " + sub_header[4]+ " | " + str(row[5]) + " | " + sub_header[5] + " | " + str(row[6]))  
                with tag(header[3].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[3] + " the Precentage of Students with "+ sub_header[6] + " is "+ str(row[7]) + " whereas the " + sub_header[7] + " are " +   str(row[8]) +" and for " + sub_header[8] + str(row[9]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[3] + " | " + sub_header[6]+ " | " + str(row[7]) + " | " + sub_header[7] + " | " + str(row[8]) + " | " + sub_header[8] + " | " + str(row[9]))
                with tag(header[4].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[4] + " the Precentage of Students with "+ sub_header[9] + " is "+ str(row[10]) + " whereas the " + sub_header[10] + " are " +   str(row[11]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[4] + " | " + sub_header[9]+ " | " + str(row[10]) + " | " + sub_header[10] + " | " + str(row[11]))  
                with tag(header[5].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[5] + " the Precentage of Students with "+ sub_header[11] + " is "+ str(row[12]) + " whereas the " + sub_header[12] + " are " +   str(row[13]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[5] + " | " + sub_header[11]+ " | " + str(row[12]) + " | " + sub_header[12] + " | " + str(row[13]))  
                with tag(header[6].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[6] + " the Precentage of Students with "+ sub_header[13] + " is "+ str(row[14]) + " whereas the " + sub_header[14] + " are " +   str(row[15]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[6] + " | " + sub_header[13]+ " | " + str(row[14]) + " | " + sub_header[14] + " | " + str(row[15]))  
            
#print(doc.getvalue())
result = indent(
    doc.getvalue(),
    indentation='   ',
    indent_text=True
)

#saving the xml file
with open("output.xml", "w") as f:
    f.write(result)
EN

回答 1

Stack Overflow用户

发布于 2021-11-22 08:40:35

除非潘达完全不在餐桌上,否则我想你也许能做些熊猫和开百叶草的事情。文档提到将openpyxl中的数据读取到熊猫的dataframe:与Pandas和Numpy合作中。你能用:

代码语言:javascript
复制
data = ws.values
df = DataFrame(data[2:,:], index=data[0], columns=data[1])

可能需要对None值进行一些必要的筛选。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/70055488

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档