文章/答案/技术大牛

发布

问获取openpyxl中的子头
EN

Stack Overflow用户

提问于 2021-11-21 14:35:57

回答 1查看 182关注 0票数 0

我目前正在使用openpyxl来使用一个excel文件，该文件具有多个索引(两个级别的头)，并且我试图根据标题的子头来执行操作。我有一些经验。在熊猫身上这样做，但在这个项目中，我必须使用openpyxl，这是我以前经常使用的。我唯一能想到的就是手动方式：

对行进行迭代
将第一行保存为标题，第二行保存为子标题。
做些清洁。
手动将标题及其子标题保存在dics中。然后通过迭代所有的cols在值中进行归档。

我的代码如下：

#reading the excel file
path = r'path to file'
wb = load_workbook(path) #loading the excel table
ws = wb.active #grab the active worksheet


#Setting the doc Header
for h in ws.iter_rows(max_row = 1, values_only = True): #getting the first row (Headers) in the table
    header = list(h)

for sh in ws.iter_rows(min_row = 1 ,max_row = 2, values_only = True):
    sub_header = list(sh)

#removing all of the none Values
header = list(filter(None, header))
sub_header = list(filter(None, sub_header))

print(header)
print(sub_header)
#creating a list of all the Columns in the excel file
col_list = []

for col in ws.iter_cols(min_row=3,min_col = 1): #Iteration over every single row starting from the third row since first two are the headers
    col = [cell.value for cell in col] #Creating a list from each row
    col = list(filter(None, col)) #removing the none values from each row
    col_list.append(col) #creating a list of all rows (starting from the 3d one)
#print (col_list)

但我确信肯定有一个更好的方法，我无法找到在文档或通过检查这个网站。

提前感谢！

最后，我的目标是通过迭代头来自动化代码的这一部分，并在每次代码时使用该头的子头及其值：

#bulding the templates using yattag "yattag.org"
doc , tag , text = Doc().tagtext()

#building the tags of the xml file
with tag("Data"): #root tag
    for row in row_list :
        with tag("Row"):
            with tag("Input"):
                with tag(header[0].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[0] + " the Precentage of Students with "+ sub_header[0] + " is "+ str(row[1]) + " whereas the " + sub_header[1] + " are " +   str(row[2]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[0] + " | " + sub_header[0]+ " | " + str(row[1]) + " | " + sub_header[1] + " | " + str(row[2]))
                with tag(header[1].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[1] + " the Precentage of Students with "+ sub_header[2] + " is "+ str(row[3]) + " whereas the " + sub_header[3] + " are " +   str(row[4]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[1] + " | " + sub_header[2]+ " | " + str(row[3]) + " | " + sub_header[3] + " | " + str(row[4]))
                with tag(header[2].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[2] + " the Precentage of Students with "+ sub_header[4] + " is "+ str(row[5]) + " whereas the " + sub_header[5] + " are " +   str(row[6]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[2] + " | " + sub_header[4]+ " | " + str(row[5]) + " | " + sub_header[5] + " | " + str(row[6]))  
                with tag(header[3].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[3] + " the Precentage of Students with "+ sub_header[6] + " is "+ str(row[7]) + " whereas the " + sub_header[7] + " are " +   str(row[8]) +" and for " + sub_header[8] + str(row[9]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[3] + " | " + sub_header[6]+ " | " + str(row[7]) + " | " + sub_header[7] + " | " + str(row[8]) + " | " + sub_header[8] + " | " + str(row[9]))
                with tag(header[4].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[4] + " the Precentage of Students with "+ sub_header[9] + " is "+ str(row[10]) + " whereas the " + sub_header[10] + " are " +   str(row[11]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[4] + " | " + sub_header[9]+ " | " + str(row[10]) + " | " + sub_header[10] + " | " + str(row[11]))  
                with tag(header[5].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[5] + " the Precentage of Students with "+ sub_header[11] + " is "+ str(row[12]) + " whereas the " + sub_header[12] + " are " +   str(row[13]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[5] + " | " + sub_header[11]+ " | " + str(row[12]) + " | " + sub_header[12] + " | " + str(row[13]))  
                with tag(header[6].replace(' ','_').replace('\n','_')):
                    text("In " + dic[row[0]]+" the precentage of Students " + " regarding the "  + header[6] + " the Precentage of Students with "+ sub_header[13] + " is "+ str(row[14]) + " whereas the " + sub_header[14] + " are " +   str(row[15]) )
                    with tag("Row_Data"):
                        text(dic[row[0]] + " | "  + header[6] + " | " + sub_header[13]+ " | " + str(row[14]) + " | " + sub_header[14] + " | " + str(row[15]))  
            
#print(doc.getvalue())
result = indent(
    doc.getvalue(),
    indentation='   ',
    indent_text=True
)

#saving the xml file
with open("output.xml", "w") as f:
    f.write(result)

python

excel

openpyxl

multi-index

回答 1

Stack Overflow用户

发布于 2021-11-22 08:40:35

除非潘达完全不在餐桌上，否则我想你也许能做些熊猫和开百叶草的事情。文档提到将openpyxl中的数据读取到熊猫的dataframe：与Pandas和Numpy合作中。你能用：

data = ws.values
df = DataFrame(data[2:,:], index=data[0], columns=data[1])

可能需要对None值进行一些必要的筛选。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70055488

复制

相似问题

问获取openpyxl中的子头
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问获取openpyxl中的子头EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问获取openpyxl中的子头
EN