首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在循环中将Pandas系列附加到Dataframe

在循环中将Pandas系列附加到Dataframe
EN

Stack Overflow用户
提问于 2022-01-30 15:53:09
回答 2查看 45关注 0票数 1

我正试图将nmap扫描的结果附加到数据帧中。

代码语言:javascript
复制
def vulnScan(targets):
    portInfo =[]
    columnNames = ["Port","Protocol","State","Service"]
    for target in targets:
        portsDF = pd.DataFrame(columns = columnNames)
        print("Executing: nmap -Pn "+target[1])
        result = subprocess.run(['nmap','-Pn',target[1]], universal_newlines = True, stdout = subprocess.PIPE)
        for line in result.stdout.split("\n"):
            if "/" in line and "Starting" not in line:
                tableInfo = line.split(" ")
                port = tableInfo[0].split("/")[0]
                protocol = tableInfo[0].split("/")[1]
                status = tableInfo[1]
                service = tableInfo[3]
                print(port,protocol,status,service)
                newRow = pd.Series(data=[port,protocol,status,service],index=["Port","Protocol","State","Service"])
                portsDF = portsDF.append(newRow, ignore_index=True)
                print(tabulate(portsDF, headers="keys",tablefmt='psql'))
        portInfo = portInfo.append([target[0],portsDF])
    print("")
    print(tabulate(portInfo, headers="keys", tablefmt='psql'))

但是,正如您从输出中看到的那样,dataframe从未填充过。

代码语言:javascript
复制
80 tcp
+--------+------------+---------+-----------+
| Port   | Protocol   | State   | Service   |
|--------+------------+---------+-----------|
+--------+------------+---------+-----------+
135 tcp open msrpc
+--------+------------+---------+-----------+
| Port   | Protocol   | State   | Service   |
|--------+------------+---------+-----------|
+--------+------------+---------+-----------+
139 tcp open netbios-ssn
+--------+------------+---------+-----------+
| Port   | Protocol   | State   | Service   |
|--------+------------+---------+-----------|
+--------+------------+---------+-----------+
443 tcp open https
+--------+------------+---------+-----------+
| Port   | Protocol   | State   | Service   |
|--------+------------+---------+-----------|
+--------+------------+---------+-----------+
445 tcp open microsoft-ds
+--------+------------+---------+-----------+
| Port   | Protocol   | State   | Service   |
|--------+------------+---------+-----------|
+--------+------------+---------+-----------+

+-----------------+-------------------------------------------+
| 0               | 1                                         |
|-----------------+-------------------------------------------|
| DESKTOP-30UOSMD | Empty DataFrame                           |
|                 | Columns: [Port, Protocol, State, Service] |
|                 | Index: []                                 |
+-----------------+-------------------------------------------+

我不确定我遗漏了什么,因为我已经检查了文档https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.append.html,并且我认为我正确地使用了append()

更新的部分

理查兹的答案似乎有效,但已经导致portInfo列表不再是一个列表,它现在属于无类。

代码语言:javascript
复制
def vulnScan(targets):
    portInfo = []
    print(type(portInfo))
    columnNames = ["Port","Protocol","State","Service"]
    for target in targets:
        rows = []
        print("Executing: nmap -Pn "+target[1])
        result = subprocess.run(['nmap','-Pn',target[1]], universal_newlines = True, stdout = subprocess.PIPE)
        for line in result.stdout.split("\n"):
            #This could be improved "/" indicates a row in table output
            if "/" in line and "Starting" not in line:
                tableInfo = line.split(" ")
                port = tableInfo[0].split("/")[0]
                protocol = tableInfo[0].split("/")[1]
                status = tableInfo[1]
                service = tableInfo[3]
                print(port,protocol,status,service)
                newRow = pd.Series(data=[port,protocol,status,service],index=["Port","Protocol","State","Service"])
                rows.append(newRow)

        portsDF = pd.DataFrame(rows, columns = columnNames)
        print(tabulate(portsDF, headers="keys", tablefmt='psql'))
        portInfo = portInfo.append([target[0],portsDF])
        print(type(portInfo))
        print(portInfo)

输出:

代码语言:javascript
复制
<class 'list'>
Executing: nmap -Pn 192.168.1.86
80 tcp
135 tcp open msrpc
139 tcp open netbios-ssn
443 tcp open https
445 tcp open microsoft-ds
+----+--------+------------+---------+--------------+
|    |   Port | Protocol   | State   | Service      |
|----+--------+------------+---------+--------------|
|  0 |     80 | tcp        |         |              |
|  1 |    135 | tcp        | open    | msrpc        |
|  2 |    139 | tcp        | open    | netbios-ssn  |
|  3 |    443 | tcp        | open    | https        |
|  4 |    445 | tcp        | open    | microsoft-ds |
+----+--------+------------+---------+--------------+
<class 'NoneType'>
None

在portInfo列表中,我们应该有一个具有主机名(String)和端口信息(Dataframe)的list对象。

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-01-30 16:10:42

pandas.DataFrame.append不合适,所以它返回一个新对象,正如您链接的docs页面所说的那样。因此,您通常会这样做:

代码语言:javascript
复制
portsDF = portsDF.append(newRow, ignore_index=True)

但是在本例中,您是在循环中填充数据,所以运行上面的操作只会创建一个名为portsDF的新变量,而不是修改原始的portsDF

因此,在这种情况下,我会创建一个列表并将每一行附加到其中,然后在循环完成后从它创建portsDF

代码语言:javascript
复制
columnNames = ["Port","Protocol","State","Service"]
for target in targets:
    # New code:
    rows = []

    print("Executing: nmap -Pn "+target[1])
    result = subprocess.run(['nmap','-Pn',target[1]], universal_newlines = True, stdout = subprocess.PIPE)
    for line in result.stdout.split("\n"):
        if "/" in line and "Starting" not in line:
            tableInfo = line.split(" ")
            port = tableInfo[0].split("/")[0]
            protocol = tableInfo[0].split("/")[1]
            status = tableInfo[1]
            service = tableInfo[3]
            print(port,protocol,status,service)
            
            newRow = pd.Series(data=[port,protocol,status,service],index=["Port","Protocol","State","Service"])
            # New Code:
            rows.append(newRow)
    
    # New code:
    portsDF = pd.DataFrame(rows, columns=columnNames)
票数 0
EN

Stack Overflow用户

发布于 2022-01-30 18:12:25

由于Series被设计为保存相同类型的原子值,所以避免使用多列。相反,构建一个字典列表以传递到外部循环的DataFrame构造函数。

下面的命令行调用用于组织和异常处理。此外,target[0]可能是一个可作为键(而不是list元素)用于标识数据帧字典列表的每个数据帧对象的可选值。

代码语言:javascript
复制
def run_cmd(t):
    print("Executing: nmap -Pn "+t)
    result = subprocess.run(
        ['nmap','-Pn',t], 
        universal_newlines = True, 
        stdout = subprocess.PIPE
    ) 
        
    return result.stdout.split("\n")
         
def vulnScan(targets): 
    portInfo = []
    for target in targets: 
        rows = [] 
        output_lines = run_cmd(target[1])
        for line in output_lines:
            if "/" in line and "Starting" not in line: 
                tableInfo = line.split(" ")
                d = {
                    "port": tableInfo[0].split("/")[0],
                    "protocol": tableInfo[0].split("/")[1],
                    "status": tableInfo[1],
                    "service": tableInfo[3]
                }
                rows.append(d)

        portDF = {target[0]: pd.DataFrame(rows)}
        portInfo.append(portDF)

    return portInfo
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/70916322

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档