首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从Linux中的文件夹获取日期信息,并保存在DF中的列中。

从Linux中的文件夹获取日期信息,并保存在DF中的列中。
EN

Stack Overflow用户
提问于 2022-09-29 12:18:57
回答 1查看 127关注 0票数 1

不太清楚该怎么问这个问题,但现在开始了。

我有这个df:

df

代码语言:javascript
复制
JOB_STREAM_NAME         JOB_NAME                        JOB_Command
0   P26_NEXT_MAU_TOD    PP_NEXT_RTBA_MAU_IND_INVE_D     /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_INVE_D.sh
1   P26_NEXT_MAU_TOD    PP_NEXT_RTBA_MAU_IND_EMPF_D     /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_EMPF_D.sh
2   P26_NEXT_NBA_TOD    PP_NEXT_NBA_AS110001_D          /data/app_next_best_action/call_nba_as11.sh
3   P26_AAIN_TOD        PP_AAIN_SPARK_CDLC_ING_DFLT_D   /data/application/AAIN/aain-srv-motor-extracao-next/iws/call_run_extract_default.sh cdlc_ing

我希望在JOB_COMMAND树结构的第4项中获得日期(来自Linux )。

文件夹aanx-dataeng-slas-sysyphus

代码语言:javascript
复制
[m292121@mz-vl-vb-415 ~]$ ll /data/application/AANX/
total 1348
ldrwxrwsr-x 12 root bgdt 4096 Sep 26 11:30 aanx-dataeng-slas-sysyphus

在这里,没有第4项,所以它得到最后一项,即文件call_nba_as11.sh

代码语言:javascript
复制
[m292121@al-vl-vb-408 ~]$ ll /data/app_next_best_action/call_nba_as11.sh
-rwxrwsr-x 1 root bgdt 371 Sep 20 19:20 /data/app_next_best_action/call_nba_as11.sh

文件夹aain-srv-motor-extracao-next

代码语言:javascript
复制
[m292121@mz-vl-vb-415 ~]$ ll /data/application/AAIN/
total 136
ldrwxrwsr-x 12 root bgdt 4096 Jul 15 10:30 aain-srv-motor-extracao-next

基本上我想要做到这一点

df

代码语言:javascript
复制
JOB_STREAM_NAME         JOB_NAME                        Last_Update         JOB_Command
0   P26_NEXT_MAU_TOD    PP_NEXT_RTBA_MAU_IND_INVE_D     2022-09-26 11:30:00 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_INVE_D.sh
1   P26_NEXT_MAU_TOD    PP_NEXT_RTBA_MAU_IND_EMPF_D     2022-09-26 11:30:00 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_EMPF_D.sh
2   P26_NEXT_NBA_TOD    PP_NEXT_NBA_AS110001_D          2022-09-20 19:20:00 /data/app_next_best_action/call_nba_as11.sh
3   P26_AAIN_TOD        PP_AAIN_SPARK_CDLC_ING_DFLT_D   2022-07-15 10:30:00 /data/application/AAIN/aain-srv-motor-extracao-next/iws/call_run_extract_default.sh cdlc_ing

我想把JOB_COMMAND分成一个新的专栏,并使用它进行搜索,但我仍然需要弄清楚如何获取信息。

有什么想法吗?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-10-02 09:41:29

根据你提供的数据:

代码语言:javascript
复制
import pandas as pd

df = pd.DataFrame(
    {
        "JOB_STREAM_NAME": [
            "P26_NEXT_MAU_TOD",
            "P26_NEXT_MAU_TOD",
            "P26_NEXT_NBA_TOD",
            "P26_AAIN_TOD",
        ],
        "JOB_NAME": [
            "PP_NEXT_RTBA_MAU_IND_INVE_D",
            "PP_NEXT_RTBA_MAU_IND_EMPF_D",
            "PP_NEXT_NBA_AS110001_D",
            "PP_AAIN_SPARK_CDLC_ING_DFLT_D",
        ],
        "JOB_Command": [
            "/data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_INVE_D.sh",
            "/data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_EMPF_D.sh",
            "/data/app_next_best_action/call_nba_as11.sh",
            "/data/application/AAIN/aain-srv-motor-extracao-next/iws/call_run_extract_default.sh cdlc_ing",
        ],
    }
)

下面是一种使用Python标准库路径库日期时间模块的方法:

代码语言:javascript
复制
import datetime
import numpy as np
from pathlib import Path


def get_fourth_elem(file_path):
    """Helper function.

    Args:
        file_path: file path as a string.

    Returns:
        absolute path to the fourth element (or last one if shorter) as a Pathlib object.
    """
    file_path_length = len(file_path.strip("/").split("/"))
    file_path = Path(file_path)
    if file_path_length > 4:
        for _ in range(file_path_length - 4):
            file_path = Path(file_path.parent)
        return file_path
    else:
        return file_path
代码语言:javascript
复制
df["Last_Update"] = df["JOB_Command"].apply(
    lambda x: datetime.datetime.fromtimestamp(
        get_fourth_elem(x).stat().st_mtime
    ).strftime("%Y-%m-%d %H:%H:%S")
    if Path(x).exists()
    else np.nan
)
df = df.reindex(columns=["JOB_STREAM_NAME", "JOB_NAME", "Last_Update", "JOB_Command"])
代码语言:javascript
复制
print(df)
# Output
JOB_STREAM_NAME         JOB_NAME                        Last_Update         JOB_Command
0   P26_NEXT_MAU_TOD    PP_NEXT_RTBA_MAU_IND_INVE_D     2022-09-26 11:30:00 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_INVE_D.sh
1   P26_NEXT_MAU_TOD    PP_NEXT_RTBA_MAU_IND_EMPF_D     2022-09-26 11:30:00 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_EMPF_D.sh
2   P26_NEXT_NBA_TOD    PP_NEXT_NBA_AS110001_D          2022-09-20 19:20:00 /data/app_next_best_action/call_nba_as11.sh
3   P26_AAIN_TOD        PP_AAIN_SPARK_CDLC_ING_DFLT_D   2022-07-15 10:30:00 /data/application/AAIN/aain-srv-motor-extracao-next/iws/call_run_extract_default.sh cdlc_ing
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73895432

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档