首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >PHP将文本解析为结构化Json

PHP将文本解析为结构化Json
EN

Stack Overflow用户
提问于 2019-10-16 10:52:37
回答 1查看 54关注 0票数 0

我有一个这样的文本:

代码语言:javascript
复制
some text Xª 1234567-89.0123.45.6789 (YZ) 01/01/2011 Esbjörn Svensson 02/02/2022 Awesome Trio Wª 0987654-32.1098.76.5432 (KBoo) 07/09/2013 Some Full Name 09/07/2017 Observation 12/12/2018 some text that I don't want to keep Xª 4335678-98.7123.95.5689 09/10/2010 Name Here 08/09/2020 Observation and more text to delete

我需要一个这样的结构化Json:

代码语言:javascript
复制
     {
        "data":
            {
                "Team": "Xª",
                "ID": "1234567-89.0123.45.6789",
                "Type": "(YZ)",
                "Date 1": "01/01/2011",
                "Name": "Esbjörn Svensson",
                "Date 2: "02/02/2022",
                "Obs": "Awesome Trio",
                "Date 3": ""
            },
            {
                "Team": "Wª",
                "ID": "0987654-32.1098.76.5432",
                "Type": "(KBoo)",
                "Date 1": "07/09/2013",
                "Name": "Some Full Name",
                "Date 2: "09/07/2017",
                "Obs": "Observation",
                "Date 3": "12/12/2018"
            },
            {
                "Team": "Xª",
                "ID": "4335678-98.7123.95.5689",
                "Type": "",
                "Date 1": "09/10/2010",
                "Name": "Name Here",Name Here
                "Date 2: "08/09/2020",
                "Obs": "Observation",
                "Date 3": ""
            }
     }

我在这里搜索了很多代码,但我不能让它以我需要的方式工作。我试图拆分有空格和"ª“字符的文本,但不起作用。

代码语言:javascript
复制
foreach($textsource as &$lista) {
        $y = implode(' ',$lista);
        $x = preg_split(' ', $y);
        $delimiter = '/\ª/';
        $childIndex = array_keys(preg_grep($delimiter, $x));
        $chunks = [];
        $final = [];
        for ($i=0; $i<count($childIndex); $i++) {
            $chunks[$i]['begin'] = $childIndex[$i];
            if (isset($childIndex[$i+1])) {
            $chunks[$i]['len'] = $childIndex[$i+1]-$childIndex[$i];
            }
    }
    foreach ($chunks as $chunk) {
        if (isset($chunk['len'])){
            $final[] = array_slice($x, $chunk['begin'], $chunk['len']);
        } else {
            $final[] = array_slice($x, $chunk['begin']);
        }
    }
    echo "<pre>";
    print_r($final);
    echo "</pre>";

我很感谢你的帮助。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-10-16 16:10:10

所以我试着解决这个问题,这是你的working soluiton。顺便说一句,你的json无效。使用jsonlint进行检查。

代码语言:javascript
复制
$text = "some text Xª 1234567-89.0123.45.6789 (YZ) 01/01/2011 Esbjörn Svensson 02/02/2022 Awesome Trio Wª 0987654-32.1098.76.5432 (KBoo) 07/09/2013 Some Full Name 09/07/2017 Observation 12/12/2018 some text that I don't want to keep Xª 4335678-98.7123.95.5689 09/10/2010 Name Here 08/09/2020 Observation and more text to delete";

$arr = explode("ª", $text);
$team_arr = array_map(function ($team){ return substr($team, -1)."ª"; }, $arr);
array_shift($arr);

array_pop($team_arr);

$text = 'ignore everything except this (text)';
preg_match('#\((.*?)\)#', $text, $match);

$t = "01/01/2011 Esbjörn Svensson 02/02/2022";
$regEx = '/(\d{2})\/(\d{2})\/(\d{4})/';
preg_match_all($regEx, $t, $result);


$res = [];

$start = 0;
$end = count($arr);
for($i = 1; $i < $end; $i++){

    $obj = $arr[$i];

    $temp_obj_arr = explode(' ', trim($obj));

    preg_match('#\((.*?)\)#', $obj, $match);
    $type = (!empty($match[0]) ? $match[0] : "");

    preg_match_all('/(\d{2})\/(\d{2})\/(\d{4})/', $obj, $dates);
    $date1 = (!empty($dates[0][0]) ? $dates[0][0] : "");
    $date2 = (!empty($dates[0][1]) ? $dates[0][1] : "");
    $date3 = (!empty($dates[0][2]) ? $dates[0][2] : "");

    $tname = explode($date1." ", $obj);
    $char_arr = str_split($tname[1]);
    $name = '';
    foreach($char_arr as $ch){
        if (is_numeric($ch)) {
            break;
        } else {
            $name .=$ch;
        }
    }

    $tname = explode($date2." ", $obj);
    $char_arr = str_split($tname[1]);
    $obs = '';
    foreach($char_arr as $ch){
        if (is_numeric($ch)) {
            break;
        } else {
            $obs .=$ch;
        }
    }

    $tkey = $i;
    $tkey--;
    $obj = [];
    $obj['Team'] = $team_arr[$tkey];
    $obj['ID'] = $temp_obj_arr[0];
    $obj['Type'] = $type;
    $obj['Date 1'] = $date1;
    $obj['Name'] = $name;
    $obj['Date 2'] = $date2;
    $obj['Obs'] = $obs;
    $obj['Date 3'] = $date3;

    $res[] = $obj;

}

$json_res = json_encode($res, true);
print_r($json_res);
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/58405070

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档