1
00:00:00,074 --> 00:00:02,564
Previously on Breaking Bad...
2
00:00:02,663 --> 00:00:04,393
Words...我需要用php解析srt文件,并用变量打印文件中的所有subs。
我找不到合适的注册表。在执行此操作时,我需要获取id、time和subtitle变量。在打印时,不一定要有array()等,必须打印与原始文件相同的内容。
我的意思是我必须像这样打印;
$number <br> (e.g. 1)
$time <br> (e.g. 00:00:00,074 --> 00:00:02,564)
$subtitle <br> (e.g. Previously on Breaking Bad...)顺便说一句,我有这个代码。但它看不到线条。它必须被编辑,但是如何编辑呢?
$srt_file = file('test.srt',FILE_IGNORE_NEW_LINES);
$regex = "/^(\d)+ ([\d]+:[\d]+:[\d]+,[\d]+) --> ([\d]+:[\d]+:[\d]+,[\d]+) (\w.+)/";
foreach($srt_file as $srt){
preg_match($regex,$srt,$srt_lines);
print_r($srt_lines);
echo '<br />';
}发布于 2012-07-26 06:01:42
下面是一个简短的状态机,用于逐行解析SRT文件:
define('SRT_STATE_SUBNUMBER', 0);
define('SRT_STATE_TIME', 1);
define('SRT_STATE_TEXT', 2);
define('SRT_STATE_BLANK', 3);
$lines = file('test.srt');
$subs = array();
$state = SRT_STATE_SUBNUMBER;
$subNum = 0;
$subText = '';
$subTime = '';
foreach($lines as $line) {
switch($state) {
case SRT_STATE_SUBNUMBER:
$subNum = trim($line);
$state = SRT_STATE_TIME;
break;
case SRT_STATE_TIME:
$subTime = trim($line);
$state = SRT_STATE_TEXT;
break;
case SRT_STATE_TEXT:
if (trim($line) == '') {
$sub = new stdClass;
$sub->number = $subNum;
list($sub->startTime, $sub->stopTime) = explode(' --> ', $subTime);
$sub->text = $subText;
$subText = '';
$state = SRT_STATE_SUBNUMBER;
$subs[] = $sub;
} else {
$subText .= $line;
}
break;
}
}
if ($state == SRT_STATE_TEXT) {
// if file was missing the trailing newlines, we'll be in this
// state here. Append the last read text and add the last sub.
$sub->text = $subText;
$subs[] = $sub;
}
print_r($subs);结果:
Array
(
[0] => stdClass Object
(
[number] => 1
[stopTime] => 00:00:24,400
[startTime] => 00:00:20,000
[text] => Altocumulus clouds occur between six thousand
)
[1] => stdClass Object
(
[number] => 2
[stopTime] => 00:00:27,800
[startTime] => 00:00:24,600
[text] => and twenty thousand feet above ground level.
)
)然后,您可以遍历subs数组或通过数组偏移量访问它们:
echo $subs[0]->number . ' says ' . $subs[0]->text . "\n";要通过循环遍历并显示每个subs来显示所有subs,请执行以下操作:
foreach($subs as $sub) {
echo $sub->number . ' begins at ' . $sub->startTime .
' and ends at ' . $sub->stopTime . '. The text is: <br /><pre>' .
$sub->text . "</pre><br />\n";
}进一步阅读:SubRip Text File Format
发布于 2012-07-26 05:55:05
这不会匹配,因为您的$srt_file数组可能如下所示:
Array
([0] => '1',
[1] => '00:00:00,074 --> 00:00:02,564',
[2] => 'Previously on Breaking Bad...'.
[3] => '',
[4] => '2',
...
)您的正则表达式不会与这些元素中的任何一个匹配。
如果您打算将整个文件读入一个冗长的内存中,那么可以使用file_get_contents将整个文件内容读入一个字符串中。然后使用preg_match_all获取所有正则表达式匹配。
否则,您可能会尝试遍历数组,并尝试匹配各种正则表达式模式,以确定该行是id、时间范围还是文本,然后进行适当的操作。显然,您可能还需要一些逻辑来确保您以正确的顺序获取值(id,然后是时间范围,然后是文本)。
发布于 2012-07-26 06:01:13
使用array_chunk()将file()数组分成4个块,然后省略最后一个条目,因为它是一个空行,如下所示:
foreach( array_chunk( file( 'test.srt'), 4) as $entry) {
list( $number, $time, $subtitle) = $entry;
echo $number . '<br />';
echo $time . '<br />';
echo $subtitle . '<br />';
}https://stackoverflow.com/questions/11659118
复制相似问题