这是我在这里的第一个问题。我有这样的文件(从itunes导出的epf数据),例如EPF dataset
列由SOH (ASCII字符1)分隔,行由STX (ASCII字符2) +“n”分隔。一切都很好,但应用程序的描述是多行的,并且包含行尾字符。所以问题是当我试图逐行读取文件时
$fn = fopen("application_stripped","r");
while(! feof($fn)) {
$result = fgets($fn);
print_r($result);
}
fclose($fn);它检测第一个行尾(即描述中的行尾),但不会检测行尾的实际行尾符号。输入文件非常大(最多4-5 5gb)。你知道该怎么处理它吗?
PS:对不起,我的英语!:-)
发布于 2020-03-26 17:31:33
感谢大家的帮助,看起来我能处理好这个问题。下面是我所做的代码
$columnBreakpoint = 17;
$handle = @fopen("inputfile.txt", "r");
if ($handle) {
#export_date
#application_id
#title
#recommended_age
#artist_name
#seller_name
#company_url
#support_url
#view_url
#artwork_url_large
#artwork_url_small
#itunes_release_date
#copyright
#description
#version
#itunes_version
#download_size
$fileSeekPointer = 0;
while(! feof($handle)) {
// Reading a part of string
$result = stream_get_line($handle, 10000);
$positions = array();
$pos = -1;
// Detecting all the positions of the column separator symbol
while (($pos = strpos($result, "\x01", $pos + 1)) !== false) {
$positions[] = $pos;
}
// Getting 17th column separator position, because each product line must contain at least 17 columns
$breakpointPos = $positions[$columnBreakpoint];
// Stripping the line by this position
$resultS = substr($result, 0, $breakpointPos);
// Detecting position of end-of-line symbol in substring and strip by it
$eolPos = strrpos($resultS, PHP_EOL);
$resultS = substr($resultS, 0, $eolPos);
// Now we must find the first position after actual EOL symbol
$fileSeekPointer += ($breakpointPos + ($eolPos - $breakpointPos)) + 1;
// And set file pointer on the first position after actual end of line
fseek($handle, $fileSeekPointer);
print '------------------'."\n";
var_dump($resultS);
}
fclose($handle);
}https://stackoverflow.com/questions/60846575
复制相似问题