文章/答案/技术大牛

发布

社区首页 >问答首页 >MATLAB:文本扫描解析不规则文本，故障调试格式说明符

问MATLAB:文本扫描解析不规则文本，故障调试格式说明符
EN

Stack Overflow用户

提问于 2014-06-24 23:38:12

回答 1查看 439关注 0票数 0

我一直在浏览堆栈溢出，mathworks网站试图想出一个解决方案，用文本扫描将不定期格式化的文本文件读入MATLAB，但还没有找到一个好的解决方案。

文本文件的格式如下：

// Reference=MNI // Citation=Beauregard M，1998年 // Condition=Primed -未准备好的语义类别决策 // Domain=Semantics // Modality=Visual // Subjects=13 -55 -25 -23 33 -9 -20 // Citation=Beauregard M，1998年 // Condition=Unprimed语义类别决策-基线 // Domain=Semantics // Modality=Visual // Subjects=13 0 -73 9 -25 -59 47 0 -14 59 8 -18 63 -21 -90 -11 -24 -4 62 24 -93 6 -21 - 15 47 -35 -26 -21 9 13 44 // Citation=Binder J R，1996年 // Condition=Words >音调-被动 // Domain=Language知觉 // Modality=Auditory // Subjects=12 -58.73 -12.05 -4.61

我想得到一个类似于{nx3双} {nx1 cellstr} {nx1 double}的单元格数组。

其中数组中的第一个元素是三维坐标，第二个元素是引文，第三个元素是条件，第四个元素是域，第五个元素是情态，第六个元素是被试的数量。

然后，我想使用这些单元格数组将数据组织成一个结构，以便通过我从文本文件中提取的每个特性轻松地对坐标进行索引。

我尝试过很多方法，但只能将坐标提取为字符串，将特性提取为单个单元格数组。

下面是通过堆栈溢出和mathworks网站进行搜索后取得的进展：

fid = fopen(fullfile(path2proj,path2loc),'r');
data = textscan(fid,'%s %s %s','HeaderLines',1,...
    'delimiter',{...
        sprintf('// '),...
        'Citation=',...
        'Condition=',...
        'Domain=',...
        'Modality=',...
        'Subjects='});

下面的代码输出如下：

数据= {16470x1单元} {16470x1单元} {16470x1单元} 数据{1}(1:20) ans = “55 -25 -23”“33 -9 -20”739“25 -59 47”14 59‘8 -18 63“-21 -90 -11’-24 -4 62‘24 -93 -6’-21 15 47‘” 数据{2}(1:20) ans = “ 数据{3}(1:20) ans = 'Beauregard，1998‘Primed Unprimed语义类别决定'’语义'‘Visual’‘13’“”'‘Visual，1998’‘无准备的语义类别决策--基线'’语义‘

虽然我可以使用这种格式的数据，但最好能理解如何正确地纠正格式说明符，将数据提取到它自己的单元格数组中。有人死了吗？

matlab

text-parsing

textscan

回答 1

Stack Overflow用户

回答已采纳

发布于 2014-06-25 01:31:25

假设引用仅在第一行中，您可以执行以下操作，从每个节Citation节中获得所需的值。

% read the file and split it into sections based on Citation
filecontents = strsplit(fileread('data.txt'), '// Citation');


% iterate through section and extract desired info from each 
% section. We start from i=2, as for i=1 we have 'Reference' line.
for i = 2:numel(filecontents)

    lines = regexp(filecontents{i}, '\n', 'split');

    % remove empty lines   
    lines(find(strcmp(lines, ''))) = [];

    % get values of the fields
    citation  = lines{1};
    condition = get_value(lines{2}, 'Condition');
    domain = get_value(lines{3}, 'Domain'); 
    modality = get_value(lines{4}, 'Modality');
    subjects = get_value(lines{5}, 'Subjects'); 

    coordinates = cellfun(@str2num, lines(6:end), 'UniformOutput', 0)'; 

    % now you can save in some global cell, 
    % display or process the extracted values as you please.

end

其中get_value是：

function value = get_value(line, search_for)    
     [tokens, ~] = regexp(line, [search_for, '=(.+)'],'tokens','match');
     value = tokens{1};

希望这能有所帮助。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/24398021

复制

相似问题

问MATLAB:文本扫描解析不规则文本，故障调试格式说明符
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问MATLAB:文本扫描解析不规则文本，故障调试格式说明符EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问MATLAB:文本扫描解析不规则文本，故障调试格式说明符
EN