我有一个由textGrid生成的前列腺定位仪文件,可以在Praat中打开。是否有可能从中提取一个类似于以下内容的文本文件:
Word in text | Pronounciation started at
Hello 0:0:0.000
my 0:0:1.125
friends 0:0:2.750编辑
附上的textGrid文件:
File type = "ooTextFile"
Object class = "TextGrid"
xmin = 0.0
xmax = 2.53
tiers? <exists>
size = 2
item []:
item [1]:
class = "IntervalTier"
name = "phones"
xmin = 0.0
xmax = 2.53
intervals: size = 13
intervals [1]:
xmin = 0.0
xmax = 0.62
text = "sil"
intervals [2]:
xmin = 0.62
xmax = 0.78
text = "K"
intervals [3]:
xmin = 0.78
xmax = 0.81
text = "L"
intervals [4]:
xmin = 0.81
xmax = 0.92
text = "IH1"
intervals [5]:
xmin = 0.92
xmax = 1.02
text = "K"
intervals [6]:
xmin = 1.02
xmax = 1.07
text = ""
intervals [7]:
xmin = 1.07
xmax = 1.22
text = "T"
intervals [8]:
xmin = 1.22
xmax = 1.31
text = "UW1"
intervals [9]:
xmin = 1.31
xmax = 1.51
text = "S"
intervals [10]:
xmin = 1.51
xmax = 1.67
text = "T"
intervals [11]:
xmin = 1.67
xmax = 1.85
text = "AA1"
intervals [12]:
xmin = 1.85
xmax = 1.88
text = "P"
intervals [13]:
xmin = 1.88
xmax = 2.53
text = "sil"
item [2]:
class = "IntervalTier"
name = "words"
xmin = 0.0
xmax = 2.53
intervals: size = 6
intervals [1]:
xmin = 0.0
xmax = 0.62
text = "sil"
intervals [2]:
xmin = 0.62
xmax = 1.02
text = "CLICK"
intervals [3]:
xmin = 1.02
xmax = 1.07
text = "sp"
intervals [4]:
xmin = 1.07
xmax = 1.31
text = "TO"
intervals [5]:
xmin = 1.31
xmax = 1.88
text = "STOP"
intervals [6]:
xmin = 1.88
xmax = 2.53
text = "sil"发布于 2014-07-16 04:23:07
因为这是一个Praat文件,而且您可以在Praat中打开它,所以我认为更好的解决方案是使用Praat来解决它。像下面这样的脚本所涉及的信念跳跃要少得多:
form Parse TextGrid...
sentence File /path/to/your.TextGrid
integer Tier 2
endform
Read from file: file$
intervals = Get number of intervals: tier
writeInfoLine: "Word in text", tab$, "Pronounciation started at"
for i to intervals
label$ = Get label of interval: tier, i
if label$ != ""
start = Get start point: tier, i
appendInfoLine: label$, tab$, string$(start)
endif
endfor如果将其保存到某个脚本中,则可以从命令行(如Praat )调用praat /path/to/your/script.praat "/path/to/your.TextGrid" 2,并从stdout获得所需的输出。
您还可以手动运行它,并可能使用这编写您的文件。
发布于 2014-07-01 19:07:12
TextGrid文件的语法有点奇怪。为了限制您的目的,列出单词及其起始点,您的解析器可能非常简单:
这一程序的结果是:
0.0 0.62 1.02 1.07 1.31 1.88
"sil“”单击"sp“至”停止“"sil”
现在,只需重新排序这两个数组,您就会得到您的表(数字是以秒为单位的起始点)。
请记住,sil是元标签“沉默”的缩写,"sp“是”言语停顿“的缩写。虽然在说话开始和结束时的沉默是预期的,但言语停顿可能是错误的,因为" to“一词的语调/t/以发音阻塞开始,这与语音停顿非常相似,但也是语调的一部分。
https://stackoverflow.com/questions/24493126
复制相似问题