我正在尝试为每个视频帧提取音频特征。我知道视频文件中有30个视频帧和16000个音频帧每秒。我正在使用pyAudioAnalysis python来实现这个目标,但没有成功。这是我的密码。
from __future__ import print_function
from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import ShortTermFeatures,MidTermFeatures
import matplotlib.pyplot as plt
import os,shlex, subprocess
import pandas as pd
import numpy as np
# -acodec pcm_s16le -vn -ar 16000
command_line = "ffmpeg -i test.mp4 -ac 1 -ar 16000 -vn test_mono.wav"
#command_line = "ffmpeg -i test.mp4 -ab 160k -ac 2 -ar 44100 -vn test_stereo.wav"
args = shlex.split(command_line)
print(args)
processResult = subprocess.call(args) # Success!
print(processResult)
[SamplingRate, signals] = audioBasicIO.read_audio_file("test_mono.wav")
print(SamplingRate)‘'ffmpeg','-i','test.mp4','-ac','1','-ar','16000','-vn','test’_Mono.wav的 1 16000
#mid_feature_extraction(signal, sampling_rate, mid_window, mid_step,short_window, short_step):
MidFeatures,ShortFeatures,MidFeatureLabels=MidTermFeatures.mid_feature_extraction(signals, SamplingRate, 0.043*SamplingRate,
0.043*SamplingRate,0.00016*SamplingRate,
0.00016*SamplingRate)
print('Mid Features Extr Success')
MidFeatures_Dataframe = pd.DataFrame(data=MidFeatures.transpose(), columns=MidFeatureLabels)
print(type(MidFeatures))
#print(MidFeatures)
print(type(ShortFeatures))
#print(ShortFeatures)
print(type(MidFeatureLabels))
#print(MidFeatureLabels)
MidFeatures_Dataframe.to_csv('Audio_MidFeatures.csv')
print('Mid Features File Success')列表>中特性文件成功
根据我的计算,我应该得到338行的音频功能,经过很长一段时间的斗争,我得到326与上述参数,但仍然不知道如何。如果有人能帮我,窗口和台阶在这里工作。我知道窗口和步骤的基本概念,作为CNN的工作,但不是在这种情况下。
发布于 2020-08-25 13:03:22
我不知道您的所有计算,但是查看代码这里、short_window和short_step中的文档应该是在示例中(可能还有mid_window和mid_step)。
但是,在您的代码中:
#mid_feature_extraction(
signal, sampling_rate,
mid_window, mid_step,
short_window, short_step
)
midFeat,shortFeat,midFeatLabels=MidTermFeatures.mid_feature_extraction(
signals, SamplingRate,
0.043*SamplingRate, 0.043*SamplingRate,
0.00016*SamplingRate, 0.00016*SamplingRate
)short_step=short_window=0.00016*16000=2.56似乎不在样品中。因此,它将被转换为整数,并且等于2而不是2.56。
希望能帮上忙。
https://datascience.stackexchange.com/questions/80743
复制相似问题