我试图使用pyAudioAnalysis来分析来自HTTP流的实时音频流.我的目标是使用这个库中的零交叉率(ZCR)和其他方法来识别流中的事件。
pyAudioAnalysis只支持来自文件的输入,但是将http流转换为.wav将产生很大的开销和临时文件管理,我想避免。
我的方法如下:
使用ffmpeg,我能够将原始音频字节放入子进程管道中。
try:
song = subprocess.Popen(["ffmpeg", "-i", "https://media-url/example", "-acodec", "pcm_s16le", "-ac", "1", "-f", "wav", "pipe:1"],
stdout=subprocess.PIPE)然后,我使用pyAudio缓冲这些数据,希望能够使用pyAudioAnalysis中的字节。
CHUNK = 65536
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16,
channels=1,
rate=44100,
output=True)
data = song.stdout.read(CHUNK)
while len(data) > 0:
stream.write(data)
data = song.stdout.read(CHUNK)但是,将这个数据输出输入AudioBasicIO.read_audio_generic()会产生一个空的numpy数组。
在没有临时文件创建的情况下,对此问题有有效的解决方案吗?
发布于 2022-03-29 23:49:18
您可以尝试我的ffmpegio包:
pip install ffmpegioimport ffmpegio
# read entire stream
fs, x = ffmpegio.audio.read("https://media-url/example", ac=1, sample_fmt='s16')
# fs - sampling rate
# x - [nx1] numpy array
# or read a block at a time:
with ffmpegio.open(["https://media-url/example", "ra", blocksize=1024, ac=1, sample_fmt='s16') as f:
fs = f.rate
for x in f:
# x: [1024x1] numpy array (or shorter for the last block)
process_data(x)注意,如果需要规范化样本,可以将sample_fmt设置为'flt' 'dbl'。
如果您希望保持低依赖,调用ffmpeg子进程的关键是使用原始输出格式:
import subprocess as sp
import numpy as np
song = sp.Popen(["ffmpeg", "-i", "https://media-url/example", "-f", "s16le","-c:a", "pcm_s16le", "-ac", "1", "pipe:1"], stdout=sp.PIPE)
CHUNK = 65536
n = CHUNK/2 # 2 bytes/sample
data = np.frombuffer(song.stdout.read(CHUNK),np.int16)
while len(data) > 0:
data = np.frombuffer(song.stdout.read(CHUNK),np.int16)我不能谈论pyAudioAnalysis,但我怀疑它需要的是样本而不是字节。
https://stackoverflow.com/questions/71669701
复制相似问题