文章/答案/技术大牛

发布

社区首页 >问答首页 >如何分析MP3中的节拍/鼓时戳、触发动作和同时播放(Rust)

问如何分析MP3中的节拍/鼓时戳、触发动作和同时播放(Rust)
EN

Stack Overflow用户

提问于 2020-07-11 17:30:25

回答 1查看 318关注 0票数 0

当mp3文件中的节拍或鼓在播放过程中出现时，我想触发一个动作(例如，一个明亮的闪光灯)。我不知道理论上我应该采取什么程序/方法。

首先，我考虑在第一步静态地分析MP3。分析的结果将是何时触发该动作。然后启动MP3，另一个线程在特定时间启动操作。这应该很容易，因为我可以使用rodio-crate进行回放。但静态分析部分仍然比较重。

分析算法：

我的想法是使用MP3 minimp3-crate读取原始音频数据，并使用rustfft-crate进行快速傅立叶变换。当我从FFT的频谱分析，我可以看到深频率在一个高音量，这应该是歌曲的节拍。

我试着把minimp3和rustfft结合起来，但是我完全不知道我得到的数据到底意味着什么。我也不能为它写一个测试。

到目前为止，这是我的做法：

use minimp3::{Decoder, Frame, Error};

use std::fs::File;
use std::sync::Arc;
use rustfft::FFTplanner;
use rustfft::num_complex::Complex;
use rustfft::num_traits::{Zero, FromPrimitive, ToPrimitive};

fn main() {
    let mut decoder = Decoder::new(File::open("08-In the end.mp3").unwrap());

    loop {
        match decoder.next_frame() {
            Ok(Frame { data, sample_rate, channels, .. }) => {
                // we only need mono data; because data is interleaved
                // data[0] is first value channel left, data[1] is first channel right, ...
                let mut mono_audio = vec![];
                for i in 0..data.len() / channels {
                    let sum = data[i] as i32 + data[i+1] as i32;
                    let avg = (sum / 2) as i16;
                    mono_audio.push(avg);
                }
                // unnormalized spectrum; now check where the beat/drums are 
                // by checking for high volume in low frequencies
                let spectrum = calc_fft(&mono_audio);
            },
            Err(Error::Eof) => break,
            Err(e) => panic!("{:?}", e),
        }
    }
}

fn calc_fft(raw_mono_audio_data: &Vec<i16>) -> Vec<i16> {
    // Perform a forward FFT of size 1234

    let len = raw_mono_audio_data.len();

    let mut input:  Vec<Complex<f32>> = vec![];
    //let mut output: Vec<Complex<f32>> = vec![Complex::zero(); 256];
    let mut spectrum: Vec<Complex<f32>> = vec![Complex::zero(); len];

    // from Vec<i16> to Vec<Complex<f32>>
    raw_mono_audio_data.iter().for_each(|val| {
        let compl = Complex::from_i16(*val).unwrap();
        input.push(compl);
    });

    let mut planner = FFTplanner::new(false);
    let fft = planner.plan_fft(len);
    fft.process(&mut input, &mut spectrum);

    // to Vec<i16>
    let mut output_i16 = vec![];
    spectrum.iter().for_each(|val| {
        if let Some(val) = val.to_i16() {
            output_i16.push(val);
        }
    });

    output_i16
}

我的问题还在于，FFT函数没有任何参数，可以指定sample_rate ( 48.000kHz)。我从decoder.next_frame()得到的只有2304个条目的Vec<i16>。

有什么想法，我如何才能做到这一点，以及我目前得到的数字实际上是什么意思？

audio

rust

fft

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-06-27 20:23:46

TL;DR：

解耦分析和音频数据准备。(1)读取MP3/WAV数据，将两个通道连接到单通道(更容易分析)，从长度为2的数据切片(用于FFT；如果需要填充附加零)，最后(2)将该数据应用于板条分析器，并从代码(这是很好的文档)中学习如何从FFT中获得某些频率。

较长版本

将问题分解为较小的问题/子任务。

离散窗口=>节拍音频数据分析:是或否

- a "window" is usually a fixed-size view into the on-going stream of audio data
- choose a strategy here: for example a lowpass filter, a FFT, a combination, ... search for "beat detection algorithm" in literature 
    - if you are doing an FFT, you should extend your data window always to the next power of 2 (e.g. fill with zeroes).

读取mp3，将其转换为mono，然后逐步将音频样本传递给分析算法。

- You can use the **sampling rate** and the **sample index** to calculate the point in time
- => attach "beat: yes/no" to timestamps inside the song

分析-部分应保持一般可用，以便它的工作现场音频以及文件。音乐通常以44100 16或48000 16和16位分辨率进行离散。所有常见的音频库都将为您提供一个接口，通过这些属性从麦克风访问音频输入。如果您阅读MP3或WAV，音乐(音频数据)通常采用相同的格式。例如，如果分析2048长度为44100 at的窗口，则每个窗口的长度为1/f * n == T * n == n/f == (2048/44100)s == ~46,4ms。时间窗口越短，你的节拍检测就越快，但你的准确性就越低--这是一种权衡:)你的算法可以保持对以前的窗口的了解，以重叠它们以减少噪声/错误数据。

要查看解决这些子问题的现有代码，我建议

https://crates.io/crates/lowpass-filter：简单的低通滤波器，用于获取数据窗口=> (可能是a)的低频
https://crates.io/crates/spectrum-analyzer：使用快速傅立叶变换对音频窗口进行频谱分析，并提供关于存储库内部如何实现的优秀文档。

有了机箱拍检器，就有了一个解决方案，可以很好地实现这个问题的原始内容。它将实时音频输入与分析算法连接起来。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/62852446

复制

相似问题

问如何分析MP3中的节拍/鼓时戳、触发动作和同时播放(Rust)
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何分析MP3中的节拍/鼓时戳、触发动作和同时播放(Rust)EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何分析MP3中的节拍/鼓时戳、触发动作和同时播放(Rust)
EN