A C/C++ library for audio and music analysis
audioFlux is a deep learning library for audio analysis, featuring new pitch algorithms in version 0.1.8. It supports Python 3.6+, with modules for transformations, features, and music information retrieval.
Read original articleaudioFlux is a deep learning library designed for audio and music analysis, feature extraction, and various audio-related tasks. It supports multiple time-frequency analysis methods and is suitable for applications such as classification, separation, music information retrieval (MIR), and automatic speech recognition (ASR). The library emphasizes performance through C implementations and hardware acceleration for FFT, making it effective for large-scale data processing. The latest version, v0.1.8, introduces several pitch algorithms, including YIN, CEP, PEF, NCF, HPS, LHS, STFT, and FFP, along with new algorithms for pitch shifting and time stretching. audioFlux is compatible with Python 3.6 and above and can be installed via PyPI or Anaconda. The main modules include Transform, which offers various time-frequency representation algorithms; Feature, which provides spectral features and cepstrum coefficients; and MIR, which focuses on pitch and onset detection. Comprehensive documentation is available, and contributions are encouraged. The project is licensed under the MIT License.
- audioFlux is a deep learning library for audio analysis and feature extraction.
- Version 0.1.8 includes new pitch algorithms and features for audio manipulation.
- The library can be installed via PyPI or Anaconda for Python 3.6+.
- Main modules cover time-frequency transformations, spectral features, and MIR tasks.
- Contributions to the project are welcome, and it is licensed under the MIT License.
Related
FileFlows: Execute actions against files in a tree flow structure
FileFlows is a versatile tool for processing various file types like text, images, audio, and video. It supports transcoding, converting, and optimizing files, offering detailed reporting and customization options for users.
Black Forest Labs – FLUX.1 open weights SOTA text to image model
Black Forest Labs has launched to develop generative deep learning models for media, securing $31 million in funding. Their FLUX.1 suite includes three model variants, outperforming competitors in image synthesis.
The open weight Flux text to image model is next level
Black Forest Labs has launched Flux, the largest open-source text-to-image model with 12 billion parameters, available in three versions. It features enhanced image quality and speed, alongside the release of AuraSR V2.
FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention
FlexAttention is a new PyTorch API that enhances flexibility and performance in attention mechanisms, allowing users to implement various attention variants efficiently while leveraging existing infrastructure and improving performance through sparsity.
Forget Midjourney – Flux is the new king of AI image generation
Flux, an open-source AI image generator by Black Forest Labs, competes with Midjourney and Stable Diffusion, offering three versions and a developing text-to-video model for enhanced media production.
- Users are interested in comparisons with existing C++ music information retrieval libraries.
- There are questions about the necessity of GPU-accelerated functions for deep learning applications.
- Some commenters clarify the programming languages used, noting that it is C and Python, not C++.
- There is curiosity about the library's capabilities, such as audio fingerprinting and feature parity with other libraries like librosa.
- One user humorously inquires about a version in Rust, highlighting interest in language compatibility.
- https://github.com/marsyas/marsyas
- https://github.com/ircam-ismm/pipo
- https://github.com/flucoma/flucoma-core/tree/main/include/al...
Related
FileFlows: Execute actions against files in a tree flow structure
FileFlows is a versatile tool for processing various file types like text, images, audio, and video. It supports transcoding, converting, and optimizing files, offering detailed reporting and customization options for users.
Black Forest Labs – FLUX.1 open weights SOTA text to image model
Black Forest Labs has launched to develop generative deep learning models for media, securing $31 million in funding. Their FLUX.1 suite includes three model variants, outperforming competitors in image synthesis.
The open weight Flux text to image model is next level
Black Forest Labs has launched Flux, the largest open-source text-to-image model with 12 billion parameters, available in three versions. It features enhanced image quality and speed, alongside the release of AuraSR V2.
FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention
FlexAttention is a new PyTorch API that enhances flexibility and performance in attention mechanisms, allowing users to implement various attention variants efficiently while leveraging existing infrastructure and improving performance through sparsity.
Forget Midjourney – Flux is the new king of AI image generation
Flux, an open-source AI image generator by Black Forest Labs, competes with Midjourney and Stable Diffusion, offering three versions and a developing text-to-video model for enhanced media production.