Audapolis: Edit audio files by word, not waveform
The Audapolis project on GitHub offers a tailored editor for spoken-word media with audio-to-text transcription. It supports various media types, works on Windows, Linux, and macOS, and stores data locally. Funding sources include governmental and foundation support.
Read original articleThe audapolis project on GitHub offers an editor tailored for spoken-word media, featuring a wordprocessor-like interface and automatic audio-to-text transcription. It supports editing various media types, including video, audio, and mixed media, and is compatible with Windows, Linux, and macOS. Notably, all data is stored locally without cloud storage. Users can download the latest version from the provided link and report bugs or provide feedback through the GitHub repository. A survey is also available for users to share their needs and expectations. The project received funding from September 2021 to February 2022 from the "Bundesministerium für Bildung und Forschung", Prototype Fund, and Open Knowledge Foundation Deutschland.
Related
Groqnotes: Generate structured notes from audio using Groq, Whisper, and Llama3
The GitHub project "Groqnotes" is a streamlit app utilizing Groq, Whisper, and Llama3 to create structured notes from audio content efficiently. It offers rapid transcription, markdown styling, and download options. Access online or set up locally.
That Editor
The GitHub repository hosts a DOS-like editor created for video production, not ideal for general use. It reflects historical hardware and software limitations, tailored for specific vintage computing requirements.
Show HN: AI assisted image editing with audio instructions
The GitHub repository hosts "AAIELA: AI Assisted Image Editing with Language and Audio," a project enabling image editing via audio commands and AI models. It integrates various technologies for object detection, language processing, and image inpainting. Future plans involve model enhancements and feature integrations.
Transcribro: On-device Accurate Speech-to-text
The GitHub repository for "Transcribro" offers project details, downloads, community links, contribution guidelines, donations, branding guidelines, and keyboard UI screenshots. Contact for project-specific support or inquiries.
Audacity 3.6
Audacity 3.6 brings master effects, a new compressor, and limiter with gain reduction history. It offers factory presets, dark and light themes, improved performance, and custom theme installation. Users can switch themes in Preferences. Audacity is a free, open-source audio editor for various operating systems.
- Some users appreciate it as a free alternative to Descript and praise its open-source nature.
- Several users suggest improvements, such as adding a demo video and supporting modern speech recognition models like Whisper.
- There are comparisons to other tools like Adobe's demo, Hindenburg, and iOS voice memos, highlighting similar functionalities.
- Some users express skepticism about the practicality of text-based audio editing for serious audio work.
- Comments also touch on the project's funding, with some noting the support from the German government.
EDIT: I could also definitely see Audapolis being useful if you could integrate it into a podcast's post processing flow (volume normalization, de-essing) by recognizing certain verbal tics and automatically removing them from the audio such as "ummmm...", etc.
I've always liked the idea of Descript and was considering building something similar before it came out. The problem is my use case is a couple of videos a year so doesn't fit with an expensive monthly subscription
Take a look at https://matcha.video
This functionality is some of my favorite when editing videos in Descript. It’s so much easier than chopping up waveforms in Audacity
[0] descript.com/
> Hindenburg’s manuscript feature gives you a complete overview of your audio. You can select the text just as you would in a text document and watch as your edits are made in real-time. If you need to export your text in a specific format, no problem. Hindenburg supports the most common text and transcription export formats.
I built something similar here: https://bigwav.app
A number of comments turned me onto Descript -- made a similar comment on another audio thread recently: drives me absolutely insane how all audio tools with any AI are web based monthly saas instead of offline private gpu upfront purchase.
Is 1 emoji for each commit title a new trend?
Can anyone clarify if this project is active?
Related
Groqnotes: Generate structured notes from audio using Groq, Whisper, and Llama3
The GitHub project "Groqnotes" is a streamlit app utilizing Groq, Whisper, and Llama3 to create structured notes from audio content efficiently. It offers rapid transcription, markdown styling, and download options. Access online or set up locally.
That Editor
The GitHub repository hosts a DOS-like editor created for video production, not ideal for general use. It reflects historical hardware and software limitations, tailored for specific vintage computing requirements.
Show HN: AI assisted image editing with audio instructions
The GitHub repository hosts "AAIELA: AI Assisted Image Editing with Language and Audio," a project enabling image editing via audio commands and AI models. It integrates various technologies for object detection, language processing, and image inpainting. Future plans involve model enhancements and feature integrations.
Transcribro: On-device Accurate Speech-to-text
The GitHub repository for "Transcribro" offers project details, downloads, community links, contribution guidelines, donations, branding guidelines, and keyboard UI screenshots. Contact for project-specific support or inquiries.
Audacity 3.6
Audacity 3.6 brings master effects, a new compressor, and limiter with gain reduction history. It offers factory presets, dark and light themes, improved performance, and custom theme installation. Users can switch themes in Preferences. Audacity is a free, open-source audio editor for various operating systems.