Show HN: Local voice assistant using Ollama, transformers and Coqui TTS toolkit
The GitHub project "june" combines Ollama, Hugging Face Transformers, and Coqui TTS Toolkit for a private voice chatbot on local machines. It includes setup, usage, customization details, and FAQs. Contact for help.
Read original articleThe GitHub project "june" is a local voice chatbot merging Ollama, Hugging Face Transformers, and the Coqui TTS Toolkit. It offers a privacy-centric voice interaction solution for local machines. The project covers installation guidelines, usage instructions, customization options, and a FAQ section. For additional information or support, reach out for assistance.
Related
Show HN: Pomoglorbo, a TUI Pomodoro timer for your terminal
A Pomodoro Technique timer, Pomoglorbo, enhances productivity with customizable features like audio settings and work intervals. Users can contribute to the project following guidelines for development and testing.
Groqnotes: Generate structured notes from audio using Groq, Whisper, and Llama3
The GitHub project "Groqnotes" is a streamlit app utilizing Groq, Whisper, and Llama3 to create structured notes from audio content efficiently. It offers rapid transcription, markdown styling, and download options. Access online or set up locally.
Show HN: Feedback on Sketch Colourisation
The GitHub repository contains SketchDeco, a project for colorizing black and white sketches without training. It includes setup instructions, usage guidelines, acknowledgments, and future plans. Users can seek support if needed.
LibreChat: Enhanced ChatGPT clone for self-hosting
LibreChat introduces a new Resources Hub, featuring a customizable AI chat platform supporting various providers and services. It aims to streamline AI interactions, offering documentation, blogs, and demos for users.
Gren 0.4: New Foundations
Gren 0.4 updates its functional language with enhanced core packages, a new compiler, revamped FileSystem API, improved functions, and a community shift to Discord. These updates aim to boost usability and community engagement.
These are easy to make and fun to play with and it's awesome to have everything local. But it will take more to build something truly useable. A truly natural conversational AI needs to understand the nuances of conversation, most importantly when to speak and when to wait. It also needs to know subtleties of the user's voice that no speech recognizer can output, and it needs control over the output voice more precise than any TTS provides. Audio-to-audio models in the style of GPT-4o are clearly the way forward. (And someday soon, video-to-video models for video calling with a virtual avatar. And the step after that is robotics for physical avatars).
There aren't any open source audio-to-audio models yet but there are some promising approaches. https://ultravox.ai has the input half at least. https://tincans.ai/slm has a cool approach too.
Docker is a great option if you want lots of people to try out your project, but not many apps in this space come with a dockerfile
We are also working on a complete open source stack for ASR+TTS+LLM and will be releasing it shortly.
Related
Show HN: Pomoglorbo, a TUI Pomodoro timer for your terminal
A Pomodoro Technique timer, Pomoglorbo, enhances productivity with customizable features like audio settings and work intervals. Users can contribute to the project following guidelines for development and testing.
Groqnotes: Generate structured notes from audio using Groq, Whisper, and Llama3
The GitHub project "Groqnotes" is a streamlit app utilizing Groq, Whisper, and Llama3 to create structured notes from audio content efficiently. It offers rapid transcription, markdown styling, and download options. Access online or set up locally.
Show HN: Feedback on Sketch Colourisation
The GitHub repository contains SketchDeco, a project for colorizing black and white sketches without training. It includes setup instructions, usage guidelines, acknowledgments, and future plans. Users can seek support if needed.
LibreChat: Enhanced ChatGPT clone for self-hosting
LibreChat introduces a new Resources Hub, featuring a customizable AI chat platform supporting various providers and services. It aims to streamline AI interactions, offering documentation, blogs, and demos for users.
Gren 0.4: New Foundations
Gren 0.4 updates its functional language with enhanced core packages, a new compiler, revamped FileSystem API, improved functions, and a community shift to Discord. These updates aim to boost usability and community engagement.