August 22nd, 2024

How we built Townie – an app that generates fullstack apps

Townie is a beta app that generates full-stack applications using large language models, simplifying software development for non-programmers with instant deployment and features like code editing and multiple application generation.

Read original articleLink Icon
FrustrationExcitementConcern
How we built Townie – an app that generates fullstack apps

Townie is a newly redesigned app that generates full-stack applications, currently in beta. The development of Townie leverages recent advancements in code generation, particularly through the use of large language models (LLMs) like Claude 3.5 Sonnet. This approach allows users to create software through conversational interactions, making it accessible even to those without programming knowledge. The app aims to streamline the process of generating both frontend and backend components of applications, addressing a significant barrier for non-programmers by enabling instant deployment of generated code. The author, JP Posma, detailed the prototyping process, which included building a basic version of code generation that can create applications like a Hacker News clone with a backend and database. The prototype, named VALL-E, incorporates features such as code editing, syntax highlighting, and the ability to generate multiple applications simultaneously. Challenges included ensuring database persistence and optimizing costs associated with using LLMs. The author also introduced an evaluation system to assess the performance of generated applications, focusing on error detection and functionality. Overall, the project reflects a vision of making the full capabilities of computing accessible to all users.

- Townie is an app that generates full-stack applications using LLMs.

- The app aims to simplify software development for non-programmers by enabling instant deployment.

- The prototype, VALL-E, includes features for code editing and multiple application generation.

- Challenges included database persistence and cost optimization for LLM usage.

- An evaluation system was implemented to assess the functionality of generated applications.

AI: What people are saying
The comments on the Townie app reveal a mix of experiences and concerns regarding AI-generated code.
  • Users appreciate the speed and ease of generating applications, with some successfully creating functional projects quickly.
  • Many encounter limitations with complex projects, where the AI struggles to resolve issues or produces suboptimal code.
  • There are concerns about the security and reliability of AI-generated applications, particularly regarding the AI's decision-making process.
  • Some users express frustration with the need for manual corrections, noting that while AI can handle a majority of the work, the final adjustments can be time-consuming.
  • Questions arise about the technology's flexibility, such as whether it can support various backend languages and if self-hosting is an option.
Link Icon 12 comments
By @devbent - 6 months
One problem that I run into with LLM code generation on large projects is that at some point the LLM runs into a problem it just cannot fix no matter how it is prompted. This manifest in a number of ways. Sometimes it is by bouncing back and forth between two invalid solutions while other times it is bouncing back and forth fixing one issue and while breaking something else in another part of the code.

Another issue with complex projects is that llms will not tell you what you don't know. They will happily go about designing crappy code if you ask them for a crappy solution and they don't have the ability to recommend a better path forward unless explicitly prompted.

That said, I had Claude generate most of a tile-based 2D pixel art rendering engine[1] for me, but again, once things got complicated I had to go and start hand fixing the code because Claude was no longer able to make improvements.

I've seen these failure modes across multiple problem domains, from CSS (alternating between two broken css styles, neither came close to fixing the issue) to backend, to rendering code (trying to get character sprites correctly on the tiles)

[1] https://www.generativestorytelling.ai/town/index.html notice the tons of rendering artifacts. I've realized I'm going to need to rewrite a lot of how rendering happens to resolve them. Claude wrote 80% of the original code but by the time I'm done fixing everything maybe only 30% or so of Claude's code will remain.

By @hebejebelus - 6 months
Reviving my long-dead account to say that I built a perfectly functional small site to help schedule my dungeons and dragons group within about 5 minutes, on my phone, from my bed. If this isn't the future I don't want to go there. Fantastic work.
By @anonzzzies - 6 months
Whenever there is a 'big new AI model' launch, I try to build one of my side projects fully with the new AI. So I do not touch anything myself, I only talk english to it. I do read the generated code and instructions so I can correct them in English; no code at all. It worked twice; with chatgpt4 and the sonnet launch. All the others did not manage without significant code or ops help.

It is a very annoying experience even if you know what you are doing; it is still much faster than what you would get done writing code but it is very frustrating getting the last 10% right; you spend a day on 80%, two days on 10 and a week on the last 10%. If I just jump in and fix the code myself, it is about 1 day for the same project, which is still amazing (and not imaginable before).

People complaining that it sucks and it cannot figure things out often are right, however, it is a lot better than what we had before, which was doing all this by hand (causing many people to procrastinate over even starting a side project while have 1000s in mind every day).

These types of services are important and I like this val.town idea. Well done and keep going.

By @osigurdson - 5 months
LLMs are massively useful, just not in the way that people think they should work. No, you are not the manager with AI doing the work. Instead, it is more like someone to bounce ideas off of, a teacher (who is often wrong) and a reviewer. The human, ironically, is the specialist that can get details right, not AI (at least not for now).
By @deckiedan - 6 months
I just played with townie AI for an hour or so... Very cool! Very fun.

There's still some glitches, occasionally the entire app code would get replaced by the function the LLM was trying to update. I could fix it by telling it that's what had happened, and it would then fill everything in again... Waiting for the entire app to be rewritten each time was a bit annoying.

It got the initial concepts of the app very very quickly running, but then struggled with some CSS stuff, saying it would try a different approach, or apologising for missing things repeatedly...and eventually it told me it would try more radical approaches and wrote online styles... I wonder if the single file approach has limitations in that respect.

Very interesting, very fun to play with.

I'm kind of concerned for security things with LLM written apps - you can ask it to do things and it says yes, without really thinking if it's a good idea or not.

But cool!

And anything which helps with the internet to be full of small independent quirky creative ideas, the better.

By @wonger_ - 6 months
The author's bit about "IterateGPT" reminds me of this "AI in a loop" post from last year: https://til.simonwillison.net/llms/python-react-pattern

  prompt = """
  You run in a loop of Thought, Action, PAUSE, Observation.
  At the end of the loop you output an Answer
  Use Thought to describe your thoughts about the question you have been asked.
  Use Action to run one of the actions available to you - then  return PAUSE.
  Observation will be the result of running those actions.
  ..."""
Seems like a really powerful technique to have LLMs act on their own feedback.
By @janpaul123 - 6 months
Post author here! Happy to answer any questions.
By @thelastparadise - 5 months
Couldn't this essentially be used as a training data generator?

E.g. have humans + LLMs generate a bunch of prompts that goes into this system, and it spits out a bunch of fully-fledged applications, which can be used to train an even bigger model.

By @syspec - 6 months
Does fullstack here mean using javascript on the backend? Is it able to generate code for other backend languages?
By @01HNNWZ0MV43FF - 6 months
And they can't be self-hosted?