Converting Codebases with LLMs
Mantle discusses using Large Language Models (LLMs) to convert codebases, emphasizing benefits like improved maintainability and performance. They highlight strategies for automating code translation and optimizing the process.
Read original articleIn the article "Working with AI (Part 2): Code Conversion," Mantle discusses their approach to converting a prototype project into a production project by leveraging Large Language Models (LLMs). They highlight the challenges faced in converting codebases from one language to another and the benefits organizations can reap from such conversions, including improved maintainability, performance boost, access to a larger talent pool, and better suitability for production use cases. Mantle's strategy involved using LLMs with over 1 million token windows to assist in the conversion process, aiming to automate the translation of code while focusing on high-value tasks. By providing context such as existing code patterns, libraries, screenshots, and already generated code, Mantle optimized the code generation process. They emphasized the importance of compiling comprehensive context to facilitate reasoning about code and outlined a systematic approach to generating files, starting from backend to frontend. The article concludes by highlighting the efficiencies gained through LLMs in code conversion and the potential for further improvements as token windows expand and models enhance code understanding and generation capabilities.
Related
Sounds more like a sales pitch than a reality. I have seen many times developers excited to port code from one language to another, but just because it is an opportunity to learn something new, do something different for a change and even rewrite old code.
What is the value if is done automatically, nobody learns anything and the code is just a transcript of the old one?
The python project is https://github.com/ml-explore/mlx and the converted project is https://github.com/frost-beta/node-mlx
I wrote a long prompt: https://github.com/frost-beta/node-mlx/blob/main/tests/promp...
The first result was almost always bad, but after manually modifying the assistant's answer, following generation usually went much better.
With the right prompt, it produced extremely clean and workable code.
~20 controller files and over 100 route handlers were converted in about 20 minutes and 5 dollars.
The engineering cost of migrating code bases is trending to 0
Really? I've only seen that twice in my career, and it was due to being written in the most obsolete tech ever.
I have the same comment for the "patterns" that GPT-bros seem to be stuck in all the time. What kind of software are they writing that needs 80% of duplicated/useless code, and 20% of business code? They should first read Refactoring by Martin Fowler, and try to avoid those mistakes in the future because it's bad to rely on a AI for what should be their job, i.e. engineering software.
> the database querying layer was quite verbose and greatly exceeded an LLM’s output token limit
No technical details as usual, only high-level stories. And how is it possible nowadays to have that kind of issue where most languages have their own SQL or REST library to do everything in, at most, 500 lines of code (if the code is duplicated)?
Last but not least, the main web site is a very pretty empty page if JS it disabled. They should fix that with an LLM and write a blog post, that would be more interesting.
IMHO, the way this could work is only if you have very good test coverage so you can run them. But without it this can easily go off the tracks.
https://chatgpt.com/share/5d2245e8-135e-44f4-a204-401e625183...
Other than that, I'm very interested to see how easily opensource libraries could be converted from ecosystem A to B.
Which current models are better than sonnet for code (plain old html JS is my use case btw)?