AI-powered conversion from Enzyme to React Testing Library
Slack engineers transitioned from Enzyme to React Testing Library due to React 18 compatibility issues. They used AST transformations and LLMs for automated conversion, achieving an 80% success rate.
Read original articleIn the world of frontend development, Slack engineers faced the challenge of transitioning from Enzyme to React Testing Library (RTL) due to Enzyme's lack of support for React 18. With over 15,000 tests to convert, they explored automated solutions using Abstract Syntax Tree (AST) transformations and Large Language Models (LLMs). While AST transformations required manual rule creation for accuracy, LLMs showed promise but lacked consistency. A hybrid approach combining AST and LLM technologies achieved an 80% conversion success rate. To enhance the process further, the team collected DOM tree data for contextual information and implemented strict controls using prompts and AST instructions to guide the LLM model effectively. Despite the complexities and the need for manual intervention in some cases, Slack's commitment to innovation and problem-solving led to significant improvements in the conversion process, showcasing the value of integrating human insights with automated technologies for successful code migrations.
Related
We no longer use LangChain for building our AI agents
Octomind switched from LangChain due to its inflexibility and excessive abstractions, opting for modular building blocks instead. This change simplified their codebase, increased productivity, and emphasized the importance of well-designed abstractions in AI development.
Exposition of Front End Build Systems
Frontend build systems are crucial in web development, involving transpilation, bundling, and minification steps. Tools like Babel and Webpack optimize code for performance and developer experience. Various bundlers like Webpack, Rollup, Parcel, esbuild, and Turbopack are compared for features and performance.
Software Engineering Practices (2022)
Gergely Orosz sparked a Twitter discussion on software engineering practices. Simon Willison elaborated on key practices in a blog post, emphasizing documentation, test data creation, database migrations, templates, code formatting, environment setup automation, and preview environments. Willison highlights the productivity and quality benefits of investing in these practices and recommends tools like Docker, Gitpod, and Codespaces for implementation.
My weekend project turned into a 3 years journey
Anthony's note-taking app journey spans 3 years, evolving from a secure Markdown tool to a complex Electron/React project with code execution capabilities. Facing challenges in store publishing, he prioritizes user feedback and simplicity, opting for a custom online deployment solution.
Homegrown Rendering with Rust
Embark Studios develops a creative platform for user-generated content, emphasizing gameplay over graphics. They leverage Rust for 3D rendering, introducing the experimental "kajiya" renderer for learning purposes. The team aims to simplify rendering for user-generated content, utilizing Vulkan API and Rust's versatility for GPU programming. They seek to enhance Rust's ecosystem for GPU programming.
This is basically our whole business at grit.io and we also take a hybrid approach. We've learned a fair amount from building our own tooling and delivering thousands of customer migrations.
1. Pure AI is likely to be inconsistent in surprising ways, and it's hard to iterate quickly. Especially on a large codebase, you can't interactively re-apply the full transform a bunch.
2. A significant reason syntactic tools (like jscodeshift) fall down is just that most codemod scripts are pretty verbose and hard to iterate on. We ended up open sourcing our own codemod engine[1] which has its own warts, but the declarative model makes handling exceptions cases much faster.
3. No matter what you do, you need to have an interactive feedback loop. We do two levels of iteration/feedback: (a) automatically run tests and verify/edit transformations based on their output, (b) present candidate files for approval / feedback and actually integrate feedback provided back into your transformation engine.
[0] https://slack.engineering/balancing-old-tricks-with-new-feat...
Reading that leads me to believe that 22% of the conversions succeeded and someone at Slack is making up numbers about developer time.
I wonder how much time or money it would take to just update Enzyme to support react 18? (fork, or, god forbid, by supporting development of the actual project).
Nah, let's play with LLMs instead, and retask all the frontend teams in the company to rewriting unit tests to a new framework we won't support either.
I guess when you're swimming in pools of money there's no need to do reasonable things.
Each test made assertions about a rendered DOM from a given React component.
Enzyme’s API allowed you to query a snippet of rendered DOM using a traditional selector e.g. get the text of the DOM node with id=“foo”. RTL’s API required you to say something like “get the text of the second header element”, but prevents you from using selectors.
To do the transformation successfully you have to run the tests, first to render each snippet, then have some system for taking those rendered snippets and the Enzyme code that queries it and convert the Enzyme code to roughly-equivalent RTL calls.
That’s what the LLM was tasked with here.
New React version made the lib obsolete, we used LLM to fix it (1/5 success rate)
The next fronteer is ditching jest and jsdom in favor of testing in a real browser. But I am not sure the path for getting there is clear yet in the community.
Perhaps reiterating my previous sentiment that such application of LLMs together with discreet structures brings/hides much more value than chatbots who will be soon considered mere console UI.
I’ve been playing with mutation-injection framework for my master’s thesis for some time. I had to use LibCST to preserve syntax information which is usually lost during AST serialization/deserialization (like whitespaces, indentation and so on). I thought that the difference between abstract and concrete trees is that it’s guaranteed CST won’t lose any information, so it can be used to specific tasks where ASTs are useless. So, did they actually use CST-based approach?
While 22% sounds low, saving yourself the effort to rewrite 3,300 tests is a good achievement.
It was supposed to take 6 months for 12 people.
Using an AST parser I wrote a program in two weeks, that converted like half the tests flawlessly, with about another third needing minor massaging, and the rest having to be done by hand (I could've done better, by handling more corner cases, but I kinda gave up once I hit diminishing returns ).
Although it helped a bunch that most tests were brain dead simple.
Reaction was mixed - the newly appointed manager was kinda fuming that his first project's glory was stolen from him by an Assi, and the guys under him missed out on half a year of leisuirely work.
I left a month after that, but what I heard is that they decided to pretend that my solution didn't exist on the management level, and the devs just ended up manually copypasting the output of my tool, and did a days planned work in 20 minutes, with the whole thing taking 6 months as planned.
"Slack uses ASTs to convert test code from Enzyme to React with 22% success rate"
This article is a poor summary of the actual article, which is at least linked to Slack's engineering blog [0].
[0] https://slack.engineering/balancing-old-tricks-with-new-feat...
[updated]
> We examined the conversion rates of approximately 2,300 individual test cases spread out within 338 files. Among these, approximately 500 test cases were successfully converted, executed, and passed. This highlights how effective AI can be, leading to a significant saving of 22% of developer time. It’s important to note that this 22% time saving represents only the documented cases where the test case passed.
So the blog post says they converted 22% of tests, which they claim as saving 22% of developer time, which InfoQ interpreted as converting 80% of tests automatically?
Am I missing something? Or is this InfoQ article just completely misinterpreting the blog post it’s supposed to be reporting on?
The topic itself is interesting, but between all of the statistics games and editorializing of the already editorialized blog post, it feels like I’m doing heavy work just to figure out what’s going on.
Related
We no longer use LangChain for building our AI agents
Octomind switched from LangChain due to its inflexibility and excessive abstractions, opting for modular building blocks instead. This change simplified their codebase, increased productivity, and emphasized the importance of well-designed abstractions in AI development.
Exposition of Front End Build Systems
Frontend build systems are crucial in web development, involving transpilation, bundling, and minification steps. Tools like Babel and Webpack optimize code for performance and developer experience. Various bundlers like Webpack, Rollup, Parcel, esbuild, and Turbopack are compared for features and performance.
Software Engineering Practices (2022)
Gergely Orosz sparked a Twitter discussion on software engineering practices. Simon Willison elaborated on key practices in a blog post, emphasizing documentation, test data creation, database migrations, templates, code formatting, environment setup automation, and preview environments. Willison highlights the productivity and quality benefits of investing in these practices and recommends tools like Docker, Gitpod, and Codespaces for implementation.
My weekend project turned into a 3 years journey
Anthony's note-taking app journey spans 3 years, evolving from a secure Markdown tool to a complex Electron/React project with code execution capabilities. Facing challenges in store publishing, he prioritizes user feedback and simplicity, opting for a custom online deployment solution.
Homegrown Rendering with Rust
Embark Studios develops a creative platform for user-generated content, emphasizing gameplay over graphics. They leverage Rust for 3D rendering, introducing the experimental "kajiya" renderer for learning purposes. The team aims to simplify rendering for user-generated content, utilizing Vulkan API and Rust's versatility for GPU programming. They seek to enhance Rust's ecosystem for GPU programming.