The plan-execute pattern
The plan-execute pattern in software engineering involves planning decisions in a data structure before execution, aiding testing and debugging. Practical examples and benefits are discussed, emphasizing improved code quality and complexity management.
Read original articleThe article discusses the plan-execute pattern, a universal technique often overlooked in software engineering discussions. The pattern involves two stages: planning, where decisions are encapsulated into a data structure, and execution, where the plan is realized. By separating decision-making from action, the pattern allows for comprehensive testing and better debugging capabilities. The text provides a practical example of a build system design to illustrate how the pattern works in real-world scenarios. Additionally, it mentions instances where the pattern is applied, such as in query planning in RDBMS and interpreter patterns. The conclusion emphasizes the benefits of the plan-execute pattern in managing complexity and improving code quality compared to the more intertwined "just do it" approach. Overall, the article highlights the importance of considering alternative implementation strategies to enhance software development practices.
Related
Software design gets worse before it gets better
The "Trough of Despair" in software design signifies a phase where design worsens before improving. Designers must manage expectations, make strategic decisions, and take incremental steps to navigate this phase successfully.
Formal methods: Just good engineering practice?
Formal methods in software engineering, highlighted by Marc Brooker from Amazon Web Services, optimize time and money by exploring designs effectively before implementation. They lead to faster development, reduced risk, and more optimal systems, proving valuable in well-understood requirements.
The software world is destroying itself (2018)
The software development industry faces sustainability challenges like application size growth and performance issues. Emphasizing efficient coding, it urges reevaluation of practices for quality improvement and environmental impact reduction.
Software Engineering Practices (2022)
Gergely Orosz sparked a Twitter discussion on software engineering practices. Simon Willison elaborated on key practices in a blog post, emphasizing documentation, test data creation, database migrations, templates, code formatting, environment setup automation, and preview environments. Willison highlights the productivity and quality benefits of investing in these practices and recommends tools like Docker, Gitpod, and Codespaces for implementation.
Optimizing the Roc parser/compiler with data-oriented design
The blog post explores optimizing a parser/compiler with data-oriented design (DoD), comparing Array of Structs and Struct of Arrays for improved performance through memory efficiency and cache utilization. Restructuring data in the Roc compiler showcases enhanced efficiency and performance gains.
However, when I later revisited the design of Ninja I found that removing the separate planning phase better reflected the dynamic nature of how builds end up working out, where you discover information during the build that impacts what work you have planned. See the "Single pass" discussion in my design notes[2], which includes some negatives of the change.
If the author reads this, in [2] I made a Rust-based Ninja rewrite that has some similar data structures to those in the blog post, you might find it interesting!
[1] https://github.com/ninja-build/ninja/blob/dcefb838534a56b262... [2] https://neugierig.org/software/blog/2022/03/n2.html
I have yet to see a graceful work flow here in a production environment and would love to hear about one. (No I will not use terraform cloud)
This pattern seems like a specific application of the general idea to favor data structures over logic. If you manage to make a clear datastructure that encapsulates all the different ways in which your data/users/etc might behave, the implementation of the logic is often very straight forward. Conversely, if your data structure is either too generic or too specific, you end up having to code a lot of special cases, exceptions, mutable state and so on in the logic to deal with that.
- You can attach the plan to your Change Management, both for documentation and approval.
- You can manually edit the plan.
- The plan can be processed by other tools (linting, optimizations).
- The "complexity" of the pair planner+executor is often lower than a monolithic algorithm (above a certain size).
- Having a side-effect-free planner makes testing so much easier.
The big downside, of course, is that you're building a VM. You'll have to design operations, parameters, encoding. And stacktraces will make much less sense.
My rule of thumb is to use plan-execute for scripts that combine business logic with destructive operations or long runtime.
Reducing program logic to a small state machine and business logic that interacts with it in predefined ways is a pattern I learned while doing asynchronous netcode for a game project, but gradually realized was applicable in a ton of places, and framing it as a robust version of a "plan-execute" pattern is really intuitive but also really powerful. Great article
For example, one application of this was a long migration project where a large collection of files (some petabytes of data) was to be migrated from an on-prem NAS to cloud filesystem. The files on NAS were managed with additional asset management solution which stored metadata in a PostgreSQL (actual filenames, etc.)
The application I wrote was composed of a series of small tools. One tool would contact all of the sources of information and create a file with a series of commands (copy file from location A to location B, create a folder, set metadata on a file, etc.)
Other tools could take that large file and run operations on it. Split it into smaller pieces, prioritise specific folders, filter out modified files by date, calculate fingerprints from actual file data for deduplication, etc. These tools would just operate on this common file format without actually doing any operations on files.
And finally tool that could be instantiated somewhere and execute a plan.
I designed it all this way as it was much more resilient process. I could have many different processes running at the same times and I had a common format (a file with a collection of directives) as a way to communicate between all these different tools.
There are always temptations to loosen the constraints this imposes, but they nearly always come at a cost of undermining the value of the pattern.
That also explains why it isn't a good and useful "pattern": inventing new DSLs with custom compilers all the time is not a trivial thing to pull off without shooting yourself in the leg, so if are writing a VM (e.g. an RDBMS query planner) you must be already perfectly aware that you are writing a fucking VM with its own programming language (e.g. SQL). And if you are not, inventing a data structure that encapsulates a whole program is probably a bigger task than the problem you are actually trying to solve.
The way I think about when to use a plan is whenever you're doing batches of I/O of some kind, or anything that you might want to make idempotent.
The best form for plans is DAG w/o control flow.
You may run perception->planning->execution in a loop.
I don't understand the ambivalence. I have to make some choice of language and paradigm, and then I have to solve problems using it.
Great article and description of the plan-execute pattern, though!
One nice advantage is that an application can show me the whole plan for me to review before executing it. This is basically what terraform did too.
This pattern also makes coding and testing much much easier.
My relationship with design patterns changed when I stopped viewing them as prescriptive and started viewing them as descriptive.
Reminds me of a thread a few days back when someone was talking about state machines as if they were obscure. At a systems programming level they aren’t obscure at all (how do you write non-blocking network code without async/await? how does regex and lexers work?). Application programmers are usually “parsing” with regex.match() or index() or split() and frequently run into brick walls when it gets complex (say parsing email headers that can span several lines) but state machines make it easy.
Related
Software design gets worse before it gets better
The "Trough of Despair" in software design signifies a phase where design worsens before improving. Designers must manage expectations, make strategic decisions, and take incremental steps to navigate this phase successfully.
Formal methods: Just good engineering practice?
Formal methods in software engineering, highlighted by Marc Brooker from Amazon Web Services, optimize time and money by exploring designs effectively before implementation. They lead to faster development, reduced risk, and more optimal systems, proving valuable in well-understood requirements.
The software world is destroying itself (2018)
The software development industry faces sustainability challenges like application size growth and performance issues. Emphasizing efficient coding, it urges reevaluation of practices for quality improvement and environmental impact reduction.
Software Engineering Practices (2022)
Gergely Orosz sparked a Twitter discussion on software engineering practices. Simon Willison elaborated on key practices in a blog post, emphasizing documentation, test data creation, database migrations, templates, code formatting, environment setup automation, and preview environments. Willison highlights the productivity and quality benefits of investing in these practices and recommends tools like Docker, Gitpod, and Codespaces for implementation.
Optimizing the Roc parser/compiler with data-oriented design
The blog post explores optimizing a parser/compiler with data-oriented design (DoD), comparing Array of Structs and Struct of Arrays for improved performance through memory efficiency and cache utilization. Restructuring data in the Roc compiler showcases enhanced efficiency and performance gains.