July 27th, 2024

How much of your binary executable is just ASCII text?

Daniel Lemire's blog analyzes ASCII text in binary executables of JavaScript runtimes using a Python script. Findings show significant ASCII content, with Node 22 containing one-third ASCII text.

Read original articleLink Icon
How much of your binary executable is just ASCII text?

Daniel Lemire's blog post explores the amount of ASCII text contained within binary executable files, particularly focusing on popular JavaScript runtimes. He developed a Python script to analyze these binaries, measuring the size of sequences of at least 16 ASCII characters. While the heuristic used is not perfect—potentially missing some short strings and misidentifying non-text sequences—it provides valuable insights. The analysis revealed that significant portions of these binaries consist of ASCII text, with approximately one-third of the Node 22 binary being ASCII. The findings for various binaries include Node 22 at 33 MB, Node 20 at 24 MB, Node 18 at 18 MB, Deno 1.32 at 22 MB, and Bun 1.1 at 5 MB. Lemire also mentions an alternative method using the 'strings' command, which yields similar results. This investigation highlights the prevalence of text within binary files, often attributed to debug symbols and unminified JavaScript code. The post emphasizes the importance of understanding the composition of binary executables, particularly for developers and researchers interested in software performance and optimization.

Link Icon 1 comments
By @basementcat - 6 months
100% if you use tom7's compiler.

http://tom7.org/abc/