September 24th, 2024

OpenBSD now enforcing no invalid NUL characters in shell scripts

OpenBSD's ksh now prohibits invalid NUL characters in scripts to unify shell behavior and prevent inconsistencies. Users must be on OpenBSD-current to utilize this update for improved reliability.

Read original articleLink Icon
CuriosityConcernApproval
OpenBSD now enforcing no invalid NUL characters in shell scripts

OpenBSD has implemented a new enforcement in its default shell, ksh, which prohibits the inclusion of invalid NUL characters in shell scripts. This change, noted in a commit message from Theo de Raadt, states that if a NUL byte is detected during the parsing of a script, the shell will abort with a "syntax error: NUL byte unexpected" message. The decision stems from the observation that various shells exhibit inconsistent behaviors when encountering NUL bytes, leading to potential issues in script execution. The majority of shells, written in C, cannot handle embedded NUL characters due to the nature of C strings. The change aims to standardize shell behavior and prevent further divergence among different shell implementations. Users must be running OpenBSD-current to benefit from this update, as it was introduced after the tagging of OpenBSD 7.6.

- OpenBSD's ksh now disallows invalid NUL characters in scripts.

- The change aims to unify shell behavior and prevent inconsistencies.

- NUL bytes in scripts previously led to divergent behaviors across different shells.

- Users need to be on OpenBSD-current to utilize this new enforcement.

- The update reflects OpenBSD's commitment to improving software reliability.

AI: What people are saying
The update to OpenBSD's ksh regarding NUL characters in scripts has generated a variety of reactions among users.
  • Many users express concern about the impact on existing scripts and software that rely on NUL characters, particularly self-extracting scripts and shar archives.
  • Comments highlight the importance of strict input validation in software development, referencing the concept of a "post-Postel world" where leniency in handling input can lead to security issues.
  • Some users question the practicality of the change, wondering if it will lead to more problems than it solves, especially regarding compatibility with other shells.
  • There is a general appreciation for OpenBSD's efforts to unify shell behavior and improve reliability, despite concerns about potential disruptions.
  • Several comments touch on the historical context of software behavior and the evolution of standards in Unix-like systems.
Link Icon 24 comments
By @amiga386 - 7 months
Here's the actual diff:

https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/bin/ksh/shf.c....

And it looks like that covers all parsed parts of the shell script or history file, including heredocs. I get the feeling it's going to break all shar archives with binary files (not that they're particularly common). It will stop NULs being in the script itself, but it won't stop them coming from other sources, e.g.

    $ var=$(printf '\0hello')
    -bash: warning: command substitution: ignored null byte in input
    $ echo $var
    hello
It remains to be seen if this will be adopted by anyone else, or if it'll be another reason to use OpenBSD only as a restricted environment and not as a general computing platform.

> "If there is ONE THING the Unix world needs, it is for bash/ksh/sh to stop diverging further"

> OpenBSD ksh: diverges further

By @mcculley - 7 months
"We are in a post-Postel world" is a great way to put it. This needs to be repeated by everyone working with file formats or accepting untrusted input.
By @jrockway - 7 months
I like the term post-Postel.

There are two reliability constraints that all software faces; security and interoperability. The more lax you are about validation, the more likely interoperability is. "That's weird, I'll just do whatever" is doing SOMETHING, and it's often to the end user's liking. But, you also enter a more and more undefined state inside the software on the other side, and that's where weird things happen. Weird things happening typically manifest as security problems. So the more effort you go to to minimize the possibility of entering a weird state, the more confidence you have that your software is working as specified.

Postel's Law made a lot of sense to me when developing the early Internet. A lot of people were reading imperfect RFCs, and it was nice when your HP server could communicate with a Sun workstation, even though maybe some bit in the TCP header was set wrong. But now? You just gotta get it right and push a hotfix when you realize you messed something up. (Sadly, I don't think it's possible. Middleboxes are getting more and more popular. At work, we make a product where the CLI talks to the server over HTTP/2. We also install Zscaler on every workstation. Zscaler simply blocks HTTP/2. So you can't use our product. Awkward.)

By @saagarjha - 7 months
> There appears to be one piece of software which is misinterpreting guidance of this, and trying to depend upon embedded NUL.

Curious what this is

By @sneela - 7 months
> This was in snapshots for more than 2 months, and only spotted one other program depending on the behaviour (and that test program did not observe that it was therefore depending in incorrect behaviour!!)

Fascinating. I wonder what that program is, and why it depends on the NUL character.

By @bell-cot - 7 months
Kudos to OpenBSD!

Similar to the olde-tyme "-o noexec" and "-o nosuid" options for `mount`, there should be easy, no-exceptions ways to blanket ban other types of simply obvious red-flag activity.

By @parasense - 7 months
Is this going to murder those fancy shell scripts that self-extract a program appended to the tail, which is really just an encoded blob of some kind, presumably compressed, etc.. ???
By @chasil - 7 months
I was going to check the status of mksh (the Android system shell), but the project page returns:

"Unavailable For Legal Reasons - Sorry, no detailled error message available."

http://www.mirbsd.org/mksh.htm

The Android system shell is now abandoned? This is also in rhel9 basesos.

By @chrisfinazzo - 7 months
Related: The installer for iTunes 12.2.1 included a bug which might recursively delete a volume if the path given as input included incorrectly escaped spaces.
By @Taikonerd - 7 months
On a similar note, I sometimes think about how newline characters are allowed in filenames, and how that can break simple...

    for each $filename in `ls`
loops -- because in many contexts, UNIX treats newlines as a delimiter.

Is there any legitimate use for filenames with newlines?

By @whiterknight - 7 months
Side note: tell your startup to switch its “hardware with Ubuntu Linux inside” to BSD. You will have a much more stable and simple platform that can last a long time.
By @raverbashing - 7 months
> There appears to be one piece of software which is misinterpreting guidance of this, and trying to depend upon embedded NUL.

Big oof here. Why? How?

> If there is ONE THING the Unix world needs, it is for bash/ksh/sh to stop diverging further by permitting STUPID INPUT that cannot plausibly work in all other shells. We are in a post-Postel world.

Amem

By @opk - 7 months
I've always found the fact that zsh copes with NUL characters in variables etc to be really useful. I can see why this approach makes sense for OpenBSD but they can't prevent NULs appearing in certain places like piped input.
By @lupusreal - 7 months
Does this break those self-extracting script/tar files? I forget how those are done, I haven't seen one in many years.
By @klooney - 7 months
Does this break the self extracting tarball trick, where you have a bootstrap shell script with a binary payload appended?
By @nubinetwork - 7 months
So I can't bury a tarball inside a shell script anymore?
By @soupbowl - 7 months
I wish FreeBSD replaced /bin/sh with OpenBSDs.
By @chmorgan_ - 7 months
Wow, they still use CVS...
By @enriquto - 7 months
Great. Now forbid spaces in filenames.
By @sph - 7 months
Is this in reference to something? Judging from the comments, NUL bytes in shell scripts are a common occurrence that everybody is celebrating this change as if it were ground breaking.

I mean, it's a good idea, but I wonder what am I missing here. Also what do they mean by post-Postel?

By @2snakes - 7 months
Surprised noone has mentioned the Crowdstrike issue, which was due to NUL characters wasn't it?
By @0xbadcafebee - 7 months

  > If there is ONE THING the Unix world needs, it is for bash/ksh/sh to
  > stop diverging further by permitting STUPID INPUT that cannot
  > plausibly work in all other shells.  We are in a post-Postel world.
  > 
  > It remains possible to put arbitrary bytes *AFTER* the parts of the
  > shell script that get parsed & executed (like some Solaris patch files
  > do).  But you can't put arbirary bytes in the middle, ahead of shell
  > script parsed lines, because shells can't jump to arbitrary offsets
  > inside the input file, they go THROUGH all the 'valid shell script
  > text lines' to get there.

  So here it is again, an example of OpenBSD making software behavior saner for all of us.
I don't consider use of all caps over a minor issue to be sane behavior. At best it's immaturity (trying to force your point rather than persuade), and at worst it's an emotional imbalance that effects judgement. That said, it's ksh, on OpenBSD, so I couldn't care less what they do.