June 21st, 2024

The End-of-Line Story (2004)

The ASCII standard lacks a unique end-of-line character, leading to varied EOL conventions in early systems. ARPAnet researchers mandated CR LF sequence for standardization across protocols like Telnet, FTP, and SMTP. Modern systems handle EOL translations, but issues like extra characters can still occur. Windows systems use CR LF for EOL, while Unix uses LF. RFCs specify CR LF for Internet transmission, and FTP can preserve EOL characters in binary mode. RFC Editor website adapts EOL conventions for different systems in compressed RFC collections.

Read original articleLink Icon
The End-of-Line Story (2004)

The ASCII standard for text lacks a unique end-of-line character, defining separate Carriage Return (CR) and Line Feed (LF) movements. Early operating systems varied in their EOL conventions, complicating network communication. To standardize, ARPAnet researchers mandated the CR LF sequence for ASCII text transmission. This convention was enforced by Jon Postel across protocols like Telnet, FTP, and SMTP. While modern systems often handle EOL translations seamlessly, issues can arise, such as extra characters or formatting problems. Windows systems, following MS-DOS, use CR LF as their EOL convention, simplifying cross-system text transfers. RFCs specify CR LF line endings for Internet transmission, though Unix systems store text with LF endings. Binary FTP mode preserves source EOL characters during transfers, assuming similar systems. Compressed RFC collections on the RFC Editor website use Unix LF endings for tar.Z files and MS-DOS CR LF endings for .zip files, catering to different system conventions.

Related

X debut 40 years ago (1984)

X debut 40 years ago (1984)

Robert W. Scheifler introduced the X window system in June 1984 for the VS100 Unix server, offering improved performance over W. The system was stable, with the Laboratory for Computer Science already transitioning to X and developing applications. Scheifler encouraged experimentation and welcomed volunteers for documentation contributions.

A specification for adding human/machine readable meaning to commit messages

A specification for adding human/machine readable meaning to commit messages

The Conventional Commits specification simplifies commit messages for clarity and automation. It categorizes changes, aids in generating changelogs, and promotes organized development practices without strict case sensitivity requirements.

AT&T can't hang up on landline phone customers, California agency rules

AT&T can't hang up on landline phone customers, California agency rules

The CPUC rejected AT&T's request to end landline phone obligations, emphasizing customer protection. AT&T's application lacked replacement provider evidence. CPUC considers COLR rule changes, while Marin County opposes AT&T's legislation.

The End-of-Line Story (2004)

The End-of-Line Story (2004)

The ASCII standard lacks a unique end-of-line character, leading to varied EOL conventions in early systems. ARPAnet researchers standardized CR LF sequence for network communication, influencing protocols like Telnet and FTP. Modern systems handle EOL conversions, but issues like Control M characters can occur. Windows' CR LF simplifies cross-system transfers, contrasting Unix's LF. RFCs mandate CR LF for internet transmission, despite Unix's LF storage. Binary FTP mode preserves EOL characters, and compressed RFC collections accommodate Unix and Windows EOL conventions, crucial for data exchange.

CentOS Linux 7 will reach EOL on Sunday

CentOS Linux 7 will reach EOL on Sunday

CentOS Linux 7 will reach End of Life on June 30, 2024. Users are advised to migrate to Red Hat Enterprise Linux for continued support, with migration tools and consulting services available for a smooth transition.

Link Icon 10 comments
By @chasil - 7 months
This goes back further.

Teletype machines needed a delay to move the printing apparatus back to the beginning of a line. The two characters provided that delay.

I never used one of these; I was too young.

https://en.m.wikipedia.org/wiki/Teletype_Model_33

https://www.revk.uk/2022/02/crlf-has-long-history.html?m=1

By @imglorp - 7 months
> This choice was designed to spread the pain equally among all operating systems of the day; each has to translate to and from the CR LF convention when text was transferred across the network.

This compromise might have made sense from a political or goodwill standpoint at the time. But it meant that everyone using these protocols would have to store, convert, or transmit the extra character forever.

By @jakedata - 7 months
I tried to post an homage to reading text that had LF without CR but the input filter helpfully corrected it. You will just have to imagine text moving down like a set of stairs descending from left to right.

Kermit was the file transfer protocol of last resort for sorting out weird translation issues. It was slow but super effective.

HN is +10 proof against ascii art.

By @demurgos - 7 months
It's pretty unfortunate that we ended up with system dependent newlines. This means that text files are not byte-for-byte the same across systems so comparisons or hashing need normalization. This also makes reliable line splitting less convenient. I was also bitten by FTP transferring binary files with text mode instead of binary mode, causing corruption.

Nowadays I tend to enforce LF everwhere. Windows can deal with it, and it ensures consistent cross platform handling. I know that Rust also uses a single LF on all platforms, not sure if there are other such languages.

By @nextaccountic - 7 months
Today it's a mistake to open files in "text mode" that translates between line endings, or otherwise save text files with any line ending that isn't LF. Even notepad opens files with LF line ending now. (while we're at it, save all text data in UTF-8 too)

Some "text protocols" have CR LF line endings but this should be relegated to a library; your own code shouldn't deal with this.

And then we can all forget this mistake

By @shmeeed - 7 months
The irony of finding the formating of this article completely messed up on mobile is not lost on me.
By @finisher2201 - 7 months
so interesting. I've somehow always supposed that it was ms-dos that decided to be different.
By @gabrielsroka - 7 months
Using curl in linux, I find it useful to delete \r. Makes it easier to use sed.

  while [ "$url" ]; do
    res=$(curl -i "$url" | tr -d '\r')
    headers=$(echo "$res" | sed '/^$/q')
    body=$(echo "$res" | sed '1,/^$/d')
    echo "$body"
    url=$(echo "$headers" | sed -n -E 's/link: <(.*)>; rel="next"/\1/pi')
  done
By @gabrielsroka - 7 months
I posted this last year and was invited to repost it using the second chance pool. After posting it, I noticed someone else posted it yesterday.