October 7th, 2024

How do HTTP servers figure out Content-Length?

HTTP servers determine Content-Length by calculating response body size. Small responses use a Content-Length header, while larger ones utilize chunked transfer encoding, allowing data to be sent in segments.

Read original articleLink Icon
CuriositySkepticismAppreciation
How do HTTP servers figure out Content-Length?

HTTP servers determine the Content-Length of a response by calculating the size of the response body before sending it. In simple implementations, the server can write the response in parts, as demonstrated in a Go program that handles HTTP requests. When a response is small enough to fit into a buffer, the server can easily calculate its length and include it in the Content-Length header. However, if the response exceeds the buffer size, the server uses chunked transfer encoding, which allows it to send the response in smaller segments without needing to know the total length in advance. This method was introduced in HTTP/1.1 and is widely supported. The response includes hexadecimal numbers indicating the size of each chunk, allowing the client to reconstruct the full message. Additionally, chunked responses can include trailers, which are headers sent after the body, useful for scenarios like digital signatures. While HTTP/2 and HTTP/3 do not support chunked transfer encoding, they have their own mechanisms for streaming data. Understanding these underlying processes is crucial for developers working with HTTP in programming languages like Go.

- HTTP servers calculate Content-Length based on the response body size.

- Small responses can use a Content-Length header, while larger ones use chunked transfer encoding.

- Chunked transfer encoding allows sending data in segments without knowing the total length beforehand.

- Trailers can be included in chunked responses for additional metadata.

- HTTP/2 and HTTP/3 utilize different streaming mechanisms and do not support chunked transfer encoding.

AI: What people are saying
The comments reflect a range of insights and experiences related to HTTP server implementations and content length handling.
  • Many commenters share their experiences with different HTTP server implementations, noting variations in how Content-Length is handled.
  • There is a discussion about the complexities of managing content length, especially with compression and chunked transfer encoding.
  • Several users highlight the importance of correctly calculating Content-Length to avoid issues with browsers and data transmission.
  • Some comments touch on the challenges of streaming large responses without buffering, emphasizing the need for efficient handling.
  • There is a mention of the historical context of HTTP and its evolving complexities, particularly with newer protocols like HTTP/2.
Link Icon 22 comments
By @hobofan - 5 months
I think the article should be called "How do Go standard library HTTP servers figure out Content-Length?".

In most HTTP server implementations from other languages I've worked with I recall having to either:

- explicitly define the Content-Length up-front (clients then usually don't like it if you send too little and servers don't like it if you send too much)

- have a single "write" operation with an object where the Content-Length can be figured out quite easily

- turn on chunking myself and handle the chunk writing myself

I don't recall having seen the kind of automatic chunking described in the article before (and I'm not too sure whether I'm a fan of it).

By @pkulak - 5 months
And if you set your own content length header, most http servers will respect it and not chunk. That way, you can stream a 4-gig file that you know the size of per the metadata. This makes downloading nicer because browsers and such will then show a progress bar and time estimate.

However, you better be right! I just found a bug in some really old code that was gzipping every response when it was appropriate (ie, asked for, textual, etc). But it was ignoring the content-length header! So, if it was set manually, it would then be wrong after compression. That caused insidious bugs for years. The fix, obviously, was to just delete that manual header if the stream was going to be compressed.

By @simonjgreen - 5 months
Along this theme of knowledge, there is the lost art of tuning your page and content sizes such that they fit in as few packets as possible to speed up transmission. The front page of Google for example famously fitted in a single packet (I don't know if that's still the case). There is a brilliant book that used to be a bit of a bible in the world of web sysadmin from the Yahoo Exceptional Performance Team which is less relevant these days but interesting to understand the era.

https://www.oreilly.com/library/view/high-performance-web/97...

By @flohofwoe - 5 months
Unfortunately the article doesn't mention compression, because this is where it gets really ugly (especially with range requests), because IIRC the content-size reported in http responses and the range defined in range requests are on the compressed data, but at least in browsers you only get the uncompressed data back and don't even have access to the compressed data.
By @jaffathecake - 5 months
The results might be totally different now, but back in 2014 I looked at how browsers behave if the resource is different to the content-length https://github.com/w3c/ServiceWorker/issues/362#issuecomment...

Also in 2018, some fun where when downloading a file, browsers report bytes written to disk vs content-length, which is wildly out when you factor in gzip https://x.com/jaffathecake/status/996720156905820160

By @AndrewStephens - 5 months
When I worked on a commercial HTTP proxy in the early 2000s, it was very common for servers to return off-by-one values for Content-Length - so much so that we had to implement heuristics to ignore and fix such errors.

It may be better now but a huge number of libraries and frameworks would either include the terminating NULL byte in the count but not send it, or not include the terminator in the count but include it in the stream.

By @aragilar - 5 months
Note that there can be trailer fields (the phrase "trailing header" is both an oxymoron and a good description of it): https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Tr...
By @matthewaveryusa - 5 months
Next up is how forms with (multiple) attachments are uploaded with Content-Type=multipart/form-data; boundary=$something_unique

https://notes.benheater.com/books/web/page/multipart-forms-a...

By @jillesvangurp - 5 months
It's a nice exercise in any web framework to figure out how you would serve a big response without buffering it in memory. This can be surprisingly hard with some frameworks that just assume that you are buffering the entire response in memory. Usually, if you look hard there is a way around this.

Buffering can be appropriate for small responses; or at least convenient. But for bigger responses this can be error prone. If you do this right, you serve the first byte of the response to the user before you read the last byte from wherever you are reading (database, file system, S3, etc.). If you do it wrong, you might run out of memory. Or your user's request times out before you are ready to respond.

This is a thing that's gotten harder with non-blocking frameworks. Spring Boot in particular can be a PITA on this front if you use it with non-blocking IO. I had some fun figuring that out some years ago. Using Kotlin makes it slightly easier to deal with low level Spring internals (fluxes and what not).

Sometimes the right answer is that it's too expensive to figure out the content length, or a content hash. Whatever you do, you need to send the headers with that information before you send anything else. And if you need to read everything before you can calculate that information and send it, your choices are buffering or omitting that information.

By @lloeki - 5 months
Chunked progress is fun, not many know it supports more than just sending chunk size but can synchronously multiplex information!

e.g I drafted this a long time ago, because if you generate something live and send it in a streaming fashion, well you can't have progress reporting since you don't know the final size in bytes, even though server side you know how far you're into generating.

This was used for multiple things like generating CSV exports from a bunch of RDBM records, or compressed tarballs from a set of files, or a bunch of other silly things like generating sequences (Fibonacci, random integers, whatever...), that could take "a while" (as in, enough to be friendly and report progress).

https://github.com/lloeki/http-chunked-progress/blob/master/...

By @dicroce - 5 months
At least in the implementation I wrote the default way to provide the body was a string... which has a length. For binary data I believe the API could accept either a std::vector<uint8_t> (which has a size) or a pointer and a size. If you needed chunked transfer encoding you had to ask for it and then make repeated calls to write chunks (that each have a fixed length).

To me the more interesting question is how web server receive an incoming request. You want to be able to read the whole thing into a single buffer, but you don't know how long its going to be until you actually read some of it. I learned recently that libc has a way to "peek" at some data without removing it from the recv buffer..... I'm curious if this is ever used to optimize the receive process?

By @skrebbel - 5 months
I thought I knew basic HTTP 1(.1), but I didn't know about trailers! Nice one, thanks.
By @nraynaud - 5 months
I have done crazy stuff to compute the content length of some payloads. For context one of my client works in cloud stuff and I worked in converting hdd format on the fly in a UI VM. The webserver that accepts the files doesn’t do chunked encoding. And there is no space to store the file. So I had to resort to passing over the input file once to transform it, compute its allocation table and transformed size, then throw away everything but the file and the table, restart the scan with the correct header and re-do the transformation.
By @Sytten - 5 months
There is a whole class of attacks called HTTP Desync Attacks that target just that problem since it is hard to get that right, especially accross multiple different http stacks. And if you dont get it right the result.is that bytes are left on the TCP connections and read as the next request in case of a reuse.
By @6383353950 - 5 months
My account not open plz help me
By @6383353950 - 5 months
Help me sir
By @Am4TIfIsER0ppos - 5 months
stat()?
By @TZubiri - 5 months
len(response)
By @remon - 5 months
Totally worth an article.
By @_ache_ - 5 months
> Anyone who has implemented a simple HTTP server can tell you that it is a really simple protocol

It's not. Like, hell no. That is so complex. Multiplexing, underlying TCP specifications, Server Push, Stream prioritization (vs priorization !), encryption (ALPN or NPN ?), extension like HSTS, CORS, WebDav or HLS, ...

It's a great protocol, nowhere near simple.

> Basically, it’s a text file that has some specific rules to make parsing it easier.

Nope, since HTTP/2 that is just a textual representation, not the real "on the wire" protocol. HTTP/2 is 10 now.

By @pknerd - 5 months
Why would someone implement the chunk logic when websockets are here? Am I missing something? What are the use cases?