Falsehoods programmers believe about TCP
The discussion highlights the effectiveness of NetworkManager versus `wpa_supplicant` for wireless connections, misconceptions about TCP reliability, and the complexities of achieving consensus over unreliable links, impacting issues like buffer bloat.
Read original articleThe discussion revolves around the use of NetworkManager versus networkd for network management, particularly in the context of wireless connectivity issues. A user shared their experience of switching from NetworkManager to `wpa_supplicant` due to unreliable wireless connections, which led to applications misinterpreting network status during packet loss. This prompted a broader commentary on misconceptions about TCP reliability, highlighting several common false beliefs programmers hold regarding TCP's behavior and reliability. The conversation also touched on the challenges of achieving consensus over unreliable links and the limitations of TCP in ensuring message delivery. Another participant argued that while it is difficult to guarantee consensus on all bytes transferred, it is possible to agree on a subset of bytes that have been successfully received. The discussion concluded with a note on how misconceptions about network protocols can lead to issues like buffer bloat, particularly when router manufacturers overlook these complexities.
- The debate centers on the effectiveness of NetworkManager versus `wpa_supplicant` for managing wireless connections.
- Misconceptions about TCP reliability and behavior are highlighted, indicating a need for better understanding among developers.
- Achieving consensus on data transfer over unreliable links is complex, with some bytes being reliably acknowledged while others remain ambiguous.
- The conversation underscores the impact of network protocol misunderstandings on real-world issues like buffer bloat.
Related
Timeliness without datagrams using QUIC
The debate between TCP and UDP for internet applications emphasizes reliability and timeliness. UDP suits real-time scenarios like video streaming, while QUIC with congestion control mechanisms ensures efficient media delivery.
Beyond bufferbloat: End-to-end congestion control cannot avoid latency spikes
End-to-end congestion control methods like TCP and QUIC face challenges in preventing latency spikes, especially in dynamic networks like Wi-Fi and 5G. Suggestions include anticipating capacity changes and prioritizing latency-sensitive traffic for a reliable low-latency internet.
Golang is evil on shitty networks (2022)
The impact of Golang's default setting disabling Nagle's algorithm on network performance is discussed. Concerns include slow uploads, increased latency, and network saturation, questioning the decision's efficiency and suggesting considerations for optimization.
Private Internet
The article highlights the inadequacies of current internet protocols regarding security and privacy, advocating for a new protocol with features like non-sensitive addresses and DoS resistance, while suggesting onion routing.
I wish (Linux) WireGuard had a simple way to restrict peer public IPs
Chris Siebenmann highlights WireGuard's limitations in restricting peer public IP addresses, suggesting the need for multiple ports or interfaces for security, and proposes potential kernel-level extensions for better peer management.
- Several commenters express frustration with the article's lack of clarity and depth, particularly regarding the "falsehoods" it presents.
- There is a discussion about the nature of TCP packets and the misconceptions surrounding them, with some arguing that the article's statements are contradictory.
- Commenters highlight the importance of error correction and the complexities of network communication, especially in unreliable environments.
- Real-world examples are shared, illustrating the challenges of message delivery over flaky connections.
- Some suggest that the article oversimplifies or misrepresents technical concepts, leading to misunderstandings among readers.
> 5. There is a such thing as a TCP packet
> 6. There is no such thing as a TCP packet
I don't understand this at all. Either the concept of a TCP packet exists, or the concept does not exist. Even it's not being used in certain scenarios, I don't see how you can argue that "there's no such thing" any of the time. This might just be me misunderstanding whatever point they're trying to make, but I don't remember ever having such philosophical confusion from anything in any other "falsehoods programmers believe about..." article before.
Additionally, I'm sure they're aware that HTTP over TLS has encrypted data frames, which would be unreceivable in a lot of cases if these situations arose a bunch. And considering how much of the modern Internet is built on this paradigm, I think that many of these points are rare and probably extremely pedantic.
This is coming from someone who agrees with much of the nuance implied (but not explained!) by the post.
All great technical writing (which I assume these clickbait articles are at least attempting to be) is written with mutual discovery and deeper understanding in mind, and if you leave no actual explanation in the post, you can't really achieve either of those.
The real question is, why this should be a problem that TCP must solve? TCP gives you a bidirectional waterflow-like pipe, and that's enough for you to create many useful applications. TCP never provided guarantee for correct delivery, that's your job.
For example, if a HTTP request is interrupted before the respond is received, the sender should assume the request never reach the server and try again with a new connection, while the server should mitigate duplicated requests (reject or return a successful code).
Well, maybe that's the point of the article, because many web pages gets confused if you send duplicated requests to them.
“Mostly” because you still care about bandwidth limits and packet RPS limits and latency of course.
The problem: you're on a subway train and you send a message as it departs a station. The request does get to the server, but by the time the response arrives, the train is already in the tunnel and you don't have a signal any more. So the client thinks that the message failed to send, but it was, in fact, sent successfully. The client would retry when it's back online, and would send another copy of that message.
The solution was to send a client-generated "random ID" with each request. I much later learned that this is conventionally called an "idempotency token". This worked, except there was now another problem: you sometimes receive your own message over the long-polling thing before the response to the request that sent it. You don't know for sure whether it's the message you just sent, or something else sent by a different client on the same account, because you don't know the ID of your message yet. This was solved by me delaying the processing of outgoing messages on the client side until all outstanding messages are fully sent and their IDs are known.
Telegram solved this much more elegantly: when the client reconnects to the server, the server sends it all the responses that were not acknowledged during the previous connection. MTProto has its own acknowledgement mechanism in addition to TCP's.
So yeah, instant messaging seems trivial at the first glance, but it turns out that TCP is a leaky enough abstraction that you need to somehow plug those leaks at the application level.
Recently a new rate limiter for TCP went by that was so terribly, terribly broken, and I cannot help but imagine that most of the containers of the world suffer from Bufferbloat in general.
In what way is that a falsehood?
but you can get round that in a lot of cases by just having a load of TCP connections in parallel.
TCP is cheap and well optimised, especially if you are keeping a bunch of connections open. (opening can be expensive)
so if you have a high latency connection, or a bit of packet loss, and you want to reach line speed without having to figure out cornercases with UDP, just open up 100-1k TCP connections and multiplex them.
bish bash bosh, mostly line speed over a high latency line (mind you this was in the days of 100m-500m cross atlantic internet, you'll probably need more connections to saturate a 10gig line.)
The hashgraph algorithm is pretty sweet too and doesn't have the issue of a single write leader like Paxos and Raft. Basically multi-writers / leaderless
https://www.swirlds.com/downloads/SWIRLDS-TR-2016-01.pdf
But to be fair, I'm not certain that CAP theorem and partition tolerance really belong in a conversation about TCP anyway
Regarding ack not being received by sender when connection breaks, it's a weak and dishonest argument thinking it will strengthen their position, but completely ignoring the fact that TCP reliability is dependent on the simple and obvious fact that the connection exists!
I mean, that's true, insofar as pipes have incredibly weak guarantees too — after all, the other end of a pipe might be a program reading from/writing to a network socket, or other unreliable transport. Whenever you let your program be plugged into an arbitrary pipe, you have to expect all that same flakiness and then some.
Yeah pretty much.
maybe don't write contradictory unexplained nonsense.
Now that is a very interesting one!
It's sort of related to the question:
"How much of the Internet is accessible from any given point (location, locality, etc.) at any given point of time?"
Which is sort of unknowable, at least, without attempting to connect with every possible connection point on the Internet, which (if it could be done) would still consist of a range of time, and every point in time following that point would bring changes, perhaps small relative to the whole -- but accruing over time -- more and more, as more time elapses...
Observation: That same (or possibly similar!) phenomena would seem to be at play with respect to the measurment (observation) of quantum systems, i.e., the more certain you are of position, the less certain you are of velocity, and vice-versa...
Well, the more you measure the connectivity to all points of the Internet at one point in time, the less certain you might be of the state of the entire system as more time elapses from that point in time...
But now, why?
Observation: Generally speaking, the larger a system is, the more degrees of freedom it has, in attempting to "lock down" (know by observation, be "certain" of) the entire state of that system at one point of time, the more the parts of the system with degrees of freedom (how many degrees of freedom does the entire Internet have?) will change/evolve/move/"be subject to change" as more time evolves the state of the system... in other words, if you can know position (instantaneous state) with certainty then you can't know velocity (where it's heading to and/or future state and/or that which predicts future state) with certainty!
Sort of like you can know the instantaneous state of the Stock Market and its history... but no one can exactly predict its future (it has many, many degrees of freedom, all of which are subject to change in various unpredictable and bizarre ways!)
Which brings us back to #7:
>"7. If we fail to connect to a well-known remote host, then we must be offline."
We might be offline... but then again, we might not be! (Ping, ICMP, UDP, Telnet and Gopher anyone?)
But then again, we might be!
The Internet's online/offline status (is it really off if it is off? Is it really on if its on?) -- is much like some modern relationships, that is, "It's complicated!" :-)
The Internet is a Black Box!
It's Schrodinger's Internet!
You know, "if a TCP packet travelling at 99.44% of the speed of light on a westbound train track meets a UDP packet travelling at 99.43% of the speed of light on an eastbound train track, then when do they meet?"
You know, "solve for x..."
You know, "assume that the speed of light is constant and that quantum effects are not present!" :-)
A common problem are points which aren't really falsehoods, but where people frequently take false conclusions from it.
E.g. if you ask if TCP is reliable, especially in a non CS paper context, the answer is yes. That is iff you take a reasonable definition of reliable (which doesn't expect literally impossible things) and a reasonable interpretation of mostly. And just listing it as a falsehood fails to point out that there are two potential issues with your understanding while making creating the risk of someone with expertise in that sub-field of IT potentially thinking TCP is quite unreliable when it isn't. I mean the most common usage of the word reliable is a gradient with its meaning in a yes/no question being a short form of "reliable _enough_". Furthermore for most use-cases the "unreliable" aspect of TCP isn't even the main relevant misunderstanding people can have with "TCP is mostly reliable" (through for some use case it is)
The main troublesome misinterpretation is what mostly means. I.e. if you would give it a regious definition it would be "if sampling typical devices used in typical situations across some target audience then for most target audiences (weighted by audience relevance) most of the sampled devices will in a sufficient large long term moving average be reliable enough" or something like that.
What that mainly means:
- even if it's mostly reliable there will be devices for which it is reliable unreliable and anything in between
- similar even if it's mostly reliable for a device that isn't necessary all the time
- nor do we do statements about the patterns when the mostly doesn't apply, i.e. for a device TCP might be mostly reliable except every Sunday 3am for 30s, would still be mostly
- there are use-cases where unreliability is much more common
- there are audiences for which unreliability is much more common
etc.
Similar for point 5,6 about TCP packages, they are definitively a thing and there is no falsehood there. The falsehood is in believing you can reliable control them, that your OS or some middle ware isn't messing with them (e.g. splitting/combining/rewriting). So in some situations it's best to pretend there are non, but in some other situations you have to care and this might differ for different parts of the same protocol. So point 5 and 6 make sense, but don't point in a helpful direction.
to be clear doesn't mean lists are bad, or this list being particular bad, but I which they had more references/details even if short and compact and more clearly separate things too
Related
Timeliness without datagrams using QUIC
The debate between TCP and UDP for internet applications emphasizes reliability and timeliness. UDP suits real-time scenarios like video streaming, while QUIC with congestion control mechanisms ensures efficient media delivery.
Beyond bufferbloat: End-to-end congestion control cannot avoid latency spikes
End-to-end congestion control methods like TCP and QUIC face challenges in preventing latency spikes, especially in dynamic networks like Wi-Fi and 5G. Suggestions include anticipating capacity changes and prioritizing latency-sensitive traffic for a reliable low-latency internet.
Golang is evil on shitty networks (2022)
The impact of Golang's default setting disabling Nagle's algorithm on network performance is discussed. Concerns include slow uploads, increased latency, and network saturation, questioning the decision's efficiency and suggesting considerations for optimization.
Private Internet
The article highlights the inadequacies of current internet protocols regarding security and privacy, advocating for a new protocol with features like non-sensitive addresses and DoS resistance, while suggesting onion routing.
I wish (Linux) WireGuard had a simple way to restrict peer public IPs
Chris Siebenmann highlights WireGuard's limitations in restricting peer public IP addresses, suggesting the need for multiple ports or interfaces for security, and proposes potential kernel-level extensions for better peer management.