Adventures in Probability
The author reflects on their limited exposure to statistics, emphasizing the importance of probability, the exponential distribution in queueing theory, and the CoDel algorithm for effective queue management.
Read original articleThe article discusses the author's reflections on their limited exposure to statistics and probability during school, particularly highlighting a frustrating experience with a left-handed professor. The author emphasizes the importance of understanding probability and statistics, especially in relation to managing uncertainty. They delve into the concept of the exponential distribution, which frequently appears in various fields such as queueing theory and performance modeling. The author explains how the exponential distribution is foundational for simulating events, particularly in queue management systems. They introduce the CoDel algorithm for queue management and describe how to simulate Poisson point processes (PPPs) for modeling event arrivals and completions. The author notes the unique memoryless property of the exponential distribution, which allows for certain simplifications in modeling. They express a desire to communicate the significance of these concepts more effectively, reflecting on their own learning journey and the challenges faced in grasping these statistical principles.
- The author regrets not taking more statistics classes in school.
- The exponential distribution is crucial in various fields, including queueing theory.
- The CoDel algorithm is introduced as a method for queue management.
- The memoryless property of the exponential distribution is highlighted as unique and significant.
- The author reflects on their learning process and the challenges of understanding these concepts.
Related
Everyday is a Birthday A Journey to a classic problem through Math and Rust
This article explores the classic probability problem of ensuring every day of the year is represented by a birthday in a group of people, using mathematical concepts like the Coupon Collector Problem and Inclusion-Exclusion Principle. It calculates needing around 2364 students, adjusting to 2669 with leap days, and discusses the gamma function for probability calculation.
Do not confuse a random variable with its distribution
In probability theory, a random variable and its distribution are distinct concepts. The random variable depends on random outcomes, while the distribution shows variation patterns. Changing variables or probabilities alters distributions. Examples clarify shared distributions with different outcomes, crucial for interpreting probabilities and simulations.
Visualizing Algorithms
Visualization enhances understanding of algorithms, with sampling being crucial in graphics. Bridson’s Poisson-disc sampling improves distribution, while the Fisher-Yates shuffle efficiently rearranges elements, aiding education and comprehension.
How networking affects distributed systems
The article outlines challenges in distributed systems, including network unreliability, inconsistency, probabilistic outcomes, and bandwidth limitations, emphasizing the need for robust design and failure management strategies.
Sampling with SQL
Tom Moertel's blog post discusses SQL sampling algorithms, particularly the A-ES algorithm for weighted sampling without replacement, emphasizing numerical stability and performance optimizations in column-oriented storage formats.
- Many commenters express frustration with traditional teaching methods in statistics, noting a lack of intuitive understanding.
- There are discussions about the complexities of modeling arrival and completion rates in Poisson processes, highlighting dependencies that challenge the validity of certain models.
- Some commenters suggest that neural networks could potentially outperform traditional algorithms by capturing subtle patterns in real-world data.
- Several individuals share personal experiences with research or projects related to probabilistic models and queue management.
- Overall, there is a call for more engaging and intuitive approaches to teaching these mathematical concepts.
So many of my college math classes had some version of this professor who took a fascinating subject like linear algebra, statistics or algorithms and made it into a slog. The fact that most stats is taught by getting students to just memorize random ideas rather than building up a holistic and intuitive view really is a travesty.
Also makes sense why so many people, even though they took stats in college, hav e such a poor understanding of probability.
Incidentally, this also goes for the determinant of a matrix. It's got a lot of neat and desirable properties, and it turns out to be the only thing that does. When it was finally taught to me this way, those weird algorithms we use to compute this seemingly-arbitrary number finally made sense. (And, in fact, this is the easiest way to prove that all those algorithms have to be computing the same seemingly-arbitrary number. Because the algorithms preserve the properties that define The Determinant, and The Determinant is the unique thing that preserves all of those properties, so must those algorithms all be computing The Determinant, no matter how different they might look.)
So I can vouch that this style of explanation really does work, at least for people like me.
Splitting a Poisson process into two lower rate processes is a neat trick. Even better is that you can do the same to convert a Poisson process into one with a variable rate, provided that rate is lower than the original (original may be variable as well).
And the fact that the partial sums of a bunch of exponential distributions results in the same distribution of values as picking Poisson(lambda * time) values uniformly at random is pure magic.
Either mathematically or just some python dice rolls.
Really good. Same ethos as fast.ai courses.
Finally that you can add tbe /3 and /5 to get a /8 distribution makes intuitive semsw to me. / means lambda.
This is because if you have people arriving to a train station you could split them by eye colour and there is no reason a particular eye colour cause there to be dependencies. (Assuming spherical cows: families arriving excepted. Assume it is downtown rush hour)
For example, customers arrive at the grocery store in clusters due to traffic lights, schools getting out, etc. Even without direct signals, a NN could potentially pickup on these "rules" given other inputs, e.g. time of day, weather, etc.
?
Edit: I remembered by stats class. merging poisson distributions requires Independent processes. since start and completion are dependent, its not mathematically valid.
but I am an engineer and it works, so whatever
Related
Everyday is a Birthday A Journey to a classic problem through Math and Rust
This article explores the classic probability problem of ensuring every day of the year is represented by a birthday in a group of people, using mathematical concepts like the Coupon Collector Problem and Inclusion-Exclusion Principle. It calculates needing around 2364 students, adjusting to 2669 with leap days, and discusses the gamma function for probability calculation.
Do not confuse a random variable with its distribution
In probability theory, a random variable and its distribution are distinct concepts. The random variable depends on random outcomes, while the distribution shows variation patterns. Changing variables or probabilities alters distributions. Examples clarify shared distributions with different outcomes, crucial for interpreting probabilities and simulations.
Visualizing Algorithms
Visualization enhances understanding of algorithms, with sampling being crucial in graphics. Bridson’s Poisson-disc sampling improves distribution, while the Fisher-Yates shuffle efficiently rearranges elements, aiding education and comprehension.
How networking affects distributed systems
The article outlines challenges in distributed systems, including network unreliability, inconsistency, probabilistic outcomes, and bandwidth limitations, emphasizing the need for robust design and failure management strategies.
Sampling with SQL
Tom Moertel's blog post discusses SQL sampling algorithms, particularly the A-ES algorithm for weighted sampling without replacement, emphasizing numerical stability and performance optimizations in column-oriented storage formats.