Median filtering: naive algorithm, histogram-based, binary tree, and more (2022)
The blog post explains median filtering in image analysis, covering techniques like percentile filters. It explores algorithms for computing median filters and compares their efficiency based on kernel size and complexity.
Read original articleThe blog post discusses the concept of median filtering in image analysis. It explains how the median filter works by taking the median value of a pixel's neighborhood in the input image. The post delves into different filtering techniques such as percentile and rank filters, highlighting their applications in noisy image processing. Various algorithms for computing the median filter are explored, including a naive algorithm, a histogram-based algorithm, and a binary tree algorithm. The post compares the efficiency of these algorithms based on kernel size and complexity. Additionally, it introduces a constant-time algorithm proposed by Perreault and Hérbert for 8-bit images and square kernels. The post concludes with timing comparisons of different implementations, showcasing the computational efficiency of each method. Overall, the blog provides a comprehensive overview of median filtering techniques and their computational implications in image processing.
Related
Beating NumPy's matrix multiplication in 150 lines of C code
Aman Salykov's blog delves into high-performance matrix multiplication in C, surpassing NumPy with OpenBLAS on AMD Ryzen 7700 CPU. Scalable, portable code with OpenMP, targeting Intel Core and AMD Zen CPUs. Discusses BLAS, CPU performance limits, and hints at GPU optimization.
Beating NumPy's matrix multiplication in 150 lines of C code
Aman Salykov's blog explores high-performance matrix multiplication in C, surpassing NumPy with OpenBLAS on AMD Ryzen 7700 CPU. Scalable, portable code optimized for modern CPUs with FMA3 and AVX instructions, parallelized with OpenMP for scalability and performance. Discusses matrix multiplication's significance in neural networks, BLAS libraries' role, CPU performance limits, and optimizing implementations without low-level assembly. Mentions fast matrix multiplication tutorials and upcoming GPU optimization post.
Beating NumPy matrix multiplication in 150 lines of C
Aman Salykov's blog explores high-performance matrix multiplication in C, surpassing NumPy with OpenBLAS on AMD Ryzen 7700 CPU. Scalable, portable code optimized for modern CPUs with OpenMP directives for parallelization. Discusses BLAS libraries, CPU performance limits, and matrix multiplication optimization.
C++ Design Patterns for Low-Latency Applications
The article delves into C++ design patterns for low-latency applications, emphasizing optimizations for high-frequency trading. Techniques include cache prewarming, constexpr usage, loop unrolling, and hotpath/coldpath separation. It also covers comparisons, datatypes, lock-free programming, and memory access optimizations. Importance of code optimization is underscored.
Memory and ILP handling in 2D convolutions
A 2D convolution operation extracts image features using filters, converting signals to tensors. Cross-correlation is used for symmetric signals. Memory optimization and SIMD instructions enhance efficiency in processing MNIST images.
Never saw the binary tree approach. And this article, being written in summer of 22, missed out on the 2D wavelet approach published later on that year.
https://cgenglab.github.io/en/publication/sigga22_wmatrix_me....
There is also no way to split up the median computation
What does this mean here ? Seems like we could have a rolling window by adding and subtracting pixels on the way. I’ve coded this before, although it’s not O(1) like the algorithm described at the endRelated
Beating NumPy's matrix multiplication in 150 lines of C code
Aman Salykov's blog delves into high-performance matrix multiplication in C, surpassing NumPy with OpenBLAS on AMD Ryzen 7700 CPU. Scalable, portable code with OpenMP, targeting Intel Core and AMD Zen CPUs. Discusses BLAS, CPU performance limits, and hints at GPU optimization.
Beating NumPy's matrix multiplication in 150 lines of C code
Aman Salykov's blog explores high-performance matrix multiplication in C, surpassing NumPy with OpenBLAS on AMD Ryzen 7700 CPU. Scalable, portable code optimized for modern CPUs with FMA3 and AVX instructions, parallelized with OpenMP for scalability and performance. Discusses matrix multiplication's significance in neural networks, BLAS libraries' role, CPU performance limits, and optimizing implementations without low-level assembly. Mentions fast matrix multiplication tutorials and upcoming GPU optimization post.
Beating NumPy matrix multiplication in 150 lines of C
Aman Salykov's blog explores high-performance matrix multiplication in C, surpassing NumPy with OpenBLAS on AMD Ryzen 7700 CPU. Scalable, portable code optimized for modern CPUs with OpenMP directives for parallelization. Discusses BLAS libraries, CPU performance limits, and matrix multiplication optimization.
C++ Design Patterns for Low-Latency Applications
The article delves into C++ design patterns for low-latency applications, emphasizing optimizations for high-frequency trading. Techniques include cache prewarming, constexpr usage, loop unrolling, and hotpath/coldpath separation. It also covers comparisons, datatypes, lock-free programming, and memory access optimizations. Importance of code optimization is underscored.
Memory and ILP handling in 2D convolutions
A 2D convolution operation extracts image features using filters, converting signals to tensors. Cross-correlation is used for symmetric signals. Memory optimization and SIMD instructions enhance efficiency in processing MNIST images.