Show HN: Dut, a fast Linux disk usage calculator
Codeberg.org hosts "dut," a disk usage calculator for Linux. It accurately counts hard links, offers customizable output, and outperforms similar tools in speed and efficiency, making it valuable for Linux users.
Read original articleThe website Codeberg.org hosts a disk usage calculator for Linux called "dut." This tool accurately counts hard links and provides ASCII output compatible with Linux terminals. Users can customize the output format by adjusting command-line arguments. The calculator displays the size of entries on the disk, accounting for shared space due to hard links. It offers options to limit rows and depth shown. The tool is a single C source file and can be compiled with a C11 compiler. Benchmark tests show that "dut" performs well, especially after Linux disk caches are populated. It outperforms other programs like du from coreutils, dua, pdu, dust, and gdu in various scenarios. The benchmarks demonstrate the tool's speed and efficiency in calculating disk usage, making it a valuable resource for Linux users.
Related
Htop explained – everything you can see in htop on Linux (2019)
This article explains htop, a Linux system monitoring tool. It covers uptime, load average, processes, memory usage, and more. It details htop's display, load averages, process IDs, procfs, and process tree structure. Practical examples are provided for system analysis.
Diff-pdf: tool to visually compare two PDFs
The GitHub repository offers "diff-pdf," a tool for visually comparing PDF files. Users can highlight differences in an enhanced PDF or use a GUI. Precompiled versions are available for various systems, with installation instructions.
Background of Linux's "file-max" and "nr_open" limits on file descriptors (2021)
The Unix background of Linux's 'file-max' and 'nr_open' kernel limits on file descriptors dates back to early Unix implementations like V7. These limits, set during kernel compilation, evolved to control resource allocation efficiently.
My List of CLI Gems
The article discusses various CLI gems for package management, categorized into sections like Utilities, Git tools, and more. Highlighted gems include fzf, bat, lazygit, tmux, and dua-cli for different functionalities.
Show HN: Xcapture-BPF – like Linux top, but with Xray vision
0x.tools simplifies Linux application performance analysis without requiring upgrades or heavy frameworks. It offers thread monitoring, CPU usage tracking, system call analysis, and kernel wait location identification. The xcapture-bpf tool enhances performance data visualization through eBPF. Installation guides are available for RHEL 8.1 and Ubuntu 24.04.
Maybe there could be an iterative breadth-first approach, where first you quickly identify and discard the small unimportant items, passing over anything that can't be counted quickly. Then with what's left you identify the smallest of those and discard, and then with what's left the smallest of those, and repeat and repeat. Each pass through, you get a higher resolution picture of which directories and files are using the most space, and you just wait until you have the level of detail you need, but you get to see the tally as it happens across the board. Does this exist?
The best disk usage UI I ever saw was this one: https://www.trishtech.com/2013/10/scanner-display-hard-disk-... The inner circle is the top level directories, and each ring outwards is one level deeper in the directory heirarchy. You would mouse over large subdirectories to see what they were, or double click to drilldown into a subdirectory. Download it and try it - it is quite spectacularly useful on Windows (although I'm not sure how well it handles Terabyte size drives - I haven't used Windows for a long time).
Hard to do a circular graph in a terminal...
It is very similar to a flame graph? Perhaps look at how flame graphs are drawn by other terminal performance tools.
I have the suspicion that some file systems store stat info next to the getdents entries.
Thus cache locality would kick in if you stat a file after receiving it via getdents (and counterintuitively, smaller getdents buffers make it faster then). Also in such cases it would be important to not sort combined getdents outputs before starting (which would destroy the locality again).
I found such a situation with CephFS but don't know what the layout is for common local file systems.
ETA: Apparently the value in /proc/sys/vm/vfs_cache_pressure makes a huge difference. With the default of 100, my dentry and inode caches never grow large enough to contain the ~15M entries in my homedir. Dentry slabs get reclaimed to stay < 1% of system RAM, while the xfs_inode slab cache grows to the correct size. The threads in dut are pointless in this case because the access to the xfs inodes serializes.
If I set this kernel param to 15, then the caches grow to accommodate the tens of millions of inodes in my homedir. Ultimately the slab caches occupy 20GB of RAM! When the caches are working the threading in dut is moderately effective, job finishes in 5s with 200% CPU time.
#/bin/sh
du -k --max-depth=1 "$@" | sort -nr | awk '
BEGIN {
split("KB,MB,GB,TB", Units, ",");
}
{
u = 1;
while ($1 >= 1024) {
$1 = $1 / 1024;
u += 1
}
$1 = sprintf("%.1f %s", $1, Units[u]);
print $0;
}
'
alias duwim='du --apparent-size -c -s -B1048576 * | sort -g'
It produces a similar output, showing a list of directories and their sizes under the current dir.The name "duwim" stands for "du what I mean". It came naturally after I dabbled for quite a while to figure out how to make du do what I mean.
Hi, how about flamegraph? I always want to display the file hierarchy in flamegraph like format.
- previous discussion: https://x.com/laixintao/status/1744012609983295816
- my work display flamegraph in terminal: https://github.com/laixintao/flameshow
I gave `dut` a try, but I'm confused by its output. For example:
3.2G 0B |- .pyenv
3.4G 0B | /- toolchains
3.4G 0B |- .rustup
4.0G 0B | |- <censored>
4.4G 0B | /- <censored>
9.2G 0B |- Work
3.7G 0B | /- flash
3.8G 0B | /- <censored>
16G 4.0K |- Downloads
5.1G 0B | |- <censored>
5.2G 0B | /- <censored>
16G 0B |- Projects
3.2G 42M | /- <censored>
17G 183M |- src
17G 0B | /- <censored>
17G 0B |- Videos
3.7G 0B | /- Videos
28G 0B |- Music
6.9G 0B | |- tmp
3.4G 0B | | /- tmp
8.8G 0B | |- go
3.6G 0B | | /- .versions
3.9G 0B | | |- go
8.5G 0B | | | /- dir
8.5G 0B | | | /- vfs
8.5G 0B | | | /- storage
8.5G 0B | | /- containers
15G 140M | /- share
34G 183M /- .local
161G 0B .
- I expected the output to be sorted by the first column, yet some items are clearly out of order. I don't use hard links much, so I wouldn't expect this to be because of shared data.- The tree rendering is very confusing. Some directories are several levels deep, but in this output they're all jumbled, so it's not clear where they exist on disk. Showing the full path with the `-p` option, and removing indentation with `-i 0` somewhat helps, but I would almost remove tree rendering entirely.
0 - https://linux.die.net/man/3/fts_read
1 - https://github.com/ttkb-oss/dedup/blob/6a906db5a940df71deb4f...
I wasn't aware that there was a rewrite of ncdu in Zig. That link is a nice read.
One comment, I find the benchmark results really cumbersome to read. Why don't you make a graph (e.g. a barplot) that would make results obvious at a quick glance. I'm a strong believer in presenting numerical data graphically whenever possible, it avoids many mistakes and misunderstandings.
#!/bin/bash
du -hs * .??* 2> /dev/null | sort -h | tail -22
One small surprise I found came when I have a symlink to a directory and refer to that with a trailing "/". dut doesn't follow the link in order to scan the real directory. Ie I have this symlink:
ln -s /big/disk/dev ~/dev
then ./dut ~/dev/
returns zero size while du -sh ~/dev/
returns the full size.I'm not sure how widespread this convention is to resolve symlinks to their target directories if named with a trailing "/" but it's one my fingers have memorized.
In any case, this is another tool for my toolbox. Thank you for sharing it.
Well I can just try :)
console: rust: cargo install diskonaut python: pip install ohmu GUI: gdmap windows: windirstat mac: grand perspective (I seem to r call)
Related
Htop explained – everything you can see in htop on Linux (2019)
This article explains htop, a Linux system monitoring tool. It covers uptime, load average, processes, memory usage, and more. It details htop's display, load averages, process IDs, procfs, and process tree structure. Practical examples are provided for system analysis.
Diff-pdf: tool to visually compare two PDFs
The GitHub repository offers "diff-pdf," a tool for visually comparing PDF files. Users can highlight differences in an enhanced PDF or use a GUI. Precompiled versions are available for various systems, with installation instructions.
Background of Linux's "file-max" and "nr_open" limits on file descriptors (2021)
The Unix background of Linux's 'file-max' and 'nr_open' kernel limits on file descriptors dates back to early Unix implementations like V7. These limits, set during kernel compilation, evolved to control resource allocation efficiently.
My List of CLI Gems
The article discusses various CLI gems for package management, categorized into sections like Utilities, Git tools, and more. Highlighted gems include fzf, bat, lazygit, tmux, and dua-cli for different functionalities.
Show HN: Xcapture-BPF – like Linux top, but with Xray vision
0x.tools simplifies Linux application performance analysis without requiring upgrades or heavy frameworks. It offers thread monitoring, CPU usage tracking, system call analysis, and kernel wait location identification. The xcapture-bpf tool enhances performance data visualization through eBPF. Installation guides are available for RHEL 8.1 and Ubuntu 24.04.