June 24th, 2024

Copy-on-Write Performance and Debugging

The article discusses Copy-on-Write (CoW) linking in Dev Drive for Windows systems, enhancing performance during repo builds. CoW benefits C# projects, with upcoming Windows updates enabling CoW by default for faster builds.

Read original articleLink Icon
Copy-on-Write Performance and Debugging

The article discusses the performance and debugging aspects of Copy-on-Write (CoW) linking in the context of Dev Drive and Windows operating systems. Dev Drive, introduced in Windows 11 and soon to be part of Windows Server 2025, utilizes CoW linking to enhance performance during repo builds. The analysis reveals varying performance improvements across different types of codebases, with C# projects benefiting significantly from CoW linking. The article also provides insights on identifying CoW links using fsutil commands and offers guidance on using tools like ProcMon and Microsoft Performance Recorder with Dev Drive. Additionally, it addresses potential issues like managing leaked CoW references and outlines steps to fix them. The upcoming 24H2 Windows release will have CoW enabled by default for Dev Drive, promising faster builds, especially for C# projects. The article recommends integrating the CopyOnWrite SDK into MSBuild repos and creating a Dev Drive partition for improved build performance.

Related

Is 2024 the year of Windows on the Desktop?

Is 2024 the year of Windows on the Desktop?

In 2024, the author reviews Windows 11, highlighting challenges like limited hardware support, lack of installation control, manual driver search, slow updates, and UI lag. They compare favorably to Linux distributions.

Arm64EC – Build and port apps for native performance on Arm

Arm64EC – Build and port apps for native performance on Arm

Arm64EC is a new ABI for Windows 11 on Arm devices, offering native performance benefits and compatibility with x64 code. Developers can enhance app performance by transitioning incrementally and rebuilding dependencies. Specific tools help identify Arm64EC binaries and guide the transition process for Win32 apps.

Software Engineering Practices (2022)

Software Engineering Practices (2022)

Gergely Orosz sparked a Twitter discussion on software engineering practices. Simon Willison elaborated on key practices in a blog post, emphasizing documentation, test data creation, database migrations, templates, code formatting, environment setup automation, and preview environments. Willison highlights the productivity and quality benefits of investing in these practices and recommends tools like Docker, Gitpod, and Codespaces for implementation.

Optimizing the Roc parser/compiler with data-oriented design

Optimizing the Roc parser/compiler with data-oriented design

The blog post explores optimizing a parser/compiler with data-oriented design (DoD), comparing Array of Structs and Struct of Arrays for improved performance through memory efficiency and cache utilization. Restructuring data in the Roc compiler showcases enhanced efficiency and performance gains.

Windows File Explorer will be more powerful with version control and 7z

Windows File Explorer will be more powerful with version control and 7z

Microsoft updates File Explorer with Git integration for version control, native support for 7-zip and TAR compression formats. Aimed at enhancing project management and file organization for users, announced at Microsoft Build.

Link Icon 10 comments
By @bhouston - 4 months
I had to read an early blog to figure out what it was:

https://devblogs.microsoft.com/engineering-at-microsoft/dev-...

"Copy-on-write (CoW) linking, also known as block cloning in the Windows API documentation, avoids fully copying a file by creating a metadata reference to the original data on-disk. CoW links are like hardlinks but are safe to write to, as the filesystem lazily copies the original data into the link as needed when opened for append or random-access write. With a CoW link you save disk space and time since the link consists of a small amount of metadata and they write fast."

It seems there is a MacOS implementation: https://github.com/dotnet/runtime/pull/79243

But it seems that this is .Net specific and not something that would speed up other build systems? It is confusing if this can apply to other build technologies other than .NET. Can it speed up TypeScript/JavaScript builds? Can it speed up Rust builds? Also what are the speed ups on these other platforms like MacOS and Linux?

Is this something that all build systems and all OSes would benefit from?

I guess this blog post for me raises more questions than it answers.

By @justinlloyd - 4 months
Have been running ReFS on a drive on my Windows 10 workstation for about three years, and recently started using a dev drive equivalent on Windows 10 for the past two months. Our Unreal Engine project is quite large, 600+GB straight from the P4 depot before building. I need to keep a few separate workspaces around, one for current development work, one for swarm reviews, one for "let me test out a thing that might break" because as we know, branching in Perforce can quite painful, especially on large depots. At one point I needed to have dozens of workspaces synced to specific changelists whilst we hunted down a bug in one of our levels.

ReFS, with block de-duplication and LZ4 compression has reduced the per-workspace footprint to around 10% of what it was previously. Decreased build times by around 5% and decreased archive, stage and package times by about 80% by deploying MSBuild SDK CopyOnWrite. I also moved the DDC onto the VHDX where the project resides which has further reduced the footprint of the project.

Windows 11 canary channel (still in canary I think) has a modified Win32 that supports CoW FileCopyEx. You can get similar gains by other means on Win10 and Win11 by using ReFS CoW aware utilities.

Have used XFS, BTRFS, APFS and others extensively over the years, so I am glad that Windows is finally getting in on the action.

By @tedunangst - 4 months
> We ran into this problem on one machine that had run continuous CoW builds for weeks under a prerelease CoW-in-Win32 implementation, so we don’t expect this to appear in the wild very often.

That's not exactly confidence inspiring.

By @Joker_vD - 4 months
You know, I've always been kinda amused that something very simple like "cat a b >c" or even "fa = open("a", O_APPEND | O_WRONLY); fb = open("b", O_RDONLY); sendfile(fa, fb, NULL, 0x7ffff000);" doesn't really have either user-visible specialized API nor under-the-hood speed ups in the FS implementations. It's just gluing two files together, it's got to be a very popular operation, about as popular as "prepend the contents of file A to file B". But you can't do it in-place which is kinda annoying when you have to preserve the existing files' attributes.
By @mgerdts - 4 months
I think they use a lot of extra words to say that ReFS will support the equivalent of cp --reflink.
By @forrestthewoods - 4 months
Oh man. My dream Git Successor combines a Virtual File System with a Copy-on-Write cache to allow repos to trivially commit all their dependencies including compiler toolchains.

Windows having CoW makes my far fetched dream a possibility.

By @Rakshith - 4 months
Does anyone know if there is a way to convert the entire windows installation and all files into a Dev Drive format? Without losing any data or corruption
By @tester756 - 4 months
>Dev Drive was released

I tried that Dev Drive thing and I havent seen perf improvement when building C++ code, sadly.

By @42lux - 4 months
And with all that said WSL2 still buffers file transfers in RAM...
By @whalesalad - 4 months
Crazy to see Microsoft talking about performance like they have any expertise in the matter.