September 28th, 2024

Ask HN: Is distributed systems research slowing down?

Distributed systems research thrived from the late 1990s to early 2010s, producing key systems. Currently, interest has waned, yet they remain vital for AI and consensus technologies.

Ask HN: Is distributed systems research slowing down?

In the late 1990s to early 2010s, distributed systems research was prolific, producing significant theoretical and practical advancements, including systems like Chord, Google BigTable, Spanner, and Facebook's Cassandra and TAO. This period was characterized by a strong focus on building systems capable of handling web-scale and mobile applications. However, in recent times, the perception of distributed systems has shifted from being seen as a frontier for exploration to merely an implementation detail. The current landscape appears less innovative, possibly due to a lack of visibility into ongoing research or a change in focus within the community. Despite this, distributed systems remain crucial, especially in the context of artificial intelligence and distributed consensus technologies, such as those used in cryptocurrency. The author seeks guidance on current significant work in distributed systems that supports AI and distributed consensus, indicating a need for updated keywords or resources to better navigate this evolving field.

- Distributed systems research was highly active from the late 1990s to early 2010s.

- Key systems developed during this time include Chord, Google BigTable, and Facebook's Cassandra.

- The perception of distributed systems has shifted to being seen as implementation details rather than areas of innovation.

- There is a growing relevance of distributed systems in AI and distributed consensus technologies.

- The author is looking for current significant research in distributed systems to better understand the field today.

Related

Link Icon 6 comments
By @warner25 - 2 months
Regarding AI and distributed systems, this might not be what you have in mind, but take a look at federated learning. I'm currently a computer science PhD candidate at a small school, and a couple of our graduates in the past year worked on the fundamentals and applications of federated learning.

I came into this PhD program thinking that I wanted to work on stuff like the distributed databases that you listed, or the stuff they're built on like clock synchronization. I did my master's degree in 2017-2018 and I was fascinated by an "advanced databases" class that covered these things. Unfortunately, nobody in my department works on such things, and I agree with you that I don't hear much about that area anymore.

By @JSDevOps - 2 months
No, AI has nothing to do with this. The late 1990s to early 2010s were a massive period for distributed systems research, but that surge was driven by advancements in computing infrastructure and the growing needs of industry giants—not AI. From theoretical frameworks like Chord to practical implementations like BigTable, Spanner, Cassandra, and TAO, the focus was on solving complex problems in scalability, consistency, and fault tolerance. The efforts to properly implement Paxos were part of that journey. AI had little to do with the sheer volume of breakthroughs that came from this time; it was all about improving distributed systems, not artificial intelligence. Not everything needs to loop back to AI.
By @musicale - 2 months
Everything is a distributed system now, and system designs are changing in important ways. As you indicate, we you are currently seeing a redesign of basically all computer systems from mobile to datacenter/cloud/hpc to better support AI workloads. I expect there are still many opportunities for distributed systems research.
By @austin-cheney - 2 months
If anybody needs a practical launching point to study distributed systems maybe I can help.

I have a browser/Node.js application that I worked on for several years that is a browser based peer-to-peer file system sharing tool. The application is massive in size, but its easy to tinker with, the build takes about 8 seconds, and the test automation is decent enough for anybody interested to get started. The idea is to invert the transmission model of the web from anonymous connections towards a server instead to trusted connections between owned computers and even restricted connections to different users (and their various devices) all in real time using sockets and streams and a tiered distribution model. https://github.com/prettydiff/share-file-systems

I am currently working on a web server application that allows spinning up many different web servers quickly and a monitoring dashboard that shows the various servers, their connected sockets, port management, and various other things. This all started with the idea that you can proxy and redirect anything super efficiently where a proxy in Node.js is as simple as:

    proxy.pipe(socket);socket.pipe(proxy);
By @fragmede - 2 months
I'm sure the next generation of supercomputers (eg anything upcoming with Nvidia's GB200 architecture, Tesla's Dojo, etc) will have their own share of challenges that need to be solved, at their specific scale.
By @ldjkfkdsjnv - 2 months
Its a solved problem, unless youre at FAANG scale, you can just use cloud solutions to build distributed systems. Use their NO SQL dbs, queues, serverless functions, etc.