October 18th, 2024

Using static websites for tiny archives

The author advocates for using static websites to organize personal digital archives, emphasizing simplicity, keyword tagging, and effective file management, while promoting this method for broader archival applications.

Read original article

The article discusses the author's approach to digital preservation by creating static websites for organizing personal archives. The author emphasizes the importance of intentionality in retaining digital files, opting to keep only those that are meaningful. Each collection, such as scanned documents, screenshots, and bookmarks, is represented by a separate website designed to enhance browsing and metadata display. The simplicity of this method, which avoids complex systems and dependencies, allows for easy access and long-term usability. The author highlights the limitations of traditional file organization methods, such as hierarchical folders, and prefers the flexibility of keyword tagging. By converting folders into mini-websites, the author can simplify file management while maintaining the ability to search and categorize files effectively. The process is described as low-tech and manageable, with a focus on small collections, which encourages thoughtful curation of digital content. The author also notes the potential for this method to be applied in larger archival contexts, promoting the idea of static websites as a viable tool for digital preservation. The article concludes with the author's commitment to gradually transition existing files into this new system, appreciating the low maintenance and ease of use that HTML provides for personal archiving.

- The author creates static websites to organize personal digital archives.

- Each collection is represented by a unique website designed for easy browsing and metadata display.

- The method emphasizes simplicity, avoiding complex systems and dependencies.

- The author prefers keyword tagging over traditional hierarchical folder organization.

- This approach is seen as beneficial for both personal and larger archival contexts.

Surfing the (Human-Made) Internet

The internet's evolution prompts a return to its human side, advocating for personal sites, niche content, and self-hosted platforms. Strategies include exploring blogrolls, creating link directories, and using alternative search engines. Embrace decentralized social media and RSS feeds for enriched online experiences.

Archiving and Syndicating Mastodon Posts

The article details archiving Mastodon posts to a personal website using the PESOS model, emphasizing online presence, automation, and content organization through a custom tool developed in Go.

Digital Tools I Wish Existed (2019)

The article highlights challenges in managing digital content, emphasizing the need for effective tools for organization, retrieval, and integration, advocating for a more interconnected approach to personal data management.

The race to save our online lives from a digital dark age

Concerns over digital data preservation grow as vast information is created daily, with organizations like the Internet Archive working to save at-risk content and prevent a potential "digital dark age."

Rediscovering the Small Web (2020)

The modern web is dominated by corporations, stifling personal expression. The author advocates for rediscovering smaller, independent websites, emphasizing their importance for creativity and individual interests in a commercialized landscape.

29 comments

By @egeozcan - 3 months

I copy the images in my clipboard and save them in an HTML file to have single-file galleries:

https://gist.github.com/egeozcan/b27e11a7e776972d18603222fa5...

Live:

https://gistpreview.github.io/?b27e11a7e776972d18603222fa523...

Selecting via file-picker works too. Dragging usually does not. When all works, images are inserted inline as blobs.

After adding images, if you save the page (literally file->save), the blobs are saved together. don't want a part when saving (for example, removing images)? inspect element, remove, save page.

throw the page on some server or just double click on your computer/mobile.

By @meonkeys - 3 months

Lots of folks mentioning Markdown in the comments. +1 to that. Plain text FTW. I think a lot about my own data hoarding / archiving, and plain text is such a key part of that. Very future-proof.

Ever since WordPerfect I've preferred more deterministic, lightly-formatted documents with some way to see formatting characters directly. Markdown is brilliant, basically a DSL (domain-specific language) for HTML.

The key to plain text is tooling! A couple Markdown tools I haven't seen mentioned here yet (even though they've come up on HN before) are:

https://addons.mozilla.org/en-US/firefox/addon/markdown-view... - pretty-render Markdown right in the browser

https://casual-effects.com/markdeep/ - standalone web-friendly Markdown formatter with many features

By @Rhapso - 3 months

I convert content to markdown and relevant images and then store them in an obsidian vault. I self-sync it with syncthing. It has quickly become a rather effective zettelkasten memory prosthetic on my laptop and phone.

I also use google/facebook takeouts, reformat the results, and store+index all my human-facing correspondence in there. Text is cheap and I avoid most images. Its still under 200mb and instantly searchable with a nice UI and as a bunch of markdown files it is easily portable.

By @stared - 3 months

For personal use, I rely on Obsidian in a similar way—whenever I want to keep something (like an FB post I might want to share later), I save it along with the source link. External services can disappear anytime, so local data has the dual advantage of being owned by us and easily searchable.

I also wrote a script to convert Kindle highlights into Markdown files. If anyone’s interested, I'd be happy to polish it a bit and share.

For public-facing content, the Static Site Generator ecosystem keeps improving. I started with Jekyll (since it's the GitHub default), moved through Gridsome, and eventually landed on Nuxt 3 Content, which feels like the sweet spot for me. If I were starting now, I might have chosen Astro.

In any case, the barrier to entry has never been lower. We can host sites for free on GitHub, and if custom styling is needed, AI models are incredibly helpful with CSS.

Markdown is like JavaScript for text formatting. Despite its quirks, it just works.

By @G_o_D - 3 months

Been doing so since 15 years, i make portable html with embedded images, mp3 and much so that i dont need any special software for viewing, just carry it in cloud or my phone nowadays and you only need a browser on any device any os. With embedded mp3 in html, (yes size may grew large) l, but i dont need special music player software or app just browser,

Nowadays along with html i try to archive using MHTML format instead of manually embedding

Run a simple http server and start browsing archives

FOR IMAGES I DO IS

---> Store all images in Folder

---> Open localhost server

---> Open folder in browser

---> Using javascript convert links to <img> tag with src=link

--> Once browser fetches and displays all images Save as and i have embedded MHTML archive

Or simple bash script can be used to create html with img tag and links to folder

Or you can manuaaly template a MHTML

BUT i let my browser do the heavy work why go manual,

Also instead of BASE64 EMBED, EMDEDDING DIRECTLY BINARY IMAGES IN MHTML IS QUITE MORE EFFECTIVE AND LESS MEMORY CONSUMING

Eg i have 15 images MHTML (binary encode) -> 4MB MHTML (BASE64 ENCODE) -> 5MB

Another method i use is, Run python -m http.server on any folder

Or linux : tree -H http://localhost:8000 Set recursion depth

Then open folder link from server or tree created HTML IN BROWSER

in cmd execute wget -rkpN -e robots=off http://localhost:8000

It will recreate folder with index.html for you to browse, you dont need server then for viewing

Same as export from google or twitter or youtube

By @pomdtr - 3 months

I had similar thoughts, and built myself a little framework for this: https://www.smallweb.run

The key feature it adds compared to your own setup is mapping subfolders to subdomains (+ dynamic websites, but you don't seem interested in that).

ex: ~/smallweb/example => https://example.localhost

We have a little discord community at https://discord.smallweb.run if anyone is interested.

By @zirkuswurstikus - 3 months

Personally, I prefer VimWiki for taking notes during my work. So it is a place to mix ideas, small documentations and snippets of things I found on the web.

Since I most of the time like to store articles, tutorial or nifty tricks, I like to store the entire website. For this task, my favorite Tool is SingleFile[1]. With SingleFile you can save a Website with embedded images. Also, you can add annotations, and cut away annoying Ads etc. Besides, it supports a distraction free copy of the website. I can highly recommend taking a look.

[1]https://github.com/gildas-lormeau/SingleFile

By @ericyd - 3 months

I always find posts like this fascinating. I love the direction of going low tech and maintainable, but I have never once found myself spending significant time looking through old work. Photos are the one exception but I've always been fine just scrolling through my personal timeline of date-sorted photos. I used to spend more time on this sort of thing when I was younger and then at some point I just realized I'm never actually looking at it. I'd be curious to know some of the reasons people are frequently revising work from years ago?

By @justusthane - 3 months

For myself at least, there's no way I'd stick with this over the long run given the overhead of hand-editing an HTML file (however quick and simple) every time I needed to add an item to a collection.

Seems like an ideal use for a very simple DIY static-site generator. Write it in Bash or Perl and it will be future-proofed forever.

By @lloeki - 3 months

> Using a static website like this isn’t new – my inspiration was Twitter’s account export, which gives you a mini-website you can browse locally. I’ve seen several other social media platforms that give you a website as a human-friendly way to browse your data.

I've read somewhere that Telegram exports work this way, you get a bunch of raw files somehow organised with directories and browsable by themselves, with a tiny local static website to browse them more conveniently.

So different from the last such mass export I used: Google Takeout, which produces a dumb dump of cryptic xml and raw files named in some nonsensical (to the user) scheme. To this day I'm not even sure I got all the data I asked for before deleting it cloudside.

By @massimoto - 3 months

I recently wrote a static site generator from AnyBox's local database, since they currently only allow for backups via iCloud which is locked down on my work laptop. I was surprised by the peace of mind it gave me to have a nice, 100% portable version of my vast bookmark/website archives.

By @corinroyal - 3 months

This excites me. Imagine someone not overcomplicating web tech. I've been thinking of having web sites render as epubs so we don't have to have a sysadmin on call 24/7 just so I can read.

By @lazylizard - 3 months

https://linux.die.net/man/1/tree

will list your directory tree as a html file..helpful?

By @mathnmusic - 3 months

Strict hierarchies are indeed too rigid. What about using a tag-based file manager like TagSpaces (which is free and open-source)?

By @ejddhbrbrrnrn - 3 months

Markdown files can be a magic low effort way to get this. Even less fancy. Just stick an md file and it is easy to link to stuff. Open it in VS Code. You can go full zettlekasten but you can also just drop some notes around.

By @itohihiyt - 3 months

Why not use a wiki? Zim desktop is text based local first. It doesn't handle videos but everything is handled. Search is good and you get the other benefits of a wiki. No mobile client, that I'm aware of.

By @miragecraft - 3 months

I'm doing the same except with the convenience of HTML includes.

https://miragecraft.com/projects/x-include

By @crtasm - 3 months

>folders require you to use hierarchical organisation, and everything has to be stored in exactly one place.

You can make aliases/shortcuts to files on MacOS, can't you?

By @chrisweekly - 3 months

Awesome post. I'm inspired to take a similar approach. Related tangent: https://sive.rs/ti

is author/entrepreneur Derek Sivers' script for reproducing his bare-bones, low-overhead, long-term "Tech Independence" stack.

By @RadiozRadioz - 3 months

> folders require you to use hierarchical organisation

I find symlinks work for this, which is what I do. I have big directories with the raw pictures dumped from my devices, then categorized directories linking to them.

By @mediumsmart - 3 months

thank you for posting, I have the same experiences looking for a good way to organize files etc - I tested this now and asked the oracle to write me a bash script that finds all images starting with Screenshot and list them in an html file that grids them at 200px width with click fill screen and second click dismiss. Such a good way to have an overview - going to implement that across the HD.

By @freitzzz - 3 months

Really nice idea! As a data hoarder myself, I think I will follow this as way to remind myself of the things I truly should archive :)

By @smugglerFlynn - 3 months

Just thought it would be cool to have a personal "data lifeboat", similar to Twitter export, for exporting Instagram

By @nyc111 - 3 months

I use org-mode export to HTML and then ftp that to the server. Is the OP doing the same without the org-mode?

By @GrumpyNl - 3 months

Why the html files? You can just expose your directory structure on the web, no need for html.

By @thenoblesunfish - 3 months

Glad to see my own instincts here. Filesystems, text files, plain HTML, fun, long-lasting.

By @lovegrenoble - 3 months

nice idea

By @zoobab - 3 months

PDF is better for archiving, but what about videos?

HTML ready sucks for archiving.

By @noja - 3 months

https://www.filestash.app/

Surfing the (Human-Made) Internet

Archiving and Syndicating Mastodon Posts

The article details archiving Mastodon posts to a personal website using the PESOS model, emphasizing online presence, automation, and content organization through a custom tool developed in Go.