Archiving WebMemex

November 24, 2024

With no movement in this project for several years, I figured it’s time to consider it over, archive this blog, dehydrate its content before bitrot consumes it. But first, I’ll give an overview of my WebMemex-related efforts of the past years for any curious visitor or future archeologist.

The mindmapping browser

Freshly graduated in 2016, I tried to create a mindmap-like read/write web browser for growing one’s digital memory; a memex-ish tool based on world-wide web technology. See:

This first demo was a browser that was itself built as a web-app, (ab)using <iframe> elements. It relies on a proxy to avoid cross-origin requests as well as to insert a script into each page which captures link clicks inside the frame (because clicked links should open in a new node, creating a path).

By browsing you would build a graph of visited web pages, connected by the links you followed. You could also create links between any pages, to add your own. And you create notes, which could be regarded as tiny, user-writable web pages themselves, and could thus be attached to any webpages by their links.

The most similar tool out there was The Brain. Jerry Michalski, who publishes many years of accumulated thoughts and findings as Jerry’s Brain, had convinced me that a directed graph is a powerful structure for organising thoughts: it allows building hierarchies while also allowing multicategorisation. So I started with this, other organisation methods could still be added later.

The demo elicited encouraging responses, and hopefully inspired some people; e.g. Mozilla’s browser.html experiment reportedly borrowed some ideas. While being merely a proof of concept, the WebMemex was somewhat functional already. I used it intensively for a while myself (until I lost my data multiple times), some friends gave it a try, and for a short period hypertext-inventor Ted Nelson was the most enthusiastic adopter (until presumably giving up on it too).

Given the limitations of the iframe-based approach, and as building a new browser was beyond my abilities, in order to make something practical I decided to make a browser extension to incrementally add the desired features to existing browsers: see the first WebMemex blog post. I got a small grant from SIDN fonds for this work.

Snapshotting pages

When the browser becomes a knowledge management tool, accumulating web pages and notes to function as your personal memory extension, storing only the addresses of web pages is not sufficient — the web appeared too ephemeral. Web pages behave rather like living beings that are not easily stored, applications that often rely on their back-ends to remain operational. To treat web pages as pages again, as documents that you can store on your digital shelf, I needed a module for snapshotting web pages; not finding any, I started to build it myself. This became the freeze-dry javascript module, which packs the whole snapshotting logic into one simple but highly configurable function.

To keep focus, the WebMemex browser extension got stripped down to a simple web page snapshotting tool based on freeze-dry, with full-text search through one’s snapshotted pages. After a brief collaboration, a more feature-rich fork of the extension by Oliver Sauter and his team developed into WorldBrain’s Memex.

With a grant from the Prototype Fund in 2018, I improved freeze-dry and created a proof of concept for personal web archives: your collection of snapshots is your archive, which can be queried via the Memento protocol like any other web archive. To prototype this I made a small Nextcloud app.

Web annotation

An important goal of the WebMemex project was to enable highlighting and annotating the content of pages; e.g. linking between corroborating or conflicting statements. Creating links should not be limited to the publisher of a page; browsers lack a highlighter and pencil.

To ensure that annotations made in one tool are also readable in another, we need standard formats for them; both to be able to share annotations between people, and to avoid content dying along with the software that made it. The W3C Web Annotation Data Model was an attempt in this direction, but the standards were both broad and vague and got little uptake.

After earlier work with web annotations (e.g. an internship at Hypothes.is in 2014), I made some efforts on web annotation again in 2020–2022, with an NGI0 grant from NLnet Foundation (and then I ended up working at NLnet). Instead of annotation support directly into the WebMemex browser extension, I wanted to work on more widely reusable tooling for the web annotation standards. I contributed to the modules of the ever-incubating Apache Annotator, and created a proof of concept specification and implementation for standards-based ‘annotation feeds’, Web Annotation Discovery. This 3-minute screencast conveys the idea.

Relatedly, in 2022 the work on URL Fragment Text Directives (first called ‘scroll-to-text’) by the Chromium team caught my eye. I had made the highly similar quoteurl back in 2016, and a Precise links browser extension in 2017, with the hope that links to arbitrary content would some day become a standard. While I’m not fond of a dominant browser effectively changing the nature of URLs with little input from others, this particular change seemed helpful. I made a line-by-line implementation of the spec, text-fragments-ts, and tried improve the spec at the edges, hoping that it will finally allow web links to refer to arbitrary phrases within a document.

Now what

The aspirations behind the WebMemex are ever present, and will perhaps revive some day in some form, but this website can be turned into past tense now. Most likely I will not attempt to build an all-in-one WebMemex knowledge management tool, though time permitting I might maintain and improve some technical enablers, such as freeze-dry.

Over the last few years, I have come to realise that building the user-facing application, while difficult by itself, is not the primary challenge; the more important goal is to grow an ecosystem of standards and modules, and also establish the concepts, that power a wide range of interoperating knowledge management tools.

It helps to see your whole computer as your memex, and every application is part of it; highlighting, annotating or linking a fragment of text or audio should be as ubiquitous as copy-paste. And by cross-referencing between people’s knowledge bases, these then form a web of memexes — somewhat like the original idea behind the world wide web, which then went another way.

Now to archiving this blog. Instead of running the specific software that currently powers this blog (the Ghost CMS, in this case), I will turn the website into static HTML files along with their subresources (images, stylesheets, scripts), which can then be hosted by any simple web server. (Wiser people would have used a static website generator from the beginning.) I could simply run wget --recursive, or take this opportunity to use the WebMemex in action..

— Gerben