Das eigene Webarchiv

April 08, 2018

tl;dr: back to work, with support from the Prototype Fund, to spend 6 months on personal web archiving.

It has been silent around this project for a long time, while I was distracted with other things, ran out of funds, and needed a viable plan to move things forward. This week, the project gets an impulse again, as it will be supported by the Prototype Fund for the next six months under the project title "Das eigene Webarchiv".

As the title suggests (eigen = own), the focus will be even more than before on the ability to archive web pages for personal use. Like in the original idea of the memex, such use would ideally include adding notes and links among things you have read, thus organising them by your assocations; and this is still the long term vision. But to organise items one cannot own, to annotate pages that may change or disappear.. it all feels like building on quicksand. So let's first build this foundation, in order to work around the web's inability to retain the documents we care about.

This sub-mission feels partly like a continuation of the work done so far, and partly as a new project building on lessons learnt from that. Either way, the plan for the coming months is to work on technologies and tools for web archiving; and to combine their features in a browser extension that enables you to…

  1. store a web page as you are visiting it. The core task here is improving freeze-dry, to as well as possible snapshot a page and bundle it with its dependencies (e.g. images and stylesheets).

  2. browse pages from your archive. Whenever you visit a webpage, you can choose to see previously saved versions; you could even choose to browse offline-first by default.

An explicit goal is to avoid creating a silo that locks archived pages up inside your browser. Rather, the idea is to be part of an ecosystem of composable tools, so you could access or even edit pages using other applications, thereby making them truly yours.

To this end, pages will be stored on a web server of your choice; possibly just running on your own computer and only accessible by you, but nevertheless speaking webby protocols (exactly which protocols is still to be determined). Part of the project plan is to experiment with this architecture to add social features, that will enable you to…

  1. share saved pages with others. By storing your archive on an internet-connected web server (aka "the cloud"), you can easily make archived pages, or a whole archive, available to others. You can snapshot a page and give me the link to your snapshot.

  2. browse using other people's archives. Besides your own archive, you could query the archives shared by others about their versions of pages, using the Memento protocol. Your friends' archives, your own archive, and big ones like the Internet Archive; each will just be another repository of documents too precious to lose, queryable in the same manner.

These four listed features will form the core work of the next months; of course there will be more features to make this a convenient tool: comparing different versions, full-text search through the archive(s), and perhaps more. How exactly things will work is to be found out as we go!