blog-static/content/blog/introducing_highlight/index.md

113 lines
7.3 KiB
Markdown
Raw Permalink Normal View History

---
title: "Introducing Matrix Highlight"
date: 2021-12-13T16:49:42-08:00
tags: ["Matrix", "Project", "Matrix Highlight"]
---
I wanted to briefly introduce a project that I've been working on in my spare time over the past couple of months.
It's called __Matrix Highlight__, though this is a working title. However, it does exactly what the title claims:
this little project is a browser extension to annotate the web, using [Matrix](https://matrix.org) as a communication and storage protocol.
My goal with this project is a __decentralized, federated,
collaborative annotation system for web pages and documents (that can be self-hosted).__
See the image below for a quick sneak peek at what it looks like.
{{< figure src="mhl_many.png" caption="Text randomly highlighted with Matrix Highlight" class="fullwide" >}}
### Project Goals
#### Decentralized
Quite literally, the word "decentralized" lies in opposition to "centralized". In a centralized application, the data,
or computation, or anything really, is controlled by a single entity. An example of this might be Google Docs: Google
is in charge of Docs, and no one else. You log in through Google, and Google manages your various documents and edits
to them authoritatively.
I don't think that's a good idea. Although convenient, this kind of arrangement shifts control out of your hands
as a user, and into the hands of the entity running your software. Google has the power, if they wanted, to ban you from
using Google Docs, or to manage your content in ways you don't like. They can (and do) impose storage limits. You are,
in a sense, at their mercy. Furthermore, in the (admittedly unlikely) case that Google goes down, Google Docs goes down
for everyone. There's a single point of failure. Whereas it's hard to imagine Google itself having any real trouble (although
the recent AWS outage proves that such a thing is possible), most services aren't Google.
2021-12-13 20:16:37 -08:00
A decentralized application does not suffer from these problems.
The failure of a single server in a decentralized system does not bring it down for everyone. Furthermore, users have
the ability to switch between different servers or providers if one becomes abusive (or simply stops existing). Users
have more choices, and more control.
__For Matrix Highlight specifically__, this means not having to rely on one specific group or company
for storing and managing your annotations or notes.
#### Federated
Decentralization by itself does not make for useful software. There might very well be multiple servers providing access
to a particular piece of software. However, there's no guarantee that users of one such server can meaningfully interact
with users of another server. Microsoft's Office 365 has collaborative document editing, and so does Google Docs. However,
users of the two services cannot collaborate with _each other_.
In a federated system, the various providers establish a way of working together. The [Fediverse](https://en.wikipedia.org/wiki/Fediverse)
is a big example of this. Users of various [Mastodon](https://joinmastodon.org/) servers can see each other's messages and posts,
despite residing on servers with differing rules and administration. Users of Matrix can send messages between servers, with only
one account.
__For Matrix Highlight__, this means that users who choose to use different servers or providers are still able to collaboratively highlight
and annotate pages together.
#### Self-Hosted
Self-hosting is the practice of running the various software you use yourself. This allows you yourself to be in charge of your data,
instead of _any_ other entity, however trustworthy. A popular self-hosted solution is [Nextcloud](https://nextcloud.com/), which
may be used, among other things, as a Google Drive replacement that you run on your own server. With Nextcloud, your files
are completely under your own management, rather than that of some other person or company elsewhere.
__For Matrix Highlight__, this means that users can choose to run all the necessary software themselves, and thus remain in complete
control of their annotation and other data.
### What it Looks Like
First of all, you can watch a little demo video I recorded here:
{{< youtube Q3h5A0DsE1s >}}
You already got a little taste of Matrix Highlight in the opening screenshot. However, I'd like to show you some more of what I have
so far. The most important aspect of the tool is the ability to annotate web pages. The tool can be brought up on any page;
I typically test it on my blog, but that's just because it's convenient. Selecting some text brings up a little highlighting tooltip:
{{< figure src="mhl_tooltip.png" caption="A matrix highlighting tooltip appearing over one of the sidenotes in a different article." >}}
Selecting one of the colors in the tooltip creates a new highlight of the text you had selected:
{{< figure src="mhl_highlight.png" caption="The result of clicking a color in the previous screenshot." >}}
Annotations applied in this way are shared across all active instances of a Matrix Highlight page, including those shared with other users.
{{< figure src="mhl_multi.png" caption="Two chrome windows with the same annotations." class="fullwide" >}}
Highlights created by users can also be browsed as a list:
{{< figure src="mhl_quotelist.png" caption="A list of highlights from another page." class="medium" >}}
Highlights are stored in Matrix rooms. Since Matrix rooms are effectively chat rooms, they are built for being shared with other users.
Thus, it is very simple to give another user access to the current list of highlights.
{{< figure src="mhl_userlist.png" caption="A list of users for a particular page." class="medium" >}}
This also means that a single page can have multiple
independent sets of highlights, allowing you to organize them however you like. For instance, if you're proofreading a page of your own,
you may have a highlight set (Matrix room) for every editing pass. The rooms can be switched at a moment's notice:
{{< figure src="mhl_roomlist.png" caption="A list of rooms for a particular page." class="medium" >}}
### Current and Planned Features
The following are the current and planned features for Matrix Highlight:
* __Current__: Create and send website annotations over Matrix.
* __Current__: Store data in a decentralized and federated manner.
* __Current__: Share highlights with other users, including those on other servers.
* __Current__: Group annotations together and create multiple annotation groups
* __Planned__: Use Matrix's End-to-End encryption to ensure the secure transmission and storage of highlight data.
* __Planned__: Leverage the new [`m.thread` MSC](https://github.com/matrix-org/matrix-doc/blob/gsouquet/threading-via-relations/proposals/3440-threading-via-relations.md) to allow users to comment on and discuss
highlights.
* __Planned__: Use something like [ArchiveBox](https://archivebox.io/) to cache the current version of a website and prevent annotations from breaking.
* __Planned__ Highlight PDFs in addition to web pages.
### Project Status and Conclusion
For the moment, I'm refraining from publishing the project's source or output extensions. This is a hobby project, and I don't want to share
something half-baked with the world. However, I fully intend to share the code for the project as soon as I think it's ready (which would probably
be when I feel perfectly comfortable using it for my own needs).