Hello! Welcome back to my project’s dev log. This one, like the last, will be more technical, but instead of discussing using SQLite on the frontend, I’m going to talk about CRDTs and go into some thoughts I have about the libraries available for them in Clojure.
First, some updates. I’ve sadly been a bit slower than I would like progress-wise, but I’ve made quite a lot of good progress in both essential features and nice-to-haves in the past couple of months:
hugsql-cljs
.I plan to go into detail about each of these in another dev log or two, but for now let’s get onto the main topic of this post: where’s the nice collaborative text library in Clojure?
First, I’ll cover my bases; here’s a quick primer. As the name implies, collaborative text is when two or more people work on some text, like a document, together, and where they might be making changes at the same time, which necessitates a “conflict resolution” algorithm. Think Google Docs or Notion. It’s typically done with one of two algorithms: Operational Transformation (OT) or Conflict-free Replicated Data Type (CRDT), which both have various kinds of tradeoffs that change the environment they work in and ultimately what feature set they allow.
It’s also worth noting that collaboration in a real web app doesn’t only deal with text — for example you might collaborate on more varied kinds of data, like Miro with its whiteboards — but my application doesn’t need to worry about this aspect in a direct way due to the Operations architecture, and in general from what I’ve understood collaborative text tends to be a more complex and interesting problem than something like collaborative map
s.
You can conceptualize OT as a way to take a set of insert or delete operations between two peers and figure out what changes to those operations allow all the peers to arrive at the same end result. It’s pretty straightforward to grep the basic mechanics.
For example, imagine you’re working on a blog post with a friend, and trying to think of the intro. You might start with this:
For each of those inserts, you record the cursor position and the character being inserted, e.g. insert(0, "H")
. The same would apply for a delete, except instead of the inserted character you’d record how many characters were deleted, e.g. delete(5, 2)
.
Back to you and your friend’s blog post. At the same time, your friend considers “Hey” to be a better intro, and changes the “Hi” to “Hey”:
Okay cool, seems to work. But there’s a problem: look at the indices above, and you’ll notice that the original indices won’t work anymore since “Hey” is one more character than “Hi”, and so all of the subsequent operations need to be changed to reflect this, hence the term transformation.
Doing this transformation is tricky, in large part because you don’t have enough information to accurately figure out what to do with just the position, and doing this transformation either in a peer-to-peer setting and/or with fast performance in mind is even trickier (apparently algorithms with decent performance in a peer-to-peer setting have only been around for a few years as of 2025), so you typically see it with two peers: a client and a server, and the server handles all of the transformation before sending it to other peers. This is why if you’ve ever used Google Docs you may have noticed a tiny bit of lag in some cases: it’s because those operations need to first go to Google’s servers to be transformed.
An alternative approach is to use a Conflict-free Replicated Data Type, or CRDT. It works similarly to an OT system, but does a bit more work when inserting a character. Rather than recording just the position, each character gets its own globally unique ID and those IDs are used in place of the position, plus some additional metadata. Using CRDTs for the second example above, we arrive at something like this:
In these operations we see some additional information compared to an OT operation:
insert(
// A replica ID. In reality you would use a random ID, not a username/user ID,
// since the same user could be on multiple computers
"mike",
// A sequence number, a monotonically increasing counter of the number of operations
// this replica has performed
6,
// A parent ID, which corresponds to the character to the left of this one
["mike", 5],
// The actual character
"t"
)
This additional metadata is enough to coordinate merging two or more changesets relatively quickly and in a peer-to-peer situation, and without manual conflict resolution like you might do with Git, which means that it’s much more suitable for local-first or offline-capable software.
However, it’s not without its downsides:
There are probably even more downsides than these, but you get the idea: a CRDT is a powerful data structure for collaboration, and even with these downsides it’s what I would recommend many people to start with when building something collaborative (but, go for an existing library), and this is the kind of data structure that I’m looking to use too.
Even before getting into the libraries, I want to cover a little bit about why I’m specifically looking for a Clojure CRDT library.
One way you can design this kind of system is to just trust the client to send valid data, and I would imagine that a lot of systems are built in this way, because it means you only need to figure out a client side library, of which many good ones exist (Yjs, Automerge). In a lot of cases I imagine it works well since the frontend can take all the responsibility for dealing with collaboration here, and the backend, if one even exists, can just model the data as an arbitrary blob.
This doesn’t work if you have more complex requirements for storage, or if you want a cached value that isn’t just the individual CRDT operations. I can imagine a theoretical system where you’d want to model a document as a single big map CRDT, with a key for the text, another for styles, and another for comments, and you want to have a role in that system that allows for commenting but not editing the text, and where you would just want direct CRDT operations rather than something more complex like event sourcing. Obviously there are different ways to approach it, but regardless of the approach the backend would need to understand the CRDT operation directly, possibly even performing the CRDT operation, which means now we have more than just the frontend needing to understand the CRDT.
In my case I simply want to check that the incoming operation is sane, to prevent some obviously malicious operation from being persisted, but the principle still stands: I want to be able to perform a CRDT operation on both the frontend and backend, which means I need a shared algorithm, ideally a shared library, and I need it to be in Clojure for obvious reasons.
Unfortunately, the landscape of Clojure CRDTs isn’t ideal for me. I found three libraries, but they all have issues and/or don’t meet certain requirements of my setup:
tx-index
on top of the Replica ID and vector clock) and is something I don’t want as I already have transaction logic in the Operations system.While I was researching these libraries, I was concurrently reading up about how collaborative text algorithms generally work (partly so I can explain it here!), and found Joseph Gentle. He’s done a lot of work in the area, for example helping work on Google Wave, creating the previously mentioned ShareJS, and since then has both published academic work and created implementations of high performance CRDTs. He really knows his stuff, and very helpfully has made a series of reference implementations and Youtube videos where he talks about building a CRDT from scratch. I didn’t originally plan to build a CRDT, partly because I had perceived them as very complex things that you should basically never implement yourself, and while I still believe this in the general, after watching his videos I was given the confidence that it’s not so complex as to make it untenable for me to do it for this project, and so that’s what I did: I created an implementation of one of the newest collaborative text algorithms, Eg-walker, in Clojure for my project.
Without going into too many of the details — if for no other reason that this post is long enough and Joseph explains this stuff better than I could — the Eg-Walker algorithm is essentially the best of the OT and CRDT worlds, such that you can store operations similar to OT (just the position and character of the change), you don’t need to store tombstones and eat that storage cost of that, and you get guaranteed conflict-free merging of two documents like in a CRDT (with minimized interleaving too). For my project specifically it’s also nice that it fits well into the Operations system too. I imagine that in the future modern collaborative text systems will use such an algorithm by default, or something along these lines, since its tradeoffs are reduced compared to OT and CRDT.
As of the date of publishing this dev log, my implementation isn’t ready for open sourcing (I want to add more tests, and generally clean it up), but I plan to do so and I’ll update this post when I do. I’m looking forward to sharing what I’ve learned and implemented to the community so that they can get the benefits of a powerful collaborative text library too. Stay tuned (follow me on Github).◾️
P.S. I’ve started a new job at Tower! On the positive side, I’m enjoying it so far and learning a lot (I’ve never programmed in Go or Rust before), the team is welcoming, and to be honest it’s kind of nice to be back to a regular schedule of working, but on the negative side, that means I’m going to have less time to focus on this project. This does not mean I’m putting the project on ice like it was back in 2024, but it does mean that work on it and updates in the form of these logs will be less frequent than they already were. I still strive to keep a cadence of monthly/bi-monthly logs, but not sure how realistic it is.
If you’re reading this: 1. thanks! 2. I’ll add an RSS feed at some point so you don’t have to manually check.