This Week at Zed Industries: #14
August 4th, 2023
It's been another busy week at Zed. Our search is getting a major new feature, WebAssembly extensions are getting off the ground, and we're getting close to shipping our new collaboration channels feature.
Kyle
This week was mostly focused on reworking the Search UI to incorporate the new Semantic Search feature we've been working on. Piotr and I have got the UI to a state that feels pretty smooth and intuitive and we are hoping it will be an improvement over the previous UI. Beyond this, there was a few small pieces of exploration focused on the future of AI at Zed, and expanding our retrieval engine to incorporate reranking models.
Piotr
This week I was reworking search UI together with Kyle. It will most likely be a part of next week's Preview. In the following week I plan to apply UI the changes to the buffer search as well.
Joseph
This week, I've built some charts to show which file types are most commonly opened by users who have stopped using Zed, to try to get more data points on what languages people are trying to use in Zed. I brainstormed a bit with Julia on what additional charts could be built that would help to paint a better picture of what our churned users are looking for. Outside of community work and some charting, I landed a few small PRs into Zed; one adds convert to {upper,lower} case
commands to the editor (thanks for the assistance, Julia). Next week, I plan to continue building charts for churned users and I might explore building a local tool to aggregate user feedback.
Nathan & Antonio
The most interesting thing Antonio and I focused on this week was synchronization between two CRDB repositories. Our initial version relied on exchanging vector timestamps which contained a monotonically increasing counter for each replica id. We used this to quickly determine which operations the client was missing from the server, and vice versa. Knowing these timestamps would grow with every replica id, this approach meant we would need to centrally assign replica ids and also maintain state on a given replica to track its id, which comes at the cost of operational complexity.
So we decided to invest a bit more at the theoretical level and find a synchronization strategy that doesn't depend on vector timestamps. Ultimately, the state of the repository is derived from a set of operations that are ordered by lamport timestamp, so that new operations tend to cluster on the right side of the operation history.
After exploring a couple different approaches, we're arriving at a solution that involves determining the length of a shared prefix of the operation sequence between client and server. We use the Bromberg associative hash function along with our generic B-tree data structure to compute a digest for any range of operations in O(log(n)) time. When the client initiates a sync, they send digests covering several ranges. Starting with the largest range send by the client, the server looks at successively smaller ranges until it finds a digest that matches. This is the common prefix. The difference in size between the matching range and the next matching range determines our potential overshoot... The common prefix is potentially larger, but we don't have enough resolution to know.
If the potential error is smaller than our maximum, we can just send all the operations following the shared prefix. Some will already be on the client, but this is okay. If the potential error exceeds our maximum, we can perform an additional round trip to refine the answer. It's all about achieving a balance between the latency of round trips and the cost of sending redundant operations.
I also worked on expressing UI layout GPUI, and it's really starting to come together. Here's an excerpt from a test I'm working on:
A lot of frameworks use macro DSLs, but I think there are complexity trade-offs. I really like the idea of expressing UI in pure Rust, and so far, I think this function chaining approach is reasonable.
Max
This week, Mikayla and I have continued work on channel's Zed's new feature for structuring collaboration. We have a UI for managing channels and memberships that's now mostly feature-complete, but ugly. Next week, we'll finish up styling it, and hopefully ship it to the preview channel.
Mikayla
This week, we built out all the core UI and interactions for our Channels feature! Me and Max have had a lot of fun working on this feature, and we're excited to see how quickly it's coming together. As we get close to an internal-MVP, I thought I'd talk a little bit about how and why we've built this feature.
Fundamentally, channels are a way for people to organize where and what they're working on. Just like with our actual work, channels can be organized heirarchically into a rough selection of topics, we're imagining something like:
- #zed
- #livestreaming
- #public-channels
- #channels-core
- #search-and-replace
- #design
- #wasm
Clicking on any of these channels immediately starts a Zed call and broadcasts your presence in the channel to anyone else who's joined. Kinda like a Slack huddle, but with a little more structure and no text chat (yet).
But our work isn't purely heirarchical, it can't always be cleanly seperated into individual topics, and we often collaborate with other teams and projects. To accomodate this reality, we've modeled our channels data structure as a DAG (Directed, Acyclic Graph), allowing you to link things together however it makes sense. As an example, we use LiveKit to power our microphone and screen sharing, so you could imagine a shared channel between our two organizations:
#zed
- > #livestreaming
- #search-and-replace
- #design
- #livekit ๐
- #screen-sharing
- #web-rtc
Or you could have cross-cutting concerns for a specific feature:
#zed
- > #livestreaming
- #search-and-replace
- #replace-design
- #design
- #replace-design
- > #livekit
Or you could organize your channels into teams and projects:
- #crdb-industries
- #design
- #alpha
- #lamport-timestamps
- #branches
- #repository-synchronization
- #november
- #gpui-layout
- #branches
- #repository-synchronization
Still finding the tree too restrictive? Make another channel and organize them however you want!
- #my-freelance-design-business
- > #crdb-industries/#design
- > #zed/#design
What's powering this flexibility is that channels have two fundamental properties: permissions for any channel cascades down to all sub-channels with subtractive interference, and it is impossible to move 'up' the DAG without being invited by someone who has the permissions to do so. This means that, say, if we where to start using these channels to publish livestreams, we could create a #public channel for that:
#zed
- > #livestreaming
- > #search-and-replace
- > #design
- > #livekit ๐
- #public
- #linux-support
- #windows-support
In this #public channel we could designate community members as moderators or generate a read-only link to it. And we could do all of this without ever having to worry that the people in the #public channel could get access to anything at Zed.
Next week, I'll talk about some of the technical details for how we implemented this feature.
(Note that all of what I've said is subject to change, we're still experimenting with what feels right and what works best :D)
Julia
Something important that we've been wanting to do for a while is to give users the ability to add custom language support to Zed. We've been putting this off for a while because of the complexity involved, but this week I got the go-ahead to start working on it. We're starting with just adding custom langauge servers for now as that will let us get a bunch of WASM things figured out before we add extensible tree-sitter grammar support. This means I've gotten to read and experiment with a bunch of cool WebAssembly stuff this week. There's still a bunch of work to do but I'm excited about where we're headed!