597b4 - 2y
If you notice your Likes taking longer and new notes taking longer to appear in your feed, it's not NostrGram, it's the relays. They're lagging right now from the traffic. Right now I'm seeing an average of 10-15 new events per second going into the db. Given that most relays seem to run on fairly light hardware, that will bog them down. (Yet another reason to support paid relays with good hardware!)
7ef7d - 2y
Has anyone done an analysis of relay architecture? I guess it’s early days. But some of these implementations just don’t look like they’d scale from the outset.
I'm not sure, but I'm pretty sure nobody in the nostr space thought it would take off the way it is now so I imagine that kind of analysis is coming sooner rather than later.
I’m very new to Nostr but I can completely understand that. It could be some time before some well built relays emerge. It seems to me at the moment that relays hit a limit and there are only two solutions: 1) limit clients with paywall, or 2) require more relays Ideally it would be better if a single relay scaled horizontally? Which doesn’t appear to be happening at the moment?
Are any relays sharding?
I haven't looked at the architecture, but as far as I know they aren't.
aff9a - 2y
I don’t know of any linearly scalable relays yet. Is that what you’re looking for? The largest relays have <10,000 concurrent users at this point.
Relays availability is not an issue, there are plenty right now. So users can technically flock around whenever a given relay stops functioning. But that experience might work for niche communities but is obviously is not ideal for any global social network experience. At bigger scales most relays will bow out because the economics are there. I’m already running a relay at a complete loss per month. I have a lot of distributed systems experience and it’d be fun to build out nostr but I couldn’t afford to operate a big cluster which you’d need to do to build it right. Some alternative model would be some kind of sharded peer model where smaller nodes could work together to support global load but need to think how to make that possible and identify bad (flaky or censoring etc) peers.
Luigi @Luigi - 2y
I’m impressed by how reliable is your relay, though. I’m connected to it since the very beginning and I believe I have never seen it down. Thanks!
Thanks! I’ve optimized a lot but I still have work to do in certain areas. It’s hung in there but to be fair I’m a mid-tier relay, ie at about 500 concurrent users whereas the biggest are an order of magnitude over that from what I gather. I’ve loads tested my relay at 10,000 simultaneous active firehose queries and it dies well on modest 2 vcpu hardware. I would like to think I could serve up to a few 100,000 users on a single basic machine eventually. (But I’ll need to monetize to afford the operational costs towards that point.)
“does well” not “dies well” lol 🔪
Elasticsearch would a perfect fit for nostr’s query and access patterns, is linearly scalable, I’m surprised no one has made an ES-based relay yet.
e54b8 - 2y
If a single basic machine can serve 100k+ users, it means Nostr can be undefeated like Bitcoin. Appreciate for your contribution sir!🎖️
That sharded peer model is an interesting idea.
Interesting I will look into it. I’m working on a basic relay just to understand it better. Although it’s slow going as I’m doing it while simultaneously learning the gleam language (built on Erlang BEAM). I can’t say at this stage I can intuit where the scalability issues are.
An early dumb version that could go a long way might look something like this. A relay could publish that it supports a given shard of users. For example, my relay could publish that it supports serving npubs starting with “aa” through “az” and other relays could too. Now clients just have to tell a user if their relay set doesn’t have enough shard coverage. Whenever the next order of magnitude of users arrive, relays would want to reshard and other or underutilized relays would need to redeclare shards. And there would need to be a way to get users on the new shards. Maybe some kind of registries would be in order here. In fact registries could be the key branded thing that clients watch for censorship and change to new ones etc. Historical data would have to get shared around for reshards but there could be people out there that just keep archives or in ipfs or something. You still need clients to keep users able to understand and identify a bad relay share that is censoring or underperforming or even overserving — or a registry that is promoting shards you don’t like. But I wonder how far you could scale nostr globally with a dumb strategy like this? Really important is that niche relays should not do anything like this and should find other ways to scale (monetize etc) but this would be for just the “global” nostr network experience only.
Just saw this too #[4]
*Just* made a note about this -> #[4]
d7b76 - 2y
I proposed some time ago something similar, although it was sharding by note hash, not by user. I would get a set of relays to agree to signal that the participate in a DHT protocol, then build a client to use rendezvous hashing to distribute posts and queries
0b39c - 2y
I’ll share as I get other resources. This thread is talking about the topic. I like lurking on this thread 👀👀 #[3]
21a8d - 2y
Following 👀
How would you distribute query load in this model? The common query is for notes from a user’s follows.
This will depend to a significant degree on which relay implementation you are using. If the implementation doesn't use multiple cores, then you won't see much advantage scaling out vcpus. Also consider what db does the relay use and how do they use it? Does it support concurrent writes? Does the relay index subscriptions so it doesn't have to iterate on every subscription each time a new event arrives? Probably we need to improve our relay code. Very modest hardware with little concurrency should be able to support much more than 20 notes per second. My relay runs on the smallest gcp machine (2 vcpus) and median *note* write latencies are <3ms. This is with little optimization on writes, so should be able to support a least 100s of writes/sec on basic hardware eventually. Indexing subscriptions should mean the time to dispatch a note to subscription should remain small assuming most subscriptions are not firehose/global feed filters.
Good question; I misspoke. The queries would still be issued to all the nodes in the shard group, but each would respond with a subset of the total data, rather than everyone replying with all data
Cameri @Cameri - 2y
wss://eden.nostr.land is load balanced