discover feed fever dream

the ramblings of a madwoman

%at=2025-06-30T23:21:50.498Z

#author_luna #bluesky #recommendation_algorithms

i've been working on a specific project in the past month, it's called nagare.

nagare came to be because of an idea i've had for an even longer time (3/4 months ago) about trying to replicate behaviors of the twitter algorithmic feed (before it went to shit) on a 20usd/mo vps using bluesky's data, as in, a feed that learns what you like and tries to show more of it. the idea lived rent free in my mind until i was successfully tricked into writing the entire idea down in my internal notes and decided to implement a basic version of it and put it out for some friends to vibe check it.

even though the basic version is a more complicated version of likes-from-follows, it works absurdly well for a certain amount of users when they compare it versus bsky discover.

beyond just likes-from-follows, it currently has:

heuristic-based analytics based on interactions
- lets the feed know what you already read
a basic scoring system based on what you like and report (through a mod service)
- basically a very cheap way to make show more/show less that actually works. the reports aren't processed at a global way across all users, just to your own personalized feed
- many thanks to mary who gave the report idea
automatically condensing posts that are from the same thread, posts by the same author, etc
time decay based on user reading patterns

the thing i've been working on last week and a half was a way to expand follow sets because of an issue i've found in testing, better described by a graph:

in this graph, i plot the amount of followers a user has, and the amount of problems they're seeing in their basic likes-from-follows feed. at the low end of the spectrum (10 to 100 followers), the problems they see are shaped as "not enough activity, feed becomes stale", but at the higher follow counts like 1000+ are differently shaped problems, mostly related to having so much activity i don't think i would be able to hold storage for that type of tester.

follow expansion was some basic algorithm that i've worked that tries to fetch likes (via the listRecords XRPC, thanks @bad-example.com!) and expand the set, but that backfired as low quality posts were given on the feeds of everyone (since, for example, i just "virtually followed" all the posts from those users, without any filtering). it took me a long time to revert that change (did it yesterday on the 29th) and i'm now thinking about future things to do on the feed.

what's next? #

there's a lot of behaviors that i could make in the future but can't be made because i treat this as a side-project that shouldn't burn too much money. i have a full time job.

it basically leads me to three options with the current project:

call it quits, finish the project as-is, rebrand as "Likes From Follows" and expose that to the world
- it wouldn't even fully work for me!! i follow 20 accounts!
add more manual work on more obscure feed sources (which is what i'm doing, or trying to do), here's some example ideas:
- "oh let me add a score for someone's images" (being unaware of the content of the image, because a user may create 2 different kinds of images and the user only cares about one type, etc. edge cases happen so much)
- the current scoring system which is backed by splitting a post's text on whitespace lol
- follow expansion (for users with small follow sets), how to decide what users to expose based on the graph?
attempt to get big data and see what things i could do by myself. examples of what i could do:
- get embeddings for all posts and research ways to cluster the data
  - "replicate" cluster/clusterv2 from discover?
  - create user embeddings to find other possible users and what they like?
  - could i do this right now with, say, two days of firehose data?
- scoring models that take the embedding of a post and other signals to emit a 0-1 score value, learned from like data (unknown: what is the negative data? could it be a user's followings posts that weren't liked? don't know!)
- very much bitter lesson related.

option number 3 has a direct cost attached to it, and that's the argument peeking at possible funding for the project. it is a project that may or may not succeed at its job of providing a good alternative to discover, but i can at least say i have something that works right now:

if anyone is interested, contact me