Hey everyone,

This isn’t an announcement, just wanted peoples thoughts on this.

I think everyone knows searching the fediverse can be better. Googling doesn’t work too well, etc. So I wanted to do my part and help out.

Indexing all posts, etc is quite a lot to handle, so I wanted to start small and just focus on video search. I’ve started indexing videos from Peertube and other video websites. (Even YouTube but this could be removed to just focus on independent sites)

I know Peertube has their own search engine for videos. I will be reaching out to them. Compared to my site I’m planning it’ll have other video sources and be easier to use.

So that leads to feedback from you guys.

  • What do you think about indexing videos posted on the fediverse and other independent platforms?
  • Are there similar services?
  • Am I just wasting my time?
You are viewing a single thread.
View all comments View context

As the fediverse is almost exclusively run by volunteers that are paying server bills and being admins, I could see some larger instances not taking kindly to this, especially depending on how much stress it would be putting on some already at capacity servers.

permalink
report
parent
reply
18 points

Ideally, OP’s crawlers will just come from their own instance that other instance owners can defederate from if they want to opt out.

permalink
report
parent
reply
14 points

Yeah that would be the case.

permalink
report
parent
reply
5 points

That’s a good idea. Listen to public data being broadcasted out, then you aren’t worrying people with scraping or anything. It would only be from go live onward, but you would just be listening to the protocol.

permalink
report
parent
reply
0 points

How much bandwidth do you suppose a crawler would use? I’d guess very little

permalink
report
parent
reply
2 points

It will be very little if not downloading full html pages.

permalink
report
parent
reply

I was thinking more in terms of resources (number of spider threads X posts/communities/users being indexed) that would be now dedicated to a bot, not so much network traffic that is probably tiny if not downloading images.

permalink
report
parent
reply
1 point

Right, it would be an initial hit but if the bot was properly built it wouldn’t need to do full reindexing very often. I’m no expert but I think it could be done in a way that there is no noticeable spike in traffic or anything

permalink
report
parent
reply

Fediverse

!fediverse@lemmy.world

Create post

A community to talk about the Fediverse and all it’s related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

  • Posts must be on topic.
  • Be respectful of others.
  • Cite the sources used for graphs and other statistics.
  • Follow the general Lemmy.world rules.

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

Community stats

  • 5.3K

    Monthly active users

  • 1.9K

    Posts

  • 65K

    Comments