I use Duckduckgo, but I realised these big(ish) search engines give me all the commercialised results. Duckduckgo has been going down the slope for years, but not at such a rate as Google or Bing has.
I want to have a search engine that gives me all the small blogs and personal sites.
Does something like this exist?
Don’t know if this fits your criteria, but I’ve been using Gruble a lot recently. You can personalise the look and language in the settings, plus it’s open source.
For info: That’s (just) a SearXNG instance. That’s a metasearch engine, getting results from Google etc and proxying and aggregating them for you.
the link should be: https://gruble.de/. But as stated it’s “just” a SearXNG instance. See the full list: https://searx.space/
I’m intrigued. The search results are more akin to how they used to be 25 years ago on the internet that I loved
Https://Search.marginalia.nu is definitely something I’ll be exploring going forward!
Replying under the top comment but this really applies to all of these, how do these search engines determine what counts as a personal site? For example I had procrastinated for years on finally spinning up a static, barren HTML blog. The infamous Lucidity AI post introduced me to Mataroa and I got over the hump and started writing. Would that get indexed? Etc
Does it just crawl through webrings?
Mojeek
Thanks for the rec, I’ll give Mojeek a try for a while. So far the results seem better than Brave (which I didn’t seriously consider using regularly anyway) but I miss the bang options (!w, !yt, etc.) that DDG has.
our Search Choices might be of use here, different implementation but similar: https://blog.mojeek.com/2022/02/search-choices-enable-freedom-to-seek.html
our Search Choices might be of use here
Thanks, I think that’s a valuable option! It’s probably not what I was looking for. As I understand it, the “bang” use is just a way to use the search on a specific webpage, and is just a nice little hack to speed up searches on commonly used websites (i.e., Wikipedia, YouTube, BBC, etc.) I can probably get used to going straight to those sites, but it was a feature that got me using DDG at first and broke my reliance on Google.
You’re looking for Kagi.com
Not only does it give better search results quality wise on “the big web” - you can select to search specific parts, like blogs.
Best part - it’s completely ad and spam free. You pay for it with actual money instead of with your data.
Why not run an SearXNG instance and help everyone instead? Y’know, Kagi is pretty expensive and they are also getting into AI shit.
I’m hoping just as Proton do good free stuff using money I pay them (Visionary account) Kagi does/will do the same. The Internet as a whole needs to stop being ad-supported.
I refuse to believe Proton when they do advertisements lol. They also are being pretty suspicious with ignoring XMR support since years of people requesting it. If they ever even considered it a bit, their new shit Proton Wallet wouldn’t allow you to store (or only store) bitcoin, which we all know has nothing that protects your privacy.
The Internet as a whole needs to stop being ad-supported.
I’m with you to an extent but it also makes me consider what my online experience would have been if I needed money to do anything online. The internet was a huge part of my childhood and I definitely didn’t have money to spend on it.
We barely had enough to get internet when I was ~10yrs old and it was much later when we got something better than dial up.
I’ve signed up for the €5 a month subscription at kagi and I’ve never used my whole quota.
Granted I expect it’s overly expensive if you live in a developing country like Eritrea or the United States
5 euros a month for 300 searches. Definitely not worth it. I live in germany.
Can you expand on how running your own SearXNG helps others? Does it contribute to some shared index or something?
SearXNG is a meta search engine, which means it gets the search results from other search engines (Google, Bing, Qwant, etc.) and show them to you. It acts a proxy, thus hiding the users IP. This means Google can’t target ads based on your IP and also can’t make a profile about you.
Teclis - Includes search results from Marginalia, free to use at the moment. This search index has been in the past closed down due to abuse.
Kagi, whose creation Teclis is, is a paid search engine (metasearch engine to be more precise) also incorporates these search results in their normal searches. I warmly recommend giving Kagi a try, it’s great, I’ve been enjoying it a lot.
–
Other options I can recommend; You could always try to host your own search engine if you have list of small-web sites in mind or don’t mind spending some effort collecting such list. I personally host Yacy [github link] (and Searxng to interface with yacy and several other self-hosted indexes/search engines such as kiwix wiki’s.). Indexing and crawling your own search results surprisingly is not resource heavy at all, and can be run on your personal machine in the background.
Not just a meta search engine though - they do have their own index as well.
https://help.kagi.com/kagi/search-details/search-sources.html
I tried running yacy for a while but it just ran for a bit less than a day then ran out of memory and crashed, over and over. Tried to figure out the problem, but it’s niche enough that I couldn’t get anywhere googling the issue.
This is a bit off-topic, but did you try to increase the JVM limits inside Yacy’s administration panel?
Spoilering to hide wall of text related to this topic.
This setting located in /Performance_p.html
-page for example gives the java runtime more memory. Same page also has other settings related to ram, such as setting how much memory Yacy must leave unused for the system. (These settings exist so people who run Yacy on their personal machines can have guaranteed resources for more important stuff)
Other things that would reduce memory usage is to limit the concurrency of the crawler for example. There’s quite a lot of tunable settings that can affect memory usage. Would recommend trying to hit up one of the Yacy forums is also good place to ask questions. The Matrix channel (and IRC) are a bit dead, but there are couple of people including myself there!
Also, theres new docs written by the community, they might help as well! https://yacy.net/docs/ https://yacy.net/operation/performance/