I got into the self-hosting scene this year when I wanted to start up my own website run on old recycled thinkpad. A lot of time was spent learning about ufw, reverse proxies, header security hardening, fail2ban.
Despite all that I still had a problem with bots knocking on my ports spamming my logs. I tried some hackery getting fail2ban to read caddy logs but that didnt work for me. I nearly considered giving up and going with cloudflare like half the internet does. But my stubbornness for open source self hosting and the recent cloudflare outages this year have encouraged trying alternatives.

Coinciding with that has been an increase in exposure to seeing this thing in the places I frequent like codeberg. This is Anubis, a proxy type firewall that forces the browser client to do a proof-of-work security check and some other nice clever things to stop bots from knocking. I got interested and started thinking about beefing up security.
I’m here to tell you to try it if you have a public facing site and want to break away from cloudflare It was VERY easy to install and configure with caddyfile on a debian distro with systemctl. In an hour its filtered multiple bots and so far it seems the knocks have slowed down.
My botspam woes have seemingly been seriously mitigated if not completely eradicated. I’m very happy with tonights little security upgrade project that took no more than an hour of my time to install and read through documentation. Current chain is caddy reverse proxy -> points to Anubis -> points to services
Good place to start for install is here


Anubis is an elegant solution to the ai bot scraper issue, I just wish the solution to everything wasn’t just spending compute everywhere. In a world where we need to rethink our energy consumption and generation, even on clients, this is a stupid use of computing power.
It also doesn’t function without JavaScript. If you’re security or privacy conscious chances are not zero that you have JS disabled, in which case this presents a roadblock.
On the flip side of things, if you are a creator and you’d prefer to not make use of JS (there’s dozens of us) then forcing people to go through a JS “security check” feels kind of shit. The alternative is to just take the hammering, and that feels just as bad.
No hate on Anubis. Quite the opposite, really. It just sucks that we need it.
Theres a compute option that doesnt require javascript. The responsibility lays on site owners to properly configure IMO, though you can make the argument its not default I guess.
https://anubis.techaro.lol/docs/admin/configuration/challenges/metarefresh
From docs on Meta Refresh Method
Meta Refresh (No JavaScript)
The
metarefreshchallenge sends a browser a much simpler challenge that makes it refresh the page after a set period of time. This enables clients to pass challenges without executing JavaScript.To use it in your Anubis configuration:
# Generic catchall rule - name: generic-browser user_agent_regex: >- Mozilla|Opera action: CHALLENGE challenge: difficulty: 1 # Number of seconds to wait before refreshing the page algorithm: metarefresh # Specify a non-JS challenge methodThis is not enabled by default while this method is tested and its false positive rate is ascertained. Many modern scrapers use headless Google Chrome, so this will have a much higher false positive rate.
This is news to me! Thanks for enlightening me!
Yeah I actually use the noscript extension and i refuse to just whitelist certain sites unless I’m very certain I trust them.
I run into Anubis checks all the time and while I appreciate the software, having to consistently temporarily whitelist these sites does get cumbersome at times. I hope they make this noJS implementation the default soon.
I feel comfortable hating on Anubis for this. The compute cost per validation is vanishingly small to someone with the existing budget to run a cloud scraping farm, it’s just another cost of doing business.
The cost to actual users though, particularly to lower income segments who may not have compute power to spare, is annoyingly large. There are plenty of complaints out there about Anubis being painfully slow on old or underpowered devices.
Some of us do actually prefer to use the internet minus JS, too.
Plus the minor irritation of having anime catgirls suddenly be a part of my daily browsing.
What would you propose as an alternative?
There’s a caddy config out there that works as well as Anubis without the catgirls and mining: https://fxgn.dev/blog/anubis/
That blog post is fundamentally misunderstanding what Anubis actually does.
No numbers, no testimonials, or even anecdotes… “It works, trust me bro” is not exactly convincing.
Not having catgirls is def a con
I’m with you here. I come from an older time on the Internet. I’m not much of a creator, but I do have websites, and unlike many self-hosters I think, in the spirit of the internet, they should be open to the public as a matter of principle, not cowering away for my own private use behind some encrypted VPN. I want it to be shared. Sometimes that means taking a hammering. It’s fine. It’s nothing that’s going to end the world if it goes down or goes away, and I try not to make a habit of being so irritating that anyone would have much legitimate reason to target me.
I don’t like any of these sort of protections that put the burden onto legitimate users. I get that’s the reality we live in, but I reject that reality, and substitute my own. I understand that some people need to be able to block that sort of traffic to be able to limit and justify the very real costs of providing services for free on the Internet and Anubis does its job for that. But I’m not one of those people. It has yet to cost me a cent above what I have already decided to pay, and until it does, I have the freedom to adhere to my principles on this.
To paraphrase another great movie: Why should any legitimate user be inconvenienced when the bots are the ones who suck. I refuse to punish the wrong party.
deleted by creator
Scarcity is what powers this type of challenge: you have to prove you spent a certain amount of electricity in exchange for access to the site, and because electricity isn’t free, this imposes a dollar cost on bots.
You could skip the detour through hashes/electricity and do something with a proof-of-stake cryptocurrency, and just pay for access. The site owner actually gets compensated instead of burning dead dinosaurs.
Obviously there are practical roadblocks to this today that a JavaScript proof-of-work challenge doesn’t face, but longer term…
The cost here only really impacts regular users, too. The type of users you actually want to block have budgets which easily allow for the compute needed anyways.
I think maybe they wouldn’t if they are trying to scale their operations to scanning through millions of sites and your site is just one of them
Yeah, exactly. A regular user isn’t going to notice an extra few cents on their electricity bill (boiling water costs more), but a data centre certainly will when you scale up.