Y
yacybot
YaCy decentralized search engine web crawler
About this crawler
yacybot is a web crawler identified by the regular-expression pattern yacybot in the User-Agent request header. It is categorised as search-engine. Use the regex above to detect, log, allow, or block yacybot traffic in your web server, CDN edge rules, or robots.txt.
Block-rate · top 25k sites
0.33%
Technical details
- Name
- yacybot
- Pattern
yacybot- Tags
- search-engine
- rDNS suffixes
.yacy.net,.yacybot.net- Instances
- 47 known sample(s)
rDNS verification (FCrDNS)
Verify a request is genuinely yacybot with forward-confirmed reverse DNS: the client IP's PTR record must end in one of the suffixes below and a forward A/AAAA lookup of that hostname must return the same IP. UA strings alone are spoofable; FCrDNS is not.
.yacy.net.yacybot.net
Sample User-Agent strings
yacybot (/global; amd64 FreeBSD 10.3-RELEASE; java 1.8.0_77; GMT/en) http://yacy.net/bot.html
yacybot (/global; amd64 FreeBSD 10.3-RELEASE-p7; java 1.7.0_95; GMT/en) http://yacy.net/bot.html
yacybot (-global; amd64 FreeBSD 9.2-RELEASE-p10; java 1.7.0_65; Europe/en) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 2.6.32-042stab093.4; java 1.7.0_65; Etc/en) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 2.6.32-042stab094.8; java 1.7.0_79; America/en) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 2.6.32-042stab108.8; java 1.7.0_91; America/en) http://yacy.net/bot.html
yacybot (-global; amd64 Linux 2.6.32-042stab111.11; java 1.7.0_79; Europe/en) http://yacy.net/bot.html
yacybot (-global; amd64 Linux 2.6.32-042stab116.1; java 1.7.0_79; Europe/en) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 2.6.32-573.3.1.el6.x86_64; java 1.7.0_85; Europe/en) http://yacy.net/bot.html
yacybot (-global; amd64 Linux 3.10.0-229.4.2.el7.x86_64; java 1.7.0_79; Europe/en) http://yacy.net/bot.html
yacybot (-global; amd64 Linux 3.10.0-229.4.2.el7.x86_64; java 1.8.0_45; Europe/en) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 3.10.0-229.7.2.el7.x86_64; java 1.8.0_45; Europe/en) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 3.10.0-327.22.2.el7.x86_64; java 1.7.0_101; Etc/en) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 3.11.10-21-desktop; java 1.7.0_51; America/en) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 3.12.1; java 1.7.0_65; Europe/en) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 3.13.0-042stab093.4; java 1.7.0_79; Europe/de) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 3.13.0-042stab093.4; java 1.7.0_79; Europe/en) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 3.13.0-45-generic; java 1.7.0_75; Europe/en) http://yacy.net/bot.html
yacybot (-global; amd64 Linux 3.13.0-61-generic; java 1.7.0_79; Europe/en) http://yacy.net/bot.html
yacybot (/global; amd64 Linux 3.13.0-74-generic; java 1.7.0_91; Europe/en) http://yacy.net/bot.html
+ 27 more samples in crawlers.json
Block this crawler
robots.txt — disallow yacybot:
User-agent: yacybot
Disallow: /
Apache .htaccess — return 403:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} yacybot [NC]
RewriteRule .* - [F,L]
Nginx — return 403 inside a server block:
if ($http_user_agent ~* "yacybot") {
return 403;
}
← back to all crawlers