sistrix Crawler
SISTRIX SEO tool web crawler for analysis
About this crawler
sistrix Crawler is a web crawler identified by the regular-expression pattern (sistrix|SISTRIX) [cC]rawler in the User-Agent request header. It is categorised as seo. Use the regex above to detect, log, allow, or block sistrix Crawler traffic in your web server, CDN edge rules, or robots.txt.
Block-rate · top 25k sites
0.065%
Technical details
- Name
- sistrix Crawler
- Pattern
(sistrix|SISTRIX) [cC]rawler- Tags
- seo
- Reference
- https://www.sistrix.com/tutorials/crawling-errors-in-the-optimizer/
- Added
- 2011/08/02
- rDNS suffixes
.sistrix.com,.sistrix.net- Instances
- 1 known sample(s)
rDNS verification (FCrDNS)
Verify a request is genuinely sistrix Crawler with forward-confirmed reverse DNS: the client IP's PTR record must end in one of the suffixes below and a forward A/AAAA lookup of that hostname must return the same IP. UA strings alone are spoofable; FCrDNS is not.
.sistrix.com.sistrix.net
Sample User-Agent strings
Mozilla/5.0 (compatible; SISTRIX Crawler; http://crawler.sistrix.net/)
Block this crawler
robots.txt — disallow sistrix Crawler:
User-agent: sistrix Crawler
Disallow: /
Apache .htaccess — return 403:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (sistrix|SISTRIX) [cC]rawler [NC]
RewriteRule .* - [F,L]
Nginx — return 403 inside a server block:
if ($http_user_agent ~* "(sistrix|SISTRIX) [cC]rawler") {
return 403;
}
← back to all crawlers