SpiderLing

academic SpiderLing

Linguistic research crawler building language corpora

About this crawler

SpiderLing is a web crawler identified by the regular-expression pattern SpiderLing in the User-Agent request header. It is categorised as academic. Use the regex above to detect, log, allow, or block SpiderLing traffic in your web server, CDN edge rules, or robots.txt.

Block-rate · top 25k sites

0.064%
latest snapshot
2026-05-29
matched key: SpiderLing
2026-05-292026-05-290.11%

Technical details

Name
SpiderLing
Pattern
SpiderLing
Tags
academic
Reference
https://nlp.fi.muni.cz/projects/biwec/
Added
2026/04/17
Instances
1 known sample(s)

Sample User-Agent strings

Mozilla/5.0 (compatible; SpiderLing (a SPIDER for LINGustic research); +http://nlp.fi.muni.cz/projects/biwec/)

Block this crawler

robots.txt — disallow SpiderLing:

User-agent: SpiderLing Disallow: /

Apache .htaccess — return 403:

RewriteEngine On RewriteCond %{HTTP_USER_AGENT} SpiderLing [NC] RewriteRule .* - [F,L]

Nginx — return 403 inside a server block:

if ($http_user_agent ~* "SpiderLing") { return 403; }
← back to all crawlers