Library Of Congress Web Archiving
US Library of Congress web preservation archiving crawler
About this crawler
Library Of Congress Web Archiving is a web crawler identified by the regular-expression pattern Library Of Congress Web Archiving in the User-Agent request header. It is categorised as archiver. Use the regex above to detect, log, allow, or block Library Of Congress Web Archiving traffic in your web server, CDN edge rules, or robots.txt.
Block-rate · top 25k sites
0.065%
Technical details
- Name
- Library Of Congress Web Archiving
- Pattern
Library Of Congress Web Archiving- Tags
- archiver
- Reference
- https://www.loc.gov/programs/web-archiving/about-this-program/
- Added
- 2026/04/26
- Instances
- 1 known sample(s)
Sample User-Agent strings
Library Of Congress Web Archiving
Block this crawler
robots.txt — disallow Library Of Congress Web Archiving:
User-agent: Library Of Congress Web Archiving
Disallow: /
Apache .htaccess — return 403:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} Library Of Congress Web Archiving [NC]
RewriteRule .* - [F,L]
Nginx — return 403 inside a server block:
if ($http_user_agent ~* "Library Of Congress Web Archiving") {
return 403;
}
← back to all crawlers