Library Of Congress Web Archiving

archiver Library Of Congress Web Archiving

US Library of Congress web preservation archiving crawler

About this crawler

Library Of Congress Web Archiving is a web crawler identified by the regular-expression pattern Library Of Congress Web Archiving in the User-Agent request header. It is categorised as archiver. Use the regex above to detect, log, allow, or block Library Of Congress Web Archiving traffic in your web server, CDN edge rules, or robots.txt.

Block-rate · top 25k sites

0.065%
latest snapshot
2026-06-04
matched key: Embed PHP Library
2026-05-012026-06-040.11%

Technical details

Name
Library Of Congress Web Archiving
Pattern
Library Of Congress Web Archiving
Tags
archiver
Reference
https://www.loc.gov/programs/web-archiving/about-this-program/
Added
2026/04/26
Instances
1 known sample(s)

Sample User-Agent strings

Library Of Congress Web Archiving

Block this crawler

robots.txt — disallow Library Of Congress Web Archiving:

User-agent: Library Of Congress Web Archiving Disallow: /

Apache .htaccess — return 403:

RewriteEngine On RewriteCond %{HTTP_USER_AGENT} Library Of Congress Web Archiving [NC] RewriteRule .* - [F,L]

Nginx — return 403 inside a server block:

if ($http_user_agent ~* "Library Of Congress Web Archiving") { return 403; }
← back to all crawlers