Library Of Congress Web Archiving

archiver Library Of Congress Web Archiving

US Library of Congress web preservation archiving crawler

About this crawler

Library Of Congress Web Archiving is a web crawler identified by the regular-expression pattern Library Of Congress Web Archiving in the User-Agent request header. It is categorised as archiver. Use the regex above to detect, log, allow, or block Library Of Congress Web Archiving traffic in your web server, CDN edge rules, or robots.txt.

Block-rate · top 25k sites

0.065%

latest snapshot
2026-06-04
matched key: Embed PHP Library

Technical details

Name: Library Of Congress Web Archiving
Pattern: Library Of Congress Web Archiving
Tags: archiver
Reference: https://www.loc.gov/programs/web-archiving/about-this-program/
Added: 2026/04/26
Instances: 1 known sample(s)

Sample User-Agent strings

Library Of Congress Web Archiving

Block this crawler

robots.txt — disallow Library Of Congress Web Archiving:

User-agent: Library Of Congress Web Archiving Disallow: /

Apache .htaccess — return 403:

RewriteEngine On RewriteCond %{HTTP_USER_AGENT} Library Of Congress Web Archiving [NC] RewriteRule .* - [F,L]

Nginx — return 403 inside a server block:

if ($http_user_agent ~* "Library Of Congress Web Archiving") { return 403; }

← back to all crawlers