Skip to content

Releases: bottomless-archive-project/document-location-database

2021 - July/August

18 Oct 10:35
b8362c0

Choose a tag to compare

This release is a collection of 240 million URLs with the following file extensions:
pdf, doc, docx, ppt, pptx, xls, xlsx, rtf, mobi, epub

The URL list was acquired by crawling Common Crawl's 2021 August and July dataset.

To merge and uncompress the files, use 7-Zip.