A script to bulk download files from Wikimedia Commons.
Download the script directly: https://git.sr.ht/~nytpu/commons-downloader/blob/master/commons-downloader
Or clone the repo:
git clone https://git.sr.ht/~nytpu/commons-downloader
Then run the script where it is with something like
symlink it into your
commons-downloader -h at any time for an overview.
The main options are
-c will download all matches in a category, and
-s will download all
matches for a search; they can be combined, the downloaded files will be
deduplicated so an intersection between them is not an issue.
-r <URL list file> will resume a download given a list of URLs, and is
mutually exclusive with
At least one of
-r is required to be passed.
-q <add'l query> flags can be added when using
-s to add additional
queries to a search. It has no effect if
-s is not also passed.
commons-downloader -s -q Q173651 -q "African Wild Dog" Lycaon pictus
is equivalent to the search
"Lycaon pictus" OR "Q173651" OR "African Wild Dog"
-o <out directory> will download all files to the given directory, creating
it if necessary. The current directory is the default if
-o is not passed.
The mandatory argument is a category. If only
-s is passed it can be an
arbitrary search query, but if
-c is passed then it must be an official
category. A category can be
verified by visiting
https://commons.wikimedia.org/wiki/Category:<catergory_name>. You can often
find a new category by going to the bottom of a Wikipedia page and looking for
a box that says:
Wikimedia Commons has media related to: <article name> (category)
You can then click the
(category) link to find the Wikimedia Commons
Download all files in the Panthera
and all results for in the search
"Panthera uncia" OR "Q30197" OR "snow leopard" OR "Uncia uncia"
snep/ subdirectory in the current folder:
commons-downloader -cs -o snep -q Q30197 -q "snow leopard" -q "Uncia uncia" Panthera uncia
If the download in the previous command was interrupted, it could be resumed with:
commons-downloader -o snep -r snep/_URLS.txt
The upstream URL of this project is https://git.sr.ht/~nytpu/commons-downloader. Send suggestions, bugs, patches, and other contributions to ~firstname.lastname@example.org. For help sending a patch through email, see https://git-send-email.io. You can browse the list archives at https://lists.sr.ht/~nytpu/public-inbox.
Written in 2021 by nytpu <alex [at] nytpu.com>
To the extent possible under law, the author(s) have dedicated all copyright and related and neighboring rights to this software to the public domain worldwide. This software is distributed without any warranty.