Content-triggered make rules

code
Makefile
hacks
Trigger a rule when a file hash changes, not just modification time
Author

Chris Markiewicz

Published

26 February 2023

While working on a project where some content needs to be retrieved from an API, it occurred to me that it would be preferable to make a small request that would indicate whether a larger request needs to be made.

The basic pattern turns out to be as follows:

# Cheap fetch that will indicate whether the main target will have changed
file.stamp: FORCE
    curl $(URL1) > file.stamp

# Main target, could trigger additional work locally or remotely
file: file.stamp.md5
    curl $(URL2) > file

# Generate an .md5 file, but only write if the hash changes
%.md5: %
    $(if $(filter-out $(shell cat $@ 2>/dev/null),$(shell md5sum $*)), md5sum $* > $@)

FORCE:

Though I’m using API calls for an example, nothing in this is really specific to URL fetches.

Here our main target is some file that is expensive to generate, or updating it might cascade to the entire build. We will know if we need to rerun file: if some file.stamp changes, but we need to perform some work to see if it needs to change. Therefore we depend on a phony rule FORCE to ensure file.stamp is always run. file.stamp.md5 is then produced by the third rule, but if the checksum is unchanged the file is not written. file: can then depend on file.stamp.md5 and not file.stamp to avoid doing unnecessary work.

To make it concrete, my specific use case is retrieving Zenodo records where my ORCID is listed as a creator. The Zenodo API has a curious bug where asking for a larger number of records than are responsive will simply retrieve additional records, so I need to run a short query to find out how many there are first. While I am at it, I can check what the latest modification time was

Here is a simplified version of my actual Makefile:

ORCID=0000-0002-6533-164X
BASEQUERY="https://zenodo.org/api/records/?q=creators.orcid:$(ORCID)"

zenodo.bib: zenodo.stamp.md5
    $(eval SIZE=$(shell jq .[1] zenodo.stamp ))
    curl -sSL -H 'Accept: application/x-bibtex' "$(BASEQUERY)&size=$(SIZE)" -o zenodo.bib

zenodo.stamp: FORCE
    curl -sSL "$(BASEQUERY)&size=1&sort=-publication_date" | \
    jq "[.hits.hits[0].updated, .hits.total]" > zenodo.stamp

%.md5: %
    $(if $(filter-out $(shell cat $@ 2>/dev/null),$(shell md5sum $*)), md5sum $* > $@)

FORCE:

Here it is working:

$ make zenodo.bib
curl -sSL ""https://zenodo.org/api/records/?q=creators.orcid:0000-0002-6533-164X"&size=1&sort=-publication_date" | \
jq "[.hits.hits[0].updated, .hits.total]" > zenodo.stamp
md5sum zenodo.stamp > zenodo.stamp.md5
curl -sSL -H 'Accept: application/x-bibtex' ""https://zenodo.org/api/records/?q=creators.orcid:0000-0002-6533-164X"&size=25" -o zenodo.bib
$ cat zenodo.stamp
[
  "2023-02-19T02:26:51.701065+00:00",
  25
]
~/tmp 
$ cat zenodo.stamp.md5 
b851094530e336c4df14d0807a28c596  zenodo.stamp

Re-running:

$ make zenodo.bib     
curl -sSL ""https://zenodo.org/api/records/?q=creators.orcid:0000-0002-6533-164X"&size=1&sort=-publication_date" | \
jq "[.hits.hits[0].updated, .hits.total]" > zenodo.stamp
$ ls -l zenodo.stamp*
-rw-rw-r-- 1 chris chris 47 Feb 26 22:00 zenodo.stamp
-rw-rw-r-- 1 chris chris 47 Feb 26 21:59 zenodo.stamp.md5

Note that the zenodo.stamp file is newer than zenodo.stamp.md5.

References