It's been a while since this was posted. Hopefully the information in here is still useful to you (if it isn't please let me know!). If you want to get the new stuff as soon as it's out though, sign up to the mailing list below.Join the Mailing list
tldr; If you’re on arch, not all hope is lost when trying to deal with PDFs.
pdfunite is out there for combining PDFs, Firefox is surprisingly helpful since is uses
pdftk is there if you’re down with downloading the dependencies,
convert is available for paring down scanned images, and ultimately, any software you can run on ubuntu can run on arch with a little
In my case I had to sign some pages of a PDF, then return the whole thing, with the signed pages. After a few seconds of thinking, the obvious answer is to split apart the original PDF, sign the pages I needed to sign, then combine the old pdf (without the signed pages), with the new signed pages. Hacky, yes, and if I was on a different platform, I might have been able to very easily sign the PDF with my mouse.. but let’s not think about that too much.
Firefox is great for this – you can actually use it to open and view the PDF, and then save only certain pages to a new PDF by using the print-to-pdf option.
pdfunite in a relevant stack overflow post and it turns out it’s a super easy command line tool for putting two pdfs together.
This is necessary because of course, when scanning the PDFs, I actually scanned them in the wrong way. Surprisingly, it’s not all that easy to actually rotate pages (or a single page) of a PDF… Going down this rabit hole ultimately lead to a program called
pdftk that seemed to be especially good. Unfortunately, I wasn’t interested in downloading the large list of dependencies that
pdftk would bring into my system.
If only there was a way to isolate
pdftk when running it… Maybe I could even use a distribution that’s better suited/mentioned in all the guides I see…
A great solution to using
pdftk without dirtying my own system too much is to run
pdftk in a Docker container!
The first container I found, aultman-pdftk, seemed supe rusable, but I couldn’t get it to work properly with the entrypoint. This is probably just me being not quite used to using docker containers for specific commands, but I’m sure a smart reader can figure it out.
Since I was in a do-anything-to-make-it-work kind of mood, what I ended up actually doing was:
docker run -it ubuntu /bin/bash)
apt-get update && apt-get install pdftk)
docker cpto move the files that I actually wanted to work on into the container and back
NOTE Again, you should probably just figure out how to use some pdftk container as a command and pipe the input to it or whatever… I just didn’t feel like reading through Docker docs to figure out what I was doing wrong with the command line syntax.
I found that when I finished creating the PDFs they were gigantic. Since my PDFs were basically just scans, I was able to use
convert to pare down their size, after finding a relevant askubuntu question.
The actual command I used was:
convert -density 200x200 -quality 60 -compress jpeg input.pdf output.pdf