Spelunking a PDF

I’m using Hazel to rename and file PDFs based on their content. Unfortunately, my CONTENTS CONTAIN MATCH Rules are not firing and I need to step through exactly what Hazel is “seeing” in order to debug them.

How do I view the content of a PDF in the same way that Hazel sees it?

But people here might have some ideas for spelunking through the nether regions of a PDF.

Tough one. I tried over and over years ago to develop a script in Automate to name pdf files I’d downloaded by title and it ends up that, unless the title is in the metadata, it’s not done. How does one know? One can use Get Info and look at ‘Name and Extension’ and see if there’s a title. If there is, you are in luck. If not, I couldn’t find a way. If you find a better way, I can’t wait to use it. Best, Patrick

I’ve never used Automator so I can’t comment on how to use it to peer into PDF files and rename or move such a file based on its content.

However, I did a few searches and found that Automator has an “Extract PDF Text” command. Perhaps after extracting the text, you can search the extracted text and the rename and move the PDF based on conditions.

Here are links to some web pages that demonstrate uses of this command:

As for Hazel …

Hazel can examine the attributes as well as content of files, including PDFs, and take a variety of actions based on what it finds.

You can find the answer to my question on Hazel’s support board.

