Merge searchable PDFs

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Merge searchable PDFs

Maurice van Peursem
I recently bought the Plustek OpticSlim 1180 A3 scanner, which has a
bit clunky Mac driver, but it works. It can produce 'searchable
PDFs', where you still see the scanned picture in the PDF-viewer, but
you can select (parts of) the text, and you can search your scans
with Spotlight. The OCR works quite well; I was pleasantly surprised.
There are a few errors, but that is not really a problem, you're
reading the scan and not the result of the OCR; but you can at least
find text in your scans!

However, the Mac software produces one PDF file per page; it should
be possible to scan more than one page into one PDF file, but that
doesn't work. So I'm left with a number of single-page PDF files
which I want to merge into one PDF file. I've tried a few programs
that are supposed to be able to do that (Preview, PDFSuite, PDF
Reader), however they (apparently) all use the same method which
destroys the text-part of the PDF. The scans themselves still look
normal, but if you try to select text you get garbage, that looks
like random utf characters.

Has anyone here had that problem, and found a solution?

Maurice


____________TidBITS Talk Participation Guidelines____________
Post only when you have something substantive to contribute.
Be polite and constructive, and comment on posts, not people.
Quote sparingly, if at all. We all read the previous message.
Start threads with a new message to [hidden email].
Read archives at: http://tidbits.com/pipermail/tidbits-talk/
Unsubscribe at: http://tidbits.com/mailman/options/tidbits-talk
____Mailing List Manners: http://tidbits.com/series/1141 ____
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Merge searchable PDFs

"John Turner the Bear😎"
I have a lot of success with PDF Expert by Readdle on the App Store.

Bear😎

On Jul 1, 2017, 8:48 PM -0400, Maurice van Peursem <[hidden email]>, wrote:
I recently bought the Plustek OpticSlim 1180 A3 scanner, which has a
bit clunky Mac driver, but it works. It can produce 'searchable
PDFs', where you still see the scanned picture in the PDF-viewer, but
you can select (parts of) the text, and you can search your scans
with Spotlight. The OCR works quite well; I was pleasantly surprised.
There are a few errors, but that is not really a problem, you're
reading the scan and not the result of the OCR; but you can at least
find text in your scans!

However, the Mac software produces one PDF file per page; it should
be possible to scan more than one page into one PDF file, but that
doesn't work. So I'm left with a number of single-page PDF files
which I want to merge into one PDF file. I've tried a few programs
that are supposed to be able to do that (Preview, PDFSuite, PDF
Reader), however they (apparently) all use the same method which
destroys the text-part of the PDF. The scans themselves still look
normal, but if you try to select text you get garbage, that looks
like random utf characters.

Has anyone here had that problem, and found a solution?

Maurice


____________TidBITS Talk Participation Guidelines____________
Post only when you have something substantive to contribute.
Be polite and constructive, and comment on posts, not people.
Quote sparingly, if at all. We all read the previous message.
Start threads with a new message to [hidden email].
Read archives at: http://tidbits.com/pipermail/tidbits-talk/
Unsubscribe at: http://tidbits.com/mailman/options/tidbits-talk
____Mailing List Manners: http://tidbits.com/series/1141 ____



____________TidBITS Talk Participation Guidelines____________
Post only when you have something substantive to contribute.
Be polite and constructive, and comment on posts, not people.
Quote sparingly, if at all. We all read the previous message.
Start threads with a new message to [hidden email].
Read archives at: http://tidbits.com/pipermail/tidbits-talk/
Unsubscribe at: http://tidbits.com/mailman/options/tidbits-talk
____Mailing List Manners: http://tidbits.com/series/1141 ____
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Merge searchable PDFs

Maurice van Peursem
Yes, PDF Expert worked, but to buy a program that costs $60 only to
merge a few PDF files, that is too expensive for me... But thanks for
the tip.

Maurice

>I have a lot of success with PDF Expert by Readdle on the App Store.
>
>Bear?
>
>On Jul 1, 2017, 8:48 PM -0400, Maurice van Peursem
><[hidden email]>, wrote:
>
>>I recently bought the Plustek OpticSlim 1180 A3 scanner, which has a
>>bit clunky Mac driver, but it works. It can produce 'searchable
>>PDFs', where you still see the scanned picture in the PDF-viewer, but
>>you can select (parts of) the text, and you can search your scans
>>with Spotlight. The OCR works quite well; I was pleasantly surprised.
>>There are a few errors, but that is not really a problem, you're
>>reading the scan and not the result of the OCR; but you can at least
>>find text in your scans!
>>
>>However, the Mac software produces one PDF file per page; it should
>>be possible to scan more than one page into one PDF file, but that
>>doesn't work. So I'm left with a number of single-page PDF files
>>which I want to merge into one PDF file. I've tried a few programs
>>that are supposed to be able to do that (Preview, PDFSuite, PDF
>>Reader), however they (apparently) all use the same method which
>>destroys the text-part of the PDF. The scans themselves still look
>>normal, but if you try to select text you get garbage, that looks
>>like random utf characters.
>>
>>Has anyone here had that problem, and found a solution?
>>
>>Maurice


____________TidBITS Talk Participation Guidelines____________
Post only when you have something substantive to contribute.
Be polite and constructive, and comment on posts, not people.
Quote sparingly, if at all. We all read the previous message.
Start threads with a new message to [hidden email].
Read archives at: http://tidbits.com/pipermail/tidbits-talk/
Unsubscribe at: http://tidbits.com/mailman/options/tidbits-talk
____Mailing List Manners: http://tidbits.com/series/1141 ____
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Merge searchable PDFs

John Burt
In reply to this post by Maurice van Peursem
The Mac app Image Capture in 10.9.5 has a check box you can select that says "Combine Into Single Document." I used it once with a Canon scanner to combine a many page newsletter into a single PDF document. But I didn't try searching the product. 

John

On Sat, Jul 1, 2017 at 5:45 PM, Maurice van Peursem <[hidden email]> wrote:

Has anyone here had that problem, and found a solution?

Maurice


____________TidBITS Talk Participation Guidelines____________
Post only when you have something substantive to contribute.
Be polite and constructive, and comment on posts, not people.
Quote sparingly, if at all. We all read the previous message.
Start threads with a new message to [hidden email].
Read archives at: http://tidbits.com/pipermail/tidbits-talk/
Unsubscribe at: http://tidbits.com/mailman/options/tidbits-talk
____Mailing List Manners: http://tidbits.com/series/1141 ____




____________TidBITS Talk Participation Guidelines____________
Post only when you have something substantive to contribute.
Be polite and constructive, and comment on posts, not people.
Quote sparingly, if at all. We all read the previous message.
Start threads with a new message to [hidden email].
Read archives at: http://tidbits.com/pipermail/tidbits-talk/
Unsubscribe at: http://tidbits.com/mailman/options/tidbits-talk
____Mailing List Manners: http://tidbits.com/series/1141 ____
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Merge searchable PDFs

Rodney
In reply to this post by Maurice van Peursem
On Jul 2, 2017, at 02:45, Maurice van Peursem <[hidden email]> wrote:

Has anyone here had that problem, and found a solution?

Just a thought, can you scan without the OCR? If so, perhaps you could do the merge using Preview, whatever, then OCR the resulting multi-page PDF.



____________TidBITS Talk Participation Guidelines____________
Post only when you have something substantive to contribute.
Be polite and constructive, and comment on posts, not people.
Quote sparingly, if at all. We all read the previous message.
Start threads with a new message to [hidden email].
Read archives at: http://tidbits.com/pipermail/tidbits-talk/
Unsubscribe at: http://tidbits.com/mailman/options/tidbits-talk
____Mailing List Manners: http://tidbits.com/series/1141 ____
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Merge searchable PDFs

Maurice van Peursem
Re: Merge searchable PDFs
On Jul 2, 2017, at 02:45, Maurice van Peursem <[hidden email]> wrote:

Has anyone here had that problem, and found a solution?

Just a thought, can you scan without the OCR? If so, perhaps you could do the merge using Preview, whatever, then OCR the resulting multi-page PDF.

Unfortunately, the OCR only works with a scan, undoubtedly to make sure you have bought the scanner when using the software (the software can be downloaded freely from their website).

Maurice



____________TidBITS Talk Participation Guidelines____________
Post only when you have something substantive to contribute.
Be polite and constructive, and comment on posts, not people.
Quote sparingly, if at all. We all read the previous message.
Start threads with a new message to [hidden email].
Read archives at: http://tidbits.com/pipermail/tidbits-talk/
Unsubscribe at: http://tidbits.com/mailman/options/tidbits-talk
____Mailing List Manners: http://tidbits.com/series/1141 ____
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Merge searchable PDFs

"John Turner the Bear😎"
There is an app that does use OCR on PDFs and converts to Word on the Mac, I use it all the time…PDF to Word OCR.app 5.1.2.

Bear😎

On Jul 2, 2017, 8:19 AM -0400, Maurice van Peursem <[hidden email]>, wrote:
On Jul 2, 2017, at 02:45, Maurice van Peursem <[hidden email]> wrote:

Has anyone here had that problem, and found a solution?

Just a thought, can you scan without the OCR? If so, perhaps you could do the merge using Preview, whatever, then OCR the resulting multi-page PDF.

Unfortunately, the OCR only works with a scan, undoubtedly to make sure you have bought the scanner when using the software (the software can be downloaded freely from their website).

Maurice


____________TidBITS Talk Participation Guidelines____________
Post only when you have something substantive to contribute.
Be polite and constructive, and comment on posts, not people.
Quote sparingly, if at all. We all read the previous message.
Start threads with a new message to [hidden email].
Read archives at: http://tidbits.com/pipermail/tidbits-talk/
Unsubscribe at: http://tidbits.com/mailman/options/tidbits-talk
____Mailing List Manners: http://tidbits.com/series/1141 ____



____________TidBITS Talk Participation Guidelines____________
Post only when you have something substantive to contribute.
Be polite and constructive, and comment on posts, not people.
Quote sparingly, if at all. We all read the previous message.
Start threads with a new message to [hidden email].
Read archives at: http://tidbits.com/pipermail/tidbits-talk/
Unsubscribe at: http://tidbits.com/mailman/options/tidbits-talk
____Mailing List Manners: http://tidbits.com/series/1141 ____
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Merge searchable PDFs

Rodney
In reply to this post by Maurice van Peursem

On Jul 2, 2017, at 14:16, Maurice van Peursem <[hidden email]> wrote:

Unfortunately, the OCR only works with a scan, undoubtedly to make sure you have bought the scanner when using the software (the software can be downloaded freely from their website).

That scanner isn’t the only source of OCR software. I think that there are web sites that’ll do it for free (assuming the content isn’t confidential).

If you’d only tried one app, and it was unable to preserve the OCR, then I might be inclined to blame the app. However, if you’ve had the same problem with multiple apps, then my suspicion is that the problem might be with the way the scanner software is imbedding the OCR and not with the apps you’ve used to stitch the PDFs together. If that’s the case, then the only way you’re going to do this is by doing the OCR separately after you’ve combined the PDFs.



____________TidBITS Talk Participation Guidelines____________
Post only when you have something substantive to contribute.
Be polite and constructive, and comment on posts, not people.
Quote sparingly, if at all. We all read the previous message.
Start threads with a new message to [hidden email].
Read archives at: http://tidbits.com/pipermail/tidbits-talk/
Unsubscribe at: http://tidbits.com/mailman/options/tidbits-talk
____Mailing List Manners: http://tidbits.com/series/1141 ____
Loading...