So, I just got this email:
Hi Dan,
I was searching for a study guide for a friend of mine who wants to get his Tech class license. I came across the PDF version of your study guide, downloaded it and started going through it.
I don’t know if it is just me (as I am blind and use a screen reader), but I can not believe how many misspellings there are in this document. For example, the word “find” is always missing the letter “i”. Was this done on purpose? It would make me hesitant to recommend the purchase of the physical book.
Just curious.
Thanks,
John
No one has ever mentioned this to me before, so, I downloaded a fresh copy from my website. Just looking at it, all the words seem correct, but searching for words gives me different results. When I searched for “find,” there were three results, all spelled correctly. When I searched for “fnd,” however, there were 14 results, and when I looked for them in the text, the complete word “find” appears. See the screen shot below.
This happens in both the browser’s PDF viewer and the Mac Preview app, so I’m guessing that the problem is with the PDF file, not the programs
I’m not sure what to make of this, except perhaps that when I created the PDF (I use Libre Office on a Mac), some words got mangled in the export process, such that the fellow’s screen reader and the PDF viewer search functions aren’t seeing some characters even though they are there in the text. I don’t know what screen reader he’s using, but I’ve sent a reply asking for that information. I’ve also asked what other words he’s seeing as misspelled.
Have any of you ever seen something like this? Can you think of anything that I might do to fix my PDF files?
Chris KD8VHL says
I can’t speak to what to do to fix this, but a few notes:
– In a PDF file, the image that’s displayed and the text content accessible to eg. screen readers are different. You can have a page that looks like text but that’s totally unreadable for screen readers and from which users can’t select and copy text.
– It looks like the words in question are those that contain ligatures ( https://en.wikipedia.org/wiki/Orthographic_ligature )
– Quick Googling for “libreoffice ligatures” reveals that LibreOffice may implement ligatures by auto-replacing these character combinations with single characters, which presumably the PDF export process then mangles: https://ask.libreoffice.org/en/question/216867/how-can-i-globally-disable-ligatures/
Hope that helps point you in a useful direction!
Dan KB6NU says
Thanks. I think that’s it!
Karl Heinz - K5KHK says
I was just about to jump in and then saw Chris’
reply. He is right, this is faulty ligature handling when creating the PDF file. If you want help fixing this, send me an email – de K5KHK
Dan KB6NU says
After doing some googling, I found a work-around. Instead of having Open Office export the PDF, you have to print the file, but instead of sending the file to the printer, save it as a PDF. (This is something that you can do on a Mac. I’m not sure if you can also do this on Windows.) When the PDF is created this way, no ligatures are created.
I like this approach better than messing around with auto-correct, and it keeps the ligatures for print version. I still think that the PDF viewers and screen readers should be smart enough to handle ligatures properly, though.
DAVID J SAWICKI - WA3DS says
Look for Microsoft Print to PDF under Print Settings.