My name is Philipp C. Heckel and I write about nerdy things.
This site moved here recently from blog.philippheckel.com!

Monthly Archives / August 2009


  • Aug 09 / 2009
  • 0
Linux, Office

Extract text from PDF files

Adobe’s Portable Document Format (PDF) has reached great popularity over the last years and is the number one format for easy document exchange. It comes with great features such as embeddable images and multimedia, but also has rather unpleasant properties. The so called Security Features represent a simple Digital Rights Management (DRM) system and allow PDF authors to restrict the file usage. Using the DRM system, authors can allow or deny actions such as printing a file, commenting or copying content.

Even though this is a good idea for some situations, most of the times, it’s just annoying: Collecting ideas for seminar papers or a thesis, for instance, is almost impossible without being able to Copy & Paste certain paragraphs from the PDF.

Continue Reading