Showing posts with label pdf. Show all posts
Showing posts with label pdf. Show all posts

Sunday, November 24, 2013

Extracting images from PDF and removing whitespace.

Two tools you need
  1. ImageMagick (convert) - http://www.imagemagick.org/script/binary-releases.php
  2. pdfimages - http://en.wikipedia.org/wiki/Pdfimages
Here the order:
  570  pdfimages ../../20120101.pdf 20120101
  571  convert *.ppm -fuzz 7% -trim ../20120101-%d.jpg

Monday, March 18, 2013

Cropping Margins Existing Template PDF with PDFTK

The key of the re-centering or re-margining is the /Mediabox reference... we have to pan the original template in PDF... ON THE UNCOMPRESSED PDF... The media box [0,0,612,792]
  • Move the box up: [0 -70 612 722]
  • Move the box down: [0 70 612 862]
  • Move the box left: [70 0 682 792]
  • Move the box right: [-70 0 542 792]
 1273  pdftk S89E.pdf output S89E.unc.pdf uncompress
 1274  sed 's/MediaBox \[0 0 612 792\]/MediaBox \[-70 0 542 812\]/g'< S89E.unc.pdf > S89.resized.pdf
Examples
 1301  sed 's/MediaBox \[0 0 612 792\]/MediaBox \[0 70 612 862\]/g'< S89E.unc.pdf > S89.resized.pdf
 1304  sed 's/MediaBox \[0 0 612 792\]/MediaBox \[0 -70 612 722\]/g'< S89E.unc.pdf > S89.resized.pdf
 1305  sed 's/MediaBox \[0 0 612 792\]/MediaBox \[70 0 682 792\]/g'< S89E.unc.pdf > S89.resized.pdf
 1306  sed 's/MediaBox \[0 0 612 792\]/MediaBox \[-70 0 542 792\]/g'< S89E.unc.pdf > S89.resized.pdf

Wednesday, December 28, 2011

PDFTK find pdfs and collate them oneliner

Here a oneliner to merge PDF's together that are in a sub-directory structure. (Prerequisite: installed working version of pdftk)
$ find . -type f -name '*.pdf' | 
      perl -ne 'BEGIN {  my @c=() } 
          {     push @c, $_;$_=undef; } 
                END {my $n=65; foreach (@c) 
                { chomp; $x=$x.sprintf " %c=\"%s\"", $n,$_;$y=$y. sprintf " %c", $n; $n++ 
                }; 
                  print "\npdftk $x shuffle $y output collated.pdf;\n" 
          }'

Wednesday, January 19, 2011

PDF::Reuse and Wide characher in compress

If you have control characters in your $content string you should remove them by forcing the encoding utf-8 for the PDF::Reuse module.

use Encode;
use PDF::Reuse;
# remove the codec formating to avoid 
# "Wide character in compress at /usr/lib/perl5/site_perl/5.10/PDF/Reuse.pm line" 
$content = encode( "utf8", $content );