Modifying PDF files with PHP

Last week, a friend of mine asked me to help him with a programming problem that he had been wrestling with for some time. The problem sounds simple:

  1. Take a PDF file
  2. Write something at the footer of each page of that file

And this had to be done with PHP.

Although there are several libraries available in PHP for dealing with PDF files, none seem to have capabilities to modify the contents of an existing PDF file. Their manuals/tutorials are full of examples on how to create PDF on the fly. After spending few fruitless hours trying to get the much recommended PDFLib installed in my Mac and have it work with MAMP, I painfully realized this library is for commercial use only. The free version leaves a horrible watermark of their site address on the generated PDF documents.

My search for a solution took me to FPDF, an open-source library for PDF file generation in PHP. In their FAQ section, I found the link to an extension of the library, named FPDI. This one was seemingly capable of ‘manipulating’ PDF files in an ad hoc fashion. It extracts the contents of each page in the file, uses it as a template, lets you put texts/shapes on the template and then outputs the modified file. Excited, I got into coding and after an hour of labor, finally succeeded to achieve my goal! Thank God for creating open source!

Enough talk, now lets get our hand dirty!

First we need to have following libraries downloaded and unzipped. They are just packages of PHP scripts that you just require/include in your own script. No need to deal with .dll/.so extensions.

  1. FPDF
  2. FPDI
  3. FPDF_TPL

Keep them in the same directory of your script, or in the include path. The following code snippet gives a basic idea of how to get started with it:

require_once('fpdf/fpdf.php');
require_once('fpdi/fpdi.php');

$pdf =& new FPDI();
$pdf->AddPage();

//Set the source PDF file
$pagecount = $pdf->setSourceFile("my_existing_pdf.pdf");

//Import the first page of the file
$tpl = $pdf->importPage($i);
//Use this page as template
$pdf->useTemplate($tpl);

#Print Hello World at the bottom of the page

//Go to 1.5 cm from bottom
$pdf->SetY(-15);
//Select Arial italic 8
$pdf->SetFont('Arial','I',8);
//Print centered cell with a text in it
$pdf->Cell(0, 10, "Hello World", 0, 0, 'C');

$pdf->Output("my_modified_pdf.pdf", "F");

The above code takes a PDF file “my_existing_pdf.pdf”, and creates a copy of it “my_modified_pdf.pdf” with “Hello World” printed at the centre bottom of the first page.

That’s it! To achieve my goal, which I outlined at the start of this post, I extended the FPDI class, and overrode the Footer() method to print a customized footer in each page.

I only wish that the PHP online manual did NOT have an entire section dedicated to PDFLib, a non-free and commercial library, and rather point to free ones such FPDF or TCPDF. It could have saved me hours.

Advertisements

26 thoughts on “Modifying PDF files with PHP

  1. Thanks for the great intro! Just the information I needed.

    As a note to anyone trying out the example, you’ll have to replace $this-> with $pdf->, since $this-> isn’t referencing to any objects in the example.

  2. I concur about using FPDF/FPDI. Used it to auto populate a PDF form by simply overlaying the text. Fantastic lib and strongly recommend it.

  3. Sounds glorious BUT it appears to move beyond the footer into the body of the the PDF one will need to know the coordinance position of the data?

  4. Pingback: Blog stats 2010 « Glorified Geek

  5. Pingback: Watermarking PDF using PHP :O | Meja Duniaku

  6. To work with landscape and portrait correctly i am using this:

    $specs = $pdf->getTemplateSize($tpl);
    $pdf->addPage($specs[‘h’] > $specs[‘w’] ? ‘P’ : ‘L’);

    I am having some troubles to put the text at the end of the page. In general it will put it at a new page when I call setY with a value less than “-30”.

    I tried to use $pdf->setMargin(0,0) withou success.

  7. Thanks for this article. I am trying to use it in my project but i got this error.

    Fatal error: Uncaught exception ‘InvalidArgumentException’ with message ‘Invalid page number!’ in /var/www/helpdesk/tools/pdf/pdf_manipulation/fpdi/fpdi_pdf_parser.php:104 Stack trace: #0 /var/www/helpdesk/tools/pdf/pdf_manipulation/fpdi/fpdi.php(192): fpdi_pdf_parser->setPageNo(NULL) #1 /var/www/helpdesk/tools/pdf/pdf.php(13): FPDI->importPage(NULL) #2 {main} thrown in /var/www/helpdesk/tools/pdf/pdf_manipulation/fpdi/fpdi_pdf_parser.php on line 104

    Please help.

  8. Thanks for wonderful example. But in case of Muti page documents, what ever I am writing on first page is being repeated on all pages. Whereas I want to write X on Page 1 and Write Y on Page 2.
    How can it be done ?

  9. This looks pretty cool. Thanks for sharing.can you tell wether its possible to inject input fields into a pdf with that approach ? I would need to make a pdf document “editable” on the server side so on the client side its possible to enter data.

  10. Yea, this is a good tutorial, well written. But I would’ve loved to see a tutorial on actually editing the contents of a pdf file such as text.

    The only useful piece of info I’ve found came from a stackoverflow page, describing the same process you made your post about, but they also added the following:

    “If you’re trying to replace inline content, such as a “[placeholder string],” it gets much more complicated. While it’s technically possible to do, you’re likely to mess up the layout of the page.

    A PDF document is comprised of a set of primitive drawing operations: line here, image here, text chunk there, etc. It does not contain any information about the layout intent of those primitives.”
    src: http://stackoverflow.com/a/7455

  11. Hi, This is very useful for editing PDF files. We need help on one more thing.
    Could you please provide function or class for reading the PDF contents and highlight required data while editing. This is more helpful if you suggest any ways.
    Thanks in advance!

  12. Fantastic! Helped me a lot! Just three things I needed to realize before it worked:

    -The code itself does not work if you do not change the “$i” from “$tpl = $pdf->importPage($i);” for a “1” for example. I guess the “$i” is to iterate between different pages or something but as long as it is not declared in the code it just doesn’t work.

    -I needed to change another thing before it worked: the “&” from “$pdf =& new FPDI();” needs to be erased.

    -The fpdf_tpl.php is already included in the fpdi.php, it even is called from the fpdfi.php with a “require_once(‘fpdf_tpl.php’);” in the line number 12.

    After I figured out these three things which took me a long time as I just learned php today, everything worked well: you run the script from your browser after you saved the pdf to be converted (with the name “my_existing_pdf.pdf”) in the same directory as the script, and a new pdf appears in this same directory under the name “my_modified_pdf.pdf”.

    Thank’s a lot! 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s