Java Example Split PDF File Using iText Tutorial

In this tutorial, we will discuss how to split a single PDF document into multiple PDF files in Java (with iText) using a simple example. We earlier discussed how to merge multiple PDF files into a single PDF document using Java code. This guide does the opposite. Once you read through this section, you will find that separating an input file into multiple chunks is much easier when compared to merging.

To get started with Splitting a PDF document, I have an input PDF file of two pages and I would like to programmatically split it into two  documents, each containing of one page. Later on, we will also discuss how to split PDF files based on size of PDF documents with suitable examples. At a high level, bursting a PDF document (in PDF terminologies) involves using a PdfReader object to read each and every page and create Document object to stamp the contents into a new Document object. When doing this, we will have to perform additional steps like generate dynamic file names, get number of pages in PDF document and so on. Again, this tutorial is for beginners in iText / Java. Follow the step by step procedure to slice a PDF file;

Step-1: Begin with creating a PdfReader object this time, that takes an input PDF file for breaking into multiple files. We also declare static Document and PdfCopy objects, for using during the splitting operation. Identify the number of pages in the input document to an integer variable. Refer to the code below for how to do this;
          PdfReader Split_PDF_Document = new PdfReader("CombinedPDFDocument.pdf");
          Document document; /* To be used Dynamically to construct Individual PDFs */
          PdfCopy copy; /* To import pages from Source Document */
          /* Get Number of Pages in Source Document */
          int number_of_pages = Split_PDF_Document.getNumberOfPages();
Step-2: Once you have the number of pages, it becomes really simple now. All you have to do is to loop through for the number of pages and create a Document object inside the loop. You should also generate a file name dynamically and use the same getImportedPage method to copy the contents into the new file (one page at a time for this example). Here is a sample on how to do this in iText;
          for (int i = 0; i < number_of_pages; ) {
                  document = new Document();
                  String FileName="File"+ ++i+".pdf"; /* Split File Name, Dynamically generated */
                  /* Instantiate PdfCopy object for new Document */
                  copy = new PdfCopy(document,new FileOutputStream(FileName));
                  document.open();
                  /* Add page to new document from source file */
                  copy.addPage(copy.getImportedPage(Split_PDF_Document, i));
                  /* Close the file */
                  document.close();
                                }
And, you only need to do this much to split a PDF document in Java. When you run this code, if your input document contains 100 pages, you will get 100 individual PDF files due to the disjoin code. Does this not look too much? You can always tweak the looping code in such a way so that your splitted documents contain more than one page. In this way, you can reduce the number of  partitioned files. We will examine this when we see some advanced examples for dividing a PDF file. The complete working code is provided below;
import java.io.*;
import com.itextpdf.text.*;
import com.itextpdf.text.pdf.*;
public class SplitPDF {  
     public static void main(String[] args){
        try {
          PdfReader Split_PDF_Document = new PdfReader("CombinedPDFDocument.pdf");
          Document document; 
          PdfCopy copy;           
          int number_of_pages = Split_PDF_Document.getNumberOfPages();
          for (int i = 0; i < number_of_pages; ) {
                  document = new Document();
                  String FileName="File"+ ++i+".pdf";                     
                  copy = new PdfCopy(document,new FileOutputStream(FileName));
                  document.open();                
                  copy.addPage(copy.getImportedPage(Split_PDF_Document, i));              
                  document.close();
                                }
        }
        catch (Exception i)
        {
            System.out.println(i);
        }
    }
}
When you run this example, make sure that you have the input file set in the same location where the class file resides. Or, change it to suit to your location needs
If you are looking for an example to Split PDF documents based on file size in Java, refer to the link in this alert.

2 comments:

  1. can there be any way to split pdf by hiding watermark for show when display on the screen.

    ReplyDelete
  2. are there any other libraries available other than iText to split pdf documents?

    ReplyDelete