Split PDF by Size Example PeopleCode iText Tutorial

In our previous post, we provided a tutorial that explained how to split a PDF document based on file size in Java Using the iText library. In this example, we will discuss how to divide a PDF file into different parts with a specific size in PeopleSoft. We will give a PeopleCode example that will invoke the iText API using JavaObjects and slice a PDF files into equivalent parts based on the size. To run this example, you have to make sure that the iText Jar file is available in your application server or process scheduler classpath. I have placed version 2.1.7 of iText Jar file in my application server class path directory. If you are placing this Jar file for the first time, make sure you bounce your server once for Peoplecode to pick the latest file. I have an input PDF file of size around ~ 1.7 MB and through PeopleCode I will partition this into multiple parts, each around 500 Kb. The PeopleCode guide to Split a PDF file by filesize is provided below;

Step-1: We begin by creating a JavaObject using CreateJavaObject, that will consume the input file that needs to be chunked into multiple part of equivalent size. This will be a PdfReader object [com.lowagie.text.pdf.PdfReader]. We also get the number of pages in the input file by using getNumberofPages method. The PeopleCode is provided below;
Local JavaObject &obj_Split_PDF_Size = CreateJavaObject("com.lowagie.text.pdf.PdfReader", "Split_PDF_Size.pdf");
Local number &NumberofPages = &obj_Split_PDF_Size.getNumberOfPages();
Step-2: We declare variables in Peoplecode, one to hold dynamic file name, one integer variable to hold the file size, one float variable to get the file size in kilobytes. The declaration statements are given below;
Local string &splitfilename;
Local integer &filesize, &filecounter;
Local float &FileSizeinKb = 0;
Step-3: We now run a for loop for the number of pages. Inside the loop, we extract a page from input PDF file using getImportedPage method, and copy it to the new PDF file using addPage method of PdfCopy object [com.lowagie.text.pdf.PdfCopy]. After adding a page, we estimate the size of the new PDF file using getCurrentDocumentSize method. This returns the size of the file in bytes and we get this in Kilobytes and check the threshold value inside the loop. If the threshold value is exceeded or if the page being imported is the last page in the input PDF, we close the Document object and reset the file size counter. (so that next file can be created). The PeopleCode is provided below;
For &i = 1 To &NumberofPages
   If &FileSizeinKb = 0 Then
      Local JavaObject &Obj_SplitPDFDocument_l = CreateJavaObject("com.lowagie.text.Document");
      &filecounter = &filecounter + 1;
      &splitfilename = "PeopleSoft_File" | &filecounter | ".pdf";
      Local JavaObject &Obj_PdfCopyObject_l = CreateJavaObject("com.lowagie.text.pdf.PdfCopy", &Obj_SplitPDFDocument_l, CreateJavaObject("java.io.FileOutputStream", &splitfilename, True));
      &Obj_SplitPDFDocument_l.open();
   End-If;
   &Obj_PdfCopyObject_l.addPage(&Obj_PdfCopyObject_l.getImportedPage(&obj_Split_PDF_Size, &i));
   &filesize = &Obj_PdfCopyObject_l.getCurrentDocumentSize();
   &FileSizeinKb = &filesize / 1024;
   If &FileSizeinKb > 495 Or
         &i = &NumberofPages Then
      &Obj_SplitPDFDocument_l.close();
      &FileSizeinKb = 0;
   End-If;
End-For;
The complete PeopleCode example to split a PDF file by size using Java iText library is provided below. This example is commented at each step so that it will be easy for the readers to understand and customize the logic to suit their needs;

/* Read input PDF that needs to be split */
Local JavaObject &obj_Split_PDF_Size = CreateJavaObject("com.lowagie.text.pdf.PdfReader", "Split_PDF_Size.pdf");
/* Get the number of pages in the PDF file */
Local number &NumberofPages = &obj_Split_PDF_Size.getNumberOfPages();
/* This variable holds the name of the new PDF files that will be created during this bursting */
Local string &splitfilename;
/* Variables to hold the filesize, filecounter is to provide running sequence value to dynamic files */
Local integer &filesize, &filecounter;
/* Variable to get filesize in kilobytes */
Local float &FileSizeinKb = 0;
/* Run this loop for all the pages in the input file */
For &i = 1 To &NumberofPages
   If &FileSizeinKb = 0 Then /* Create new file when Filesize exceeds threshold */
      /* Create Document object for new PDF file */
      Local JavaObject &Obj_SplitPDFDocument_l = CreateJavaObject("com.lowagie.text.Document");
      /* Update counter for dynamic file name */
      &filecounter = &filecounter + 1;
      /* generate unique file name */
      &splitfilename = "PeopleSoft_File" | &filecounter | ".pdf";
      /* Create PdfCopy object and pass Document object and file name. This object is required to import page
      from original file */
      Local JavaObject &Obj_PdfCopyObject_l = CreateJavaObject("com.lowagie.text.pdf.PdfCopy", &Obj_SplitPDFDocument_l, CreateJavaObject("java.io.FileOutputStream", &splitfilename, True));
      /* Open Document object to add pages */
      &Obj_SplitPDFDocument_l.open();
   End-If;
   /* Add page to new PDF document */
   &Obj_PdfCopyObject_l.addPage(&Obj_PdfCopyObject_l.getImportedPage(&obj_Split_PDF_Size, &i));
   /* Estimate new PDF size after adding current page */
   &filesize = &Obj_PdfCopyObject_l.getCurrentDocumentSize();
   /* Get total size in kilo bytes.Convert bytes to kilobytes in PeopleCode */
   &FileSizeinKb = &filesize / 1024;
   /* 495 is threshold value, reset variable when threshold is reached or if the page being read is the last page.
   We have to close the Document object in this case so that EOF file information is written */
   If &FileSizeinKb > 495 Or
         &i = &NumberofPages Then
      &Obj_SplitPDFDocument_l.close();
      &FileSizeinKb = 0;
   End-If;
End-For;
I noted that PeopleCode allows to create object inside For loop, but Java example requires object to be declared outside the loop also. Otherwise, the Java code will not compile and will throw the following error "Variable might not have been initialized", even though we handle it properly in the program
This code provided four PDF files as output, three of size around 500KB and one of size around 250 KB. The example works!

1 comment: