Java itext Convert Metadata XMP Format Example Tutorial

We have been discussing about creating metadata inside a PDF document using various approaches for a while now. In this post, we will be discussing how to accept an input PDF that has the metadata stored in the info dictionary and update the PDF in such a way that the metadata is converted to XMP stream. We will be using the PdfStamper class of iText Java API to do this conversion. For other fundamental tasks of metadata update / insert to PDF files, you may wish to refer to the related post section at the bottom of this post. The step by step guide to convert an existing PDF document's metadata into an XMP stream format is provided below;

Step-1: We begin this tutorial by creating a PdfReader and PdfStamper objects. We also use the getinfo method of PdfReader object to access the existing Metadata information on the PDF file.
          PdfReader ReadInputPDF;
          ReadInputPDF = new PdfReader("sample.pdf");
          PdfStamper Write_Metadata_XMP =new PdfStamper(ReadInputPDF, new FileOutputStream("XMP_Metadata_Convert.pdf"));
          HashMap<String, String> hMap = ReadInputPDF.getInfo();          
Step-2: We now create an object for XmpWriter(com.itextpdf.text.xml.xmp.XmpWriter) class and pass the hash map obtained in step 1 and a Byte Array output stream to the constructor of the class. A code example to do this is provided below;
          ByteArrayOutputStream byte_array_XMP = new ByteArrayOutputStream();
          XmpWriter new_XMP_Writer = new XmpWriter(byte_array_XMP, hMap);
          new_XMP_Writer.close();
Step-3: Now, we have to write this XMP stream to the PDF file. To do this, we invoke the setXmpMetadata method of the PdfStamper class and pass the byte array output stream to this method. The code is provided below;
          Write_Metadata_XMP.setXmpMetadata(byte_array_XMP.toByteArray());
          Write_Metadata_XMP.close();
The complete code for this example is provided below. This code shows how to convert the info metadata stream to XMP stream and embed that inside a PDF document. You can always use the methods we have provided in the tutorials earlier (refer related posts below) to see if this method works, by reading the metadata again from the updated PDF.
import java.io.*;
import com.itextpdf.text.*;
import com.itextpdf.text.pdf.*;
import java.util.HashMap;
import com.itextpdf.text.xml.xmp.XmpWriter;
public class ConvertXMPMetadata{  
     public static void main(String[] args){
        try {
          PdfReader ReadInputPDF;
          ReadInputPDF = new PdfReader("sample.pdf");
          PdfStamper Write_Metadata_XMP =new PdfStamper(ReadInputPDF, new FileOutputStream("XMP_Metadata_Convert.pdf"));
          HashMap<String, String> hMap = ReadInputPDF.getInfo();          
          ByteArrayOutputStream byte_array_XMP = new ByteArrayOutputStream();
          XmpWriter new_XMP_Writer = new XmpWriter(byte_array_XMP, hMap);
          new_XMP_Writer.close();
          Write_Metadata_XMP.setXmpMetadata(byte_array_XMP.toByteArray());
          Write_Metadata_XMP.close();
          }         
        catch (Exception i)
        {
            System.out.println(i);
        }
    }
}
I noted that an XMP metadata embedded PDF document's size is considerably greater than the standard info dictionary based PDF document. I will keep posting some updates on this analysis in the next post.

3 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Hello,
    Is it possible to load the contents of an external XMP file into a PDF using iText? The goal is to have the metadata coming from the XMP file embedded in the PDF (PDF/A1-A format).
    Kind regards,
    Maarten

    ReplyDelete