Class PDFStreamEngine

java.lang.Object
org.apache.pdfbox.contentstream.PDFStreamEngine
Direct Known Subclasses:
PDFGraphicsStreamEngine, PDFMarkedContentExtractor, PDFTextStripper

public abstract class PDFStreamEngine extends Object
Processes a PDF content stream and executes certain operations. Provides a callback interface for clients that want to do things with the stream.
Author:
Ben Litchfield
  • Constructor Details

    • PDFStreamEngine

      protected PDFStreamEngine()
      Creates a new PDFStreamEngine.
  • Method Details

    • registerOperatorProcessor

      @Deprecated public void registerOperatorProcessor(String operator, OperatorProcessor op)
      Deprecated.
      Register a custom operator processor with the engine.
      Parameters:
      operator - The operator as a string.
      op - Processor instance.
    • addOperator

      public final void addOperator(OperatorProcessor op)
      Adds an operator processor to the engine.
      Parameters:
      op - operator processor
    • processPage

      public void processPage(PDPage page) throws IOException
      This will initialize and process the contents of the stream.
      Parameters:
      page - the page to process
      Throws:
      IOException - if there is an error accessing the stream
    • showTransparencyGroup

      public void showTransparencyGroup(PDTransparencyGroup form) throws IOException
      Shows a transparency group from the content stream.
      Parameters:
      form - transparency group (form) XObject
      Throws:
      IOException - if the transparency group cannot be processed
    • showForm

      public void showForm(PDFormXObject form) throws IOException
      Shows a form from the content stream.
      Parameters:
      form - form XObject
      Throws:
      IOException - if the form cannot be processed
    • processSoftMask

      protected void processSoftMask(PDTransparencyGroup group) throws IOException
      Processes a soft mask transparency group stream.
      Parameters:
      group - the transparency group.
      Throws:
      IOException
    • processTransparencyGroup

      protected void processTransparencyGroup(PDTransparencyGroup group) throws IOException
      Processes a transparency group stream.
      Parameters:
      group - the transparency group.
      Throws:
      IOException
    • processType3Stream

      protected void processType3Stream(PDType3CharProc charProc, Matrix textRenderingMatrix) throws IOException
      Processes a Type 3 character stream.
      Parameters:
      charProc - Type 3 character procedure
      textRenderingMatrix - the Text Rendering Matrix
      Throws:
      IOException - if there is an error reading or parsing the character content stream.
    • processAnnotation

      protected void processAnnotation(PDAnnotation annotation, PDAppearanceStream appearance) throws IOException
      Process the given annotation with the specified appearance stream.
      Parameters:
      annotation - The annotation containing the appearance stream to process.
      appearance - The appearance stream to process.
      Throws:
      IOException - If there is an error reading or parsing the appearance content stream.
    • processTilingPattern

      protected final void processTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace) throws IOException
      Process the given tiling pattern.
      Parameters:
      tilingPattern - the tiling pattern
      color - color to use, if this is an uncoloured pattern, otherwise null.
      colorSpace - color space to use, if this is an uncoloured pattern, otherwise null.
      Throws:
      IOException - if there is an error reading or parsing the tiling pattern content stream.
    • processTilingPattern

      protected final void processTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace, Matrix patternMatrix) throws IOException
      Process the given tiling pattern. Allows the pattern matrix to be overridden for custom rendering.
      Parameters:
      tilingPattern - the tiling pattern
      color - color to use, if this is an uncoloured pattern, otherwise null.
      colorSpace - color space to use, if this is an uncoloured pattern, otherwise null.
      patternMatrix - the pattern matrix, may be overridden for custom rendering.
      Throws:
      IOException - if there is an error reading or parsing the tiling pattern content stream.
    • showAnnotation

      public void showAnnotation(PDAnnotation annotation) throws IOException
      Shows the given annotation.
      Parameters:
      annotation - An annotation on the current page.
      Throws:
      IOException - If an error occurred reading the annotation
    • getAppearance

      public PDAppearanceStream getAppearance(PDAnnotation annotation)
      Returns the appearance stream to process for the given annotation. May be used to render a specific appearance such as "hover".
      Parameters:
      annotation - The current annotation.
      Returns:
      The stream to process.
    • processChildStream

      protected void processChildStream(PDContentStream contentStream, PDPage page) throws IOException
      Process a child stream of the given page. Cannot be used with processPage(PDPage).
      Parameters:
      contentStream - the child content stream
      page - the current page
      Throws:
      IOException - if there is an exception while processing the stream
    • beginText

      public void beginText() throws IOException
      Called when the BT operator is encountered. This method is for overriding in subclasses, the default implementation does nothing.
      Throws:
      IOException - if there was an error processing the text
    • endText

      public void endText() throws IOException
      Called when the ET operator is encountered. This method is for overriding in subclasses, the default implementation does nothing.
      Throws:
      IOException - if there was an error processing the text
    • showTextString

      public void showTextString(byte[] string) throws IOException
      Called when a string of text is to be shown.
      Parameters:
      string - the encoded text
      Throws:
      IOException - if there was an error showing the text
    • showTextStrings

      public void showTextStrings(COSArray array) throws IOException
      Called when a string of text with spacing adjustments is to be shown.
      Parameters:
      array - array of encoded text strings and adjustments
      Throws:
      IOException - if there was an error showing the text
    • applyTextAdjustment

      protected void applyTextAdjustment(float tx, float ty) throws IOException
      Applies a text position adjustment from the TJ operator. May be overridden in subclasses.
      Parameters:
      tx - x-translation
      ty - y-translation
      Throws:
      IOException - if something went wrong
    • showText

      protected void showText(byte[] string) throws IOException
      Process text from the PDF Stream. You should override this method if you want to perform an action when encoded text is being processed.
      Parameters:
      string - the encoded text
      Throws:
      IOException - if there is an error processing the string
    • showGlyph

      protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException
      Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.
      Parameters:
      textRenderingMatrix - the current text rendering matrix, Trm
      font - the current font
      code - internal PDF character code for the glyph
      unicode - the Unicode text for this glyph, or null if the PDF does provide it
      displacement - the displacement (i.e. advance) of the glyph in text space
      Throws:
      IOException - if the glyph cannot be processed
    • showGlyph

      protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) throws IOException
      Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.
      Parameters:
      textRenderingMatrix - the current text rendering matrix, Trm
      font - the current font
      code - internal PDF character code for the glyph
      displacement - the displacement (i.e. advance) of the glyph in text space
      Throws:
      IOException - if the glyph cannot be processed
    • showFontGlyph

      protected void showFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException
      Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.
      Parameters:
      textRenderingMatrix - the current text rendering matrix, Trm
      font - the current font
      code - internal PDF character code for the glyph
      unicode - the Unicode text for this glyph, or null if the PDF does provide it
      displacement - the displacement (i.e. advance) of the glyph in text space
      Throws:
      IOException - if the glyph cannot be processed
    • showFontGlyph

      protected void showFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) throws IOException
      Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.
      Parameters:
      textRenderingMatrix - the current text rendering matrix, Trm
      font - the current font
      code - internal PDF character code for the glyph
      displacement - the displacement (i.e. advance) of the glyph in text space
      Throws:
      IOException - if the glyph cannot be processed
    • showType3Glyph

      protected void showType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, String unicode, Vector displacement) throws IOException
      Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.
      Parameters:
      textRenderingMatrix - the current text rendering matrix, Trm
      font - the current font
      code - internal PDF character code for the glyph
      unicode - the Unicode text for this glyph, or null if the PDF does provide it
      displacement - the displacement (i.e. advance) of the glyph in text space
      Throws:
      IOException - if the glyph cannot be processed
    • showType3Glyph

      protected void showType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, Vector displacement) throws IOException
      Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.
      Parameters:
      textRenderingMatrix - the current text rendering matrix, Trm
      font - the current font
      code - internal PDF character code for the glyph
      displacement - the displacement (i.e. advance) of the glyph in text space
      Throws:
      IOException - if the glyph cannot be processed
    • beginMarkedContentSequence

      public void beginMarkedContentSequence(COSName tag, COSDictionary properties)
      Called when a marked content group begins
      Parameters:
      tag - indicates the role or significance of the sequence
      properties - optional properties
    • endMarkedContentSequence

      public void endMarkedContentSequence()
      Called when a marked content group ends
    • processOperator

      public void processOperator(String operation, List<COSBase> arguments) throws IOException
      This is used to handle an operation.
      Parameters:
      operation - The operation to perform.
      arguments - The list of arguments.
      Throws:
      IOException - If there is an error processing the operation.
    • processOperator

      protected void processOperator(Operator operator, List<COSBase> operands) throws IOException
      This is used to handle an operation.
      Parameters:
      operator - The operation to perform.
      operands - The list of arguments.
      Throws:
      IOException - If there is an error processing the operation.
    • unsupportedOperator

      protected void unsupportedOperator(Operator operator, List<COSBase> operands) throws IOException
      Called when an unsupported operator is encountered.
      Parameters:
      operator - The unknown operator.
      operands - The list of operands.
      Throws:
      IOException - if something went wrong
    • operatorException

      protected void operatorException(Operator operator, List<COSBase> operands, IOException e) throws IOException
      Called when an exception is thrown by an operator.
      Parameters:
      operator - The unknown operator.
      operands - The list of operands.
      e - the thrown exception.
      Throws:
      IOException - if something went wrong
    • saveGraphicsState

      public void saveGraphicsState()
      Pushes the current graphics state to the stack.
    • restoreGraphicsState

      public void restoreGraphicsState()
      Pops the current graphics state from the stack.
    • saveGraphicsStack

      protected final Deque<PDGraphicsState> saveGraphicsStack()
      Saves the entire graphics stack.
      Returns:
      the saved graphics state stack.
    • restoreGraphicsStack

      protected final void restoreGraphicsStack(Deque<PDGraphicsState> snapshot)
      Restores the entire graphics stack.
      Parameters:
      snapshot - the graphics state stack to be restored.
    • getGraphicsStackSize

      public int getGraphicsStackSize()
      Returns:
      Returns the size of the graphicsStack.
    • getGraphicsState

      public PDGraphicsState getGraphicsState()
      Returns:
      Returns the graphicsState.
    • getTextLineMatrix

      public Matrix getTextLineMatrix()
      Returns:
      Returns the textLineMatrix.
    • setTextLineMatrix

      public void setTextLineMatrix(Matrix value)
      Parameters:
      value - The textLineMatrix to set.
    • getTextMatrix

      public Matrix getTextMatrix()
      Returns:
      Returns the textMatrix.
    • setTextMatrix

      public void setTextMatrix(Matrix value)
      Parameters:
      value - The textMatrix to set.
    • setLineDashPattern

      public void setLineDashPattern(COSArray array, int phase)
      Parameters:
      array - dash array
      phase - dash phase
    • getResources

      public PDResources getResources()
      Returns:
      the stream' resources. This is mainly to be used by the OperatorProcessor classes.
    • getCurrentPage

      public PDPage getCurrentPage()
      Returns:
      the current page.
    • getInitialMatrix

      public Matrix getInitialMatrix()
      Gets the stream's initial matrix.
      Returns:
      the initial matrix.
    • transformedPoint

      public Point2D.Float transformedPoint(float x, float y)
      Transforms a point using the CTM.
      Parameters:
      x - x-coordinate of the point to be transformed.
      y - y-coordinate of the point to be transformed.
      Returns:
      the transformed point.
    • transformWidth

      protected float transformWidth(float width)
      Transforms a width using the CTM.
      Parameters:
      width - the width value to be transformed.
      Returns:
      the transformed width value.
    • getLevel

      public int getLevel()
      Get the current level. This can be used to decide whether a recursion has done too deep and an operation should be skipped to avoid a stack overflow.
      Returns:
      the current level.
    • increaseLevel

      public void increaseLevel()
      Increase the level. Call this before running a potentially recursive operation.
    • decreaseLevel

      public void decreaseLevel()
      Decrease the level. Call this after running a potentially recursive operation. A log message is shown if the level is below 0. This can happen if the level is not decreased after an operation is done, e.g. by using a "finally" block.