CIB pdf toolbox 2 technical documentation

1. Introduction

CIB pdf toolbox was developed since around the year 2000. In 2017 a second main version was developed, which was written completly from scratch and in which we put all our expertise gained from long-term developing of PDF processing and modern C++ programming. It proofed to be a very fast, reliable and a very safe pdf processor, which is also very easy to maintain, modular and suitable for all the usecases around pdf we know of.

CIB pdf toolbox 2 introduced also a new interface, which is completly unicode aware and for complex structures always JSON-based.

Therefor the interface of CIB pdf toolbox 2 is sometimes similar to the old one, but not completly backwards compatible.

2. API Documentation

CIB pdf toolbox 2 has a so called job-based interface. Normally you process a usecase by:

Creating a JOB-Handle
Setting for this JOB-Handle several properties to configure the JOB
Executing the job
Optionally getting some output properties to read results of the JOB
Deleting the JOB-Handle

Core functions are described below, which are exposed from the library.
For higher level, there are convenience wrapper functions whose description can be found in Test documentation section.

On Windows all interface functions are following the stdcall calling convention.

2.1. CibPdf2JobCreate

uint32_t CibPdf2JobCreate(CibPdfJobHandle* job);

By this method a new job handle will be created.

Parameter:

Type	Variable	Meaning
CibPdfJobHandle*	Job	Handle of the job

The function returns 0 on success, otherwise an error code.

2.2. CibPdf2JobFree

uint32_t CibPdf2JobFree(CibPdfJobHandle* job);

By this method a job, created by CibPdf2JobCreate will be freed again. All resources bound to
the job will be also freed.

Parameter:

Type	Variable	Meaning
CibPdfJobHandle*	Job	Handle of the job

The function returns 0 on success, otherwise an error code.

2.3. CibPdf2JobExecute

uint32_t CibPdf2JobExecute(CibPdfJobHandle job);

By this method a job will be executed. The job handle must be created by CibPdf2JobCreate
before calling this function.

Parameter:

Type	Variable	Meaning
CibPdfJobHandle*	Job	Handle of the job

The function returns 0 on success, otherwise an error code.

2.4. CibPdf2JobSetPropertyW

uint32_t CibPdf2JobSetPropertyW(CibPdfJobHandle job, const wchar_t* name, const wchar_t*value, size_t length);

By this function you can configure a job with properties.

Parameter:

Type	Variable	Meaning
CibPdfJobHandle	Job	Handle of the job
const wchar_t*	name	Name of the job-property (UNICODE)
const wchar_t*	value	Value of the job-property (UNICODE)
size_t	length	Length of value

The function returns 0 on success, otherwise an error code.

2.5. CibPdf2JobSetPropertyUtf8

uint32_t CibPdf2JobSetPropertyUtf8(CibPdfJobHandle job, const char* name, const char* value,size_t length);

By this function you can configure a job with properties.

Parameter:

Type	Variable	Meaning
CibPdfJobHandle	Job	Handle of the job
const char*	name	Name of the job-property (in UTF-8)
const char*	value	Value of the job-property (in UTF-8)
size_t	length	Length of value

The function returns 0 on success, otherwise an error code.

2.6. CibPdf2JobSetPropertyWSimple

uint32_t CibPdf2JobSetPropertyWSimple(CibPdfJobHandle job, const wchar_t* name, constwchar_t* value);

Same as CibPdf2JobSetPropertyWSimple, but the length of value is determined by a 0-
Terminator.

Parameter:

Type	Variable	Meaning
CibPdfJobHandle	Job	Handle of the job
const wchar_t*	name	Name of the job-property (UNICODE)
const wchar_t*	value	Value of the job-property (UNICODE)

The function returns 0 on success, otherwise an error code.

2.7. CibPdf2JobSetPropertyUtf8Simple

uint32_t CibPdf2JobSetPropertyUtf8Simple(CibPdfJobHandle job, const char* name, constchar* value);

Same as CibPdf2JobSetPropertyUtf8, but the length of value is determined by a 0-Terminator.

Parameter:

Type	Variable	Meaning
CibPdfJobHandle	Job	Handle of the job
const char*	name	Name of the job-property (in UTF-8)
const char*	value	Value of the job-property (in UTF-8)

The function returns 0 on success, otherwise an error code.

2.8. CibPdf2JobGetPropertyW

uint32_t CibPdf2JobGetPropertyW(CibPdfJobHandle job, const wchar_t* name, wchar_t* value,size_t* maxLength);

By this function you can retrieve properties, after you executed the job. The provided buffer in
value must provide enough bytes to retrieve the data, otherwise an error is returned.
maxLength will always contain after executing the needed length.

Parameter:

Type	Variable	Meaning
CibPdfJobHandle	Job	Handle of the job
const wchar_t*	name	Name of the job-property (UNICODE)
wchar_t*	value	Value of the job-property (UNICODE)
size_t*	maxLength	As Input the provided length of the value buffer. After executing it contains the length of the buffer, which was really used.

The function returns 0 on success, otherwise an error code.

2.9. CibPdf2JobGetPropertyUtf8

uint32_t CibPdf2JobGetPropertyUtf8(CibPdfJobHandle job, const char* name, char* value,size_t* maxLength);

By this function you can retrieve properties, after you executed the job. The provided buffer in
value must provide enough bytes to retrieve the data, otherwise an error is returned.
maxLength will always contain after executing the needed length.

Parameter:

Type	Variable	Meaning
CibPdfJobHandle	Job	Handle of the job
const char*	name	Name of the job-property (in UTF-8)
char*	value	Value of the job-property (in UTF-8)
size_t*	maxLength	As Input the provided length of the value buffer. After executing it contains the length of the buffer, which was really used.

The function returns 0 on success, otherwise an error code.

2.10. CibPdf2GetVersion

CibPdf2GetVersion uint32_t CibPdf2GetVersion();

Returns the version of the CIB pdf toolbox 2 as uint32_t representation.

2.11. CibPdf2GetVersionText

CibPdf2GetVersionText uint32_t CibPdf2GetVersionText(char* value, size_t maxLength);

Retrieves the version of the CIB pdf toolbox 2 as a text string.

Parameter:

Type

Variable

Meaning

char*

value

Buffer to which the version string should be written to

size_t

maxLength

Length of the provided buffer.

The function returns 0 on success, otherwise an error code.

2.12. CibPdf2JobGetErrorUtf8

uint32_t CibPdf2JobGetErrorUtf8(CibPdfJobHandle job, char* value, size_t* maxLength);

By this function you can retrieve a text description of the most recent error, which occurred in
a job.

Parameter:

Type	Variable	Meaning
CibPdfJobHandle	Job	Handle of the job
char*	value	Value of the error (in UTF-8)
size_t*	maxLength	As Input the provided length of the value buffer. After executing it contains the length of the buffer, which was really used.

The function returns 0 on success, otherwise an error code.

3. Testset Documentation

The CIB pdf toolbox 2 PDF package can be used through high level functions, which are
wrapper of those described in API documentation.

A testset, where several examples of these functions can be found, is provided along with the
package for testing purposes.

For processing anything inside the library you need to create a job. To configure the job you
need to set properties. And to execute the job you need to call execute on it. Afterwards you
can get information about what was done inside the job by retrieving properties.

The library is threadsafe. So you can call it from different threads with different jobs. But each
job should be used only in one thread.

3.1. Creating/Freeing a Job

As simple as declaring a variable. Depending on the library preferred, it can be created as the two options shown below:

CibPdfToolbox2Job job;

As for freeing the corresponding job, its destructor is called as usual.

3.2. Setting Property

Properties and its different values are described in Usecases section. In order to set properties

job.setProperty(CibProperty::property, value);

where value is a wstring.

3.3. Setting Subproperty

job.setSubProperty(CibProperty::property, CibProperty::subproperty, value);

Notice that in Usecases section, subproperties are referred as Property.Subproperty

3.4. Executing a Job

job.execute();

3.5. Getting Property

Properties can be retrieved after a job is executed.

job.getPropertyAsStringW(CibProperty::property);

3.6. Getting Last Error

To retrieve a text description of the most recent error which occurred in a job.

job. getErrorText();

4. Usecases

4.1. Setting License

Setting license information

Property	Meaning	Type
LicenseCompany	Setting the company of the license.	Set
LicenseKey	Setting the key of the license.	Set
LicenseFile	If LicenseFile provided, it should be copied into c++/bin64 or c++/bin32 folder along with the binaries. LicenseCompany and LicenseKey not to be change.	Set

4.2. Loading and Merging PDF-Documents

Inside a JOB you can load PDF-documents in several ways.

Property	Meaning	Type
InputFilename	Inputfiles as ;-separated list of filenames. If the filename contains a ;, use \ as escape sequence. Before the filename you can specify which pages of the pdf document should be used. Example: InputFilename={1-3};filename1;{Odd};filename2;#;12345678#664 The result would be a merged PDF with the pages 1-3 of filename1, all odd pages of filename2 and all pages of filename3, which is located at memory address 12345678 and has a length of 664 bytes.. Syntax: InputFilename ::= <filenameorpages> [“;“ < filenameorpages >]... filenameorpages::= [<page description> „;“] (<filename> \| <memory blocks> \| <memory delimiter>) page description ::= „{“ <pages> [“,“ <pages>]... „}“ pages::= “All“ \|“Even“ \| “Odd“ \| “First“ \| “Last“ \| “NoFirst” \| “NoLast” \| <number> \| (<start number> „-„ <end number>) memory blocks::= < memory block> [<memory delimiter> <memory block>]... memory block ::= <address> <memory delimiter> <length> Note: if InputFilename contains page description for page selection then all following operations work on the pagecount and page selection of the resulting document after the merge. Note: A special case for filename is “EMPTY:Width,Height”: it means that empty single-page PDF is generated with the specified page width and height in mm. Example: InputFilename=mydoc.pdf;EMPTY:210,297 In this case, document mydoc.pdf is merged with a singlepage empty PDF.	Set
InputMemoryAddress	Contains pairs of memory addresses and lengths, separated by ; of input document.	Set
InputFilenamePattern	Contains a pattern of input files, e.g. *.pdf. All PDFs with this pattern will be merged together.	Set
WorkSpace	You can set one or several Workspace dictionaries, separated by ; where to search for inputfiles.	Set
MergePdfAConform	A flag indicating, that if several input documents are merged to one, that this merge happens PDF/A specification conform. This means, that if all input documents conform to PDF/A, then the output document conforms also to PDF/A.	Set
OutlinesDeleteExisting	A flag that indicates, that PDF-outlines should be deleted.	Set
PartialDocumentAlignment	Every partial document is expanded to an even pagecount. See also InsertEmptyPageOmitLast.	Set
InsertEmptyPageOmitLast	This configures the PartialDocumentAlignment behaviour, that for the last document no optional empty page is appebded.	Set
FormfieldNamePrefix	A ;-separated string, by which a prefix for formfields can be defined, which is assigned to each input document in a merge.	Set
GenericFormfieldNamePrefix	A flag, that indicates, that all input documents should get an automatic generated formular field prefix, so that formular fields of the input documents are always unique and are not combined in a pdf merge.	Set
PartialDocumentAlignment	Adds an empty page between merged documents, if the input documents have an uneven amount of pages.	Set
JoinHistory	Sets a marker inside the PDF, where each document participated at the join ended.	Set

4.3. Creating PDF-Documents from Image formats

Inside the InputFilename property you can also load image formats of type JPEG, BMP, Tiff, JPEG2000 and PNG and not only PDF documents. Those image formats will be automatically converted to a PDF document.

Property	Meaning	Type
InputImageScaling	You can specify the dimensions of the pages on which the images will be emplaced. Width;Height in pt. mm or in DPI. Example: InputImageScaling=100pt;200pt With as 100pt and height as 200pt InputImageScaling=100mm;200mm With as 100mm and height as 200mm InputImageScaling=;200pt Arbitrary Width and fixed height of 200pt. InputImageScaling=100dpi resolution of the image will be 100 dpi, height and width will be dynamically calculated according to these. Instead of setting a DPI or fixed size, you can select also a predefined size, as the ISO page formats A1-A9	Set
InputImageMargin	Setting a margin in pt, which should be used for the images. Default is 0. Example: InputImageMargin=10pt Setting all margins to 10 pt InputImageMargin=10pt;20pt Setting left and right margin to 10 pt, top an bottom to 20 pt InputImageMargin=10pt;20pt;30pt;40pt Setting left margin to 10pt, top margin to 20pt; right margin to 30pt and bottom margin to 40 pt	Set
InputImageBackgroundColor	Define the background color of the page as a color. Example: InputImageBackgroundColor= #FF0000 Sets red as background color.	Set
InputImageScalingMax	Defining a maximum size of a created image in pt	Set
InputImageEmbedStyle	Either embed or fit. Fit means, that the image is completely spanned over the page. Embed means, that the image is embedded into a page, with a fixed page, as A4.	Set

4.4. Content Modify (Adding text / shapes / Barcodes / Images)

You can add text to the document, modify text, add shapes as images and more:

Property	Meaning	Type
TextOverlay	TextOverlay is a JSON-Array of JSON-Objects of new text strings added to the PDF The keys in the JSON-Object can be: Degree: degree from 0 - 360 degrees FontColor: Which color it should have in HEX FontName: name of the font. Please set also property FontWorkSpace, if you want to set a directory, where to lookup the font. FontSize: font size FontStyle: which style of the font should be used, normal, bold, italic or bolditalic Opacity: 0 is transparent, 100 is full visible PageSelection: A page selection on which the text should be added. By default it is added on all pages. Position: where it should be added. You can specify this by an absolute position or by a predefined location. When you specifiy an absolute position, then you need to supply a position in X and Y from top left corner by e.g. Position: “10;20”. This is a absolute position in PDF-points (which can be positioned relative, see PageWidth and PageHeight). If you want to use mm, then you need to set e.g. Position: “10mm;20mm”. If you specify a negative Y-coordinate, the Position anchor will be the bottom line, instead of the top line. If you specify a negative X-coordinate, the position will be anchored from the right page border. You can also specify predefined locations, which are: TopLeft TopCenter TopRight MiddleLeft MiddleCenter MiddleRight BottomLeft BottomCenter BottomRight PageWidth: You can define a page width, on which the coordinates of an absolute positioned rectangle refer to. If not specified, the page width in points is used. PageHeight: You can define a page height, on which the coordinates of an absolute positioned rectangle refer to. If not specified, the page height in points is used. Text: what text should be added, it can contain newlines, which will be interpreted as new text lines. The Text can contain magic keywords for which the text content will be replaced. This can be: “<NumPages>”: The amount of pages in the document “<Page>”: The current page number “<DateTime:format>”: A datetime string, defined by a format string: o yyyy: Year, (e.g. 2015) o yy: Year, as 2 number (e.g. 15) o MMMM: Full month name (e.g. December) o MMM: Abbreviated Month Name (e.g. Dec) o MM: Month number with leading zero(eg.04) o dddd: Represents the full name of the day (Monday, Tuesday, etc). o ddd: Represents the abbreviated name of the day (Mon, Tues, Wed, etc). o dd: Represents the day of the month as a number from 01 through 31. o HH: 24-hour clock hour, with a leading 0 (e.g. 22) o mm: Minutes with a leading zero o ss: Seconds with leading zero Type: Type of shape, e.g. Rectangle ZLevel: which zlevel it should be added. -1 means in the background, 1 in the foreground Example: TextOverlay= [{"FontColor": "#FF0000", "FontName": "Arial", "Degree": -45, “Text": "Demo", "FontSize": 100, "ZLevel": 1, "Opacity": 30, "Position": "MiddleCenter", "FontStyle": "Normal"}] Creates a new Text with content “Demo” above existing pdf content with font color red, font Arial, font size 100 and normal font style. The position would be in the middle of the page and the text will be rotated by -45 degrees.	Set
TextReplace	By this you can replace existing text in a PDF. For the new text the existing font attributes are used. If the new text contains glyphs, which the old font doesn’t provide, a new font is embedded. Either you can specify a region (<SpecifiedRegion>), where text should be replaced or give text-strings(<TextSearch>), which should be replaced. The value is a JSON-array of either a <SpecifiedRegion> or a <TextSearch>. A <SpecifiedRegion> is a JSON-object with keys: PageWidth (optional): You can define a page width as JSON-number, on which the coordinates of an absolute positioned rectangle refer to. If not specified, the page width in points is used. PageHeight (optional): You can define a page height as JSON-number, on which the coordinates of an absolute positioned rectangle refer to. If not specified, the page height in points is used. PageIndex: A JSON-number for the page index. Default is 0. Replace: A JSON-array of specified regions as JSON-objects, where text should be replaced on this page. For Defining, in which region the text should be replaced, you need either to specify a region, by X,Y, Width, Height or a dynamic region, found by specifying SearchText. Each of those JSON-objects have: Text: The new text SearchText: A text, which is searched for and which will be replaced. SearchHitIndexes: Define, which hits of a Search, defined by SearchText, should be replace. Please specify a ;-separated list of indexes. Default is all hits. X: X coordinate from top left as JSON-string in pt, mm, or cm. Y: Y coordinate from top left as JSON-string in pt, mm, or cm. Width: Width as JSON-string in pt, mm, or cm. Height: Height as JSON-string in pt, mm, or cm. Fixed: A JSON-Boolean, which defines, if the text right to the replaced text should be moved to fit new text width or not. Default: true A <TextSearch> is a JSON-object with keys: Text: New text SearchText: old text PageSelection: A page selection, default all Fixed: A JSON-Boolean, which defines, if the text right to the replaced text should be moved to fit new text width or not. Default: true	Set
DrawShape	DrawSpape is a JSON-Array of JSON-Objects of new shapes added to the PDF. The keys in the JSON-object can be: PageSelection: on which pages this shape should be added Color: Which color it should have in HEX Filename: Source of an image Format: Format of a Barcode, supported are DataMatrix, QR, Aztec, PDF417, Code39, Code93, Code128, ITF, Codabar Position: positon of top left corner, in mm or pt. In case of a type Line an array of Positions, separated by semicolon, e.g.: "Position": "19mm;27mm;11mm;30mm" Width: width of the space in mm or pt Height: heigth of the space in mm or pt LineCap: Setting line cap style (0-2) LineJoin: Setting line join style (0-2) BorderThickness: thickness of the border if the shape has border Rotation: Rotation of a Image/Barcode. Supported values are 90, 180, 270. Text: Content of a Barcode in UTF-8. ITF and Codabar use only the allowed text for their types. Type: Type of shape, e.g. Rectangle, Line, Image, Barcode Example: DrawShape = [{"PageSelection": "1", "Color": "#DDFF0000", "Position": "19mm;27mm", "Width": "26mm", "Height": "6mm", "Type": "Rectangle", "BorderThickness": "3pt"}] Create a new rectangle shape with a border with of 3 pt and color red and transparency #DD on page 1 and position 19 mm, 27 mm (from top left) and with 26 mm and height 6 mm. DrawShape=[{'PageSelection': '1', 'Type': 'Barcode', 'Format': 'DataMatrix', 'Text': 'Example Barcode', 'Position': '10mm;30mm','Width': '20mm', 'Height': '20mm'}] Creates a new Barcode of Type DataMatrix on Page 1.	Set
AnnotationAdd	AnnotationAdd is a JSON-Array of JSON-Objects of new Rich Media annotations added to the PDF. The keys in the JSONobject can be: - PageIndex: on which pages this Rich Media annotation should be added - Filename: Path of the video file to be embedded. - ThumbnailPath: Thumbnail of the Rich Media annotation. - Position: position of top left corner, in mm or pt. Separated by semicolon. - Width: width of the annotation in points or mm. - Height: height of the annotation in points or mm. - ActivationCondition: PV or XA - DeactivationCondition: PI or XD According to PDF specifications. Example:* [ {"PageIndex":5, "Filename":"D:\\\\media1.mp4", "ThumbnailPath":"D:\\\\image46.png", "Position":"508.4622pt;32.85032pt", "Width":"198.0682pt", "Height":"356.5226pt", "ActivationCondition":"PV", "DeactivationCondition":"PI"}]	Set

4.5. Rendering PDF documents

You can render a PDF document to one or more image types at the same time.

Property	Meaning	Type
Render	“1” activates rendering of the document.	Set
Render.ImageType	A ;-separated list of image output types. Currently Bmp, Jpg, Png and Tiff are supported.	Set
Render.ImageScaling	Setting an image dpi for rendering. Default is 150dpi.	Set
JpegQuality	Sets the jpeg quality, if images are saved as Jpeg or Jpeg2000 Default is 80	Set
PngCompressLevel	Setting the PNG compression level (1-10) for png images	Set
Render.Bounds	Setting bounds of which part of the page should be rendered. Bounds is a tuple of 4 numbers, separated by a semicolon. It is X;Y;width;height. The numbers can be set in point, mm or cm, depending on the suffix (pt, cm and mm). The anchor of the coordinates is top left. By default the whole page is rendered and this can be kept unset.	Set
Render.DPIAware	Render.DPIAware=0 : No dpi information is written into the images. (Default) Render.DPIAware=1 : Use internal calculated dpi for the images Render.DPIAware=150 : Predefine DPI from outside. Using in this case 150.	Set
Render.MaxSize	Setting a MaxWidth and MaxHeight in pixel, where the rendered image should not exceed. Example: “100;100” => No image with with or height larger than 100 px is rendered. Independent from the DPI set from ImageScaling. Default is “0;0”, which disables this check	Set
RenderOutputFilenamePattern	Defining a pattern for the output filesnames. This is a sprintf type pattern, which should include a placeholder for a number of the page rendered and optionally for a string, which will be replaced by the original filename.	Set
Render.OutputType	Set “memory” or “file”, if you want to have the output saved to memory or file. "file” is default.	Set
RenderPageSelection	Defining a page selection, which pages should be rendered.	Set
Render.Formfields	Defines, if the content of formfields should be rendered or not. Default is 1.	Set
Render.AllowFontFall back	Defines, if the content of text should be rendered or not, which can be rendered only by a fallback font, because the original font is not available. Default is 1.	Set
Render.Annotations	Defines, if the content of annotations (other than formfields) should be rendered or not. Default is 1.	Set
Render.Signatures	Defines, if the content of signatures should be rendered or not. Default is 1.	Set
Render.TiffSinglePage	Defines, if tiff images should be generated as single-page images. Default is 0, so that multi-page tiff is generated by default	Set
Render.TextRenderM ode	By this you can configure, which classes of text is rendered. By default all text is rendered. RenderSymbolic: Only symbolic text is rendere RenderNonSearchable: Only non-searchable text is rendered RenderNoText: No digital text is rendered	Set

4.6. Reading and Writing simple PDF information

You can render a PDF document to one or more image types at the same time.

Property	Meaning	Type
CenterWindow	A flag specifying whether to position the document’s window in the center of the screen. Default value: false.	Get/Set
Encrypted	“1” or “0” depending if the document is encrypted	Get
Direction	The predominant reading order for text: L2RLeft to right R2LRight to left (including vertical writing systems, such as Chinese, Japanese, and Korean) This entry has no direct effect on the document’s contents or page numbering but may be used to determine the relative positioning of pages when displayed side by side or printed n-up. Default value: L2R.	Get/Set
DisplayDocTitle	A flag specifying whether the window’s title bar should display the document title taken from the Title entry of the document information dictionary. If false, the title bar should instead display the name of the PDF file containing the document.	Get/Set
DocInfo	A JSON-Desciption of all available DocInfo.* properties.	Get
DocInfo.*	Setting or reading of custom metadata properties.	Get/Set
DocInfo.Author	Information about the author of the document	Get/Set
DocInfo. CompressProfile	Information about which compression profile was used, after a compression task was executed on the document before.	Get
DocInfo.Creator	Information about the creating software of the PDF document.	Get/Set
DocInfo.CreationDate	Information about the creation date of the PDF document.	Get/Set
DocInfo.Keywords	The Keywords, which are assigned to this PDF document.	Get/Set
DocInfo.ModDate	The last modification date of the PDF document	Get/Set
DocInfo.Producer	Retrieving Information about the producer of the PDF document.	Get/Set
DocInfo.Subject	Information about the subject of the PDF document.	Get/Set
DocInfo.Title	Information about the title of the PDF document.	Get/Set
FitWindow	A flag specifying whether to resize the document’s window to fit the size of the first displayed page. Default value: false.	Get/Set
HasJavascript	“1” or “0” depending if the document contains JavaScript	Get
HideMenubar	A flag specifying whether to hide the conforming reader’s menu bar when the document is active. Default value: false.	Get/Set
HideToolbar	A flag specifying whether to hide the conforming reader’s tool bars when the document is active. Default value: false.	Get/Set
HideWindowUI	A flag specifying whether to hide user interface elements in the document’s window (such as scroll bars and navigation controls), leaving only the document’s contents displayed. Default value: false.	Get/Set
ID	The two ID values of the PDF, separated by a ;.	Get
ImageInfo	Retrieving a description about images inside a PDF in JSON format.	Get
NonFullScreenPageMode	The document’s page mode, specifying how to display the document on exiting full-screen mode: UseNone: Neither document outline nor thumbnail images visible UseOutlines: Document outline visible UseThumbs: Thumbnail images visible UseOC: Optional content group panel visible This entry is meaningful only if the value of the PageMode entry in the Catalog dictionary is FullScreen; it shall be ignored otherwise. Default value: UseNone.	Get/Set
PageCount	The amount of pages the document has.	Get
PageInfo.<num>	The page dimentions of the a requested page index <num>, where 0 represents the first page.	Get
PageLayout	A name object specifying the page layout shall be used when the document is opened: SinglePageDisplay one page at a time OneColumnDisplay the pages in one column TwoColumnLeftDisplay the pages in two columns, with odd-numbered pages on the left TwoColumnRightDisplay the pages in two columns, with odd-numbered pages on the right TwoPageLeft(PDF 1.5) Display the pages two at a time, with odd-numbered pages on the left TwoPageRight(PDF 1.5) Display the pages two at a time, with odd-numbered pages on the right Default value: SinglePage.	Get/Set
PageMode	A name object specifying how the document shall be displayed when opened: UseNone: Neither document outline nor thumbnail images visible UseOutlines: Document outline visible UseThumbs: Thumbnail images visible FullScreen: Full-screen mode, with no menu bar, window controls, or any other window visible UseOC: Optional content group panel visible UseAttachments: Attachments panel visible Default value: UseNone.	Get/Set
PdfVersion	The pdf version of the document, can be following strings: 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.7EL3 (pdf version 1.7 with extension level 3) 1.7EL8 (pdf version 1.7 with extension level 8) 2.0	Get/Set
MinPdfVersion	Minimum output PDF version. If the input pdf has a version higher than this value, the existing version is kept, otherwise increased to the set value.	Set
PdfVersionInfo	A JSON description about the PDF version, which includes also information about what compliance versions the PDF complies to, e.g. PDF/A or PDF/UA.	Get

4.7. Writing PDF documents

Writing a document to a file

Property	Meaning	Type
OutputFilename	Filename to which the pdf should be written to.	Set

Writing a document to memory

Property	Meaning	Type
MemoryOutputCallback	Address of a callback function which retrieves the pdf data. The address has to be written as a string, which contains the address as normal number. The callback function needs to have the signature: int MemoryCallback(const uint8_t* output, size_t length, void* userdata, int error); output contains some data of the pdf file to write. length defines the length of the data userdata is the object, defined inside MemoryOutputUserdata error defines, if there was an error during execution, otherwise 0. The return value needs to be 1, if the callback wan’t to return success, otherwise 0.	Set
MemoryOutputUserdata	Address of a userdata object which is used inside the callback. The address has to be written as a string, which contains the address as normal number.

Property

Meaning

Type

MemoryOutputCallback

Address of a callback function which retrieves the pdf data. The address has to be written as a string, which contains the address as normal number. The callback function needs to have the signature:

int MemoryCallback(const uint8_t* output, size_t length, void* userdata, int error);

output contains some data of the pdf file to write.
length defines the length of the data
userdata is the object, defined inside MemoryOutputUserdata
error defines, if there was an error during execution, otherwise 0.

The return value needs to be 1, if the callback wan’t to return success, otherwise 0.

Set

MemoryOutputUserdata

Address of a userdata object which is used inside the callback. The address has to be written as a string, which contains the address as normal number.

Normally a PDF is saved with the best saving options, which the PdfVersion of the document allows. This means object stream and xref streams will be used with a PdfVersion greater or equal to 1.5. This behavior can be overridden by setting the property WritingMode:

Property	Meaning	Type
DontOverwriteProducer	Don’t write the CIB pdfModule default Producer into the Metadata.	Set
WritingMode	ObjectAndXrefStream: use object and xref streams XrefStream: use only xref streams Xref: use only standard xref Best: use the best algorithm for the used PDF version (default)	Set
IncrementalUpdate	0: Deactivated (default) 1: Activated always Auto: Activated only, when signed Signature fields already exits in the PDF document An incremental update to a PDF means, that the whole input document is not changed in its content, but at the end of the document all changes, which were made are written combined inside an update entry. This means e.g. that signed signatures in an existing PDF document are not getting broken, but still you can add more content or modify the PDF document. Inside a (good) PDF viewer you can see then, that the PDF consists of different versions and you can switch between them.	Set

4.8. Retrieving progess and event information

Property	Meaning	Type
EventCallback	Address of a callback function which retrieves information about events. The address has to be written as a string, which contains the address as normal number. The callback function needs to have the signature: int EventCallback(uint32_t eventType, const void* eventData, void* userdata); eventType defines the event. Currently only 0 is defined, which means, that a progress event is fired. eventData in case of eventType=0 this is a 0 terminated string, which defines a JSON description of the current progress. userdata is the object, defined inside MemoryOutputUserdata The return value needs to be 1, if the callback wan’t to return success, otherwise 0.	Set
EventCallbackUserdata	Address of a userdata object which is used inside the callback. The address has to be written as a string, which contains the address as normal number.	Set
Progress	Returning a JSON-description of current progress of the current running JOB. This property can be retrieved from a different thread as the thread in which the current execute is running. The JSON description consists of the Keys “AmountSteps” and “CurrentStep”. AmountSteps defines, how many internal bigger steps are done and the CurrentStep defines how many Steps were already done.	Get

4.9. Rotating PDF documents

Property	Meaning	Type
PageRotation	A ;-separated string of page selection defintions and degrees. Example: PageRotation={All};90;{3};180 All pages should be rotated by 90 degrees. But page 3 should be rotated by 180 degrees. Valid degree values are 0,90,180 and 270. Valid page selection definitions can be looked up at property InputFilename	Set

Property

Meaning

Type

PageRotation

A ;-separated string of page selection defintions and degrees.

Example:

PageRotation={All};90;{3};180

All pages should be rotated by 90 degrees. But page 3 should be rotated by 180 degrees.

Valid degree values are 0,90,180 and 270.

Valid page selection definitions can be looked up at property InputFilename

Set

4.10. Encryption

Writing a document with encryption

Property	Meaning	Type
OutputOwnerPassword	Setting the owner password. This means, that the document gets encrypted, but unless you also specify a user password, anyone can view the document without a password entry, but the security options apply.	Set
OutputUserPassword	Setting the user password. When this is set, someone who opens the document will need to enter either a user or owner password to open the document.	Set
EncryptEnablePrinting	A flag indicating if a user has the right to print the document.	Get/Set
EncryptEnableClipboard	A flag indicating if a user has the right to copy content to the clipboard.	Get/Set
EncryptEnableForms	A flag indicating if a user has the right to fill out form fields.	Get/Set
EncryptEnableAssembling	A flag indicating if a user has the right to assemble the document (rotate, delete, insert,…).	Get/Set
EncryptEnableNotes	A flag indicating if a user has the right to add or modify text annotations.	Get/Set
EncryptEnableModifying	A flag indicating if a user has the right to do other modifying operations on the document.	Get/Set
EncryptEnableExtract	A flag indicating if a user has the right to extract text and graphics (in support of accessibility to users with disabilities or for other purposes).	Get/Set
PdfVersion	The encryption algorithm is defined by the output pdf version. Always the best encryption standard, which is supported by the pdf version will be used. To use the highest and most secure encryption, please use always either PDF version 1.7EL8 or 2.0. This is AES-256 with fixes for the hashing algorithm, as defined in those two PDF versions. PDF 2.0 deprecated all other encryption variants, which were used before.	Get/Set

Opening a document with encryption

You can either specify EncryptUserPassword or EncryptOwnerPassword, if you know if the provided password is a user or owner password or EncryptDocumentPassword. In the case of EncryptDocumentPassword it is internally checked, if the password is a user or owner password.

Property	Meaning	Type
EncryptUserPassword	The user password to open a document.	Set
EncryptOwnerPassword	The owner password to open a docuement.	Set
EncryptDocumentPassword	The user or owner password to open a document.	Set
PdfEncryptionOwnerAuthorized	After opening the document you can check by this flag, if you have owner or user access.	Get
EncryptEnable*	By reading the EncryptEnable* properties (see section above) you can read, which security rights the document has.	Get

Removing Encryption from an input document

Property	Meaning	Type
RemoveEncryption	A flag, that indicates, that encryption should be removed. You need to authorize at first to be able to remove encryption. When you write the document afterwards, it will not contain any encryption.	Set

4.11. Signing

Verify, that a Certificate Password is correct

Additional to all signing parameters as in the signing usecase the property

Property	Meaning	Type
SignPdfVerifyCertificatePassword	Just verify, that the provided certificate password is valid.	Set

Signing PDF document

Property	Meaning	Type
SignPdf	Activates Signing	Set
CertificateFilename	Sets the filename of the certificate	Set
CertificatePassword	Sets the password of the certificate	Set
SignLocation	Optionally setting a location, where the document is signed.	Set
SignContactInfo	Optionally setting a contact info as a telephone number about who signed the document.	Set
SignPdfFormfield	By this you can assign the name of a signature formfield, which should be signed. If not supplied a new invisible signature formfield is automatically created.	Set
SignPdfImage	If you supply with SignPdfFormfield an existing visible signature formfield, which should be signed, you can specify by this the visual appearence of the signed formfield by supplying an image.	Set
SignReason	Optionally setting a reason why the document was signed.	Set

The use of a timestamp server is recommended. You can activate and configure it by these properties:

Timestamp	Activating usage of a timestamp server.	Set
TimestampServer	If a timestamp server should be used, setting the URI of the timestamp server. If omitted internally configured free accessable timestamp servers are contacted.	Set
TimestampServerUsername	Optionally setting of a username, which is needed to access the timestamp server.	Set
TimestampServerPassword	Optionally setting of a password, which is needed to access the timestamp server	Set
TimestampDontUseFallbackServers	Disable internal fallback servers, if provided TimeStampServer cannot be reached	Set
ProxyHost	The address of the proxy server.	Set
ProxyPort	The port of the proxy server.	Set
ProxyUsername	The optional username for a proxy server.	Set
ProxyPassword	The optional password for a proxy server.	Set

4.12. Handling embedded files

Embedded files can be added, deleted or extracted from PDF documents. Additionally the info about them can be extracted.

Extracting info about embedded files

Property	Meaning	Type
EmbeddedFilesInfo	Returns a description of the included embedded files in JSON format.	Get

Importing embedded files

Property	Meaning	Type
EmbeddedFilesAdd	A JSON-array of JSON objects, which define, what files should be imported. One JSON object looks like this: File: Input Filename UsedFilename: Filename to be used inside PDF Description: Description of the attachment inside PDF Relationship: A PDF/A3 or PDF 2.0 compliant relationship of the Attachment to the PDF document. Standard relationship types are: Source: shall be used if this file specification is the original source material for the associated content. Data: shall be used if this file specification represents information used to derive a visual presentation – such as for a table or a graph. Alternative: shall be used if this file specification is an alternative representation of content, for example audio. Supplement: shall be used if this file specification represents a supplemental representation of the original source or data that may be more easily consumable (e.g., A MathML version of an equation). EncryptedPayload: shall be used if this file specification is an encrypted payload document that should be displayed to the user if the PDF processor has the cryptographic filter needed to decrypt the document. FormData: shall be used if this file specification is the data associated with the AcroForm of the pdf document. Unspecified: shall be used when the relationship is not known or cannot be described using one of the other values. Mimetype: mimetype of the attachment Memory: Memory input Except of File or Memory all Keys are optional. In Memory usecase also UsedFilename is mandatory.	Set

Property

Meaning

Type

EmbeddedFilesAdd

A JSON-array of JSON objects, which define, what files should be imported. One JSON object looks like this:

File: Input Filename
UsedFilename: Filename to be used inside PDF
Description: Description of the attachment inside PDF
Relationship: A PDF/A3 or PDF 2.0 compliant relationship of the Attachment to the PDF document. Standard relationship types are:

Source: shall be used if this file specification is the original source material for the associated content.
Data: shall be used if this file specification represents information used to derive a visual presentation – such as for a table or a graph.
Alternative: shall be used if this file specification is an alternative representation of content, for example audio.
Supplement: shall be used if this file specification represents a supplemental representation of the original source or data that may be more easily consumable (e.g., A MathML version of an equation).
EncryptedPayload: shall be used if this file specification is an encrypted payload document that should be displayed to the user if the PDF processor has the cryptographic filter needed to decrypt the document.
FormData: shall be used if this file specification is the data associated with the AcroForm of the pdf document.
Unspecified: shall be used when the relationship is not known or cannot be described using one of the other values.

Mimetype: mimetype of the attachment
Memory: Memory input

Except of File or Memory all Keys are optional. In Memory usecase also UsedFilename is mandatory.

Set

Exporting embedded files

Property	Meaning	Type
EmbeddedFilesExtract	A JSON-array of JSON objects, which define, what files should be exported. One JSON object looks like this: File: filename as specified in the pdf OutputFilename: Where the file should be saved to.	Set

Property

Meaning

Type

EmbeddedFilesExtract

A JSON-array of JSON objects, which define, what files should be exported. One JSON object looks like this:

File: filename as specified in the pdf
OutputFilename: Where the file should be saved to.

Set

Deleting embedded files

Property	Meaning	Type
EmbeddedFilesDelete	A JSON-array of JSON objects, which define, what files should be deleted. One JSON object looks like this: File: filename as specified in the pdf	Set

Property

Meaning

Type

EmbeddedFilesDelete

A JSON-array of JSON objects, which define, what files should be deleted. One JSON object looks like this:

File: filename as specified in the pdf

Set

4.13. Compression of documents

Property	Meaning	Type
Compress	A flag indicating, that compression should be activated.	Set
CompressQuality	Defining a compression profile, which sets a lot of other settings regarding compression. Available profile names are: lossless: uses only lossless image encoding algorithm, highest file size highest: already uses non lossless image algorithm, but at highest image quality higher: higher image quality - higher file size normal: normal image quality - normal file size - low: low image quality - low file size	Set
Threads	The amount of threads, which are allowed to work on compression. The default is the available number of CPUs – 1. If you want to disable threading, you need to set this propert to 0 or 1.	Set
RemovePieceInfo	A flag which indicates removing propriertary data of pdf creators from the PDF document. This is default for all compression profies.	Set
RemoveThumbs	A flag which indicates removing page thumbnail images of the PDF document. This is default for all compression profies.	Set
OptimizePages	A flag which indicates to optimize the internal page tree of the PDF document. This is default for all compression profies.	Set
RemoveAlternateImages	A flag which indicates to remove alternate images. This is default for all compression profies.	Set
RemoveSpiderInfo	A flag which indicates to remove spider info. This is default for all compression profies.	Set
OptimizeContent	A flag which indicates to otimize the content streams of all pages and removing unneeded commands. This is default for all compression profies.	Set
OptimizeStreams	A flag which indicates to otimize the streams in general to check, if they can be saved by lossless reduction by resaving them with tuned saving parameters.	Set
Compress	Compression can take a lot time. You can set Property Cancel=1 to indicate, that the compression should be canceled and compression should be stopped. This is a thread-safe property and can be set from different threads.	Set

Instead Compress and CompressQuality also finetuning of the reencoding of the images can be done. Normally this finetuning is not recommended.

Property	Meaning	Type
ReEncodeImages	A flag indicating, that the document images should be reencoded.	Set
ReEncodeImagesOptions	A JSON object which configures the image reencoding with following keys: MRCProfile: <profile> Same profiles as in CompressQuality, which define base rules for reencodeing. TargetColorImages: Can be Jpeg2000, Jpeg, MRC or empty TargetColorIndexedImages: Can be Jpeg2000, Jpeg, MRC or empty TargetColorImages: Cab be Jpeg2000, Jpeg, MRC or empty TargetMonochromeImages: Can be jbig2 or empty MRCTargetDPI: <number>, sets the Target DPI under which the MRC decompistion is done. It should work best between 200 and 400 dpi. JBig2Combine: <boolean>, flag, indicating, that all creared JBIG2 images should try to combine found symbols in a common global section. This normally reduces the size. Jpeg2000Quality: <number> the quality value for created JPEG2000 images JpegQuality: <number> the quality value for created Jpeg images Jpeg2000QualityBackground: <number> jpeg2000 quality for MRC background images Jpeg2000QualityForeground: <number> jpeg2000 quality for MRC foreground images MRCIterativeJpeg2000: <boolean> actvates an iterative approach to get best sized jpeg2000 images MRCWriteLayers: <boolean> writes Layer information, so that you can enable and disable in a Layer enabled viewer the MRC layers. MRCMinHeight: <number> Sets a minimum height in pixel on which images MRC is tried MRCMinWidth: <number> Sets a minimum width in pixel on which images MRC is tried MRCMinDpi: <number> Sets a minimum dpi on which images MRC is tried MRCRatioForeground: <number> the ratio the foreground is shrinked in its dimensions compared to the original image MRCRatioBackground; <number> the ratio the backhround is shrinked in its dimensions compared to the original image	Set

Property

Meaning

Type

ReEncodeImages

A flag indicating, that the document images should be reencoded.

Set

ReEncodeImagesOptions

A JSON object which configures the image reencoding with following keys:

MRCProfile: <profile> Same profiles as in CompressQuality, which define base rules for reencodeing.
TargetColorImages: Can be Jpeg2000, Jpeg, MRC or empty
TargetColorIndexedImages: Can be Jpeg2000, Jpeg, MRC or empty
TargetColorImages: Cab be Jpeg2000, Jpeg, MRC or empty
TargetMonochromeImages: Can be jbig2 or empty
MRCTargetDPI: <number>, sets the Target DPI under which the MRC decompistion is done. It should work best between 200 and 400 dpi.
JBig2Combine: <boolean>, flag, indicating, that all creared JBIG2 images should try to combine found symbols in a common global section. This normally reduces the size.
Jpeg2000Quality: <number> the quality value for created JPEG2000 images
JpegQuality: <number> the quality value for created Jpeg images
Jpeg2000QualityBackground: <number> jpeg2000 quality for MRC background images
Jpeg2000QualityForeground: <number> jpeg2000 quality for MRC foreground images
MRCIterativeJpeg2000: <boolean> actvates an iterative approach to get best sized jpeg2000 images
MRCWriteLayers: <boolean> writes Layer information, so that you can enable and disable in a Layer enabled viewer the MRC layers.
MRCMinHeight: <number> Sets a minimum height in pixel on which images MRC is tried
MRCMinWidth: <number> Sets a minimum width in pixel on which images MRC is tried
MRCMinDpi: <number> Sets a minimum dpi on which images MRC is tried
MRCRatioForeground: <number> the ratio the foreground is shrinked in its dimensions compared to the original image
MRCRatioBackground; <number> the ratio the backhround is shrinked in its dimensions compared to the original image

Set

4.14. Rasterization of PDF documents

Rasterization of a PDF document means, that each page is rendered to one image and the page content of the document is replaced by that image. This can be useful for good compression results in special cases or simplify PDF documents by avoiding rare features in the content of a page.

Property	Meaning	Type
Rasterize	A flag indicating, that the document should be raterized.	Set
RasterizeOptions	A JSON object which configures the rasterize command with following keys: RasterMinImageCount: <number> Rasterize only, when more than <number> images are found on a page. RasterMinPathPaintings: <number> Rasterize only, when more than <number> path painting operations are found on a page. RasterMaxText: <number> Rasterize only, when not more than <number> text characters are found on a page	Set

Property

Meaning

Type

Rasterize

A flag indicating, that the document should be raterized.

Set

RasterizeOptions

A JSON object which configures the rasterize command with following keys:

RasterMinImageCount: <number> Rasterize only, when more than <number> images are found on a page.
RasterMinPathPaintings: <number> Rasterize only, when more than <number> path painting operations are found on a page.
RasterMaxText: <number> Rasterize only, when not more than <number> text characters are found on a page

Set

4.15. Applying image filters on images in PDF

CIB pdfModule can make use of the image filters, which CIB image toolbox provides. You need both libraries to use this feature.

Property	Meaning	Type
ImageFilter	A flag indicating, that all images (except masking images) should be processed by an image filter, provided by CIB image toolbox	Set
ImageFilterPageSelection	Defining a page selection, which pages should be processed.	Set
CibImageToolboxFilter.*	* is a placeholder for all CIB image toolbox properties for configuring the filter, which should be applied. Simple examples are e.g.: Setting a Local Otsu on all images: CibImageToolboxFilter.OperationName=LocalOtsuBinarizer Setting an invert filter on all images: CibImageToolboxFilter.OperationName=InvertFilter	Set

4.16. Formular fields

Property	Meaning	Type
FormEditor	FormEditor is a JSON-Object with commands, what should be done with PDF formfields. Currently only the command “Create” is supported. The value of Create should be a JSON-Array of formfield definitions. A formfield definition has at least the keys “Type”, “Name” and “Annotation”: Type: Can be: Signature Text Name: the name of the new formfield. Annotation: A JSON-array of rectangle definitions, where this formfield should be created on the page. E.g.: { "Rect": [300, 600, 200, 45], "Page" : 1 } Additionally text fields can have the keys: MaxLen: the max numbers of characters, which can be inserted. Flags: A JSON-Array of combinable flags for this field: Multiline Password Comb ReadOnly FileSelect DoNotScroll RichText Value: the content of the text field.	Set
NeedAppearences	A flag, which indicates, if NeedAppearences is active or not.	Get/Set
RegenerateFormFieldAppearences	A flag which indicates, that the formfield appearences should be regenerated from form field values.	Set

4.17. Extracing or removing images

All images inside a PDF can be extracted with the ExtractImages property.

Property	Meaning	Type
ExtractImages	Definition of a path to which the images of a PDF document should be saved to. Example: D:\path\output.png	Set
ExtractImagesCallback	Callback, which should be used, instead of the Path defined in ExtractImages to deliver the images.	Set
ExtractImagesUserdata	userdata	Set
ExtractImagesCombineMasks	Inside PDF images and mask images are separated as two images. By setting this property the image and mask image is outputted as a combined image in PNG format with alpha channel.	Set
RemoveImages	Removes images from the document.	Set

4.18. Handling XFA documents

Property	Meaning	Type
HasXFA	A flag indicating, that the PDF document contains a XFA document.	Get
XFAExtract	A flag indicating, that a XFA document should be extracted from the PDF document.	Set
XFAOutputFilename	The filename to which the XFA document should be saved to. Default is the existing input filename with the extension .xfa.	Set

4.19. PDF overlays

You can merge the content of several PDF pages of different pdf document to one. By this you can e.g. add a stationery to a PDF document. Merging is done by the PageContentMerge property.

Property	Meaning	Type
PageContentMerge	A JSON array of JSON objects, which define merge opertation. A merge operation is a JSON object with following keys: File: the file from which the content is merged (mergesource-Document) Password: the password of the file, if encrypted InsideBackground: A JSON Boolean flag, which indicates, if the content should be inside the background (default) or foreground. Repitition: Following repetition types are defined: All: First page of the mergesource-Document is merged on all pages of the pdf document. Identitiy: Each page of the mergesource-Document is merged on the equivalent page with same page number of the pdf document. First: First page of the mergesource-Document is merged on the first page of the pdf document. Examples: First;All: First page of the mergesource-Document ist merged on the first page of the document, the second page of the mergesource-document is merged on all other pages of the pdf document.	Set

Property

Meaning

Type

PageContentMerge

A JSON array of JSON objects, which define merge opertation. A merge operation is a JSON object with following keys:

File: the file from which the content is merged (mergesource-Document)
Password: the password of the file, if encrypted
InsideBackground: A JSON Boolean flag, which indicates, if the content should be inside the background (default) or foreground.
Repitition: Following repetition types are defined:
All: First page of the mergesource-Document is merged on all pages of the pdf document.
Identitiy: Each page of the mergesource-Document is merged on the equivalent page with same page number of the pdf document.
First: First page of the mergesource-Document is merged on the first page of the pdf document.

Examples:

First;All: First page of the mergesource-Document ist merged on the first page of the document, the second page of the mergesource-document is merged on all other pages of the pdf document.

Set

4.20. Importing Text

Note: you need modifying rights to import text into a PDF.

Property	Meaning	Type
HocrInputData	Hocr data to be merged into the PDF. This can be provided by direct data in XML form or by a filename (where the pageIndex inside the HOCR-file need to match the pages of the PDF document) or a list of ;-separated page descritions and HOCR files. E.g. “{3};file1.hocr;{4};file2.hocr”	Set
FormatSearchablePdfShowText	A flag indicating, that the text should be imported non-invisible. Default is invisible.	Set
FormatSearchablePdfCreateLayer	Instead of FormatSearchablePdfShowText also FormatSearchablePdfCreateLayer can be applied. Then PDF Layers are generated, where you can switch on or off the imported text.	Set
FormatSearchablePdfLayerOpacity	The Opacity of the layer. A number value between 0 (transparent) and 100 (complete opaque). Default is 50.	Set
FormatSearchablePdfLayerTitle	The layer title, which should be shown in the user interface. Default: “Visualize text”	Set
HocrStartIndex	A number, which indicates, to which offset the first HOCR page refers. Default is 0.	Set
TextMark	A string under which the imported string is marked. Default: CIB_HOCR	Set

4.21. Exporting Text

Options for exporting text from a PDF document

Property	Meaning	Type
TextExtraction	A flag indicating, that text should be extracted from a PDF. By default, only visible text is extracted and saved in Utf16 format. To change this behavior, use additional options: TextFormattingOptions and TextSelectionFilter	Set
FillTextOutput	A flag indicating if the extracted text is saved in memory (1) or not (0).	Set
TextOutputFilename	Filename for text file, to which the extracted text should be saved.	Set
TextFormattingOptions	Optional: allows to specify the output format (Utf8, Utf16, Hocr) and to enable additional word repositioning. The options is specified as a JSON object. Example: The following formatting options are set by default: TextFormattingOptions={"OutputFormats":["txt"], “OutputResolution”:72, “Options”:{“EnableWordSorting”:false, “SeparateTextBlocks”:false}} So, if the option TextFormattingOptions is not set explicitly then text will be saved in output file in Utf16 format, and word order is the same as in pdf stream. The following output formats are currently supported: 1. txt: text is saved into the output file in Utf16 format; 2. utf8txt: text is saved into the output file in Utf8 format; 3. Hocr: text is saved into the output file in HOCR format If EnableWordSorting is set as true then the words in the output file will be reordered, according to their coordinates in the PDF document Option “OutputResolution” has effect only for HOCR output: it specifies the resolution of processed pages to calculate positions and sizes of all bounding boxes. Default resolution for PDF documents is 72dpi.	Set
TextSelectionFilter	Optional: allows to filter exported text by its visibility (visible/invisible) within a PDF document and also by special content markers (tags). The options is specified as a JSON object. Now, only filtering by predefined text groups are supported. Example: TextSelectionFilter = {"groups": ["any_visible",cibocr_invisible", "others_invisible", …]} The following groups may be set in any combination within the groups array: 1. any_visible: any visible text, as within marked content as within not-marked one; 2. any_invisible: any invisible text, as within marked content as within not-marked one. 3. simple_invisible: invisible text within not-marked content; 4. cibocr_invisible: invisible text within content, marked with CIB_HOCR tag; 5. others_invisible: invisible text within content, marked with tags other than CIB_HOCR; 6. marked_invisible: invisible text within content, marked with a tag, specified in TextMark property (CIB_HOCR is default); Note: The text group any_invisible is a composite group: it includes all groups with prefix _invisible. So if you need to extract all text from PDF, just set groups array as {"groups": ["any_visible",”any_invisible”]}	Set

4.22. Tracing

You can easily create a trace file by setting this property in your job:

Property	Meaning	Type
TraceFilename	The filename to which the trace should be written. Traces should be activated only for analyzing problems, not in general pdf processing, because they can become very big.	Set

5. General

5.1. Hinweis

Die CIB software GmbH behält sich sämtliche Eigentumsrechte an der angebotenen Software
und der dazugehörigen Dokumentation vor. Die Benutzung der Software und des
dazugehörigen Benutzerhandbuches unterliegen dem der Software zugrundeliegenden
Lizenzvertrag. Die Bereitstellung und der Download dieses Dokuments und der Software allein
bewirken keine Übertragung von Nutzungs- und Vervielfältigungsrechten.

Kein Teil dieses Handbuchs darf ohne schriftliche Genehmigung der CIB software GmbH in
irgendeiner Form reproduziert oder weiterverwertet werden. Auch eine Bearbeitung,
insbesondere eine Übersetzung der Dokumentation, ist ohne Genehmigung der
CIB software GmbH nicht gestattet. Der Inhalt dieses Handbuches ist auch urheberrechtlich
geschützt, wenn es nicht mit der Software geliefert wird, die eine
Endbenutzerlizenzvereinbarung enthält.

CIB pdf brewer, CIB coSys, CIB webdesk, CIB workbench, CIB dialog, CIB merge, CIB view,
CIB format, CIB print, CIB pdf toolbox, CIB pdfModule, CIB image toolbox sind entweder
eingetragene Marken oder Marken der CIB software GmbH.

Windows ist eine eingetragene Marke der Microsoft Corporation.

Solaris und Java sind Marken bzw. eingetragene Marken von Oracle und ihrer
Tochtergesellschaften.

Alle anderen Marken- und Produktnamen sind Marken oder eingetragene Marken der
jeweiligen Rechteinhaber.

Der Inhalt dieses Handbuchs wurde mit größter Sorgfalt erarbeitet. Die Angaben in diesem
Handbuch gelten jedoch nicht als Zusicherung von Eigenschaften des Produktes. Die
CIB software GmbH haftet nur im Umfang ihrer Verkaufs- und Lieferbedingungen und
übernimmt keine Gewähr für technische Ungenauigkeiten und oder Auslassungen.

Die CIB software GmbH haftet weder für technische oder typographische Fehler und Mängel in
diesem Handbuch, noch für Schäden, die direkt oder indirekt auf die Lieferung, Leistung und
Nutzung dieses Materials zurückzuführen sind.

Die Informationen in diesem Handbuch können ohne Ankündigung geändert werden.

während des Einsatzes Unstimmigkeiten in Zusammenhang mit den Ausführungen in
dieser Übersicht auftreten, sind wir Ihnen für entsprechende Hinweise sehr dankbar:
CIB software GmbH
Elektrastraße 6a
81925 München
E-Mail: support@cib.de
Tel.: 49 (0)89 / 1 43 60 - 111
Fax: 49 (0)89 / 1 43 60 – 100
Oder im Internet:
- Youtube: https://www.youtube.com/user/CIBSoftwareGmbH
- Twitter: https://twitter.com/CIBsoftwareGmbH

5.2. Support

E-Mail: support@cib.de

Tel.: 49 (0)89 / 1 43 60 - 111

Fax: 49 (0)89 / 1 43 60 – 100

5.3. Licensing

This document doesn’t provide any information about how to license this software. Please
contact CIB support or CIB sales department for further information.

5.4. Content of the delivered package

CIB pdf toolbox 2 gets delivered as binaries as DLLs (Windows) or shared libraries (Unix / Linux) or webassembly file (.wasm).

Component	Files
CIB pdf toolbox 2	CibPdf2_64/32.dll or libcibpdf_2(32/64).so/.a: Main library cibjbig264/32.dll or libcibjbig2ux(64).so/.a: Handling JBIG2 images cibpdf2.h Main Header file for the API definition Java/…. Java API doc/… Documentation, version history and used open source licenses.
	Dependent libraries on Linux/Unix
	libgcc_s.so libstdc++.so.6
	Dependent libraries on AIX
	libgcc_s.a libstdc++.a

There are also code examples included.

6. Error codes

These are general CIB error codes. Not all of them can be returned by CIB pdfModule. Some are only generated by other CIB modules.

Return value	Meaning
20000	General error occured
20001	HOCR reading error
20002	General reading during reading an input text
20003	HOCR writing error
20004	General error during executing the image toolbox
20005	General runtime error during
20006	General error during reading pdf COS objects
20007	General other error during parsing a PDF document
20008	License was validated wrong
20009	PDF validation error
20010	IO Writing error
20011	Password was not accepted for an encrypted input document
20012	Unsupported feauture called
200013	An error in execution of a PDF-Funktion occured.
20014	IO Reading error
20015	General error regarding the job-Handle pased to the library.
20016	Error during loading dependent library
20017	Error during reading CIB pdf brewer settings
20018	Error during a CIB pdf brewer document conversion
20019	Error for setting a property of the CIB pdf brewer
20020	Error during parsing the content of a property value
20021	Reading error of a ZUGFeRD XML
20022	CIB updator generic error
20023	CIB ai generic error
20024	General IO error
20025	General printing error
20026	Error during executing TextOverlays
20027	CIB pdf brewer API error
20028	Error indicating that a user cancel was invoked by a callback.
20029	Error during loading CIB pdf brewer UI library
20030	Error during reading jpeg files
20031	General encryption error
20032	CIB image toolbox error, that indicates, that an image cannot be segmented into several layers for MRC compression, because a given component limit was reached.
20033	Generic error during PDF semantics
20034	Error during processing pdf attachments
20035	Indicates, that a segmentation with the purpose of MRC placed all content on the background

Site:	CIB eLearning
Course:	CIB pdf toolbox 2
Book:	CIB pdf toolbox 2 technical documentation

Printed by:	Guest user
Date:	Friday, 11 July 2025, 2:24 PM

CIB pdf toolbox 2 technical documentation

Table of contents

1. Introduction

2. API Documentation

2.1. CibPdf2JobCreate

2.2. CibPdf2JobFree

2.3. CibPdf2JobExecute

2.4. CibPdf2JobSetPropertyW

2.5. CibPdf2JobSetPropertyUtf8

2.6. CibPdf2JobSetPropertyWSimple

2.7. CibPdf2JobSetPropertyUtf8Simple

2.8. CibPdf2JobGetPropertyW

2.9. CibPdf2JobGetPropertyUtf8

2.10. CibPdf2GetVersion

2.11. CibPdf2GetVersionText

2.12. CibPdf2JobGetErrorUtf8

3. Testset Documentation

3.1. Creating/Freeing a Job

3.2. Setting Property

3.3. Setting Subproperty

3.4. Executing a Job

3.5. Getting Property

3.6. Getting Last Error

4. Usecases

4.1. Setting License

4.2. Loading and Merging PDF-Documents

4.3. Creating PDF-Documents from Image formats

4.4. Content Modify (Adding text / shapes / Barcodes / Images)

4.5. Rendering PDF documents

4.6. Reading and Writing simple PDF information

4.7. Writing PDF documents

4.8. Retrieving progess and event information

4.9. Rotating PDF documents

4.10. Encryption

4.11. Signing

4.12. Handling embedded files

4.13. Compression of documents

4.14. Rasterization of PDF documents

4.15. Applying image filters on images in PDF

4.16. Formular fields

4.17. Extracing or removing images

4.18. Handling XFA documents

4.19. PDF overlays

4.20. Importing Text

4.21. Exporting Text

4.22. Tracing

5. General

5.1. Hinweis

5.2. Support

5.3. Licensing

5.4. Content of the delivered package

6. Error codes