CIB pdf toolbox 2 technical documentation
Site: | CIB eLearning |
Course: | CIB pdf toolbox 2 |
Book: | CIB pdf toolbox 2 technical documentation |
Printed by: | Guest user |
Date: | Friday, 22 November 2024, 6:07 AM |
Table of contents
- 1. Introduction
- 2. API Documentation
- 2.1. CibPdf2JobCreate
- 2.2. CibPdf2JobFree
- 2.3. CibPdf2JobExecute
- 2.4. CibPdf2JobSetPropertyW
- 2.5. CibPdf2JobSetPropertyUtf8
- 2.6. CibPdf2JobSetPropertyWSimple
- 2.7. CibPdf2JobSetPropertyUtf8Simple
- 2.8. CibPdf2JobGetPropertyW
- 2.9. CibPdf2JobGetPropertyUtf8
- 2.10. CibPdf2GetVersion
- 2.11. CibPdf2GetVersionText
- 2.12. CibPdf2JobGetErrorUtf8
- 3. Testset Documentation
- 4. Usecases
- 4.1. Setting License
- 4.2. Loading and Merging PDF-Documents
- 4.3. Creating PDF-Documents from Image formats
- 4.4. Content Modify (Adding text / shapes / Barcodes / Images)
- 4.5. Rendering PDF documents
- 4.6. Reading and Writing simple PDF information
- 4.7. Writing PDF documents
- 4.8. Retrieving progess and event information
- 4.9. Rotating PDF documents
- 4.10. Encryption
- 4.11. Signing
- 4.12. Handling embedded files
- 4.13. Compression of documents
- 4.14. Rasterization of PDF documents
- 4.15. Applying image filters on images in PDF
- 4.16. Formular fields
- 4.17. Extracing or removing images
- 4.18. Handling XFA documents
- 4.19. PDF overlays
- 4.20. Importing Text
- 4.21. Exporting Text
- 4.22. Tracing
- 5. General
- 6. Error codes
1. Introduction
CIB pdf toolbox was developed since around the year 2000. In 2017 a second main version was developed, which was written completly from scratch and in which we put all our expertise gained from long-term developing of PDF processing and modern C++ programming. It proofed to be a very fast, reliable and a very safe pdf processor, which is also very easy to maintain, modular and suitable for all the usecases around pdf we know of.
CIB pdf toolbox 2 introduced also a new interface, which is completly unicode aware and for complex structures always JSON-based.
Therefor the interface of CIB pdf toolbox 2 is sometimes similar to the old one, but not completly backwards compatible.
2. API Documentation
CIB pdf toolbox 2 has a so called job-based interface. Normally you process a usecase by:
- Creating a JOB-Handle
- Setting for this JOB-Handle several properties to configure the JOB
- Executing the job
- Optionally getting some output properties to read results of the JOB
- Deleting the JOB-Handle
Core functions are described below, which are exposed from the library.
For higher level, there are convenience wrapper functions whose description can be found in Test documentation section.
On Windows all interface functions are following the stdcall calling convention.
2.1. CibPdf2JobCreate
uint32_t CibPdf2JobCreate(CibPdfJobHandle* job);
By this method a new job handle will be created.
Parameter:
Type |
Variable |
Meaning |
CibPdfJobHandle* |
Job |
Handle of the job |
The function returns 0 on success, otherwise an error code.
2.2. CibPdf2JobFree
uint32_t CibPdf2JobFree(CibPdfJobHandle* job);
By this method a job, created by CibPdf2JobCreate will be freed again. All resources bound to
the job will be also freed.
Parameter:
Type |
Variable |
Meaning |
CibPdfJobHandle* |
Job |
Handle of the job |
The function returns 0 on success, otherwise an error code.
2.3. CibPdf2JobExecute
uint32_t CibPdf2JobExecute(CibPdfJobHandle job);
By this method a job will be executed. The job handle must be created by CibPdf2JobCreate
before calling this function.
Parameter:
Type |
Variable |
Meaning |
CibPdfJobHandle* |
Job |
Handle of the job |
The function returns 0 on success, otherwise an error code.
2.4. CibPdf2JobSetPropertyW
uint32_t CibPdf2JobSetPropertyW(CibPdfJobHandle job, const wchar_t* name, const wchar_t*value, size_t length);
By this function you can configure a job with properties.
Parameter:
Type |
Variable |
Meaning |
CibPdfJobHandle |
Job |
Handle of the job |
const wchar_t* |
name |
Name of the job-property (UNICODE) |
const wchar_t* |
value |
Value of the job-property (UNICODE) |
size_t |
length |
Length of value |
The function returns 0 on success, otherwise an error code.
2.5. CibPdf2JobSetPropertyUtf8
uint32_t CibPdf2JobSetPropertyUtf8(CibPdfJobHandle job, const char* name, const char* value,size_t length);
By this function you can configure a job with properties.
Parameter:
Type |
Variable |
Meaning |
CibPdfJobHandle |
Job |
Handle of the job |
const char* |
name |
Name of the job-property (in UTF-8) |
const char* |
value |
Value of the job-property (in UTF-8) |
size_t |
length |
Length of value |
The function returns 0 on success, otherwise an error code.
2.6. CibPdf2JobSetPropertyWSimple
uint32_t CibPdf2JobSetPropertyWSimple(CibPdfJobHandle job, const wchar_t* name, constwchar_t* value);
Same as CibPdf2JobSetPropertyWSimple, but the length of value is determined by a 0-
Terminator.
Parameter:
Type |
Variable |
Meaning |
CibPdfJobHandle |
Job |
Handle of the job |
const wchar_t* |
name |
Name of the job-property (UNICODE) |
const wchar_t* |
value |
Value of the job-property (UNICODE) |
The function returns 0 on success, otherwise an error code.
2.7. CibPdf2JobSetPropertyUtf8Simple
uint32_t CibPdf2JobSetPropertyUtf8Simple(CibPdfJobHandle job, const char* name, constchar* value);
Same as CibPdf2JobSetPropertyUtf8, but the length of value is determined by a 0-Terminator.
Parameter:
Type |
Variable |
Meaning |
CibPdfJobHandle |
Job |
Handle of the job |
const char* |
name |
Name of the job-property (in UTF-8) |
const char* |
value |
Value of the job-property (in UTF-8) |
The function returns 0 on success, otherwise an error code.
2.8. CibPdf2JobGetPropertyW
uint32_t CibPdf2JobGetPropertyW(CibPdfJobHandle job, const wchar_t* name, wchar_t* value,size_t* maxLength);
By this function you can retrieve properties, after you executed the job. The provided buffer in
value must provide enough bytes to retrieve the data, otherwise an error is returned.
maxLength will always contain after executing the needed length.
Parameter:
Type |
Variable |
Meaning |
CibPdfJobHandle |
Job |
Handle of the job |
const wchar_t* |
name |
Name of the job-property (UNICODE) |
wchar_t* |
value |
Value of the job-property (UNICODE) |
size_t* |
maxLength |
As Input the provided length of the value buffer. After |
The function returns 0 on success, otherwise an error code.
2.9. CibPdf2JobGetPropertyUtf8
uint32_t CibPdf2JobGetPropertyUtf8(CibPdfJobHandle job, const char* name, char* value,size_t* maxLength);
By this function you can retrieve properties, after you executed the job. The provided buffer in
value must provide enough bytes to retrieve the data, otherwise an error is returned.
maxLength will always contain after executing the needed length.
Parameter:
Type |
Variable |
Meaning |
CibPdfJobHandle |
Job |
Handle of the job |
const char* |
name |
Name of the job-property (in UTF-8) |
char* |
value |
Value of the job-property (in UTF-8) |
size_t* |
maxLength |
As Input the provided length of the value buffer. After |
The function returns 0 on success, otherwise an error code.
2.10. CibPdf2GetVersion
CibPdf2GetVersion uint32_t CibPdf2GetVersion();
Returns the version of the CIB pdf toolbox 2 as uint32_t representation.
2.11. CibPdf2GetVersionText
CibPdf2GetVersionText uint32_t CibPdf2GetVersionText(char* value, size_t maxLength);
Retrieves the version of the CIB pdf toolbox 2 as a text string.
Parameter:
Type |
Variable |
Meaning |
char* |
value |
Buffer to which the version string should be written to |
size_t |
maxLength |
Length of the provided buffer. |
The function returns 0 on success, otherwise an error code.
2.12. CibPdf2JobGetErrorUtf8
uint32_t CibPdf2JobGetErrorUtf8(CibPdfJobHandle job, char* value, size_t* maxLength);
By this function you can retrieve a text description of the most recent error, which occurred in
a job.
Parameter:
Type |
Variable |
Meaning |
CibPdfJobHandle |
Job |
Handle of the job |
char* |
value |
Value of the error (in UTF-8) |
size_t* |
maxLength |
As Input the provided length of the value buffer. After |
The function returns 0 on success, otherwise an error code.
3. Testset Documentation
The CIB pdf toolbox 2 PDF package can be used through high level functions, which are
wrapper of those described in API documentation.
A testset, where several examples of these functions can be found, is provided along with the
package for testing purposes.
For processing anything inside the library you need to create a job. To configure the job you
need to set properties. And to execute the job you need to call execute on it. Afterwards you
can get information about what was done inside the job by retrieving properties.
The library is threadsafe. So you can call it from different threads with different jobs. But each
job should be used only in one thread.
3.1. Creating/Freeing a Job
As simple as declaring a variable. Depending on the library preferred, it can be created as the two options shown below:
CibPdfToolbox2Job job;
As for freeing the corresponding job, its destructor is called as usual.
3.2. Setting Property
Properties and its different values are described in Usecases section. In order to set properties
job.setProperty(CibProperty::property, value);
where value is a wstring.
3.3. Setting Subproperty
job.setSubProperty(CibProperty::property, CibProperty::subproperty, value);
Notice that in Usecases section, subproperties are referred as Property.Subproperty
3.4. Executing a Job
job.execute();
3.5. Getting Property
Properties can be retrieved after a job is executed.
job.getPropertyAsStringW(CibProperty::property);
3.6. Getting Last Error
To retrieve a text description of the most recent error which occurred in a job.
job. getErrorText();
4. Usecases
4.1. Setting License
Setting license information
Property |
Meaning |
Type |
LicenseCompany |
Setting the company of the license. |
Set |
LicenseKey |
Setting the key of the license. |
Set |
LicenseFile |
If LicenseFile provided, it should be copied into c++/bin64 or c++/bin32 folder along with the binaries. LicenseCompany and LicenseKey not to be change. |
Set |
4.2. Loading and Merging PDF-Documents
Inside a JOB you can load PDF-documents in several ways.
Property |
Meaning |
Type |
InputFilename |
Inputfiles as ;-separated list of filenames. If the filename contains a ;, use \ as escape sequence. Before the filename you can specify which pages of the pdf document should be used. Example: InputFilename={1-3};filename1;{Odd};filename2;#;12345678#664 The result would be a merged PDF with the pages 1-3 of filename1, all odd pages of filename2 and all pages of filename3, which is located at memory address 12345678 and has a length of 664 bytes.. Syntax: InputFilename ::= <filenameorpages> [“;“ < filenameorpages >]... filenameorpages::= [<page description> „;“] (<filename> | <memory blocks> | <memory delimiter>) page description ::= „{“ <pages> [“,“ <pages>]... „}“ pages::= “All“ |“Even“ | “Odd“ | “First“ | “Last“ | “NoFirst” | “NoLast” | <number> | (<start number> „-„ <end number>) memory blocks::= < memory block> [<memory delimiter> <memory block>]... memory block ::= <address> <memory delimiter> <length> Note: Note: Example: In this case, document mydoc.pdf is merged with a singlepage empty PDF. |
Set |
InputMemoryAddress |
Contains pairs of memory addresses and lengths, separated by ; of input document. |
Set |
InputFilenamePattern |
Contains a pattern of input files, e.g. *.pdf. All PDFs with this pattern will be merged together. |
Set |
WorkSpace |
You can set one or several Workspace dictionaries, separated by ; where to search for inputfiles. |
Set |
MergePdfAConform |
A flag indicating, that if several input documents are merged to one, that this merge happens PDF/A specification conform. This means, that if all input documents conform to PDF/A, then the output document conforms also to PDF/A. |
Set |
OutlinesDeleteExisting |
A flag that indicates, that PDF-outlines should be deleted. |
Set |
PartialDocumentAlignment |
Every partial document is expanded to an even pagecount. See |
Set |
InsertEmptyPageOmitLast |
This configures the PartialDocumentAlignment behaviour, that for the last document no optional empty page is appebded. |
Set |
FormfieldNamePrefix |
A ;-separated string, by which a prefix for formfields can be defined, which is assigned to each input document in a merge. |
Set |
GenericFormfieldNamePrefix |
A flag, that indicates, that all input documents should get an automatic generated formular field prefix, so that formular fields of the input documents are always unique and are not combined in a pdf merge. |
Set |
PartialDocumentAlignment |
Adds an empty page between merged documents, if the input |
Set |
JoinHistory |
Sets a marker inside the PDF, where each document |
Set |
4.3. Creating PDF-Documents from Image formats
Inside the InputFilename property you can also load image formats of type JPEG, BMP, Tiff, JPEG2000 and PNG and not only PDF documents. Those image formats will be automatically converted to a PDF document.
Property |
Meaning |
Type |
InputImageScaling |
You can specify the dimensions of the pages on which the images will be emplaced. Example: InputImageScaling=100pt;200pt InputImageScaling=100mm;200mm InputImageScaling=;200pt InputImageScaling=100dpi Instead of setting a DPI or fixed size, you can select also a |
Set |
InputImageMargin |
Setting a margin in pt, which should be used for the images. Default is 0. Example: InputImageMargin=10pt InputImageMargin=10pt;20pt InputImageMargin=10pt;20pt;30pt;40pt |
Set |
InputImageBackgroundColor |
Define the background color of the page as a color. Example: InputImageBackgroundColor= #FF0000 |
Set |
InputImageScalingMax |
Defining a maximum size of a created image in pt |
Set |
InputImageEmbedStyle |
Either embed or fit. Fit means, that the image is completely |
Set |
4.4. Content Modify (Adding text / shapes / Barcodes / Images)
You can add text to the document, modify text, add shapes as images and more:
Property |
Meaning |
Type |
TextOverlay |
TextOverlay is a JSON-Array of JSON-Objects of new text strings added to the PDF The keys in the JSON-Object can be:
o yyyy: Year, (e.g. 2015) o ss: Seconds with leading zero
Example: TextOverlay= [{"FontColor": "#FF0000", "FontName": "Arial", "Degree": -45, “Text": "Demo", "FontSize": 100, "ZLevel": 1, "Opacity": 30, "Position": "MiddleCenter", "FontStyle": "Normal"}] Creates a new Text with content “Demo” above existing pdf content with font color red, font Arial, font size 100 and normal font style. The position would be in the middle of the page and the text will be rotated by -45 degrees. |
Set |
TextReplace |
By this you can replace existing text in a PDF. For the new text the existing font attributes are used. If the new text contains glyphs, which the old font doesn’t provide, a new font is embedded. Either you can specify a region (<SpecifiedRegion>), where text should be replaced or give text-strings(<TextSearch>), which should be replaced. The value is a JSON-array of either a <SpecifiedRegion> or a <TextSearch>. A <SpecifiedRegion> is a JSON-object with keys:
A <TextSearch> is a JSON-object with keys:
|
Set |
DrawShape |
DrawSpape is a JSON-Array of JSON-Objects of new shapes added to the PDF. The keys in the JSON-object can be:
Example: DrawShape = [{"PageSelection": "1", "Color": "#DDFF0000", "Position": "19mm;27mm", "Width": "26mm", "Height": "6mm", "Type": "Rectangle", "BorderThickness": "3pt"}] Create a new rectangle shape with a border with of 3 pt and color red and transparency #DD on page 1 and position 19 mm, 27 mm (from top left) and with 26 mm and height 6 mm. DrawShape=[{'PageSelection': '1', 'Type': 'Barcode', 'Format': Creates a new Barcode of Type DataMatrix on Page 1. |
Set |
AnnotationAdd |
AnnotationAdd is a JSON-Array of JSON-Objects of new Rich Media annotations added to the PDF. The keys in the JSONobject can be: - PageIndex: on which pages this Rich Media annotation should be added
Example: [ {"PageIndex":5, "Filename":"D:\\\\media1.mp4", |
Set |
4.5. Rendering PDF documents
You can render a PDF document to one or more image types at the same time.
Property |
Meaning |
Type |
Render |
“1” activates rendering of the document. |
Set |
Render.ImageType |
A ;-separated list of image output types. Currently Bmp, Jpg, |
Set |
Render.ImageScaling |
Setting an image dpi for rendering. Default is 150dpi. |
Set |
JpegQuality |
Sets the jpeg quality, if images are saved as Jpeg or Jpeg2000 Default is 80 |
Set |
PngCompressLevel |
Setting the PNG compression level (1-10) for png images |
Set |
Render.Bounds |
Setting bounds of which part of the page should be rendered. |
Set |
Render.DPIAware |
Render.DPIAware=0 : No dpi information is written into the |
Set |
Render.MaxSize |
Setting a MaxWidth and MaxHeight in pixel, where the |
Set |
RenderOutputFilenamePattern |
Defining a pattern for the output filesnames. This is a sprintf type pattern, which should include a placeholder for a number of the page rendered and optionally for a string, which will be replaced by the original filename. |
Set |
Render.OutputType |
Set “memory” or “file”, if you want to have the output saved to |
Set |
RenderPageSelection |
Defining a page selection, which pages should be rendered. |
Set |
Render.Formfields |
Defines, if the content of formfields should be rendered or not. |
Set |
Render.AllowFontFall |
Defines, if the content of text should be rendered or not, Default is 1. |
Set |
Render.Annotations |
Defines, if the content of annotations (other than formfields) Default is 1. |
Set |
Render.Signatures |
Defines, if the content of signatures should be rendered or |
Set |
Render.TiffSinglePage |
Defines, if tiff images should be generated as single-page |
Set |
Render.TextRenderM |
By this you can configure, which classes of text is rendered. By |
Set |
4.6. Reading and Writing simple PDF information
You can render a PDF document to one or more image types at the same time.
Property |
Meaning |
Type |
CenterWindow |
A flag specifying whether to position the document’s window in the center of the screen. Default value: false. |
Get/Set |
Encrypted |
“1” or “0” depending if the document is encrypted |
Get |
Direction |
The predominant reading order for text:
This entry has no direct effect on the document’s contents or page numbering but may be used to determine the relative positioning of pages when displayed side by side or printed n-up. Default value: L2R. |
Get/Set |
DisplayDocTitle |
A flag specifying whether the window’s title bar should display the document title taken from the Title entry of the document information dictionary. If false, the title bar should instead display the name of the PDF file containing the document. |
Get/Set |
DocInfo |
A JSON-Desciption of all available DocInfo.* properties. |
Get |
DocInfo.* |
Setting or reading of custom metadata properties. |
Get/Set |
DocInfo.Author |
Information about the author of the document |
Get/Set |
DocInfo. CompressProfile |
Information about which compression profile was used, after a compression task was executed on the document before. |
Get |
DocInfo.Creator |
Information about the creating software of the PDF document. |
Get/Set |
DocInfo.CreationDate |
Information about the creation date of the PDF document. |
Get/Set |
DocInfo.Keywords |
The Keywords, which are assigned to this PDF document. |
Get/Set |
DocInfo.ModDate |
The last modification date of the PDF document |
Get/Set |
DocInfo.Producer |
Retrieving Information about the producer of the PDF document. |
Get/Set |
DocInfo.Subject |
Information about the subject of the PDF document. |
Get/Set |
DocInfo.Title |
Information about the title of the PDF document. |
Get/Set |
FitWindow |
A flag specifying whether to resize the document’s window to fit the size of the first displayed page. Default value: false. |
Get/Set |
HasJavascript |
“1” or “0” depending if the document contains JavaScript |
Get |
HideMenubar |
A flag specifying whether to hide the conforming reader’s menu bar when the document is active. Default value: false. |
Get/Set |
HideToolbar |
A flag specifying whether to hide the conforming reader’s tool bars when the document is active. Default value: false. |
Get/Set |
HideWindowUI |
A flag specifying whether to hide user interface elements in the document’s window (such as scroll bars and navigation controls), leaving only the document’s contents displayed. Default value: false. |
Get/Set |
ID |
The two ID values of the PDF, separated by a ;. |
Get |
ImageInfo |
Retrieving a description about images inside a PDF in JSON format. |
Get |
NonFullScreenPageMode |
The document’s page mode, specifying how to display the document on exiting full-screen mode:
This entry is meaningful only if the value of the PageMode entry in the Catalog dictionary is FullScreen; it shall be ignored otherwise. Default value: UseNone. |
Get/Set |
PageCount |
The amount of pages the document has. |
Get |
PageInfo.<num> |
The page dimentions of the a requested page index <num>, where 0 represents the first page. |
Get |
PageLayout |
A name object specifying the page layout shall be used when the document is opened:
Default value: SinglePage. |
Get/Set |
PageMode |
A name object specifying how the document shall be displayed when opened:
Default value: UseNone. |
Get/Set |
PdfVersion |
The pdf version of the document, can be following strings:
|
Get/Set |
MinPdfVersion |
Minimum output PDF version. If the input pdf has a version higher than this value, the existing version is kept, otherwise increased to the set value. |
Set |
PdfVersionInfo |
A JSON description about the PDF version, which includes also information about what compliance versions the PDF complies to, e.g. PDF/A or PDF/UA. |
Get |
4.7. Writing PDF documents
Writing a document to a file
Property |
Meaning |
Type |
OutputFilename |
Filename to which the pdf should be written to. |
Set |
Writing a document to memory
Property |
Meaning |
Type |
MemoryOutputCallback |
Address of a callback function which retrieves the pdf data. The address has to be written as a string, which contains the address as normal number. The callback function needs to have the signature: int MemoryCallback(const uint8_t* output, size_t length, void* userdata, int error);
The return value needs to be 1, if the callback wan’t to return success, otherwise 0. |
Set |
MemoryOutputUserdata |
Address of a userdata object which is used inside the callback. The address has to be written as a string, which contains the address as normal number. |
|
Normally a PDF is saved with the best saving options, which the PdfVersion of the document allows. This means object stream and xref streams will be used with a PdfVersion greater or equal to 1.5. This behavior can be overridden by setting the property WritingMode:
Property |
Meaning |
Type |
DontOverwriteProducer |
Don’t write the CIB pdfModule default Producer into the Metadata. |
Set |
WritingMode |
|
Set |
IncrementalUpdate |
An incremental update to a PDF means, that the whole input document is not changed in its content, but at the end of the document all changes, which were made are written combined inside an update entry. This means e.g. that signed signatures in an existing PDF document are not getting broken, but still you can add more content or modify the PDF document. Inside a (good) PDF viewer you can see then, that the PDF consists of different versions and you can switch between them. |
Set |
4.8. Retrieving progess and event information
Property |
Meaning |
Type |
EventCallback |
Address of a callback function which retrieves information about events. The address has to be written as a string, which contains the address as normal number. The callback function needs to have the signature: int EventCallback(uint32_t eventType, const void* eventData, void* userdata);
The return value needs to be 1, if the callback wan’t to return success, otherwise 0. |
Set |
EventCallbackUserdata |
Address of a userdata object which is used inside the callback. The address has to be written as a string, which contains the address as normal number. |
Set |
Progress |
Returning a JSON-description of current progress of the current running JOB. This property can be retrieved from a different thread as the thread in which the current execute is running. The JSON description consists of the Keys “AmountSteps” and “CurrentStep”. AmountSteps defines, how many internal bigger steps are done and the CurrentStep defines how many Steps were already done. |
Get |
4.9. Rotating PDF documents
Property |
Meaning |
Type |
PageRotation |
A ;-separated string of page selection defintions and degrees. Example: PageRotation={All};90;{3};180 All pages should be rotated by 90 degrees. But page 3 should be rotated by 180 degrees. Valid degree values are 0,90,180 and 270. Valid page selection definitions can be looked up at property InputFilename |
Set |
4.10. Encryption
Writing a document with encryption
Property |
Meaning |
Type |
OutputOwnerPassword |
Setting the owner password. This means, that the document gets encrypted, but unless you also specify a user password, anyone can view the document without a password entry, but the security options apply. |
Set |
OutputUserPassword |
Setting the user password. When this is set, someone who opens the document will need to enter either a user or owner password to open the document. |
Set |
EncryptEnablePrinting |
A flag indicating if a user has the right to print the document. |
Get/Set |
EncryptEnableClipboard |
A flag indicating if a user has the right to copy content to the clipboard. |
Get/Set |
EncryptEnableForms |
A flag indicating if a user has the right to fill out form fields. |
Get/Set |
EncryptEnableAssembling |
A flag indicating if a user has the right to assemble the document (rotate, delete, insert,…). |
Get/Set |
EncryptEnableNotes |
A flag indicating if a user has the right to add or modify text annotations. |
Get/Set |
EncryptEnableModifying |
A flag indicating if a user has the right to do other modifying operations on the document. |
Get/Set |
EncryptEnableExtract |
A flag indicating if a user has the right to extract text and graphics (in support of accessibility to users with disabilities or for other purposes). |
Get/Set |
PdfVersion |
The encryption algorithm is defined by the output pdf version. Always the best encryption standard, which is supported by the pdf version will be used. To use the highest and most secure encryption, please use always either PDF version 1.7EL8 or 2.0. This is AES-256 with fixes for the hashing algorithm, as defined in those two PDF versions. PDF 2.0 deprecated all other encryption variants, which were used before. |
Get/Set |
Opening a document with encryption
You can either specify EncryptUserPassword or EncryptOwnerPassword, if you know if the provided password is a user or owner password or EncryptDocumentPassword. In the case of EncryptDocumentPassword it is internally checked, if the password is a user or owner password.
Property |
Meaning |
Type |
EncryptUserPassword |
The user password to open a document. |
Set |
EncryptOwnerPassword |
The owner password to open a docuement. |
Set |
EncryptDocumentPassword |
The user or owner password to open a document. |
Set |
PdfEncryptionOwnerAuthorized |
After opening the document you can check by this flag, if you have owner or user access. |
Get |
EncryptEnable* |
By reading the EncryptEnable* properties (see section above) you can read, which security rights the document has. |
Get |
Removing Encryption from an input document
Property |
Meaning |
Type |
RemoveEncryption |
A flag, that indicates, that encryption should be removed. You need to authorize at first to be able to remove encryption. When you write the document afterwards, it will not contain any encryption. |
Set |
4.11. Signing
Verify, that a Certificate Password is correct
Additional to all signing parameters as in the signing usecase the property
Property |
Meaning |
Type |
SignPdfVerifyCertificatePassword |
Just verify, that the provided certificate password is valid. |
Set |
Signing PDF document
Property |
Meaning |
Type |
SignPdf |
Activates Signing |
Set |
CertificateFilename |
Sets the filename of the certificate |
Set |
CertificatePassword |
Sets the password of the certificate |
Set |
SignLocation |
Optionally setting a location, where the document is signed. |
Set |
SignContactInfo |
Optionally setting a contact info as a telephone number about who signed the document. |
Set |
SignPdfFormfield |
By this you can assign the name of a signature formfield, which should be signed. If not supplied a new invisible signature formfield is automatically created. |
Set |
SignPdfImage |
If you supply with SignPdfFormfield an existing visible signature formfield, which should be signed, you can specify by this the visual appearence of the signed formfield by supplying an image. |
Set |
SignReason |
Optionally setting a reason why the document was signed. |
Set |
The use of a timestamp server is recommended. You can activate and configure it by these properties:
Timestamp |
Activating usage of a timestamp server. |
Set |
TimestampServer |
If a timestamp server should be used, setting the URI of the timestamp server. If omitted internally configured free accessable timestamp servers are contacted. |
Set |
TimestampServerUsername |
Optionally setting of a username, which is needed to access the timestamp server. |
Set |
TimestampServerPassword |
Optionally setting of a password, which is needed to access the timestamp server |
Set |
TimestampDontUseFallbackServers |
Disable internal fallback servers, if provided TimeStampServer cannot be reached |
Set |
ProxyHost |
The address of the proxy server. |
Set |
ProxyPort |
The port of the proxy server. |
Set |
ProxyUsername |
The optional username for a proxy server. |
Set |
ProxyPassword |
The optional password for a proxy server. |
Set |
4.12. Handling embedded files
Embedded files can be added, deleted or extracted from PDF documents. Additionally the info about them can be extracted.
Extracting info about embedded files
Property |
Meaning |
Type |
EmbeddedFilesInfo |
Returns a description of the included embedded files in JSON format. |
Get |
Importing embedded files
Property |
Meaning |
Type |
EmbeddedFilesAdd |
A JSON-array of JSON objects, which define, what files should be imported. One JSON object looks like this:
Except of File or Memory all Keys are optional. In Memory usecase also UsedFilename is mandatory. |
Set |
Exporting embedded files
Property |
Meaning |
Type |
EmbeddedFilesExtract |
A JSON-array of JSON objects, which define, what files should be exported. One JSON object looks like this:
|
Set |
Deleting embedded files
Property |
Meaning |
Type |
EmbeddedFilesDelete |
A JSON-array of JSON objects, which define, what files should be deleted. One JSON object looks like this:
|
Set |
4.13. Compression of documents
Property |
Meaning |
Type |
Compress |
A flag indicating, that compression should be activated. |
Set |
CompressQuality |
Defining a compression profile, which sets a lot of other settings regarding compression. Available profile names are:
|
Set |
Threads |
The amount of threads, which are allowed to work on compression. The default is the available number of CPUs – 1. If you want to disable threading, you need to set this propert to 0 or 1. |
Set |
RemovePieceInfo |
A flag which indicates removing propriertary data of pdf creators from the PDF document. This is default for all compression profies. |
Set |
RemoveThumbs |
A flag which indicates removing page thumbnail images of the PDF document. This is default for all compression profies. |
Set |
OptimizePages |
A flag which indicates to optimize the internal page tree of the PDF document. This is default for all compression profies. |
Set |
RemoveAlternateImages |
A flag which indicates to remove alternate images. This is default for all compression profies. |
Set |
RemoveSpiderInfo |
A flag which indicates to remove spider info. This is default for all compression profies. |
Set |
OptimizeContent |
A flag which indicates to otimize the content streams of all pages and removing unneeded commands. This is default for all compression profies. |
Set |
OptimizeStreams |
A flag which indicates to otimize the streams in general to check, if they can be saved by lossless reduction by resaving them with tuned saving parameters. |
Set |
Compress |
Compression can take a lot time. You can set Property Cancel=1 to indicate, that the compression should be canceled and compression should be stopped. This is a thread-safe property and can be set from different threads. |
Set |
Instead Compress and CompressQuality also finetuning of the reencoding of the images can be done. Normally this finetuning is not recommended.
Property |
Meaning |
Type |
ReEncodeImages |
A flag indicating, that the document images should be reencoded. |
Set |
ReEncodeImagesOptions |
A JSON object which configures the image reencoding with following keys:
|
Set |
4.14. Rasterization of PDF documents
Rasterization of a PDF document means, that each page is rendered to one image and the page content of the document is replaced by that image. This can be useful for good compression results in special cases or simplify PDF documents by avoiding rare features in the content of a page.
Property |
Meaning |
Type |
Rasterize |
A flag indicating, that the document should be raterized. |
Set |
RasterizeOptions |
A JSON object which configures the rasterize command with following keys:
|
Set |
4.15. Applying image filters on images in PDF
CIB pdfModule can make use of the image filters, which CIB image toolbox provides. You need both libraries to use this feature.
Property |
Meaning |
Type |
ImageFilter |
A flag indicating, that all images (except masking images) should be processed by an image filter, provided by CIB image toolbox |
Set |
ImageFilterPageSelection |
Defining a page selection, which pages should be processed. |
Set |
CibImageToolboxFilter.* |
* is a placeholder for all CIB image toolbox properties for configuring the filter, which should be applied. Simple examples are e.g.:
|
Set |
4.16. Formular fields
Property |
Meaning |
Type |
FormEditor |
FormEditor is a JSON-Object with commands, what should be done with PDF formfields. Currently only the command “Create” is supported. The value of Create should be a JSON-Array of formfield definitions. A formfield definition has at least the keys “Type”, “Name” and “Annotation”:
Additionally text fields can have the keys:
|
Set |
NeedAppearences |
A flag, which indicates, if NeedAppearences is active or not. |
Get/Set |
RegenerateFormFieldAppearences |
A flag which indicates, that the formfield appearences should be regenerated from form field values. |
Set |
4.17. Extracing or removing images
All images inside a PDF can be extracted with the ExtractImages property.
Property |
Meaning |
Type |
ExtractImages |
Definition of a path to which the images of a PDF document should be saved to. Example: D:\path\output.png |
Set |
ExtractImagesCallback |
Callback, which should be used, instead of the Path defined in ExtractImages to deliver the images. |
Set |
ExtractImagesUserdata |
userdata |
Set |
ExtractImagesCombineMasks |
Inside PDF images and mask images are separated as two images. By setting this property the image and mask image is outputted as a combined image in PNG format with alpha channel. |
Set |
RemoveImages |
Removes images from the document. |
Set |
4.18. Handling XFA documents
XFA is a document format, which was included in PDF by Adobe©. With PDF 2.0 XFA was marked as deprecated in PDF.
Property |
Meaning |
Type |
HasXFA |
A flag indicating, that the PDF document contains a XFA document. |
Get |
XFAExtract |
A flag indicating, that a XFA document should be extracted from the PDF document. |
Set |
XFAOutputFilename |
The filename to which the XFA document should be saved to. Default is the existing input filename with the extension .xfa. |
Set |
4.19. PDF overlays
You can merge the content of several PDF pages of different pdf document to one. By this you can e.g. add a stationery to a PDF document. Merging is done by the PageContentMerge property.
Property |
Meaning |
Type |
PageContentMerge |
A JSON array of JSON objects, which define merge opertation. A merge operation is a JSON object with following keys:
Examples: First;All: First page of the mergesource-Document ist merged on the first page of the document, the second page of the mergesource-document is merged on all other pages of the pdf document. |
Set |
4.20. Importing Text
Note: you need modifying rights to import text into a PDF.
Property |
Meaning |
Type |
HocrInputData |
Hocr data to be merged into the PDF. This can be provided by direct data in XML form or by a filename (where the pageIndex inside the HOCR-file need to match the pages of the PDF document) or a list of ;-separated page descritions and HOCR files. E.g. “{3};file1.hocr;{4};file2.hocr” |
Set |
FormatSearchablePdfShowText |
A flag indicating, that the text should be imported non-invisible. Default is invisible. |
Set |
FormatSearchablePdfCreateLayer |
Instead of FormatSearchablePdfShowText also FormatSearchablePdfCreateLayer can be applied. Then PDF Layers are generated, where you can switch on or off the imported text. |
Set |
FormatSearchablePdfLayerOpacity |
The Opacity of the layer. A number value between 0 (transparent) and 100 (complete opaque). Default is 50. |
Set |
FormatSearchablePdfLayerTitle |
The layer title, which should be shown in the user interface. Default: “Visualize text” |
Set |
HocrStartIndex |
A number, which indicates, to which offset the first HOCR page refers. Default is 0. |
Set |
TextMark |
A string under which the imported string is marked. Default: CIB_HOCR |
Set |
4.21. Exporting Text
Options for exporting text from a PDF document
Property |
Meaning |
Type |
TextExtraction
|
A flag indicating, that text should be extracted from a PDF. By default, only visible text is extracted and saved in Utf16 format. To change this behavior, use additional options: TextFormattingOptions and TextSelectionFilter |
Set |
FillTextOutput |
A flag indicating if the extracted text is saved in memory (1) or not (0). |
Set |
TextOutputFilename |
Filename for text file, to which the extracted text should be saved. |
Set |
TextFormattingOptions |
Optional: allows to specify the output format (Utf8, Utf16, Hocr) and to enable additional word repositioning. The options is specified as a JSON object. Example: TextFormattingOptions={"OutputFormats":["txt"], “OutputResolution”:72, “Options”:{“EnableWordSorting”:false, “SeparateTextBlocks”:false}} So, if the option TextFormattingOptions is not set explicitly then text will be saved in output file in Utf16 format, and word order is the same as in pdf stream. The following output formats are currently supported: If EnableWordSorting is set as true then the words in the output file
will be reordered, according to their coordinates in the PDF document |
Set |
TextSelectionFilter |
Optional: allows to filter exported text by its visibility (visible/invisible)
within a PDF document and also by special content markers (tags). The following groups may be set in any combination within the groups
array: |
Set |
4.22. Tracing
You can easily create a trace file by setting this property in your job:
Property |
Meaning |
Type |
TraceFilename |
The filename to which the trace should be written. Traces should be activated only for analyzing problems, not in general pdf processing, because they can become very big. |
Set |
5. General
5.1. Hinweis
© Copyright 2019 CIB software GmbH. Alle Rechte vorbehalten.
Die CIB software GmbH behält sich sämtliche Eigentumsrechte an der angebotenen Software
und der dazugehörigen Dokumentation vor. Die Benutzung der Software und des
dazugehörigen Benutzerhandbuches unterliegen dem der Software zugrundeliegenden
Lizenzvertrag. Die Bereitstellung und der Download dieses Dokuments und der Software allein
bewirken keine Übertragung von Nutzungs- und Vervielfältigungsrechten.
Kein Teil dieses Handbuchs darf ohne schriftliche Genehmigung der CIB software GmbH in
irgendeiner Form reproduziert oder weiterverwertet werden. Auch eine Bearbeitung,
insbesondere eine Übersetzung der Dokumentation, ist ohne Genehmigung der
CIB software GmbH nicht gestattet. Der Inhalt dieses Handbuches ist auch urheberrechtlich
geschützt, wenn es nicht mit der Software geliefert wird, die eine
Endbenutzerlizenzvereinbarung enthält.
CIB pdf brewer, CIB coSys, CIB webdesk, CIB workbench, CIB dialog, CIB merge, CIB view,
CIB format, CIB print, CIB pdf toolbox, CIB pdfModule, CIB image toolbox sind entweder
eingetragene Marken oder Marken der CIB software GmbH.
Windows ist eine eingetragene Marke der Microsoft Corporation.
Solaris und Java sind Marken bzw. eingetragene Marken von Oracle und ihrer
Tochtergesellschaften.
Alle anderen Marken- und Produktnamen sind Marken oder eingetragene Marken der
jeweiligen Rechteinhaber.
Der Inhalt dieses Handbuchs wurde mit größter Sorgfalt erarbeitet. Die Angaben in diesem
Handbuch gelten jedoch nicht als Zusicherung von Eigenschaften des Produktes. Die
CIB software GmbH haftet nur im Umfang ihrer Verkaufs- und Lieferbedingungen und
übernimmt keine Gewähr für technische Ungenauigkeiten und oder Auslassungen.
Die CIB software GmbH haftet weder für technische oder typographische Fehler und Mängel in
diesem Handbuch, noch für Schäden, die direkt oder indirekt auf die Lieferung, Leistung und
Nutzung dieses Materials zurückzuführen sind.
Die Informationen in diesem Handbuch können ohne Ankündigung geändert werden.
während des Einsatzes Unstimmigkeiten in Zusammenhang mit den Ausführungen in
dieser Übersicht auftreten, sind wir Ihnen für entsprechende Hinweise sehr dankbar:
CIB software GmbH
Elektrastraße 6a
81925 München
E-Mail: support@cib.de
Tel.: 49 (0)89 / 1 43 60 - 111
Fax: 49 (0)89 / 1 43 60 – 100
Oder im Internet:
- Youtube: https://www.youtube.com/user/CIBSoftwareGmbH
- Twitter: https://twitter.com/CIBsoftwareGmbH
5.2. Support
E-Mail: support@cib.de
Tel.: 49 (0)89 / 1 43 60 - 111
Fax: 49 (0)89 / 1 43 60 – 100
5.3. Licensing
This document doesn’t provide any information about how to license this software. Please
contact CIB support or CIB sales department for further information.
5.4. Content of the delivered package
CIB pdf toolbox 2 gets delivered as binaries as DLLs (Windows) or shared libraries (Unix / Linux) or webassembly file (.wasm).
Component |
Files |
CIB pdf toolbox 2
|
|
Dependent libraries on Linux/Unix |
|
|
|
Dependent libraries on AIX |
|
|
6. Error codes
These are general CIB error codes. Not all of them can be returned by CIB pdfModule. Some are only generated by other CIB modules.
Return value |
Meaning |
20000 |
General error occured |
20001 |
HOCR reading error |
20002 |
General reading during reading an input text |
20003 |
HOCR writing error |
20004 |
General error during executing the image toolbox |
20005 |
General runtime error during |
20006 |
General error during reading pdf COS objects |
20007 |
General other error during parsing a PDF document |
20008 |
License was validated wrong |
20009 |
PDF validation error |
20010 |
IO Writing error |
20011 |
Password was not accepted for an encrypted input document |
20012 |
Unsupported feauture called |
200013 |
An error in execution of a PDF-Funktion occured. |
20014 |
IO Reading error |
20015 |
General error regarding the job-Handle pased to the library. |
20016 |
Error during loading dependent library |
20017 |
Error during reading CIB pdf brewer settings |
20018 |
Error during a CIB pdf brewer document conversion |
20019 |
Error for setting a property of the CIB pdf brewer |
20020 |
Error during parsing the content of a property value |
20021 |
Reading error of a ZUGFeRD XML |
20022 |
CIB updator generic error |
20023 |
CIB ai generic error |
20024 |
General IO error |
20025 |
General printing error |
20026 |
Error during executing TextOverlays |
20027 |
CIB pdf brewer API error |
20028 |
Error indicating that a user cancel was invoked by a callback. |
20029 |
Error during loading CIB pdf brewer UI library |
20030 |
Error during reading jpeg files |
20031 |
General encryption error |
20032 |
CIB image toolbox error, that indicates, that an image cannot be segmented into several layers for MRC compression, because a given component limit was reached. |
20033 |
Generic error during PDF semantics |
20034 |
Error during processing pdf attachments |
20035 |
Indicates, that a segmentation with the purpose of MRC placed all content on the background |