Layout classification


About this task

Train our AI to understand documents. Layout classification consists of dividing the page into areas with titles, headers, footers, paragraphs and other elements. These areas are called entities. For example, if the AI wants to know where addresses or tables are located, mark as many of these entities as you can find. Can't find them? No problem. Not all documents have headlines, for example. Just click the "Element does not exist" button and move on. The AI learns this too.



How to complete a task?

  1. Carefully review the document.
  2. Draw rectangles around the necessary entities to select them.
  3. Click "OK" to proceed to the next document.
  4. If you are not sure of the answer, click "Skip".



Which entities can be found in a document?

  • Headlines

A headline is generally the shortest, most concise designation possible for a text or a section of text. Sometimes it can also be a subheading or consist of several lines. Mark these lines with a single frame. Two headlines are marked with two different frames.

         Example:

  • Accessibility in digitalisation

In an e-mail, the subject is only a headline in the text body.




  • Addresses: 

An address consists of the name or company name, street, house number, postal code and town. It can be located at the top left of a DIN letter and also contains the return address. Sometimes documents contain several addresses.

         Example:

                CIB software GmbH
                Elektrastraße 6a
                81925 München

A commercial register entry in a footer does not count as an address.




  • Paragraphs: 

A paragraph is a section of a continuous text consisting of one or more sentences and usually has its own context or even its own topic. Once this idea has been realised, a new paragraph follows. Paragraphs can be framed individually. Paragraphs directly above each other in approximately the same width can also be framed as a group.

         Example:

Dear Ms. ____,

Thank you very much for the invitation to the job interview and the chance to introduce myself to you once again in person. I found our conversation very pleasant and have rarely experienced such a collegial working atmosphere as with you. When asked about ____ I was a little unsure at first, but now I am sure that I will be able to support you in future projects, especially with my experience as ____.

-A table or enumerations with several levels are not paragraphs.

-The closing of a letter counts as paragraph.




  • Tables:

A table is an arrangement of data structured in rows and columns. Sometimes it contains frames.

         Example:

       

- Receipts and invoices can contain tables.

       




  • Footnotes:

A footnote is an additional piece of information or source reference at the bottom of a page or at the end of a text, which refers to the corresponding place in the continuous text with a superscript number or a special symbol, such as an asterisk or a cross.

        Example:

 

    • This part of the text is called continuous text.1 Footnotes are listed below the continuous text.2

             _____________________

               1 Vgl. Fischer, Tobias/Luca Hofmann: Fußnoten richtig erstellen und formatieren, Amsterdam, Niederlande: Scribbr-Verlag, 2022, S.92 

               2 Note from the author: The page area below the continuous text is called the footnote apparatus, annotation apparatus or scientific apparatus.




  • Header, footer and margin texts.
    • Header: Upper area of a page for company data with logo, title or author details.
    • Footer: Bottom area of a page with page numbers, recurring information, bank details and the like.
    • Margin text: Provides additional information in the margin. In business letters, for example, it is where information about a company's board of directors can be found.

       Example:

    • Page 1 of 1




Last modified: Monday, 27 November 2023, 12:03 PM