Docling Document
This is an automatic generated API reference of the DoclingDocument type.
doc
Package for models defined by the Document type.
Classes:
-
DoclingDocument–DoclingDocument.
-
DocumentOrigin–FileSource.
-
DocItem–Base type for any element that carries content, can be a leaf node.
-
DocItemLabel–DocItemLabel.
-
ProvenanceItem–Provenance information for elements extracted from a textual document.
-
GroupItem–GroupItem.
-
GroupLabel–GroupLabel.
-
NodeItem–NodeItem.
-
PageItem–PageItem.
-
FloatingItem–FloatingItem.
-
TextItem–TextItem.
-
TableItem–TableItem.
-
TableCell–TableCell.
-
TableData–BaseTableData.
-
TableCellLabel–TableCellLabel.
-
KeyValueItem–KeyValueItem.
-
SectionHeaderItem–SectionItem.
-
PictureItem–PictureItem.
-
ImageRef–ImageRef.
-
PictureClassificationClass–PictureClassificationData.
-
PictureClassificationData–PictureClassificationData.
-
RefItem–RefItem.
-
BoundingBox–BoundingBox.
-
CoordOrigin–CoordOrigin.
-
ImageRefMode–ImageRefMode.
-
Size–Size.
DoclingDocument
Bases: BaseModel
DoclingDocument.
Show JSON schema:
{
"$defs": {
"BaseMeta": {
"additionalProperties": true,
"description": "Base class for metadata.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "BaseMeta",
"type": "object"
},
"BoundingBox": {
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
},
"ChartBar": {
"description": "Represents a bar in a bar chart.\n\nAttributes:\n label (str): The label for the bar.\n values (float): The value associated with the bar.",
"properties": {
"label": {
"title": "Label",
"type": "string"
},
"values": {
"title": "Values",
"type": "number"
}
},
"required": [
"label",
"values"
],
"title": "ChartBar",
"type": "object"
},
"ChartLine": {
"description": "Represents a line in a line chart.\n\nAttributes:\n label (str): The label for the line.\n values (list[tuple[float, float]]): A list of (x, y) coordinate pairs\n representing the line's data points.",
"properties": {
"label": {
"title": "Label",
"type": "string"
},
"values": {
"items": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "number"
},
{
"type": "number"
}
],
"type": "array"
},
"title": "Values",
"type": "array"
}
},
"required": [
"label",
"values"
],
"title": "ChartLine",
"type": "object"
},
"ChartPoint": {
"description": "Represents a point in a scatter chart.\n\nAttributes:\n value (Tuple[float, float]): A (x, y) coordinate pair representing a point in a\n chart.",
"properties": {
"value": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "number"
},
{
"type": "number"
}
],
"title": "Value",
"type": "array"
}
},
"required": [
"value"
],
"title": "ChartPoint",
"type": "object"
},
"ChartSlice": {
"description": "Represents a slice in a pie chart.\n\nAttributes:\n label (str): The label for the slice.\n value (float): The value represented by the slice.",
"properties": {
"label": {
"title": "Label",
"type": "string"
},
"value": {
"title": "Value",
"type": "number"
}
},
"required": [
"label",
"value"
],
"title": "ChartSlice",
"type": "object"
},
"ChartStackedBar": {
"description": "Represents a stacked bar in a stacked bar chart.\n\nAttributes:\n label (list[str]): The labels for the stacked bars. Multiple values are stored\n in cases where the chart is \"double stacked,\" meaning bars are stacked both\n horizontally and vertically.\n values (list[tuple[str, int]]): A list of values representing different segments\n of the stacked bar along with their label.",
"properties": {
"label": {
"items": {
"type": "string"
},
"title": "Label",
"type": "array"
},
"values": {
"items": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "string"
},
{
"type": "integer"
}
],
"type": "array"
},
"title": "Values",
"type": "array"
}
},
"required": [
"label",
"values"
],
"title": "ChartStackedBar",
"type": "object"
},
"CodeItem": {
"additionalProperties": false,
"description": "CodeItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/FloatingMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "code",
"default": "code",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"orig": {
"title": "Orig",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"formatting": {
"anyOf": [
{
"$ref": "#/$defs/Formatting"
},
{
"type": "null"
}
],
"default": null
},
"hyperlink": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Hyperlink"
},
"captions": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Captions",
"type": "array"
},
"references": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "References",
"type": "array"
},
"footnotes": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Footnotes",
"type": "array"
},
"image": {
"anyOf": [
{
"$ref": "#/$defs/ImageRef"
},
{
"type": "null"
}
],
"default": null
},
"code_language": {
"$ref": "#/$defs/CodeLanguageLabel",
"default": "unknown"
}
},
"required": [
"self_ref",
"orig",
"text"
],
"title": "CodeItem",
"type": "object"
},
"CodeLanguageLabel": {
"description": "CodeLanguageLabel.",
"enum": [
"Ada",
"Awk",
"Bash",
"bc",
"C",
"C#",
"C++",
"CMake",
"COBOL",
"CSS",
"Ceylon",
"Clojure",
"Crystal",
"Cuda",
"Cython",
"D",
"Dart",
"dc",
"Dockerfile",
"DocLang",
"Elixir",
"Erlang",
"FORTRAN",
"Forth",
"Go",
"HTML",
"Haskell",
"Haxe",
"Java",
"JavaScript",
"JSON",
"Julia",
"Kotlin",
"Latex",
"Lisp",
"Lua",
"Matlab",
"MoonScript",
"Nim",
"OCaml",
"ObjectiveC",
"Octave",
"PHP",
"Pascal",
"Perl",
"Prolog",
"Python",
"Racket",
"Ruby",
"Rust",
"SML",
"SQL",
"Scala",
"Scheme",
"Swift",
"Tikz",
"TypeScript",
"unknown",
"VisualBasic",
"XML",
"YAML"
],
"title": "CodeLanguageLabel",
"type": "string"
},
"CodeMetaField": {
"additionalProperties": true,
"description": "Code representation for the respective item.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/CodeLanguageLabel"
},
{
"type": "null"
}
],
"default": null
}
},
"required": [
"text"
],
"title": "CodeMetaField",
"type": "object"
},
"ContentLayer": {
"description": "ContentLayer.",
"enum": [
"body",
"furniture",
"background",
"invisible",
"notes"
],
"title": "ContentLayer",
"type": "string"
},
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
},
"DescriptionAnnotation": {
"description": "DescriptionAnnotation.",
"properties": {
"kind": {
"const": "description",
"default": "description",
"title": "Kind",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"provenance": {
"title": "Provenance",
"type": "string"
}
},
"required": [
"text",
"provenance"
],
"title": "DescriptionAnnotation",
"type": "object"
},
"DescriptionMetaField": {
"additionalProperties": true,
"description": "Description metadata field.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "DescriptionMetaField",
"type": "object"
},
"DocumentOrigin": {
"description": "FileSource.",
"properties": {
"mimetype": {
"title": "Mimetype",
"type": "string"
},
"binary_hash": {
"maximum": 18446744073709551615,
"minimum": 0,
"title": "Binary Hash",
"type": "integer"
},
"filename": {
"title": "Filename",
"type": "string"
},
"uri": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Uri"
}
},
"required": [
"mimetype",
"binary_hash",
"filename"
],
"title": "DocumentOrigin",
"type": "object"
},
"EntitiesMetaField": {
"additionalProperties": true,
"description": "Container for extracted entity mentions.",
"properties": {
"mentions": {
"items": {
"$ref": "#/$defs/EntityMention"
},
"minItems": 1,
"title": "Mentions",
"type": "array"
}
},
"required": [
"mentions"
],
"title": "EntitiesMetaField",
"type": "object"
},
"EntityMention": {
"additionalProperties": true,
"description": "Entity mention extracted from text.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"description": "Normalized text of the entity mention.",
"title": "Text",
"type": "string"
},
"orig": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Exact source text extracted from the original charspan, analogous to TextItem.orig. This may differ from 'text' when the mention has been normalized.",
"title": "Orig"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Entity type or category.",
"title": "Label"
},
"charspan": {
"anyOf": [
{
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "Character span (0-indexed) of the entity mention in the source text.",
"title": "Charspan"
}
},
"required": [
"text"
],
"title": "EntityMention",
"type": "object"
},
"FieldHeadingItem": {
"additionalProperties": false,
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "field_heading",
"default": "field_heading",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"orig": {
"title": "Orig",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"formatting": {
"anyOf": [
{
"$ref": "#/$defs/Formatting"
},
{
"type": "null"
}
],
"default": null
},
"hyperlink": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Hyperlink"
},
"level": {
"default": 1,
"maximum": 100,
"minimum": 1,
"title": "Level",
"type": "integer"
}
},
"required": [
"self_ref",
"orig",
"text"
],
"title": "FieldHeadingItem",
"type": "object"
},
"FieldItem": {
"additionalProperties": false,
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "field_item",
"default": "field_item",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
}
},
"required": [
"self_ref"
],
"title": "FieldItem",
"type": "object"
},
"FieldRegionItem": {
"additionalProperties": false,
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "field_region",
"default": "field_region",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
}
},
"required": [
"self_ref"
],
"title": "FieldRegionItem",
"type": "object"
},
"FieldValueItem": {
"additionalProperties": false,
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "field_value",
"default": "field_value",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"orig": {
"title": "Orig",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"formatting": {
"anyOf": [
{
"$ref": "#/$defs/Formatting"
},
{
"type": "null"
}
],
"default": null
},
"hyperlink": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Hyperlink"
},
"kind": {
"default": "read_only",
"enum": [
"read_only",
"fillable"
],
"title": "Kind",
"type": "string"
}
},
"required": [
"self_ref",
"orig",
"text"
],
"title": "FieldValueItem",
"type": "object"
},
"FineRef": {
"description": "Fine-granular reference item that can capture span range info.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
},
"range": {
"anyOf": [
{
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"title": "Range"
}
},
"required": [
"$ref"
],
"title": "FineRef",
"type": "object"
},
"FloatingMeta": {
"additionalProperties": true,
"description": "Metadata model for floating.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
},
"description": {
"anyOf": [
{
"$ref": "#/$defs/DescriptionMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "FloatingMeta",
"type": "object"
},
"FormItem": {
"additionalProperties": false,
"description": "FormItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/FloatingMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "form",
"default": "form",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"captions": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Captions",
"type": "array"
},
"references": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "References",
"type": "array"
},
"footnotes": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Footnotes",
"type": "array"
},
"image": {
"anyOf": [
{
"$ref": "#/$defs/ImageRef"
},
{
"type": "null"
}
],
"default": null
},
"graph": {
"$ref": "#/$defs/GraphData"
}
},
"required": [
"self_ref",
"graph"
],
"title": "FormItem",
"type": "object"
},
"Formatting": {
"description": "Formatting.",
"properties": {
"bold": {
"default": false,
"title": "Bold",
"type": "boolean"
},
"italic": {
"default": false,
"title": "Italic",
"type": "boolean"
},
"underline": {
"default": false,
"title": "Underline",
"type": "boolean"
},
"strikethrough": {
"default": false,
"title": "Strikethrough",
"type": "boolean"
},
"script": {
"$ref": "#/$defs/Script",
"default": "baseline"
}
},
"title": "Formatting",
"type": "object"
},
"FormulaItem": {
"additionalProperties": false,
"description": "FormulaItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "formula",
"default": "formula",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"orig": {
"title": "Orig",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"formatting": {
"anyOf": [
{
"$ref": "#/$defs/Formatting"
},
{
"type": "null"
}
],
"default": null
},
"hyperlink": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Hyperlink"
}
},
"required": [
"self_ref",
"orig",
"text"
],
"title": "FormulaItem",
"type": "object"
},
"GraphCell": {
"description": "GraphCell.",
"properties": {
"label": {
"$ref": "#/$defs/GraphCellLabel"
},
"cell_id": {
"title": "Cell Id",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"orig": {
"title": "Orig",
"type": "string"
},
"prov": {
"anyOf": [
{
"$ref": "#/$defs/ProvenanceItem"
},
{
"type": "null"
}
],
"default": null
},
"item_ref": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
}
},
"required": [
"label",
"cell_id",
"text",
"orig"
],
"title": "GraphCell",
"type": "object"
},
"GraphCellLabel": {
"description": "GraphCellLabel.",
"enum": [
"unspecified",
"key",
"value",
"checkbox"
],
"title": "GraphCellLabel",
"type": "string"
},
"GraphData": {
"description": "GraphData.",
"properties": {
"cells": {
"items": {
"$ref": "#/$defs/GraphCell"
},
"title": "Cells",
"type": "array"
},
"links": {
"items": {
"$ref": "#/$defs/GraphLink"
},
"title": "Links",
"type": "array"
}
},
"title": "GraphData",
"type": "object"
},
"GraphLink": {
"description": "GraphLink.",
"properties": {
"label": {
"$ref": "#/$defs/GraphLinkLabel"
},
"source_cell_id": {
"title": "Source Cell Id",
"type": "integer"
},
"target_cell_id": {
"title": "Target Cell Id",
"type": "integer"
}
},
"required": [
"label",
"source_cell_id",
"target_cell_id"
],
"title": "GraphLink",
"type": "object"
},
"GraphLinkLabel": {
"description": "GraphLinkLabel.",
"enum": [
"unspecified",
"to_value",
"to_key",
"to_parent",
"to_child"
],
"title": "GraphLinkLabel",
"type": "string"
},
"GroupItem": {
"additionalProperties": false,
"description": "GroupItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"name": {
"default": "group",
"title": "Name",
"type": "string"
},
"label": {
"$ref": "#/$defs/GroupLabel",
"default": "unspecified"
}
},
"required": [
"self_ref"
],
"title": "GroupItem",
"type": "object"
},
"GroupLabel": {
"description": "GroupLabel.",
"enum": [
"unspecified",
"list",
"ordered_list",
"chapter",
"section",
"sheet",
"slide",
"form_area",
"key_value_area",
"comment_section",
"inline",
"picture_area"
],
"title": "GroupLabel",
"type": "string"
},
"HumanLanguageLabel": {
"description": "Two-letter human language primary subtags using BCP-47 values.",
"enum": [
"aa",
"ab",
"ae",
"af",
"ak",
"am",
"an",
"ar",
"as",
"av",
"ay",
"az",
"ba",
"be",
"bg",
"bh",
"bi",
"bm",
"bn",
"bo",
"br",
"bs",
"ca",
"ce",
"ch",
"co",
"cr",
"cs",
"cu",
"cv",
"cy",
"da",
"de",
"dv",
"dz",
"ee",
"el",
"en",
"eo",
"es",
"et",
"eu",
"fa",
"ff",
"fi",
"fj",
"fo",
"fr",
"fy",
"ga",
"gd",
"gl",
"gn",
"gu",
"gv",
"ha",
"he",
"hi",
"ho",
"hr",
"ht",
"hu",
"hy",
"hz",
"ia",
"id",
"ie",
"ig",
"ii",
"ik",
"io",
"is",
"it",
"iu",
"ja",
"jv",
"ka",
"kg",
"ki",
"kj",
"kk",
"kl",
"km",
"kn",
"ko",
"kr",
"ks",
"ku",
"kv",
"kw",
"ky",
"la",
"lb",
"lg",
"li",
"ln",
"lo",
"lt",
"lu",
"lv",
"mg",
"mh",
"mi",
"mk",
"ml",
"mn",
"mr",
"ms",
"mt",
"my",
"na",
"nb",
"nd",
"ne",
"ng",
"nl",
"nn",
"no",
"nr",
"nv",
"ny",
"oc",
"oj",
"om",
"or",
"os",
"pa",
"pi",
"pl",
"ps",
"pt",
"qu",
"rm",
"rn",
"ro",
"ru",
"rw",
"sa",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sk",
"sl",
"sm",
"sn",
"so",
"sq",
"sr",
"ss",
"st",
"su",
"sv",
"sw",
"ta",
"te",
"tg",
"th",
"ti",
"tk",
"tl",
"tn",
"to",
"tr",
"ts",
"tt",
"tw",
"ty",
"ug",
"uk",
"ur",
"uz",
"ve",
"vi",
"vo",
"wa",
"wo",
"xh",
"yi",
"yo",
"za",
"zh",
"zu"
],
"title": "HumanLanguageLabel",
"type": "string"
},
"ImageRef": {
"description": "ImageRef.",
"properties": {
"mimetype": {
"title": "Mimetype",
"type": "string"
},
"dpi": {
"title": "Dpi",
"type": "integer"
},
"size": {
"$ref": "#/$defs/Size"
},
"uri": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
}
],
"title": "Uri"
}
},
"required": [
"mimetype",
"dpi",
"size",
"uri"
],
"title": "ImageRef",
"type": "object"
},
"InlineGroup": {
"additionalProperties": false,
"description": "InlineGroup.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"name": {
"default": "group",
"title": "Name",
"type": "string"
},
"label": {
"const": "inline",
"default": "inline",
"title": "Label",
"type": "string"
}
},
"required": [
"self_ref"
],
"title": "InlineGroup",
"type": "object"
},
"KeyValueItem": {
"additionalProperties": false,
"description": "KeyValueItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/FloatingMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "key_value_region",
"default": "key_value_region",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"captions": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Captions",
"type": "array"
},
"references": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "References",
"type": "array"
},
"footnotes": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Footnotes",
"type": "array"
},
"image": {
"anyOf": [
{
"$ref": "#/$defs/ImageRef"
},
{
"type": "null"
}
],
"default": null
},
"graph": {
"$ref": "#/$defs/GraphData"
}
},
"required": [
"self_ref",
"graph"
],
"title": "KeyValueItem",
"type": "object"
},
"LanguageMetaField": {
"additionalProperties": true,
"description": "Detected human language.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"code": {
"$ref": "#/$defs/HumanLanguageLabel"
}
},
"required": [
"code"
],
"title": "LanguageMetaField",
"type": "object"
},
"ListGroup": {
"additionalProperties": false,
"description": "ListGroup.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"name": {
"default": "group",
"title": "Name",
"type": "string"
},
"label": {
"const": "list",
"default": "list",
"title": "Label",
"type": "string"
}
},
"required": [
"self_ref"
],
"title": "ListGroup",
"type": "object"
},
"ListItem": {
"additionalProperties": false,
"description": "SectionItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "list_item",
"default": "list_item",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"orig": {
"title": "Orig",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"formatting": {
"anyOf": [
{
"$ref": "#/$defs/Formatting"
},
{
"type": "null"
}
],
"default": null
},
"hyperlink": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Hyperlink"
},
"enumerated": {
"default": false,
"title": "Enumerated",
"type": "boolean"
},
"marker": {
"default": "-",
"title": "Marker",
"type": "string"
}
},
"required": [
"self_ref",
"orig",
"text"
],
"title": "ListItem",
"type": "object"
},
"MiscAnnotation": {
"description": "MiscAnnotation.",
"properties": {
"kind": {
"const": "misc",
"default": "misc",
"title": "Kind",
"type": "string"
},
"content": {
"additionalProperties": true,
"title": "Content",
"type": "object"
}
},
"required": [
"content"
],
"title": "MiscAnnotation",
"type": "object"
},
"MoleculeMetaField": {
"additionalProperties": true,
"description": "Molecule metadata field.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"smi": {
"description": "The SMILES representation of the molecule.",
"title": "Smi",
"type": "string"
}
},
"required": [
"smi"
],
"title": "MoleculeMetaField",
"type": "object"
},
"Orientation": {
"description": "Counter-clockwise rotation of a table on the page, in degrees.\n\nFollows the convention used by PIL/Pillow's ``Image.rotate``: positive\nangles rotate the table counter-clockwise. ``ROT_0`` / ``ROT_180`` keep\nrows running horizontally on the page; ``ROT_90`` / ``ROT_270`` turn\nrows into vertical stripes.",
"enum": [
"rot_0",
"rot_90",
"rot_180",
"rot_270"
],
"title": "Orientation",
"type": "string"
},
"PageItem": {
"description": "PageItem.",
"properties": {
"size": {
"$ref": "#/$defs/Size"
},
"image": {
"anyOf": [
{
"$ref": "#/$defs/ImageRef"
},
{
"type": "null"
}
],
"default": null
},
"page_no": {
"title": "Page No",
"type": "integer"
}
},
"required": [
"size",
"page_no"
],
"title": "PageItem",
"type": "object"
},
"PictureBarChartData": {
"description": "Represents data of a bar chart.\n\nAttributes:\n kind (Literal[\"bar_chart_data\"]): The type of the chart.\n x_axis_label (str): The label for the x-axis.\n y_axis_label (str): The label for the y-axis.\n bars (list[ChartBar]): A list of bars in the chart.",
"properties": {
"kind": {
"const": "bar_chart_data",
"default": "bar_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"x_axis_label": {
"title": "X Axis Label",
"type": "string"
},
"y_axis_label": {
"title": "Y Axis Label",
"type": "string"
},
"bars": {
"items": {
"$ref": "#/$defs/ChartBar"
},
"title": "Bars",
"type": "array"
}
},
"required": [
"title",
"x_axis_label",
"y_axis_label",
"bars"
],
"title": "PictureBarChartData",
"type": "object"
},
"PictureClassificationClass": {
"description": "PictureClassificationData.",
"properties": {
"class_name": {
"title": "Class Name",
"type": "string"
},
"confidence": {
"title": "Confidence",
"type": "number"
}
},
"required": [
"class_name",
"confidence"
],
"title": "PictureClassificationClass",
"type": "object"
},
"PictureClassificationData": {
"description": "PictureClassificationData.",
"properties": {
"kind": {
"const": "classification",
"default": "classification",
"title": "Kind",
"type": "string"
},
"provenance": {
"title": "Provenance",
"type": "string"
},
"predicted_classes": {
"items": {
"$ref": "#/$defs/PictureClassificationClass"
},
"title": "Predicted Classes",
"type": "array"
}
},
"required": [
"provenance",
"predicted_classes"
],
"title": "PictureClassificationData",
"type": "object"
},
"PictureClassificationMetaField": {
"additionalProperties": true,
"description": "Picture classification metadata field.",
"properties": {
"predictions": {
"items": {
"$ref": "#/$defs/PictureClassificationPrediction"
},
"minItems": 1,
"title": "Predictions",
"type": "array"
}
},
"title": "PictureClassificationMetaField",
"type": "object"
},
"PictureClassificationPrediction": {
"additionalProperties": true,
"description": "Picture classification instance.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"class_name": {
"title": "Class Name",
"type": "string"
}
},
"required": [
"class_name"
],
"title": "PictureClassificationPrediction",
"type": "object"
},
"PictureItem": {
"additionalProperties": false,
"description": "PictureItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/PictureMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"default": "picture",
"enum": [
"picture",
"chart"
],
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"captions": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Captions",
"type": "array"
},
"references": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "References",
"type": "array"
},
"footnotes": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Footnotes",
"type": "array"
},
"image": {
"anyOf": [
{
"$ref": "#/$defs/ImageRef"
},
{
"type": "null"
}
],
"default": null
},
"annotations": {
"default": [],
"deprecated": true,
"items": {
"discriminator": {
"mapping": {
"bar_chart_data": "#/$defs/PictureBarChartData",
"classification": "#/$defs/PictureClassificationData",
"description": "#/$defs/DescriptionAnnotation",
"line_chart_data": "#/$defs/PictureLineChartData",
"misc": "#/$defs/MiscAnnotation",
"molecule_data": "#/$defs/PictureMoleculeData",
"pie_chart_data": "#/$defs/PicturePieChartData",
"scatter_chart_data": "#/$defs/PictureScatterChartData",
"stacked_bar_chart_data": "#/$defs/PictureStackedBarChartData",
"tabular_chart_data": "#/$defs/PictureTabularChartData"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/DescriptionAnnotation"
},
{
"$ref": "#/$defs/MiscAnnotation"
},
{
"$ref": "#/$defs/PictureClassificationData"
},
{
"$ref": "#/$defs/PictureMoleculeData"
},
{
"$ref": "#/$defs/PictureTabularChartData"
},
{
"$ref": "#/$defs/PictureLineChartData"
},
{
"$ref": "#/$defs/PictureBarChartData"
},
{
"$ref": "#/$defs/PictureStackedBarChartData"
},
{
"$ref": "#/$defs/PicturePieChartData"
},
{
"$ref": "#/$defs/PictureScatterChartData"
}
]
},
"title": "Annotations",
"type": "array"
}
},
"required": [
"self_ref"
],
"title": "PictureItem",
"type": "object"
},
"PictureLineChartData": {
"description": "Represents data of a line chart.\n\nAttributes:\n kind (Literal[\"line_chart_data\"]): The type of the chart.\n x_axis_label (str): The label for the x-axis.\n y_axis_label (str): The label for the y-axis.\n lines (list[ChartLine]): A list of lines in the chart.",
"properties": {
"kind": {
"const": "line_chart_data",
"default": "line_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"x_axis_label": {
"title": "X Axis Label",
"type": "string"
},
"y_axis_label": {
"title": "Y Axis Label",
"type": "string"
},
"lines": {
"items": {
"$ref": "#/$defs/ChartLine"
},
"title": "Lines",
"type": "array"
}
},
"required": [
"title",
"x_axis_label",
"y_axis_label",
"lines"
],
"title": "PictureLineChartData",
"type": "object"
},
"PictureMeta": {
"additionalProperties": true,
"description": "Metadata model for pictures.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
},
"description": {
"anyOf": [
{
"$ref": "#/$defs/DescriptionMetaField"
},
{
"type": "null"
}
],
"default": null
},
"classification": {
"anyOf": [
{
"$ref": "#/$defs/PictureClassificationMetaField"
},
{
"type": "null"
}
],
"default": null
},
"molecule": {
"anyOf": [
{
"$ref": "#/$defs/MoleculeMetaField"
},
{
"type": "null"
}
],
"default": null
},
"tabular_chart": {
"anyOf": [
{
"$ref": "#/$defs/TabularChartMetaField"
},
{
"type": "null"
}
],
"default": null
},
"code": {
"anyOf": [
{
"$ref": "#/$defs/CodeMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "PictureMeta",
"type": "object"
},
"PictureMoleculeData": {
"description": "PictureMoleculeData.",
"properties": {
"kind": {
"const": "molecule_data",
"default": "molecule_data",
"title": "Kind",
"type": "string"
},
"smi": {
"title": "Smi",
"type": "string"
},
"confidence": {
"title": "Confidence",
"type": "number"
},
"class_name": {
"title": "Class Name",
"type": "string"
},
"segmentation": {
"items": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "number"
},
{
"type": "number"
}
],
"type": "array"
},
"title": "Segmentation",
"type": "array"
},
"provenance": {
"title": "Provenance",
"type": "string"
}
},
"required": [
"smi",
"confidence",
"class_name",
"segmentation",
"provenance"
],
"title": "PictureMoleculeData",
"type": "object"
},
"PicturePieChartData": {
"description": "Represents data of a pie chart.\n\nAttributes:\n kind (Literal[\"pie_chart_data\"]): The type of the chart.\n slices (list[ChartSlice]): A list of slices in the pie chart.",
"properties": {
"kind": {
"const": "pie_chart_data",
"default": "pie_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"slices": {
"items": {
"$ref": "#/$defs/ChartSlice"
},
"title": "Slices",
"type": "array"
}
},
"required": [
"title",
"slices"
],
"title": "PicturePieChartData",
"type": "object"
},
"PictureScatterChartData": {
"description": "Represents data of a scatter chart.\n\nAttributes:\n kind (Literal[\"scatter_chart_data\"]): The type of the chart.\n x_axis_label (str): The label for the x-axis.\n y_axis_label (str): The label for the y-axis.\n points (list[ChartPoint]): A list of points in the scatter chart.",
"properties": {
"kind": {
"const": "scatter_chart_data",
"default": "scatter_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"x_axis_label": {
"title": "X Axis Label",
"type": "string"
},
"y_axis_label": {
"title": "Y Axis Label",
"type": "string"
},
"points": {
"items": {
"$ref": "#/$defs/ChartPoint"
},
"title": "Points",
"type": "array"
}
},
"required": [
"title",
"x_axis_label",
"y_axis_label",
"points"
],
"title": "PictureScatterChartData",
"type": "object"
},
"PictureStackedBarChartData": {
"description": "Represents data of a stacked bar chart.\n\nAttributes:\n kind (Literal[\"stacked_bar_chart_data\"]): The type of the chart.\n x_axis_label (str): The label for the x-axis.\n y_axis_label (str): The label for the y-axis.\n stacked_bars (list[ChartStackedBar]): A list of stacked bars in the chart.",
"properties": {
"kind": {
"const": "stacked_bar_chart_data",
"default": "stacked_bar_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"x_axis_label": {
"title": "X Axis Label",
"type": "string"
},
"y_axis_label": {
"title": "Y Axis Label",
"type": "string"
},
"stacked_bars": {
"items": {
"$ref": "#/$defs/ChartStackedBar"
},
"title": "Stacked Bars",
"type": "array"
}
},
"required": [
"title",
"x_axis_label",
"y_axis_label",
"stacked_bars"
],
"title": "PictureStackedBarChartData",
"type": "object"
},
"PictureTabularChartData": {
"description": "Base class for picture chart data.\n\nAttributes:\n title (str): The title of the chart.\n chart_data (TableData): Chart data in the table format.",
"properties": {
"kind": {
"const": "tabular_chart_data",
"default": "tabular_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"chart_data": {
"$ref": "#/$defs/TableData"
}
},
"required": [
"title",
"chart_data"
],
"title": "PictureTabularChartData",
"type": "object"
},
"ProvenanceItem": {
"description": "Provenance information for elements extracted from a textual document.\n\nA `ProvenanceItem` object acts as a lightweight pointer back into the original\ndocument for an extracted element. It applies to documents with an explicity\nor implicit layout, such as PDF, HTML, docx, or pptx.",
"properties": {
"page_no": {
"description": "Page number",
"title": "Page No",
"type": "integer"
},
"bbox": {
"$ref": "#/$defs/BoundingBox",
"description": "Bounding box"
},
"charspan": {
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"title": "Charspan",
"type": "array"
}
},
"required": [
"page_no",
"bbox",
"charspan"
],
"title": "ProvenanceItem",
"type": "object"
},
"RefItem": {
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
},
"RichTableCell": {
"description": "RichTableCell.",
"properties": {
"bbox": {
"anyOf": [
{
"$ref": "#/$defs/BoundingBox"
},
{
"type": "null"
}
],
"default": null
},
"row_span": {
"default": 1,
"title": "Row Span",
"type": "integer"
},
"col_span": {
"default": 1,
"title": "Col Span",
"type": "integer"
},
"start_row_offset_idx": {
"title": "Start Row Offset Idx",
"type": "integer"
},
"end_row_offset_idx": {
"title": "End Row Offset Idx",
"type": "integer"
},
"start_col_offset_idx": {
"title": "Start Col Offset Idx",
"type": "integer"
},
"end_col_offset_idx": {
"title": "End Col Offset Idx",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"column_header": {
"default": false,
"title": "Column Header",
"type": "boolean"
},
"row_header": {
"default": false,
"title": "Row Header",
"type": "boolean"
},
"row_section": {
"default": false,
"title": "Row Section",
"type": "boolean"
},
"fillable": {
"default": false,
"title": "Fillable",
"type": "boolean"
},
"ref": {
"$ref": "#/$defs/RefItem"
}
},
"required": [
"start_row_offset_idx",
"end_row_offset_idx",
"start_col_offset_idx",
"end_col_offset_idx",
"text",
"ref"
],
"title": "RichTableCell",
"type": "object"
},
"Script": {
"description": "Text script position.",
"enum": [
"baseline",
"sub",
"super"
],
"title": "Script",
"type": "string"
},
"SectionHeaderItem": {
"additionalProperties": false,
"description": "SectionItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "section_header",
"default": "section_header",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"orig": {
"title": "Orig",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"formatting": {
"anyOf": [
{
"$ref": "#/$defs/Formatting"
},
{
"type": "null"
}
],
"default": null
},
"hyperlink": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Hyperlink"
},
"level": {
"default": 1,
"maximum": 100,
"minimum": 1,
"title": "Level",
"type": "integer"
}
},
"required": [
"self_ref",
"orig",
"text"
],
"title": "SectionHeaderItem",
"type": "object"
},
"Size": {
"description": "Size.",
"properties": {
"width": {
"default": 0.0,
"title": "Width",
"type": "number"
},
"height": {
"default": 0.0,
"title": "Height",
"type": "number"
}
},
"title": "Size",
"type": "object"
},
"SummaryMetaField": {
"additionalProperties": true,
"description": "Summary data.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "SummaryMetaField",
"type": "object"
},
"TableCell": {
"description": "TableCell.",
"properties": {
"bbox": {
"anyOf": [
{
"$ref": "#/$defs/BoundingBox"
},
{
"type": "null"
}
],
"default": null
},
"row_span": {
"default": 1,
"title": "Row Span",
"type": "integer"
},
"col_span": {
"default": 1,
"title": "Col Span",
"type": "integer"
},
"start_row_offset_idx": {
"title": "Start Row Offset Idx",
"type": "integer"
},
"end_row_offset_idx": {
"title": "End Row Offset Idx",
"type": "integer"
},
"start_col_offset_idx": {
"title": "Start Col Offset Idx",
"type": "integer"
},
"end_col_offset_idx": {
"title": "End Col Offset Idx",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"column_header": {
"default": false,
"title": "Column Header",
"type": "boolean"
},
"row_header": {
"default": false,
"title": "Row Header",
"type": "boolean"
},
"row_section": {
"default": false,
"title": "Row Section",
"type": "boolean"
},
"fillable": {
"default": false,
"title": "Fillable",
"type": "boolean"
}
},
"required": [
"start_row_offset_idx",
"end_row_offset_idx",
"start_col_offset_idx",
"end_col_offset_idx",
"text"
],
"title": "TableCell",
"type": "object"
},
"TableData": {
"description": "BaseTableData.",
"properties": {
"table_cells": {
"default": [],
"items": {
"anyOf": [
{
"$ref": "#/$defs/RichTableCell"
},
{
"$ref": "#/$defs/TableCell"
}
]
},
"title": "Table Cells",
"type": "array"
},
"num_rows": {
"default": 0,
"title": "Num Rows",
"type": "integer"
},
"num_cols": {
"default": 0,
"title": "Num Cols",
"type": "integer"
},
"orientation": {
"$ref": "#/$defs/Orientation",
"default": "rot_0"
}
},
"title": "TableData",
"type": "object"
},
"TableItem": {
"additionalProperties": false,
"description": "TableItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/FloatingMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"default": "table",
"enum": [
"document_index",
"table"
],
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"captions": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Captions",
"type": "array"
},
"references": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "References",
"type": "array"
},
"footnotes": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Footnotes",
"type": "array"
},
"image": {
"anyOf": [
{
"$ref": "#/$defs/ImageRef"
},
{
"type": "null"
}
],
"default": null
},
"data": {
"$ref": "#/$defs/TableData"
},
"annotations": {
"default": [],
"deprecated": true,
"items": {
"discriminator": {
"mapping": {
"description": "#/$defs/DescriptionAnnotation",
"misc": "#/$defs/MiscAnnotation"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/DescriptionAnnotation"
},
{
"$ref": "#/$defs/MiscAnnotation"
}
]
},
"title": "Annotations",
"type": "array"
}
},
"required": [
"self_ref",
"data"
],
"title": "TableItem",
"type": "object"
},
"TabularChartMetaField": {
"additionalProperties": true,
"description": "Tabular chart metadata field.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"title": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Title"
},
"chart_data": {
"$ref": "#/$defs/TableData"
}
},
"required": [
"chart_data"
],
"title": "TabularChartMetaField",
"type": "object"
},
"TextItem": {
"additionalProperties": false,
"description": "TextItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"enum": [
"caption",
"checkbox_selected",
"checkbox_unselected",
"footnote",
"page_footer",
"page_header",
"paragraph",
"reference",
"text",
"empty_value",
"field_key",
"field_hint",
"marker",
"handwritten_text"
],
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"orig": {
"title": "Orig",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"formatting": {
"anyOf": [
{
"$ref": "#/$defs/Formatting"
},
{
"type": "null"
}
],
"default": null
},
"hyperlink": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Hyperlink"
}
},
"required": [
"self_ref",
"label",
"orig",
"text"
],
"title": "TextItem",
"type": "object"
},
"TitleItem": {
"additionalProperties": false,
"description": "TitleItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "title",
"default": "title",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"orig": {
"title": "Orig",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"formatting": {
"anyOf": [
{
"$ref": "#/$defs/Formatting"
},
{
"type": "null"
}
],
"default": null
},
"hyperlink": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Hyperlink"
}
},
"required": [
"self_ref",
"orig",
"text"
],
"title": "TitleItem",
"type": "object"
},
"TrackSource": {
"description": "Source metadata for a cue extracted from a media track.\n\nA `TrackSource` instance identifies a cue in a media track (audio, video, subtitles, screen-recording captions,\netc.). A *cue* here refers to any discrete segment that was pulled out of the original asset, e.g., a subtitle\nblock, an audio clip, or a timed marker in a screen-recording.",
"properties": {
"kind": {
"const": "track",
"default": "track",
"description": "Identifies this type of source.",
"title": "Kind",
"type": "string"
},
"start_time": {
"description": "Start time offset of the track cue in seconds",
"examples": [
11.0,
6.5,
5370.0
],
"title": "Start Time",
"type": "number"
},
"end_time": {
"description": "End time offset of the track cue in seconds",
"examples": [
12.0,
8.2,
5370.1
],
"title": "End Time",
"type": "number"
},
"identifier": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "An identifier of the cue",
"examples": [
"test",
"123",
"b72d946"
],
"title": "Identifier"
},
"voice": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The name of the voice in this track (the speaker)",
"examples": [
"John",
"Mary",
"Speaker 1"
],
"title": "Voice"
}
},
"required": [
"start_time",
"end_time"
],
"title": "TrackSource",
"type": "object"
}
},
"description": "DoclingDocument.",
"properties": {
"schema_name": {
"const": "DoclingDocument",
"default": "DoclingDocument",
"title": "Schema Name",
"type": "string"
},
"version": {
"default": "1.10.0",
"pattern": "^(?P<major>0|[1-9]\\d*)\\.(?P<minor>0|[1-9]\\d*)\\.(?P<patch>0|[1-9]\\d*)(?:-(?P<prerelease>(?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\\.(?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\\+(?P<buildmetadata>[0-9a-zA-Z-]+(?:\\.[0-9a-zA-Z-]+)*))?$",
"title": "Version",
"type": "string"
},
"name": {
"title": "Name",
"type": "string"
},
"origin": {
"anyOf": [
{
"$ref": "#/$defs/DocumentOrigin"
},
{
"type": "null"
}
],
"default": null
},
"furniture": {
"$ref": "#/$defs/GroupItem",
"default": {
"self_ref": "#/furniture",
"parent": null,
"children": [],
"content_layer": "furniture",
"meta": null,
"name": "_root_",
"label": "unspecified"
},
"deprecated": true
},
"body": {
"$ref": "#/$defs/GroupItem",
"default": {
"self_ref": "#/body",
"parent": null,
"children": [],
"content_layer": "body",
"meta": null,
"name": "_root_",
"label": "unspecified"
}
},
"groups": {
"default": [],
"items": {
"anyOf": [
{
"$ref": "#/$defs/ListGroup"
},
{
"$ref": "#/$defs/InlineGroup"
},
{
"$ref": "#/$defs/GroupItem"
}
]
},
"title": "Groups",
"type": "array"
},
"texts": {
"default": [],
"items": {
"anyOf": [
{
"$ref": "#/$defs/TitleItem"
},
{
"$ref": "#/$defs/SectionHeaderItem"
},
{
"$ref": "#/$defs/ListItem"
},
{
"$ref": "#/$defs/CodeItem"
},
{
"$ref": "#/$defs/FormulaItem"
},
{
"$ref": "#/$defs/FieldHeadingItem"
},
{
"$ref": "#/$defs/FieldValueItem"
},
{
"$ref": "#/$defs/TextItem"
}
]
},
"title": "Texts",
"type": "array"
},
"pictures": {
"default": [],
"items": {
"$ref": "#/$defs/PictureItem"
},
"title": "Pictures",
"type": "array"
},
"tables": {
"default": [],
"items": {
"$ref": "#/$defs/TableItem"
},
"title": "Tables",
"type": "array"
},
"key_value_items": {
"default": [],
"items": {
"$ref": "#/$defs/KeyValueItem"
},
"title": "Key Value Items",
"type": "array"
},
"form_items": {
"default": [],
"items": {
"$ref": "#/$defs/FormItem"
},
"title": "Form Items",
"type": "array"
},
"field_regions": {
"default": [],
"items": {
"$ref": "#/$defs/FieldRegionItem"
},
"title": "Field Regions",
"type": "array"
},
"field_items": {
"default": [],
"items": {
"$ref": "#/$defs/FieldItem"
},
"title": "Field Items",
"type": "array"
},
"pages": {
"additionalProperties": {
"$ref": "#/$defs/PageItem"
},
"default": {},
"title": "Pages",
"type": "object"
}
},
"required": [
"name"
],
"title": "DoclingDocument",
"type": "object"
}
Fields:
-
schema_name(Literal['DoclingDocument']) -
version(str) -
name(str) -
origin(Optional[DocumentOrigin]) -
furniture(GroupItem) -
body(GroupItem) -
groups(list[Union[ListGroup, InlineGroup, GroupItem]]) -
texts(list[Union[TitleItem, SectionHeaderItem, ListItem, CodeItem, FormulaItem, FieldHeadingItem, FieldValueItem, TextItem]]) -
pictures(list[PictureItem]) -
tables(list[TableItem]) -
key_value_items(list[KeyValueItem]) -
form_items(list[FormItem]) -
field_regions(list[FieldRegionItem]) -
field_items(list[FieldItem]) -
pages(dict[int, PageItem])
Validators:
-
transform_to_content_layer -
check_version_is_compatible→version -
_validate_unique_refs -
validate_document -
validate_misplaced_list_items
field_items
field_items: list[FieldItem] = []
field_regions
field_regions: list[FieldRegionItem] = []
form_items
form_items: list[FormItem] = []
name
name: str
schema_name
schema_name: Literal['DoclingDocument'] = 'DoclingDocument'
texts
texts: list[
Union[
TitleItem,
SectionHeaderItem,
ListItem,
CodeItem,
FormulaItem,
FieldHeadingItem,
FieldValueItem,
TextItem,
]
] = []
version
version: str = CURRENT_VERSION
add_code
add_code(
text: str,
code_language: Optional[CodeLanguageLabel] = None,
orig: Optional[str] = None,
caption: Optional[Union[TextItem, RefItem]] = None,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
)
add_code.
:param text: str: :param code_language: Optional[CodeLanguageLabel]: (Default value = None) :param orig: Optional[str]: (Default value = None) :param caption: Optional[Union[TextItem: :param RefItem]]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None)
add_comment
add_comment(
*,
text: str,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
targets: Optional[
list[
Union[DocItem, tuple[DocItem, tuple[int, int]]]
]
] = None,
)
Adds a comment to the document, assigning it to the given targets.
:param text: str: :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None) :param targets: list[Union[DocItem, tuple[DocItem, tuple[int, int]]]]: (Default value = None) Each list element can be either a single DocItem or a tuple of a DocItem and a span range (start_inclusive, end_exclusive).
add_document
add_document(
doc: DoclingDocument, parent: Optional[NodeItem] = None
) -> None
Adds the content from the body of a DoclingDocument to this document under a specific parent.
:param doc: DoclingDocument: The document whose content will be added :param parent: Optional[NodeItem]: The parent NodeItem under which new items are added (Default value = None)
:returns: None
add_field_heading
add_field_heading(
text: str,
orig: Optional[str] = None,
level: LevelNumber = 1,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
)
add_kv_heading.
:param label: DocItemLabel: :param text: str: :param orig: Optional[str]: (Default value = None) :param level: LevelNumber: (Default value = 1) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param formatting: Optional[Formatting]: (Default value = None) :param hyperlink: Optional[Union[AnyUrl, Path]]: (Default value = None)
add_field_hint
add_field_hint(
text: str,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
)
add_field_hint.
:param text: str: :param orig: Optional[str]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param formatting: Optional[Formatting]: (Default value = None) :param hyperlink: Optional[Union[AnyUrl, Path]]: (Default value = None)
add_field_item
add_field_item(
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
) -> FieldItem
add_kv_entry.
add_field_key
add_field_key(
text: str,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
)
add_field_key.
:param label: DocItemLabel: :param text: str: :param orig: Optional[str]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param formatting: Optional[Formatting]: (Default value = None) :param hyperlink: Optional[Union[AnyUrl, Path]]: (Default value = None)
add_field_region
add_field_region(
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
) -> FieldRegionItem
add_field_region.
:param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None)
add_field_value
add_field_value(
text: str,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
kind: Optional[
Literal["read_only", "fillable"]
] = "read_only",
)
add_field_value.
:param label: DocItemLabel: :param text: str: :param orig: Optional[str]: (Default value = None) :param level: LevelNumber: (Default value = 1) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param formatting: Optional[Formatting]: (Default value = None) :param hyperlink: Optional[Union[AnyUrl, Path]]: (Default value = None) :param kind: Optional[typing.Literal["read_only", "fillable", "fillable_with_hint"]]: (Default value = "read_only")
add_form
add_form(
graph: GraphData,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
)
add_form.
:param graph: GraphData: :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None)
add_formula
add_formula(
text: str,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
)
add_formula.
:param text: str: :param orig: Optional[str]: (Default value = None) :param level: LevelNumber: (Default value = 1) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None)
add_group
add_group(
label: Optional[GroupLabel] = None,
name: Optional[str] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
) -> GroupItem
add_group.
:param label: Optional[GroupLabel]: (Default value = None) :param name: Optional[str]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None)
add_heading
add_heading(
text: str,
orig: Optional[str] = None,
level: LevelNumber = 1,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
)
add_heading.
:param label: DocItemLabel: :param text: str: :param orig: Optional[str]: (Default value = None) :param level: LevelNumber: (Default value = 1) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None)
add_inline_group
add_inline_group(
name: Optional[str] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
) -> InlineGroup
add_inline_group.
add_key_values
add_key_values(
graph: GraphData,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
)
add_key_values.
:param graph: GraphData: :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None)
add_list_group
add_list_group(
name: Optional[str] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
) -> ListGroup
add_list_group.
add_list_item
add_list_item(
text: str,
enumerated: bool = False,
marker: Optional[str] = None,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
)
add_list_item.
:param label: str: :param text: str: :param orig: Optional[str]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None)
add_marker
add_marker(
text: str,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
)
add_marker.
:param text: str: :param orig: Optional[str]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param formatting: Optional[Formatting]: (Default value = None) :param hyperlink: Optional[Union[AnyUrl, Path]]: (Default value = None)
add_node_items
add_node_items(
node_items: list[NodeItem],
doc: DoclingDocument,
parent: Optional[NodeItem] = None,
) -> None
Adds multiple NodeItems and their children under a parent in this document.
:param node_items: list[NodeItem]: The NodeItems to be added :param doc: DoclingDocument: The document to which the NodeItems and their children belong :param parent: Optional[NodeItem]: The parent NodeItem under which new items are added (Default value = None)
:returns: None
add_ordered_list
add_ordered_list(
name: Optional[str] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
) -> GroupItem
add_ordered_list.
add_page
add_page.
:param page_no: int: :param size: Size:
add_picture
add_picture(
annotations: Optional[list[PictureDataType]] = None,
image: Optional[ImageRef] = None,
caption: Optional[Union[TextItem, RefItem]] = None,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
)
add_picture.
:param data: Optional[list[PictureData]]: (Default value = None) :param caption: Optional[Union[TextItem: :param RefItem]]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None)
add_table
add_table(
data: TableData,
caption: Optional[Union[TextItem, RefItem]] = None,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
label: DocItemLabel = TABLE,
content_layer: Optional[ContentLayer] = None,
annotations: Optional[list[TableAnnotationType]] = None,
)
add_table.
:param data: TableData: :param caption: Optional[Union[TextItem, RefItem]]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None) :param label: DocItemLabel: (Default value = DocItemLabel.TABLE)
add_table_cell
Add a table cell to the table.
add_text
add_text(
label: DocItemLabel,
text: str,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
*,
source: Optional[SourceType] = None,
**kwargs: Any,
)
add_text.
:param label: str: :param text: str: :param orig: Optional[str]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None)
add_title
add_title(
text: str,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
)
add_title.
:param text: str: :param orig: Optional[str]: (Default value = None) :param level: LevelNumber: (Default value = 1) :param prov: Optional[ProvenanceItem]: (Default value = None) :param parent: Optional[NodeItem]: (Default value = None)
add_unordered_list
add_unordered_list(
name: Optional[str] = None,
parent: Optional[NodeItem] = None,
content_layer: Optional[ContentLayer] = None,
) -> GroupItem
add_unordered_list.
append_child_item
Adds an item.
check_version_is_compatible
check_version_is_compatible(v: str) -> str
Check if this document version is compatible with SDK schema version.
concatenate
concatenate(
docs: Sequence[DoclingDocument],
) -> DoclingDocument
Concatenate multiple documents into a single document.
delete_items
delete_items(*, node_items: list[NodeItem]) -> None
Deletes an item, given its instance or ref, and any children it has.
delete_items_range
delete_items_range(
*,
start: NodeItem,
end: NodeItem,
start_inclusive: bool = True,
end_inclusive: bool = True,
) -> None
Deletes all NodeItems and their children in the range from the start NodeItem to the end NodeItem.
:param start: NodeItem: The starting NodeItem of the range :param end: NodeItem: The ending NodeItem of the range :param start_inclusive: bool: (Default value = True): If True, the start NodeItem will also be deleted :param end_inclusive: bool: (Default value = True): If True, the end NodeItem will also be deleted
:returns: None
export_to_dict
export_to_dict(
mode: str = "json",
by_alias: bool = True,
exclude_none: bool = True,
coord_precision: Optional[int] = None,
confid_precision: Optional[int] = None,
) -> dict[str, Any]
Export to dict.
export_to_doclang
export_to_doclang() -> str
Export to Doclang.
export_to_doctags
export_to_doctags(
delim: str = "",
from_element: int = 0,
to_element: int = maxsize,
labels: Optional[set[DocItemLabel]] = None,
xsize: int = 500,
ysize: int = 500,
add_location: bool = True,
add_content: bool = True,
add_page_index: bool = True,
add_table_cell_location: bool = False,
add_table_cell_text: bool = True,
minified: bool = False,
pages: Optional[set[int]] = None,
) -> str
Exports the document content to a DocumentToken format.
Operates on a slice of the document's body as defined through arguments from_element and to_element; defaulting to the whole main_text.
:param delim: str: (Default value = "") Deprecated :param from_element: int: (Default value = 0) :param to_element: Optional[int]: (Default value = None) :param labels: set[DocItemLabel] :param xsize: int: (Default value = 500) :param ysize: int: (Default value = 500) :param add_location: bool: (Default value = True) :param add_content: bool: (Default value = True) :param add_page_index: bool: (Default value = True) :param # table specific flagsadd_table_cell_location: bool :param add_table_cell_text: bool: (Default value = True) :param minified: bool: (Default value = False) :param pages: set[int]: (Default value = None) :returns: The content of the document formatted as a DocTags string. :rtype: str
export_to_document_tokens
export_to_document_tokens(*args, **kwargs)
Export to DocTags format.
export_to_element_tree
export_to_element_tree() -> str
Export_to_element_tree.
export_to_html
export_to_html(
from_element: int = 0,
to_element: int = maxsize,
labels: Optional[set[DocItemLabel]] = None,
enable_chart_tables: bool = True,
image_mode: ImageRefMode = PLACEHOLDER,
formula_to_mathml: bool = True,
page_no: Optional[int] = None,
html_lang: str = "en",
html_head: str = "null",
included_content_layers: Optional[
set[ContentLayer]
] = None,
split_page_view: bool = False,
include_annotations: bool = True,
) -> str
Serialize to HTML.
export_to_markdown
export_to_markdown(
delim: str = "\n\n",
from_element: int = 0,
to_element: int = maxsize,
labels: Optional[set[DocItemLabel]] = None,
strict_text: bool = False,
escape_html: bool = True,
escape_underscores: bool = True,
image_placeholder: str = "<!-- image -->",
enable_chart_tables: bool = True,
image_mode: ImageRefMode = PLACEHOLDER,
indent: int = 4,
text_width: int = -1,
page_no: Optional[int] = None,
included_content_layers: Optional[
set[ContentLayer]
] = None,
page_break_placeholder: Optional[str] = None,
include_annotations: bool = True,
mark_annotations: bool = False,
compact_tables: bool = False,
traverse_pictures: bool = False,
*,
use_legacy_annotations: Optional[bool] = None,
allowed_meta_names: Optional[set[str]] = None,
blocked_meta_names: Optional[set[str]] = None,
mark_meta: bool = False,
) -> str
Serialize to Markdown.
Operates on a slice of the document's body as defined through arguments from_element and to_element; defaulting to the whole document.
:param delim: Deprecated. :type delim: str = "\n\n" :param from_element: Body slicing start index (inclusive). (Default value = 0). :type from_element: int = 0 :param to_element: Body slicing stop index (exclusive). (Default value = maxint). :type to_element: int = sys.maxsize :param labels: The set of document labels to include in the export. None falls back to the system-defined default. :type labels: Optional[set[DocItemLabel]] = None :param strict_text: Deprecated. :type strict_text: bool = False :param escape_html: bool: Whether to escape HTML reserved characters in the text content of the document. (Default value = True). :param escape_underscores: bool: Whether to escape underscores in the text content of the document. (Default value = True). :type escape_underscores: bool = True :param image_placeholder: The placeholder to include to position images in the markdown. (Default value = "\<!-- image -->"). :type image_placeholder: str = "" :param image_mode: The mode to use for including images in the markdown. (Default value = ImageRefMode.PLACEHOLDER). :type image_mode: ImageRefMode = ImageRefMode.PLACEHOLDER :param indent: The indent in spaces of the nested lists. (Default value = 4). :type indent: int = 4 :param included_content_layers: The set of layels to include in the export. None falls back to the system-defined default. :type included_content_layers: Optional[set[ContentLayer]] = None :param page_break_placeholder: The placeholder to include for marking page breaks. None means no page break placeholder will be used. :type page_break_placeholder: Optional[str] = None :param include_annotations: bool: Whether to include annotations in the export; only considered if item does not have meta. (Default value = True). :type include_annotations: bool = True :param mark_annotations: bool: Whether to mark annotations in the export; only considered if item does not have meta. (Default value = False). :type mark_annotations: bool = False :param compact_tables: bool: Whether to use compact table format without column padding. (Default value = False). :type compact_tables: bool = False :param traverse_pictures: bool: Whether to traverse into picture items and serialize their text children. Must be set to True for scanned/image-based PDFs processed with full-page OCR, where the layout model places all OCR text as children of a top-level PictureItem. (Default value = False). :type traverse_pictures: bool = False :param use_legacy_annotations: bool: Deprecated; legacy annotations considered only when meta not present. :type use_legacy_annotations: Optional[bool] = None :param mark_meta: bool: Whether to mark meta in the export :type mark_meta: bool = False :returns: The exported Markdown representation. :rtype: str :param allowed_meta_names: Optional[set[str]]: Meta names to allow; None means all meta names are allowed. :type allowed_meta_names: Optional[set[str]] = None :param blocked_meta_names: Optional[set[str]]: Meta names to block; takes precedence over allowed_meta_names. :type blocked_meta_names: Optional[set[str]] = None
export_to_text
export_to_text(
delim: str = "\n\n",
from_element: int = 0,
to_element: int = maxsize,
labels: Optional[set[DocItemLabel]] = None,
page_no: Optional[int] = None,
included_content_layers: Optional[
set[ContentLayer]
] = None,
page_break_placeholder: Optional[str] = None,
traverse_pictures: bool = False,
) -> str
Export to plain text.
Produces clean plain text without any Markdown decoration. Heading
markers (#), bold/italic markers, and hyperlink syntax are all
stripped. List bullets (-), ordered list numbers, and table-cell
separators (|) are preserved as they aid readability.
:param delim: Deprecated. :type delim: str = "\n\n" :param from_element: Body slicing start index (inclusive). (Default value = 0). :type from_element: int = 0 :param to_element: Body slicing stop index (exclusive). (Default value = maxint). :type to_element: int = sys.maxsize :param labels: The set of document labels to include in the export. None falls back to the system-defined default. :type labels: Optional[set[DocItemLabel]] = None :param page_no: If set, only content from this page is exported. :type page_no: Optional[int] = None :param included_content_layers: The set of layers to include. None falls back to the system-defined default. :type included_content_layers: Optional[set[ContentLayer]] = None :param page_break_placeholder: String inserted at page boundaries. None means no page-break marker is emitted. :type page_break_placeholder: Optional[str] = None :param traverse_pictures: bool: Whether to traverse into picture items and include their text children. Must be set to True for scanned/image-based PDFs processed with full-page OCR, where the layout model places all OCR text as children of a top-level PictureItem. (Default value = False). :type traverse_pictures: bool = False :returns: The exported plain-text representation. :rtype: str
export_to_vtt
export_to_vtt(
included_content_layers: set[ContentLayer]
| None = None,
omit_hours_if_zero: bool = False,
omit_voice_end: bool = False,
) -> str
Serializes the Docling document to WebVTT format.
Parameters:
-
included_content_layers(set[ContentLayer] | None, default:None) –The content layers to serializes. If ommitted, the
DEFAULT_CONTENT_LAYERSwill be serialized. -
omit_hours_if_zero(bool, default:False) –If True, omit hours when they are 0 in the timings.
-
omit_voice_end(bool, default:False) –If True and cue blocks have a WebVTT cue voice span as the only component, omit the voice end tag for brevity.
Returns:
-
str–A string representation of the Docling document in WebVTT format.
extract_items_range
extract_items_range(
*,
start: NodeItem,
end: NodeItem,
start_inclusive: bool = True,
end_inclusive: bool = True,
delete: bool = False,
) -> DoclingDocument
Extracts NodeItems and children in the range from the start NodeItem to the end as a new DoclingDocument.
:param start: NodeItem: The starting NodeItem of the range (must be a direct child of the document body) :param end: NodeItem: The ending NodeItem of the range (must be a direct child of the document body) :param start_inclusive: bool: (Default value = True): If True, the start NodeItem will also be extracted :param end_inclusive: bool: (Default value = True): If True, the end NodeItem will also be extracted :param delete: bool: (Default value = False): If True, extracted items are deleted in the original document
:returns: DoclingDocument: A new document containing the extracted NodeItems and their children
filter
filter(
page_nrs: Optional[set[int]] = None,
) -> DoclingDocument
Create a new document based on the provided filter parameters.
get_visualization
get_visualization(
show_label: bool = True,
show_branch_numbering: bool = False,
viz_mode: Literal[
"reading_order", "key_value"
] = "reading_order",
show_cell_id: bool = False,
) -> dict[Optional[int], Image]
Get visualization of the document as images by page.
:param show_label: Show labels on elements (applies to all visualizers). :type show_label: bool :param show_branch_numbering: Show branch numbering (reading order visualizer only). :type show_branch_numbering: bool :param visualizer: Which visualizer to use. One of 'reading_order' (default), 'key_value'. :type visualizer: str :param show_cell_id: Show cell IDs (key value visualizer only). :type show_cell_id: bool
:returns: Dictionary mapping page numbers to PIL images. :rtype: dict[Optional[int], PILImage.Image]
insert_code
insert_code(
sibling: NodeItem,
text: str,
code_language: Optional[CodeLanguageLabel] = None,
orig: Optional[str] = None,
caption: Optional[Union[TextItem, RefItem]] = None,
prov: Optional[ProvenanceItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
after: bool = True,
) -> CodeItem
Creates a new CodeItem item and inserts it into the document.
:param sibling: NodeItem: :param text: str: :param code_language: Optional[str]: (Default value = None) :param orig: Optional[str]: (Default value = None) :param caption: Optional[Union[TextItem, RefItem]]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param formatting: Optional[Formatting]: (Default value = None) :param hyperlink: Optional[Union[AnyUrl, Path]]: (Default value = None) :param after: bool: (Default value = True)
:returns: CodeItem: The newly created CodeItem item.
insert_document
insert_document(
doc: DoclingDocument,
sibling: NodeItem,
after: bool = True,
) -> None
Inserts the content from the body of a DoclingDocument into this document at a specific position.
:param doc: DoclingDocument: The document whose content will be inserted :param sibling: NodeItem: The NodeItem after/before which the new items will be inserted :param after: bool: If True, insert after the sibling; if False, insert before (Default value = True)
:returns: None
insert_form
insert_form(
sibling: NodeItem,
graph: GraphData,
prov: Optional[ProvenanceItem] = None,
after: bool = True,
) -> FormItem
Creates a new FormItem item and inserts it into the document.
:param sibling: NodeItem: :param graph: GraphData: :param prov: Optional[ProvenanceItem]: (Default value = None) :param after: bool: (Default value = True)
:returns: FormItem: The newly created FormItem item.
insert_formula
insert_formula(
sibling: NodeItem,
text: str,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
after: bool = True,
) -> FormulaItem
Creates a new FormulaItem item and inserts it into the document.
:param sibling: NodeItem: :param text: str: :param orig: Optional[str]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param formatting: Optional[Formatting]: (Default value = None) :param hyperlink: Optional[Union[AnyUrl, Path]]: (Default value = None) :param after: bool: (Default value = True)
:returns: FormulaItem: The newly created FormulaItem item.
insert_group
insert_group(
sibling: NodeItem,
label: Optional[GroupLabel] = None,
name: Optional[str] = None,
content_layer: Optional[ContentLayer] = None,
after: bool = True,
) -> GroupItem
Creates a new GroupItem item and inserts it into the document.
:param sibling: NodeItem: :param label: Optional[GroupLabel]: (Default value = None) :param name: Optional[str]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param after: bool: (Default value = True)
:returns: GroupItem: The newly created GroupItem.
insert_heading
insert_heading(
sibling: NodeItem,
text: str,
orig: Optional[str] = None,
level: LevelNumber = 1,
prov: Optional[ProvenanceItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
after: bool = True,
) -> SectionHeaderItem
Creates a new SectionHeaderItem item and inserts it into the document.
:param sibling: NodeItem: :param text: str: :param orig: Optional[str]: (Default value = None) :param level: LevelNumber: (Default value = 1) :param prov: Optional[ProvenanceItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param formatting: Optional[Formatting]: (Default value = None) :param hyperlink: Optional[Union[AnyUrl, Path]]: (Default value = None) :param after: bool: (Default value = True)
:returns: SectionHeaderItem: The newly created SectionHeaderItem item.
insert_inline_group
insert_inline_group(
sibling: NodeItem,
name: Optional[str] = None,
content_layer: Optional[ContentLayer] = None,
after: bool = True,
) -> InlineGroup
Creates a new InlineGroup item and inserts it into the document.
:param sibling: NodeItem: :param name: Optional[str]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param after: bool: (Default value = True)
:returns: InlineGroup: The newly created InlineGroup item.
insert_item_after_sibling
Inserts an item, given its node_item instance, after other as a sibling.
insert_item_before_sibling
Inserts an item, given its node_item instance, before other as a sibling.
insert_key_values
insert_key_values(
sibling: NodeItem,
graph: GraphData,
prov: Optional[ProvenanceItem] = None,
after: bool = True,
) -> KeyValueItem
Creates a new KeyValueItem item and inserts it into the document.
:param sibling: NodeItem: :param graph: GraphData: :param prov: Optional[ProvenanceItem]: (Default value = None) :param after: bool: (Default value = True)
:returns: KeyValueItem: The newly created KeyValueItem item.
insert_list_group
insert_list_group(
sibling: NodeItem,
name: Optional[str] = None,
content_layer: Optional[ContentLayer] = None,
after: bool = True,
) -> ListGroup
Creates a new ListGroup item and inserts it into the document.
:param sibling: NodeItem: :param name: Optional[str]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param after: bool: (Default value = True)
:returns: ListGroup: The newly created ListGroup item.
insert_list_item
insert_list_item(
sibling: NodeItem,
text: str,
enumerated: bool = False,
marker: Optional[str] = None,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
after: bool = True,
) -> ListItem
Creates a new ListItem item and inserts it into the document.
:param sibling: NodeItem: :param text: str: :param enumerated: bool: (Default value = False) :param marker: Optional[str]: (Default value = None) :param orig: Optional[str]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param formatting: Optional[Formatting]: (Default value = None) :param hyperlink: Optional[Union[AnyUrl, Path]]: (Default value = None) :param after: bool: (Default value = True)
:returns: ListItem: The newly created ListItem item.
insert_node_items
insert_node_items(
sibling: NodeItem,
node_items: list[NodeItem],
doc: DoclingDocument,
after: bool = True,
) -> None
Insert multiple NodeItems and their children at a specific position in the document.
:param sibling: NodeItem: The NodeItem after/before which the new items will be inserted :param node_items: list[NodeItem]: The NodeItems to be inserted :param doc: DoclingDocument: The document to which the NodeItems and their children belong :param after: bool: If True, insert after the sibling; if False, insert before (Default value = True)
:returns: None
insert_picture
insert_picture(
sibling: NodeItem,
annotations: Optional[list[PictureDataType]] = None,
image: Optional[ImageRef] = None,
caption: Optional[Union[TextItem, RefItem]] = None,
prov: Optional[ProvenanceItem] = None,
content_layer: Optional[ContentLayer] = None,
after: bool = True,
) -> PictureItem
Creates a new PictureItem item and inserts it into the document.
:param sibling: NodeItem: :param annotations: Optional[list[PictureDataType]]: (Default value = None) :param image: Optional[ImageRef]: (Default value = None) :param caption: Optional[Union[TextItem, RefItem]]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param after: bool: (Default value = True)
:returns: PictureItem: The newly created PictureItem item.
insert_table
insert_table(
sibling: NodeItem,
data: TableData,
caption: Optional[Union[TextItem, RefItem]] = None,
prov: Optional[ProvenanceItem] = None,
label: DocItemLabel = TABLE,
content_layer: Optional[ContentLayer] = None,
annotations: Optional[list[TableAnnotationType]] = None,
after: bool = True,
) -> TableItem
Creates a new TableItem item and inserts it into the document.
:param sibling: NodeItem: :param data: TableData: :param caption: Optional[Union[TextItem, RefItem]]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param label: DocItemLabel: (Default value = DocItemLabel.TABLE) :param content_layer: Optional[ContentLayer]: (Default value = None) :param annotations: Optional[list[TableAnnotationType]]: (Default value = None) :param after: bool: (Default value = True)
:returns: TableItem: The newly created TableItem item.
insert_text
insert_text(
sibling: NodeItem,
label: DocItemLabel,
text: str,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
after: bool = True,
) -> TextItem
Creates a new TextItem item and inserts it into the document.
:param sibling: NodeItem: :param label: DocItemLabel: :param text: str: :param orig: Optional[str]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param formatting: Optional[Formatting]: (Default value = None) :param hyperlink: Optional[Union[AnyUrl, Path]]: (Default value = None) :param after: bool: (Default value = True)
:returns: TextItem: The newly created TextItem item.
insert_title
insert_title(
sibling: NodeItem,
text: str,
orig: Optional[str] = None,
prov: Optional[ProvenanceItem] = None,
content_layer: Optional[ContentLayer] = None,
formatting: Optional[Formatting] = None,
hyperlink: Optional[Union[AnyUrl, Path]] = None,
after: bool = True,
) -> TitleItem
Creates a new TitleItem item and inserts it into the document.
:param sibling: NodeItem: :param text: str: :param orig: Optional[str]: (Default value = None) :param prov: Optional[ProvenanceItem]: (Default value = None) :param content_layer: Optional[ContentLayer]: (Default value = None) :param formatting: Optional[Formatting]: (Default value = None) :param hyperlink: Optional[Union[AnyUrl, Path]]: (Default value = None) :param after: bool: (Default value = True)
:returns: TitleItem: The newly created TitleItem item.
iterate_items
iterate_items(
root: Optional[NodeItem] = None,
with_groups: bool = False,
traverse_pictures: bool = False,
page_no: Optional[int] = None,
included_content_layers: Optional[
set[ContentLayer]
] = None,
_level: int = 0,
) -> Iterable[tuple[NodeItem, int]]
Iterate elements with level.
load_from_doctags
load_from_doctags(
doctag_document: DocTagsDocument,
document_name: str = "Document",
) -> DoclingDocument
Load Docling document from lists of DocTags and Images.
load_from_json
load_from_json(
filename: Union[str, Path],
) -> DoclingDocument
load_from_json.
:param filename: The filename to load a saved DoclingDocument from a .json. :type filename: Path
:returns: The loaded DoclingDocument. :rtype: DoclingDocument
load_from_yaml
load_from_yaml(
filename: Union[str, Path],
) -> DoclingDocument
load_from_yaml.
Parameters:
-
filename(Union[str, Path]) –The filename to load a YAML-serialized DoclingDocument from.
Returns:
-
DoclingDocument(DoclingDocument) –the loaded DoclingDocument
num_pages
num_pages()
num_pages.
print_element_tree
print_element_tree()
Print_element_tree.
replace_item
Replace item with new item.
save_as_doclang
save_as_doclang(filename: Union[str, Path]) -> None
Save the document as Doclang.
save_as_doctags
save_as_doctags(
filename: Union[str, Path],
delim: str = "",
from_element: int = 0,
to_element: int = maxsize,
labels: Optional[set[DocItemLabel]] = None,
xsize: int = 500,
ysize: int = 500,
add_location: bool = True,
add_content: bool = True,
add_page_index: bool = True,
add_table_cell_location: bool = False,
add_table_cell_text: bool = True,
minified: bool = False,
)
Save the document content to DocTags format.
save_as_document_tokens
save_as_document_tokens(*args, **kwargs)
Save the document content to a DocumentToken format.
save_as_html
save_as_html(
filename: Union[str, Path],
artifacts_dir: Optional[Path] = None,
from_element: int = 0,
to_element: int = maxsize,
labels: Optional[set[DocItemLabel]] = None,
image_mode: ImageRefMode = PLACEHOLDER,
formula_to_mathml: bool = True,
page_no: Optional[int] = None,
html_lang: str = "en",
html_head: str = "null",
included_content_layers: Optional[
set[ContentLayer]
] = None,
split_page_view: bool = False,
include_annotations: bool = True,
)
Save to HTML.
save_as_json
save_as_json(
filename: Union[str, Path],
artifacts_dir: Optional[Path] = None,
image_mode: ImageRefMode = EMBEDDED,
indent: int = 2,
coord_precision: Optional[int] = None,
confid_precision: Optional[int] = None,
)
Save as json.
save_as_markdown
save_as_markdown(
filename: Union[str, Path],
artifacts_dir: Optional[Path] = None,
delim: str = "\n\n",
from_element: int = 0,
to_element: int = maxsize,
labels: Optional[set[DocItemLabel]] = None,
strict_text: bool = False,
escape_html: bool = True,
escaping_underscores: bool = True,
image_placeholder: str = "<!-- image -->",
image_mode: ImageRefMode = PLACEHOLDER,
indent: int = 4,
text_width: int = -1,
page_no: Optional[int] = None,
included_content_layers: Optional[
set[ContentLayer]
] = None,
page_break_placeholder: Optional[str] = None,
include_annotations: bool = True,
compact_tables: bool = False,
*,
mark_meta: bool = False,
use_legacy_annotations: Optional[bool] = None,
)
Save to markdown.
save_as_vtt
save_as_vtt(
filename: str | Path,
included_content_layers: set[ContentLayer]
| None = None,
omit_hours_if_zero: bool = False,
omit_voice_end: bool = True,
) -> None
Saves the Docling document to a file in WebVTT format.
Parameters:
-
filename(str | Path) –The path to the WebVTT file.
-
included_content_layers(set[ContentLayer] | None, default:None) –The content layers to serializes. If ommitted, the
DEFAULT_CONTENT_LAYERSwill be serialized. -
omit_hours_if_zero(bool, default:False) –If True, omit hours when they are 0 in the timings.
-
omit_voice_end(bool, default:True) –If True and cue blocks have a WebVTT cue voice span as the only component, omit the voice end tag for brevity.
save_as_yaml
save_as_yaml(
filename: Union[str, Path],
artifacts_dir: Optional[Path] = None,
image_mode: ImageRefMode = EMBEDDED,
default_flow_style: bool = False,
coord_precision: Optional[int] = None,
confid_precision: Optional[int] = None,
)
Save as yaml.
transform_to_content_layer
transform_to_content_layer(data: Any) -> Any
transform_to_content_layer.
validate_document
validate_document() -> Self
validate_document.
validate_misplaced_list_items
validate_misplaced_list_items() -> Self
validate_misplaced_list_items.
DocumentOrigin
Bases: BaseModel
FileSource.
Show JSON schema:
{
"description": "FileSource.",
"properties": {
"mimetype": {
"title": "Mimetype",
"type": "string"
},
"binary_hash": {
"maximum": 18446744073709551615,
"minimum": 0,
"title": "Binary Hash",
"type": "integer"
},
"filename": {
"title": "Filename",
"type": "string"
},
"uri": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Uri"
}
},
"required": [
"mimetype",
"binary_hash",
"filename"
],
"title": "DocumentOrigin",
"type": "object"
}
Fields:
-
mimetype(str) -
binary_hash(Uint64) -
filename(str) -
uri(Optional[AnyUrl])
Validators:
binary_hash
binary_hash: Uint64
filename
filename: str
mimetype
mimetype: str
uri
uri: Optional[AnyUrl] = None
parse_hex_string
parse_hex_string(value)
parse_hex_string.
validate_mimetype
validate_mimetype(v)
validate_mimetype.
DocItem
Bases: NodeItem
Base type for any element that carries content, can be a leaf node.
Show JSON schema:
{
"$defs": {
"BaseMeta": {
"additionalProperties": true,
"description": "Base class for metadata.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "BaseMeta",
"type": "object"
},
"BoundingBox": {
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
},
"ContentLayer": {
"description": "ContentLayer.",
"enum": [
"body",
"furniture",
"background",
"invisible",
"notes"
],
"title": "ContentLayer",
"type": "string"
},
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
},
"DocItemLabel": {
"description": "DocItemLabel.",
"enum": [
"caption",
"chart",
"footnote",
"formula",
"list_item",
"page_footer",
"page_header",
"picture",
"section_header",
"table",
"text",
"title",
"document_index",
"code",
"checkbox_selected",
"checkbox_unselected",
"form",
"key_value_region",
"grading_scale",
"handwritten_text",
"empty_value",
"paragraph",
"reference",
"field_region",
"field_heading",
"field_item",
"field_key",
"field_value",
"field_hint",
"marker"
],
"title": "DocItemLabel",
"type": "string"
},
"EntitiesMetaField": {
"additionalProperties": true,
"description": "Container for extracted entity mentions.",
"properties": {
"mentions": {
"items": {
"$ref": "#/$defs/EntityMention"
},
"minItems": 1,
"title": "Mentions",
"type": "array"
}
},
"required": [
"mentions"
],
"title": "EntitiesMetaField",
"type": "object"
},
"EntityMention": {
"additionalProperties": true,
"description": "Entity mention extracted from text.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"description": "Normalized text of the entity mention.",
"title": "Text",
"type": "string"
},
"orig": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Exact source text extracted from the original charspan, analogous to TextItem.orig. This may differ from 'text' when the mention has been normalized.",
"title": "Orig"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Entity type or category.",
"title": "Label"
},
"charspan": {
"anyOf": [
{
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "Character span (0-indexed) of the entity mention in the source text.",
"title": "Charspan"
}
},
"required": [
"text"
],
"title": "EntityMention",
"type": "object"
},
"FineRef": {
"description": "Fine-granular reference item that can capture span range info.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
},
"range": {
"anyOf": [
{
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"title": "Range"
}
},
"required": [
"$ref"
],
"title": "FineRef",
"type": "object"
},
"HumanLanguageLabel": {
"description": "Two-letter human language primary subtags using BCP-47 values.",
"enum": [
"aa",
"ab",
"ae",
"af",
"ak",
"am",
"an",
"ar",
"as",
"av",
"ay",
"az",
"ba",
"be",
"bg",
"bh",
"bi",
"bm",
"bn",
"bo",
"br",
"bs",
"ca",
"ce",
"ch",
"co",
"cr",
"cs",
"cu",
"cv",
"cy",
"da",
"de",
"dv",
"dz",
"ee",
"el",
"en",
"eo",
"es",
"et",
"eu",
"fa",
"ff",
"fi",
"fj",
"fo",
"fr",
"fy",
"ga",
"gd",
"gl",
"gn",
"gu",
"gv",
"ha",
"he",
"hi",
"ho",
"hr",
"ht",
"hu",
"hy",
"hz",
"ia",
"id",
"ie",
"ig",
"ii",
"ik",
"io",
"is",
"it",
"iu",
"ja",
"jv",
"ka",
"kg",
"ki",
"kj",
"kk",
"kl",
"km",
"kn",
"ko",
"kr",
"ks",
"ku",
"kv",
"kw",
"ky",
"la",
"lb",
"lg",
"li",
"ln",
"lo",
"lt",
"lu",
"lv",
"mg",
"mh",
"mi",
"mk",
"ml",
"mn",
"mr",
"ms",
"mt",
"my",
"na",
"nb",
"nd",
"ne",
"ng",
"nl",
"nn",
"no",
"nr",
"nv",
"ny",
"oc",
"oj",
"om",
"or",
"os",
"pa",
"pi",
"pl",
"ps",
"pt",
"qu",
"rm",
"rn",
"ro",
"ru",
"rw",
"sa",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sk",
"sl",
"sm",
"sn",
"so",
"sq",
"sr",
"ss",
"st",
"su",
"sv",
"sw",
"ta",
"te",
"tg",
"th",
"ti",
"tk",
"tl",
"tn",
"to",
"tr",
"ts",
"tt",
"tw",
"ty",
"ug",
"uk",
"ur",
"uz",
"ve",
"vi",
"vo",
"wa",
"wo",
"xh",
"yi",
"yo",
"za",
"zh",
"zu"
],
"title": "HumanLanguageLabel",
"type": "string"
},
"LanguageMetaField": {
"additionalProperties": true,
"description": "Detected human language.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"code": {
"$ref": "#/$defs/HumanLanguageLabel"
}
},
"required": [
"code"
],
"title": "LanguageMetaField",
"type": "object"
},
"ProvenanceItem": {
"description": "Provenance information for elements extracted from a textual document.\n\nA `ProvenanceItem` object acts as a lightweight pointer back into the original\ndocument for an extracted element. It applies to documents with an explicity\nor implicit layout, such as PDF, HTML, docx, or pptx.",
"properties": {
"page_no": {
"description": "Page number",
"title": "Page No",
"type": "integer"
},
"bbox": {
"$ref": "#/$defs/BoundingBox",
"description": "Bounding box"
},
"charspan": {
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"title": "Charspan",
"type": "array"
}
},
"required": [
"page_no",
"bbox",
"charspan"
],
"title": "ProvenanceItem",
"type": "object"
},
"RefItem": {
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
},
"SummaryMetaField": {
"additionalProperties": true,
"description": "Summary data.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "SummaryMetaField",
"type": "object"
},
"TrackSource": {
"description": "Source metadata for a cue extracted from a media track.\n\nA `TrackSource` instance identifies a cue in a media track (audio, video, subtitles, screen-recording captions,\netc.). A *cue* here refers to any discrete segment that was pulled out of the original asset, e.g., a subtitle\nblock, an audio clip, or a timed marker in a screen-recording.",
"properties": {
"kind": {
"const": "track",
"default": "track",
"description": "Identifies this type of source.",
"title": "Kind",
"type": "string"
},
"start_time": {
"description": "Start time offset of the track cue in seconds",
"examples": [
11.0,
6.5,
5370.0
],
"title": "Start Time",
"type": "number"
},
"end_time": {
"description": "End time offset of the track cue in seconds",
"examples": [
12.0,
8.2,
5370.1
],
"title": "End Time",
"type": "number"
},
"identifier": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "An identifier of the cue",
"examples": [
"test",
"123",
"b72d946"
],
"title": "Identifier"
},
"voice": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The name of the voice in this track (the speaker)",
"examples": [
"John",
"Mary",
"Speaker 1"
],
"title": "Voice"
}
},
"required": [
"start_time",
"end_time"
],
"title": "TrackSource",
"type": "object"
}
},
"additionalProperties": false,
"description": "Base type for any element that carries content, can be a leaf node.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"$ref": "#/$defs/DocItemLabel"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
}
},
"required": [
"self_ref",
"label"
],
"title": "DocItem",
"type": "object"
}
Fields:
-
self_ref(str) -
parent(Optional[RefItem]) -
children(list[RefItem]) -
content_layer(ContentLayer) -
meta(Optional[BaseMeta]) -
label(DocItemLabel) -
prov(list[ProvenanceItem]) -
source(list[SourceType]) -
comments(list[FineRef])
comments
comments: list[FineRef] = []
content_layer
content_layer: ContentLayer = BODY
meta
meta: Optional[BaseMeta] = None
model_config
model_config = ConfigDict(extra='forbid')
self_ref
self_ref: str
source
source: list[SourceType]
The provenance of this document item. Currently, it is only used for media track provenance.
get_annotations
get_annotations() -> Sequence[BaseAnnotation]
Get the annotations of this DocItem.
get_image
get_image(
doc: DoclingDocument, prov_index: int = 0
) -> Optional[Image]
Returns the image of this DocItem.
The function returns None if this DocItem has no valid provenance or if a valid image of the page containing this DocItem is not available in doc.
get_location_tokens
get_location_tokens(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
self_closing: bool = False,
) -> str
Get the location string for the BaseCell.
DocItemLabel
Bases: str, Enum
DocItemLabel.
Methods:
-
get_color–Return the RGB color associated with a given label.
Attributes:
-
CAPTION– -
CHART– -
CHECKBOX_SELECTED– -
CHECKBOX_UNSELECTED– -
CODE– -
DOCUMENT_INDEX– -
EMPTY_VALUE– -
FIELD_HEADING– -
FIELD_HINT– -
FIELD_ITEM– -
FIELD_KEY– -
FIELD_REGION– -
FIELD_VALUE– -
FOOTNOTE– -
FORM– -
FORMULA– -
GRADING_SCALE– -
HANDWRITTEN_TEXT– -
KEY_VALUE_REGION– -
LIST_ITEM– -
MARKER– -
PAGE_FOOTER– -
PAGE_HEADER– -
PARAGRAPH– -
PICTURE– -
REFERENCE– -
SECTION_HEADER– -
TABLE– -
TEXT– -
TITLE–
CAPTION
CAPTION = 'caption'
CHART
CHART = 'chart'
CHECKBOX_SELECTED
CHECKBOX_SELECTED = 'checkbox_selected'
CHECKBOX_UNSELECTED
CHECKBOX_UNSELECTED = 'checkbox_unselected'
CODE
CODE = 'code'
DOCUMENT_INDEX
DOCUMENT_INDEX = 'document_index'
EMPTY_VALUE
EMPTY_VALUE = 'empty_value'
FIELD_HEADING
FIELD_HEADING = 'field_heading'
FIELD_HINT
FIELD_HINT = 'field_hint'
FIELD_ITEM
FIELD_ITEM = 'field_item'
FIELD_KEY
FIELD_KEY = 'field_key'
FIELD_REGION
FIELD_REGION = 'field_region'
FIELD_VALUE
FIELD_VALUE = 'field_value'
FOOTNOTE
FOOTNOTE = 'footnote'
FORM
FORM = 'form'
FORMULA
FORMULA = 'formula'
GRADING_SCALE
GRADING_SCALE = 'grading_scale'
HANDWRITTEN_TEXT
HANDWRITTEN_TEXT = 'handwritten_text'
KEY_VALUE_REGION
KEY_VALUE_REGION = 'key_value_region'
LIST_ITEM
LIST_ITEM = 'list_item'
MARKER
MARKER = 'marker'
PAGE_FOOTER
PAGE_FOOTER = 'page_footer'
PAGE_HEADER
PAGE_HEADER = 'page_header'
PARAGRAPH
PARAGRAPH = 'paragraph'
PICTURE
PICTURE = 'picture'
REFERENCE
REFERENCE = 'reference'
SECTION_HEADER
SECTION_HEADER = 'section_header'
TABLE
TABLE = 'table'
TEXT
TEXT = 'text'
TITLE
TITLE = 'title'
get_color
get_color(label: DocItemLabel) -> tuple[int, int, int]
Return the RGB color associated with a given label.
ProvenanceItem
Bases: BaseModel
Provenance information for elements extracted from a textual document.
A ProvenanceItem object acts as a lightweight pointer back into the original
document for an extracted element. It applies to documents with an explicity
or implicit layout, such as PDF, HTML, docx, or pptx.
Show JSON schema:
{
"$defs": {
"BoundingBox": {
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
},
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
}
},
"description": "Provenance information for elements extracted from a textual document.\n\nA `ProvenanceItem` object acts as a lightweight pointer back into the original\ndocument for an extracted element. It applies to documents with an explicity\nor implicit layout, such as PDF, HTML, docx, or pptx.",
"properties": {
"page_no": {
"description": "Page number",
"title": "Page No",
"type": "integer"
},
"bbox": {
"$ref": "#/$defs/BoundingBox",
"description": "Bounding box"
},
"charspan": {
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"title": "Charspan",
"type": "array"
}
},
"required": [
"page_no",
"bbox",
"charspan"
],
"title": "ProvenanceItem",
"type": "object"
}
Fields:
-
page_no(int) -
bbox(BoundingBox) -
charspan(CharSpan)
GroupItem
Bases: NodeItem
GroupItem.
Show JSON schema:
{
"$defs": {
"BaseMeta": {
"additionalProperties": true,
"description": "Base class for metadata.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "BaseMeta",
"type": "object"
},
"ContentLayer": {
"description": "ContentLayer.",
"enum": [
"body",
"furniture",
"background",
"invisible",
"notes"
],
"title": "ContentLayer",
"type": "string"
},
"EntitiesMetaField": {
"additionalProperties": true,
"description": "Container for extracted entity mentions.",
"properties": {
"mentions": {
"items": {
"$ref": "#/$defs/EntityMention"
},
"minItems": 1,
"title": "Mentions",
"type": "array"
}
},
"required": [
"mentions"
],
"title": "EntitiesMetaField",
"type": "object"
},
"EntityMention": {
"additionalProperties": true,
"description": "Entity mention extracted from text.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"description": "Normalized text of the entity mention.",
"title": "Text",
"type": "string"
},
"orig": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Exact source text extracted from the original charspan, analogous to TextItem.orig. This may differ from 'text' when the mention has been normalized.",
"title": "Orig"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Entity type or category.",
"title": "Label"
},
"charspan": {
"anyOf": [
{
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "Character span (0-indexed) of the entity mention in the source text.",
"title": "Charspan"
}
},
"required": [
"text"
],
"title": "EntityMention",
"type": "object"
},
"GroupLabel": {
"description": "GroupLabel.",
"enum": [
"unspecified",
"list",
"ordered_list",
"chapter",
"section",
"sheet",
"slide",
"form_area",
"key_value_area",
"comment_section",
"inline",
"picture_area"
],
"title": "GroupLabel",
"type": "string"
},
"HumanLanguageLabel": {
"description": "Two-letter human language primary subtags using BCP-47 values.",
"enum": [
"aa",
"ab",
"ae",
"af",
"ak",
"am",
"an",
"ar",
"as",
"av",
"ay",
"az",
"ba",
"be",
"bg",
"bh",
"bi",
"bm",
"bn",
"bo",
"br",
"bs",
"ca",
"ce",
"ch",
"co",
"cr",
"cs",
"cu",
"cv",
"cy",
"da",
"de",
"dv",
"dz",
"ee",
"el",
"en",
"eo",
"es",
"et",
"eu",
"fa",
"ff",
"fi",
"fj",
"fo",
"fr",
"fy",
"ga",
"gd",
"gl",
"gn",
"gu",
"gv",
"ha",
"he",
"hi",
"ho",
"hr",
"ht",
"hu",
"hy",
"hz",
"ia",
"id",
"ie",
"ig",
"ii",
"ik",
"io",
"is",
"it",
"iu",
"ja",
"jv",
"ka",
"kg",
"ki",
"kj",
"kk",
"kl",
"km",
"kn",
"ko",
"kr",
"ks",
"ku",
"kv",
"kw",
"ky",
"la",
"lb",
"lg",
"li",
"ln",
"lo",
"lt",
"lu",
"lv",
"mg",
"mh",
"mi",
"mk",
"ml",
"mn",
"mr",
"ms",
"mt",
"my",
"na",
"nb",
"nd",
"ne",
"ng",
"nl",
"nn",
"no",
"nr",
"nv",
"ny",
"oc",
"oj",
"om",
"or",
"os",
"pa",
"pi",
"pl",
"ps",
"pt",
"qu",
"rm",
"rn",
"ro",
"ru",
"rw",
"sa",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sk",
"sl",
"sm",
"sn",
"so",
"sq",
"sr",
"ss",
"st",
"su",
"sv",
"sw",
"ta",
"te",
"tg",
"th",
"ti",
"tk",
"tl",
"tn",
"to",
"tr",
"ts",
"tt",
"tw",
"ty",
"ug",
"uk",
"ur",
"uz",
"ve",
"vi",
"vo",
"wa",
"wo",
"xh",
"yi",
"yo",
"za",
"zh",
"zu"
],
"title": "HumanLanguageLabel",
"type": "string"
},
"LanguageMetaField": {
"additionalProperties": true,
"description": "Detected human language.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"code": {
"$ref": "#/$defs/HumanLanguageLabel"
}
},
"required": [
"code"
],
"title": "LanguageMetaField",
"type": "object"
},
"RefItem": {
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
},
"SummaryMetaField": {
"additionalProperties": true,
"description": "Summary data.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "SummaryMetaField",
"type": "object"
}
},
"additionalProperties": false,
"description": "GroupItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"name": {
"default": "group",
"title": "Name",
"type": "string"
},
"label": {
"$ref": "#/$defs/GroupLabel",
"default": "unspecified"
}
},
"required": [
"self_ref"
],
"title": "GroupItem",
"type": "object"
}
Fields:
-
self_ref(str) -
parent(Optional[RefItem]) -
children(list[RefItem]) -
content_layer(ContentLayer) -
meta(Optional[BaseMeta]) -
name(str) -
label(GroupLabel)
content_layer
content_layer: ContentLayer = BODY
meta
meta: Optional[BaseMeta] = None
model_config
model_config = ConfigDict(extra='forbid')
name
name: str = 'group'
self_ref
self_ref: str
GroupLabel
Bases: str, Enum
GroupLabel.
Attributes:
-
CHAPTER– -
COMMENT_SECTION– -
FORM_AREA– -
INLINE– -
KEY_VALUE_AREA– -
LIST– -
ORDERED_LIST– -
PICTURE_AREA– -
SECTION– -
SHEET– -
SLIDE– -
UNSPECIFIED–
CHAPTER
CHAPTER = 'chapter'
COMMENT_SECTION
COMMENT_SECTION = 'comment_section'
FORM_AREA
FORM_AREA = 'form_area'
INLINE
INLINE = 'inline'
KEY_VALUE_AREA
KEY_VALUE_AREA = 'key_value_area'
LIST
LIST = 'list'
ORDERED_LIST
ORDERED_LIST = 'ordered_list'
PICTURE_AREA
PICTURE_AREA = 'picture_area'
SECTION
SECTION = 'section'
SHEET
SHEET = 'sheet'
SLIDE
SLIDE = 'slide'
UNSPECIFIED
UNSPECIFIED = 'unspecified'
NodeItem
Bases: BaseModel
NodeItem.
Show JSON schema:
{
"$defs": {
"BaseMeta": {
"additionalProperties": true,
"description": "Base class for metadata.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "BaseMeta",
"type": "object"
},
"ContentLayer": {
"description": "ContentLayer.",
"enum": [
"body",
"furniture",
"background",
"invisible",
"notes"
],
"title": "ContentLayer",
"type": "string"
},
"EntitiesMetaField": {
"additionalProperties": true,
"description": "Container for extracted entity mentions.",
"properties": {
"mentions": {
"items": {
"$ref": "#/$defs/EntityMention"
},
"minItems": 1,
"title": "Mentions",
"type": "array"
}
},
"required": [
"mentions"
],
"title": "EntitiesMetaField",
"type": "object"
},
"EntityMention": {
"additionalProperties": true,
"description": "Entity mention extracted from text.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"description": "Normalized text of the entity mention.",
"title": "Text",
"type": "string"
},
"orig": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Exact source text extracted from the original charspan, analogous to TextItem.orig. This may differ from 'text' when the mention has been normalized.",
"title": "Orig"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Entity type or category.",
"title": "Label"
},
"charspan": {
"anyOf": [
{
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "Character span (0-indexed) of the entity mention in the source text.",
"title": "Charspan"
}
},
"required": [
"text"
],
"title": "EntityMention",
"type": "object"
},
"HumanLanguageLabel": {
"description": "Two-letter human language primary subtags using BCP-47 values.",
"enum": [
"aa",
"ab",
"ae",
"af",
"ak",
"am",
"an",
"ar",
"as",
"av",
"ay",
"az",
"ba",
"be",
"bg",
"bh",
"bi",
"bm",
"bn",
"bo",
"br",
"bs",
"ca",
"ce",
"ch",
"co",
"cr",
"cs",
"cu",
"cv",
"cy",
"da",
"de",
"dv",
"dz",
"ee",
"el",
"en",
"eo",
"es",
"et",
"eu",
"fa",
"ff",
"fi",
"fj",
"fo",
"fr",
"fy",
"ga",
"gd",
"gl",
"gn",
"gu",
"gv",
"ha",
"he",
"hi",
"ho",
"hr",
"ht",
"hu",
"hy",
"hz",
"ia",
"id",
"ie",
"ig",
"ii",
"ik",
"io",
"is",
"it",
"iu",
"ja",
"jv",
"ka",
"kg",
"ki",
"kj",
"kk",
"kl",
"km",
"kn",
"ko",
"kr",
"ks",
"ku",
"kv",
"kw",
"ky",
"la",
"lb",
"lg",
"li",
"ln",
"lo",
"lt",
"lu",
"lv",
"mg",
"mh",
"mi",
"mk",
"ml",
"mn",
"mr",
"ms",
"mt",
"my",
"na",
"nb",
"nd",
"ne",
"ng",
"nl",
"nn",
"no",
"nr",
"nv",
"ny",
"oc",
"oj",
"om",
"or",
"os",
"pa",
"pi",
"pl",
"ps",
"pt",
"qu",
"rm",
"rn",
"ro",
"ru",
"rw",
"sa",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sk",
"sl",
"sm",
"sn",
"so",
"sq",
"sr",
"ss",
"st",
"su",
"sv",
"sw",
"ta",
"te",
"tg",
"th",
"ti",
"tk",
"tl",
"tn",
"to",
"tr",
"ts",
"tt",
"tw",
"ty",
"ug",
"uk",
"ur",
"uz",
"ve",
"vi",
"vo",
"wa",
"wo",
"xh",
"yi",
"yo",
"za",
"zh",
"zu"
],
"title": "HumanLanguageLabel",
"type": "string"
},
"LanguageMetaField": {
"additionalProperties": true,
"description": "Detected human language.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"code": {
"$ref": "#/$defs/HumanLanguageLabel"
}
},
"required": [
"code"
],
"title": "LanguageMetaField",
"type": "object"
},
"RefItem": {
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
},
"SummaryMetaField": {
"additionalProperties": true,
"description": "Summary data.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "SummaryMetaField",
"type": "object"
}
},
"additionalProperties": false,
"description": "NodeItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
}
},
"required": [
"self_ref"
],
"title": "NodeItem",
"type": "object"
}
Fields:
PageItem
Bases: BaseModel
PageItem.
Show JSON schema:
{
"$defs": {
"ImageRef": {
"description": "ImageRef.",
"properties": {
"mimetype": {
"title": "Mimetype",
"type": "string"
},
"dpi": {
"title": "Dpi",
"type": "integer"
},
"size": {
"$ref": "#/$defs/Size"
},
"uri": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
}
],
"title": "Uri"
}
},
"required": [
"mimetype",
"dpi",
"size",
"uri"
],
"title": "ImageRef",
"type": "object"
},
"Size": {
"description": "Size.",
"properties": {
"width": {
"default": 0.0,
"title": "Width",
"type": "number"
},
"height": {
"default": 0.0,
"title": "Height",
"type": "number"
}
},
"title": "Size",
"type": "object"
}
},
"description": "PageItem.",
"properties": {
"size": {
"$ref": "#/$defs/Size"
},
"image": {
"anyOf": [
{
"$ref": "#/$defs/ImageRef"
},
{
"type": "null"
}
],
"default": null
},
"page_no": {
"title": "Page No",
"type": "integer"
}
},
"required": [
"size",
"page_no"
],
"title": "PageItem",
"type": "object"
}
Fields:
FloatingItem
Bases: DocItem
FloatingItem.
Show JSON schema:
{
"$defs": {
"BoundingBox": {
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
},
"ContentLayer": {
"description": "ContentLayer.",
"enum": [
"body",
"furniture",
"background",
"invisible",
"notes"
],
"title": "ContentLayer",
"type": "string"
},
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
},
"DescriptionMetaField": {
"additionalProperties": true,
"description": "Description metadata field.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "DescriptionMetaField",
"type": "object"
},
"DocItemLabel": {
"description": "DocItemLabel.",
"enum": [
"caption",
"chart",
"footnote",
"formula",
"list_item",
"page_footer",
"page_header",
"picture",
"section_header",
"table",
"text",
"title",
"document_index",
"code",
"checkbox_selected",
"checkbox_unselected",
"form",
"key_value_region",
"grading_scale",
"handwritten_text",
"empty_value",
"paragraph",
"reference",
"field_region",
"field_heading",
"field_item",
"field_key",
"field_value",
"field_hint",
"marker"
],
"title": "DocItemLabel",
"type": "string"
},
"EntitiesMetaField": {
"additionalProperties": true,
"description": "Container for extracted entity mentions.",
"properties": {
"mentions": {
"items": {
"$ref": "#/$defs/EntityMention"
},
"minItems": 1,
"title": "Mentions",
"type": "array"
}
},
"required": [
"mentions"
],
"title": "EntitiesMetaField",
"type": "object"
},
"EntityMention": {
"additionalProperties": true,
"description": "Entity mention extracted from text.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"description": "Normalized text of the entity mention.",
"title": "Text",
"type": "string"
},
"orig": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Exact source text extracted from the original charspan, analogous to TextItem.orig. This may differ from 'text' when the mention has been normalized.",
"title": "Orig"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Entity type or category.",
"title": "Label"
},
"charspan": {
"anyOf": [
{
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "Character span (0-indexed) of the entity mention in the source text.",
"title": "Charspan"
}
},
"required": [
"text"
],
"title": "EntityMention",
"type": "object"
},
"FineRef": {
"description": "Fine-granular reference item that can capture span range info.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
},
"range": {
"anyOf": [
{
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"title": "Range"
}
},
"required": [
"$ref"
],
"title": "FineRef",
"type": "object"
},
"FloatingMeta": {
"additionalProperties": true,
"description": "Metadata model for floating.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
},
"description": {
"anyOf": [
{
"$ref": "#/$defs/DescriptionMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "FloatingMeta",
"type": "object"
},
"HumanLanguageLabel": {
"description": "Two-letter human language primary subtags using BCP-47 values.",
"enum": [
"aa",
"ab",
"ae",
"af",
"ak",
"am",
"an",
"ar",
"as",
"av",
"ay",
"az",
"ba",
"be",
"bg",
"bh",
"bi",
"bm",
"bn",
"bo",
"br",
"bs",
"ca",
"ce",
"ch",
"co",
"cr",
"cs",
"cu",
"cv",
"cy",
"da",
"de",
"dv",
"dz",
"ee",
"el",
"en",
"eo",
"es",
"et",
"eu",
"fa",
"ff",
"fi",
"fj",
"fo",
"fr",
"fy",
"ga",
"gd",
"gl",
"gn",
"gu",
"gv",
"ha",
"he",
"hi",
"ho",
"hr",
"ht",
"hu",
"hy",
"hz",
"ia",
"id",
"ie",
"ig",
"ii",
"ik",
"io",
"is",
"it",
"iu",
"ja",
"jv",
"ka",
"kg",
"ki",
"kj",
"kk",
"kl",
"km",
"kn",
"ko",
"kr",
"ks",
"ku",
"kv",
"kw",
"ky",
"la",
"lb",
"lg",
"li",
"ln",
"lo",
"lt",
"lu",
"lv",
"mg",
"mh",
"mi",
"mk",
"ml",
"mn",
"mr",
"ms",
"mt",
"my",
"na",
"nb",
"nd",
"ne",
"ng",
"nl",
"nn",
"no",
"nr",
"nv",
"ny",
"oc",
"oj",
"om",
"or",
"os",
"pa",
"pi",
"pl",
"ps",
"pt",
"qu",
"rm",
"rn",
"ro",
"ru",
"rw",
"sa",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sk",
"sl",
"sm",
"sn",
"so",
"sq",
"sr",
"ss",
"st",
"su",
"sv",
"sw",
"ta",
"te",
"tg",
"th",
"ti",
"tk",
"tl",
"tn",
"to",
"tr",
"ts",
"tt",
"tw",
"ty",
"ug",
"uk",
"ur",
"uz",
"ve",
"vi",
"vo",
"wa",
"wo",
"xh",
"yi",
"yo",
"za",
"zh",
"zu"
],
"title": "HumanLanguageLabel",
"type": "string"
},
"ImageRef": {
"description": "ImageRef.",
"properties": {
"mimetype": {
"title": "Mimetype",
"type": "string"
},
"dpi": {
"title": "Dpi",
"type": "integer"
},
"size": {
"$ref": "#/$defs/Size"
},
"uri": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
}
],
"title": "Uri"
}
},
"required": [
"mimetype",
"dpi",
"size",
"uri"
],
"title": "ImageRef",
"type": "object"
},
"LanguageMetaField": {
"additionalProperties": true,
"description": "Detected human language.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"code": {
"$ref": "#/$defs/HumanLanguageLabel"
}
},
"required": [
"code"
],
"title": "LanguageMetaField",
"type": "object"
},
"ProvenanceItem": {
"description": "Provenance information for elements extracted from a textual document.\n\nA `ProvenanceItem` object acts as a lightweight pointer back into the original\ndocument for an extracted element. It applies to documents with an explicity\nor implicit layout, such as PDF, HTML, docx, or pptx.",
"properties": {
"page_no": {
"description": "Page number",
"title": "Page No",
"type": "integer"
},
"bbox": {
"$ref": "#/$defs/BoundingBox",
"description": "Bounding box"
},
"charspan": {
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"title": "Charspan",
"type": "array"
}
},
"required": [
"page_no",
"bbox",
"charspan"
],
"title": "ProvenanceItem",
"type": "object"
},
"RefItem": {
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
},
"Size": {
"description": "Size.",
"properties": {
"width": {
"default": 0.0,
"title": "Width",
"type": "number"
},
"height": {
"default": 0.0,
"title": "Height",
"type": "number"
}
},
"title": "Size",
"type": "object"
},
"SummaryMetaField": {
"additionalProperties": true,
"description": "Summary data.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "SummaryMetaField",
"type": "object"
},
"TrackSource": {
"description": "Source metadata for a cue extracted from a media track.\n\nA `TrackSource` instance identifies a cue in a media track (audio, video, subtitles, screen-recording captions,\netc.). A *cue* here refers to any discrete segment that was pulled out of the original asset, e.g., a subtitle\nblock, an audio clip, or a timed marker in a screen-recording.",
"properties": {
"kind": {
"const": "track",
"default": "track",
"description": "Identifies this type of source.",
"title": "Kind",
"type": "string"
},
"start_time": {
"description": "Start time offset of the track cue in seconds",
"examples": [
11.0,
6.5,
5370.0
],
"title": "Start Time",
"type": "number"
},
"end_time": {
"description": "End time offset of the track cue in seconds",
"examples": [
12.0,
8.2,
5370.1
],
"title": "End Time",
"type": "number"
},
"identifier": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "An identifier of the cue",
"examples": [
"test",
"123",
"b72d946"
],
"title": "Identifier"
},
"voice": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The name of the voice in this track (the speaker)",
"examples": [
"John",
"Mary",
"Speaker 1"
],
"title": "Voice"
}
},
"required": [
"start_time",
"end_time"
],
"title": "TrackSource",
"type": "object"
}
},
"additionalProperties": false,
"description": "FloatingItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/FloatingMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"$ref": "#/$defs/DocItemLabel"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"captions": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Captions",
"type": "array"
},
"references": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "References",
"type": "array"
},
"footnotes": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Footnotes",
"type": "array"
},
"image": {
"anyOf": [
{
"$ref": "#/$defs/ImageRef"
},
{
"type": "null"
}
],
"default": null
}
},
"required": [
"self_ref",
"label"
],
"title": "FloatingItem",
"type": "object"
}
Fields:
-
self_ref(str) -
parent(Optional[RefItem]) -
children(list[RefItem]) -
content_layer(ContentLayer) -
label(DocItemLabel) -
prov(list[ProvenanceItem]) -
source(list[SourceType]) -
comments(list[FineRef]) -
meta(Optional[FloatingMeta]) -
captions(list[RefItem]) -
references(list[RefItem]) -
footnotes(list[RefItem]) -
image(Optional[ImageRef])
comments
comments: list[FineRef] = []
content_layer
content_layer: ContentLayer = BODY
meta
meta: Optional[FloatingMeta] = None
model_config
model_config = ConfigDict(extra='forbid')
self_ref
self_ref: str
source
source: list[SourceType]
The provenance of this document item. Currently, it is only used for media track provenance.
get_annotations
get_annotations() -> Sequence[BaseAnnotation]
Get the annotations of this DocItem.
get_image
get_image(
doc: DoclingDocument, prov_index: int = 0
) -> Optional[Image]
Returns the image corresponding to this FloatingItem.
This function returns the PIL image from self.image if one is available. Otherwise, it uses DocItem.get_image to get an image of this FloatingItem.
In particular, when self.image is None, the function returns None if this FloatingItem has no valid provenance or the doc does not contain a valid image for the required page.
get_location_tokens
get_location_tokens(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
self_closing: bool = False,
) -> str
Get the location string for the BaseCell.
TextItem
Bases: DocItem
TextItem.
Show JSON schema:
{
"$defs": {
"BaseMeta": {
"additionalProperties": true,
"description": "Base class for metadata.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "BaseMeta",
"type": "object"
},
"BoundingBox": {
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
},
"ContentLayer": {
"description": "ContentLayer.",
"enum": [
"body",
"furniture",
"background",
"invisible",
"notes"
],
"title": "ContentLayer",
"type": "string"
},
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
},
"EntitiesMetaField": {
"additionalProperties": true,
"description": "Container for extracted entity mentions.",
"properties": {
"mentions": {
"items": {
"$ref": "#/$defs/EntityMention"
},
"minItems": 1,
"title": "Mentions",
"type": "array"
}
},
"required": [
"mentions"
],
"title": "EntitiesMetaField",
"type": "object"
},
"EntityMention": {
"additionalProperties": true,
"description": "Entity mention extracted from text.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"description": "Normalized text of the entity mention.",
"title": "Text",
"type": "string"
},
"orig": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Exact source text extracted from the original charspan, analogous to TextItem.orig. This may differ from 'text' when the mention has been normalized.",
"title": "Orig"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Entity type or category.",
"title": "Label"
},
"charspan": {
"anyOf": [
{
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "Character span (0-indexed) of the entity mention in the source text.",
"title": "Charspan"
}
},
"required": [
"text"
],
"title": "EntityMention",
"type": "object"
},
"FineRef": {
"description": "Fine-granular reference item that can capture span range info.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
},
"range": {
"anyOf": [
{
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"title": "Range"
}
},
"required": [
"$ref"
],
"title": "FineRef",
"type": "object"
},
"Formatting": {
"description": "Formatting.",
"properties": {
"bold": {
"default": false,
"title": "Bold",
"type": "boolean"
},
"italic": {
"default": false,
"title": "Italic",
"type": "boolean"
},
"underline": {
"default": false,
"title": "Underline",
"type": "boolean"
},
"strikethrough": {
"default": false,
"title": "Strikethrough",
"type": "boolean"
},
"script": {
"$ref": "#/$defs/Script",
"default": "baseline"
}
},
"title": "Formatting",
"type": "object"
},
"HumanLanguageLabel": {
"description": "Two-letter human language primary subtags using BCP-47 values.",
"enum": [
"aa",
"ab",
"ae",
"af",
"ak",
"am",
"an",
"ar",
"as",
"av",
"ay",
"az",
"ba",
"be",
"bg",
"bh",
"bi",
"bm",
"bn",
"bo",
"br",
"bs",
"ca",
"ce",
"ch",
"co",
"cr",
"cs",
"cu",
"cv",
"cy",
"da",
"de",
"dv",
"dz",
"ee",
"el",
"en",
"eo",
"es",
"et",
"eu",
"fa",
"ff",
"fi",
"fj",
"fo",
"fr",
"fy",
"ga",
"gd",
"gl",
"gn",
"gu",
"gv",
"ha",
"he",
"hi",
"ho",
"hr",
"ht",
"hu",
"hy",
"hz",
"ia",
"id",
"ie",
"ig",
"ii",
"ik",
"io",
"is",
"it",
"iu",
"ja",
"jv",
"ka",
"kg",
"ki",
"kj",
"kk",
"kl",
"km",
"kn",
"ko",
"kr",
"ks",
"ku",
"kv",
"kw",
"ky",
"la",
"lb",
"lg",
"li",
"ln",
"lo",
"lt",
"lu",
"lv",
"mg",
"mh",
"mi",
"mk",
"ml",
"mn",
"mr",
"ms",
"mt",
"my",
"na",
"nb",
"nd",
"ne",
"ng",
"nl",
"nn",
"no",
"nr",
"nv",
"ny",
"oc",
"oj",
"om",
"or",
"os",
"pa",
"pi",
"pl",
"ps",
"pt",
"qu",
"rm",
"rn",
"ro",
"ru",
"rw",
"sa",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sk",
"sl",
"sm",
"sn",
"so",
"sq",
"sr",
"ss",
"st",
"su",
"sv",
"sw",
"ta",
"te",
"tg",
"th",
"ti",
"tk",
"tl",
"tn",
"to",
"tr",
"ts",
"tt",
"tw",
"ty",
"ug",
"uk",
"ur",
"uz",
"ve",
"vi",
"vo",
"wa",
"wo",
"xh",
"yi",
"yo",
"za",
"zh",
"zu"
],
"title": "HumanLanguageLabel",
"type": "string"
},
"LanguageMetaField": {
"additionalProperties": true,
"description": "Detected human language.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"code": {
"$ref": "#/$defs/HumanLanguageLabel"
}
},
"required": [
"code"
],
"title": "LanguageMetaField",
"type": "object"
},
"ProvenanceItem": {
"description": "Provenance information for elements extracted from a textual document.\n\nA `ProvenanceItem` object acts as a lightweight pointer back into the original\ndocument for an extracted element. It applies to documents with an explicity\nor implicit layout, such as PDF, HTML, docx, or pptx.",
"properties": {
"page_no": {
"description": "Page number",
"title": "Page No",
"type": "integer"
},
"bbox": {
"$ref": "#/$defs/BoundingBox",
"description": "Bounding box"
},
"charspan": {
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"title": "Charspan",
"type": "array"
}
},
"required": [
"page_no",
"bbox",
"charspan"
],
"title": "ProvenanceItem",
"type": "object"
},
"RefItem": {
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
},
"Script": {
"description": "Text script position.",
"enum": [
"baseline",
"sub",
"super"
],
"title": "Script",
"type": "string"
},
"SummaryMetaField": {
"additionalProperties": true,
"description": "Summary data.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "SummaryMetaField",
"type": "object"
},
"TrackSource": {
"description": "Source metadata for a cue extracted from a media track.\n\nA `TrackSource` instance identifies a cue in a media track (audio, video, subtitles, screen-recording captions,\netc.). A *cue* here refers to any discrete segment that was pulled out of the original asset, e.g., a subtitle\nblock, an audio clip, or a timed marker in a screen-recording.",
"properties": {
"kind": {
"const": "track",
"default": "track",
"description": "Identifies this type of source.",
"title": "Kind",
"type": "string"
},
"start_time": {
"description": "Start time offset of the track cue in seconds",
"examples": [
11.0,
6.5,
5370.0
],
"title": "Start Time",
"type": "number"
},
"end_time": {
"description": "End time offset of the track cue in seconds",
"examples": [
12.0,
8.2,
5370.1
],
"title": "End Time",
"type": "number"
},
"identifier": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "An identifier of the cue",
"examples": [
"test",
"123",
"b72d946"
],
"title": "Identifier"
},
"voice": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The name of the voice in this track (the speaker)",
"examples": [
"John",
"Mary",
"Speaker 1"
],
"title": "Voice"
}
},
"required": [
"start_time",
"end_time"
],
"title": "TrackSource",
"type": "object"
}
},
"additionalProperties": false,
"description": "TextItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"enum": [
"caption",
"checkbox_selected",
"checkbox_unselected",
"footnote",
"page_footer",
"page_header",
"paragraph",
"reference",
"text",
"empty_value",
"field_key",
"field_hint",
"marker",
"handwritten_text"
],
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"orig": {
"title": "Orig",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"formatting": {
"anyOf": [
{
"$ref": "#/$defs/Formatting"
},
{
"type": "null"
}
],
"default": null
},
"hyperlink": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Hyperlink"
}
},
"required": [
"self_ref",
"label",
"orig",
"text"
],
"title": "TextItem",
"type": "object"
}
Fields:
-
self_ref(str) -
parent(Optional[RefItem]) -
children(list[RefItem]) -
content_layer(ContentLayer) -
meta(Optional[BaseMeta]) -
prov(list[ProvenanceItem]) -
source(list[SourceType]) -
comments(list[FineRef]) -
label(Literal[CAPTION, CHECKBOX_SELECTED, CHECKBOX_UNSELECTED, FOOTNOTE, PAGE_FOOTER, PAGE_HEADER, PARAGRAPH, REFERENCE, TEXT, EMPTY_VALUE, FIELD_KEY, FIELD_HINT, MARKER, HANDWRITTEN_TEXT]) -
orig(str) -
text(str) -
formatting(Optional[Formatting]) -
hyperlink(Optional[Union[AnyUrl, Path]])
comments
comments: list[FineRef] = []
content_layer
content_layer: ContentLayer = BODY
formatting
formatting: Optional[Formatting] = None
hyperlink
hyperlink: Optional[Union[AnyUrl, Path]] = None
label
label: Literal[
CAPTION,
CHECKBOX_SELECTED,
CHECKBOX_UNSELECTED,
FOOTNOTE,
PAGE_FOOTER,
PAGE_HEADER,
PARAGRAPH,
REFERENCE,
TEXT,
EMPTY_VALUE,
FIELD_KEY,
FIELD_HINT,
MARKER,
HANDWRITTEN_TEXT,
]
meta
meta: Optional[BaseMeta] = None
model_config
model_config = ConfigDict(extra='forbid')
orig
orig: str
self_ref
self_ref: str
source
source: list[SourceType]
The provenance of this document item. Currently, it is only used for media track provenance.
text
text: str
export_to_doctags
export_to_doctags(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
add_location: bool = True,
add_content: bool = True,
)
Export text element to document tokens format.
:param doc: "DoclingDocument": :param new_line: str (Default value = "") Deprecated :param xsize: int: (Default value = 500) :param ysize: int: (Default value = 500) :param add_location: bool: (Default value = True) :param add_content: bool: (Default value = True)
export_to_document_tokens
export_to_document_tokens(*args, **kwargs)
Export to DocTags format.
get_annotations
get_annotations() -> Sequence[BaseAnnotation]
Get the annotations of this DocItem.
get_image
get_image(
doc: DoclingDocument, prov_index: int = 0
) -> Optional[Image]
Returns the image of this DocItem.
The function returns None if this DocItem has no valid provenance or if a valid image of the page containing this DocItem is not available in doc.
get_location_tokens
get_location_tokens(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
self_closing: bool = False,
) -> str
Get the location string for the BaseCell.
TableItem
Bases: FloatingItem
TableItem.
Show JSON schema:
{
"$defs": {
"BoundingBox": {
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
},
"ContentLayer": {
"description": "ContentLayer.",
"enum": [
"body",
"furniture",
"background",
"invisible",
"notes"
],
"title": "ContentLayer",
"type": "string"
},
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
},
"DescriptionAnnotation": {
"description": "DescriptionAnnotation.",
"properties": {
"kind": {
"const": "description",
"default": "description",
"title": "Kind",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"provenance": {
"title": "Provenance",
"type": "string"
}
},
"required": [
"text",
"provenance"
],
"title": "DescriptionAnnotation",
"type": "object"
},
"DescriptionMetaField": {
"additionalProperties": true,
"description": "Description metadata field.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "DescriptionMetaField",
"type": "object"
},
"EntitiesMetaField": {
"additionalProperties": true,
"description": "Container for extracted entity mentions.",
"properties": {
"mentions": {
"items": {
"$ref": "#/$defs/EntityMention"
},
"minItems": 1,
"title": "Mentions",
"type": "array"
}
},
"required": [
"mentions"
],
"title": "EntitiesMetaField",
"type": "object"
},
"EntityMention": {
"additionalProperties": true,
"description": "Entity mention extracted from text.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"description": "Normalized text of the entity mention.",
"title": "Text",
"type": "string"
},
"orig": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Exact source text extracted from the original charspan, analogous to TextItem.orig. This may differ from 'text' when the mention has been normalized.",
"title": "Orig"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Entity type or category.",
"title": "Label"
},
"charspan": {
"anyOf": [
{
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "Character span (0-indexed) of the entity mention in the source text.",
"title": "Charspan"
}
},
"required": [
"text"
],
"title": "EntityMention",
"type": "object"
},
"FineRef": {
"description": "Fine-granular reference item that can capture span range info.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
},
"range": {
"anyOf": [
{
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"title": "Range"
}
},
"required": [
"$ref"
],
"title": "FineRef",
"type": "object"
},
"FloatingMeta": {
"additionalProperties": true,
"description": "Metadata model for floating.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
},
"description": {
"anyOf": [
{
"$ref": "#/$defs/DescriptionMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "FloatingMeta",
"type": "object"
},
"HumanLanguageLabel": {
"description": "Two-letter human language primary subtags using BCP-47 values.",
"enum": [
"aa",
"ab",
"ae",
"af",
"ak",
"am",
"an",
"ar",
"as",
"av",
"ay",
"az",
"ba",
"be",
"bg",
"bh",
"bi",
"bm",
"bn",
"bo",
"br",
"bs",
"ca",
"ce",
"ch",
"co",
"cr",
"cs",
"cu",
"cv",
"cy",
"da",
"de",
"dv",
"dz",
"ee",
"el",
"en",
"eo",
"es",
"et",
"eu",
"fa",
"ff",
"fi",
"fj",
"fo",
"fr",
"fy",
"ga",
"gd",
"gl",
"gn",
"gu",
"gv",
"ha",
"he",
"hi",
"ho",
"hr",
"ht",
"hu",
"hy",
"hz",
"ia",
"id",
"ie",
"ig",
"ii",
"ik",
"io",
"is",
"it",
"iu",
"ja",
"jv",
"ka",
"kg",
"ki",
"kj",
"kk",
"kl",
"km",
"kn",
"ko",
"kr",
"ks",
"ku",
"kv",
"kw",
"ky",
"la",
"lb",
"lg",
"li",
"ln",
"lo",
"lt",
"lu",
"lv",
"mg",
"mh",
"mi",
"mk",
"ml",
"mn",
"mr",
"ms",
"mt",
"my",
"na",
"nb",
"nd",
"ne",
"ng",
"nl",
"nn",
"no",
"nr",
"nv",
"ny",
"oc",
"oj",
"om",
"or",
"os",
"pa",
"pi",
"pl",
"ps",
"pt",
"qu",
"rm",
"rn",
"ro",
"ru",
"rw",
"sa",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sk",
"sl",
"sm",
"sn",
"so",
"sq",
"sr",
"ss",
"st",
"su",
"sv",
"sw",
"ta",
"te",
"tg",
"th",
"ti",
"tk",
"tl",
"tn",
"to",
"tr",
"ts",
"tt",
"tw",
"ty",
"ug",
"uk",
"ur",
"uz",
"ve",
"vi",
"vo",
"wa",
"wo",
"xh",
"yi",
"yo",
"za",
"zh",
"zu"
],
"title": "HumanLanguageLabel",
"type": "string"
},
"ImageRef": {
"description": "ImageRef.",
"properties": {
"mimetype": {
"title": "Mimetype",
"type": "string"
},
"dpi": {
"title": "Dpi",
"type": "integer"
},
"size": {
"$ref": "#/$defs/Size"
},
"uri": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
}
],
"title": "Uri"
}
},
"required": [
"mimetype",
"dpi",
"size",
"uri"
],
"title": "ImageRef",
"type": "object"
},
"LanguageMetaField": {
"additionalProperties": true,
"description": "Detected human language.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"code": {
"$ref": "#/$defs/HumanLanguageLabel"
}
},
"required": [
"code"
],
"title": "LanguageMetaField",
"type": "object"
},
"MiscAnnotation": {
"description": "MiscAnnotation.",
"properties": {
"kind": {
"const": "misc",
"default": "misc",
"title": "Kind",
"type": "string"
},
"content": {
"additionalProperties": true,
"title": "Content",
"type": "object"
}
},
"required": [
"content"
],
"title": "MiscAnnotation",
"type": "object"
},
"Orientation": {
"description": "Counter-clockwise rotation of a table on the page, in degrees.\n\nFollows the convention used by PIL/Pillow's ``Image.rotate``: positive\nangles rotate the table counter-clockwise. ``ROT_0`` / ``ROT_180`` keep\nrows running horizontally on the page; ``ROT_90`` / ``ROT_270`` turn\nrows into vertical stripes.",
"enum": [
"rot_0",
"rot_90",
"rot_180",
"rot_270"
],
"title": "Orientation",
"type": "string"
},
"ProvenanceItem": {
"description": "Provenance information for elements extracted from a textual document.\n\nA `ProvenanceItem` object acts as a lightweight pointer back into the original\ndocument for an extracted element. It applies to documents with an explicity\nor implicit layout, such as PDF, HTML, docx, or pptx.",
"properties": {
"page_no": {
"description": "Page number",
"title": "Page No",
"type": "integer"
},
"bbox": {
"$ref": "#/$defs/BoundingBox",
"description": "Bounding box"
},
"charspan": {
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"title": "Charspan",
"type": "array"
}
},
"required": [
"page_no",
"bbox",
"charspan"
],
"title": "ProvenanceItem",
"type": "object"
},
"RefItem": {
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
},
"RichTableCell": {
"description": "RichTableCell.",
"properties": {
"bbox": {
"anyOf": [
{
"$ref": "#/$defs/BoundingBox"
},
{
"type": "null"
}
],
"default": null
},
"row_span": {
"default": 1,
"title": "Row Span",
"type": "integer"
},
"col_span": {
"default": 1,
"title": "Col Span",
"type": "integer"
},
"start_row_offset_idx": {
"title": "Start Row Offset Idx",
"type": "integer"
},
"end_row_offset_idx": {
"title": "End Row Offset Idx",
"type": "integer"
},
"start_col_offset_idx": {
"title": "Start Col Offset Idx",
"type": "integer"
},
"end_col_offset_idx": {
"title": "End Col Offset Idx",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"column_header": {
"default": false,
"title": "Column Header",
"type": "boolean"
},
"row_header": {
"default": false,
"title": "Row Header",
"type": "boolean"
},
"row_section": {
"default": false,
"title": "Row Section",
"type": "boolean"
},
"fillable": {
"default": false,
"title": "Fillable",
"type": "boolean"
},
"ref": {
"$ref": "#/$defs/RefItem"
}
},
"required": [
"start_row_offset_idx",
"end_row_offset_idx",
"start_col_offset_idx",
"end_col_offset_idx",
"text",
"ref"
],
"title": "RichTableCell",
"type": "object"
},
"Size": {
"description": "Size.",
"properties": {
"width": {
"default": 0.0,
"title": "Width",
"type": "number"
},
"height": {
"default": 0.0,
"title": "Height",
"type": "number"
}
},
"title": "Size",
"type": "object"
},
"SummaryMetaField": {
"additionalProperties": true,
"description": "Summary data.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "SummaryMetaField",
"type": "object"
},
"TableCell": {
"description": "TableCell.",
"properties": {
"bbox": {
"anyOf": [
{
"$ref": "#/$defs/BoundingBox"
},
{
"type": "null"
}
],
"default": null
},
"row_span": {
"default": 1,
"title": "Row Span",
"type": "integer"
},
"col_span": {
"default": 1,
"title": "Col Span",
"type": "integer"
},
"start_row_offset_idx": {
"title": "Start Row Offset Idx",
"type": "integer"
},
"end_row_offset_idx": {
"title": "End Row Offset Idx",
"type": "integer"
},
"start_col_offset_idx": {
"title": "Start Col Offset Idx",
"type": "integer"
},
"end_col_offset_idx": {
"title": "End Col Offset Idx",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"column_header": {
"default": false,
"title": "Column Header",
"type": "boolean"
},
"row_header": {
"default": false,
"title": "Row Header",
"type": "boolean"
},
"row_section": {
"default": false,
"title": "Row Section",
"type": "boolean"
},
"fillable": {
"default": false,
"title": "Fillable",
"type": "boolean"
}
},
"required": [
"start_row_offset_idx",
"end_row_offset_idx",
"start_col_offset_idx",
"end_col_offset_idx",
"text"
],
"title": "TableCell",
"type": "object"
},
"TableData": {
"description": "BaseTableData.",
"properties": {
"table_cells": {
"default": [],
"items": {
"anyOf": [
{
"$ref": "#/$defs/RichTableCell"
},
{
"$ref": "#/$defs/TableCell"
}
]
},
"title": "Table Cells",
"type": "array"
},
"num_rows": {
"default": 0,
"title": "Num Rows",
"type": "integer"
},
"num_cols": {
"default": 0,
"title": "Num Cols",
"type": "integer"
},
"orientation": {
"$ref": "#/$defs/Orientation",
"default": "rot_0"
}
},
"title": "TableData",
"type": "object"
},
"TrackSource": {
"description": "Source metadata for a cue extracted from a media track.\n\nA `TrackSource` instance identifies a cue in a media track (audio, video, subtitles, screen-recording captions,\netc.). A *cue* here refers to any discrete segment that was pulled out of the original asset, e.g., a subtitle\nblock, an audio clip, or a timed marker in a screen-recording.",
"properties": {
"kind": {
"const": "track",
"default": "track",
"description": "Identifies this type of source.",
"title": "Kind",
"type": "string"
},
"start_time": {
"description": "Start time offset of the track cue in seconds",
"examples": [
11.0,
6.5,
5370.0
],
"title": "Start Time",
"type": "number"
},
"end_time": {
"description": "End time offset of the track cue in seconds",
"examples": [
12.0,
8.2,
5370.1
],
"title": "End Time",
"type": "number"
},
"identifier": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "An identifier of the cue",
"examples": [
"test",
"123",
"b72d946"
],
"title": "Identifier"
},
"voice": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The name of the voice in this track (the speaker)",
"examples": [
"John",
"Mary",
"Speaker 1"
],
"title": "Voice"
}
},
"required": [
"start_time",
"end_time"
],
"title": "TrackSource",
"type": "object"
}
},
"additionalProperties": false,
"description": "TableItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/FloatingMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"default": "table",
"enum": [
"document_index",
"table"
],
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"captions": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Captions",
"type": "array"
},
"references": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "References",
"type": "array"
},
"footnotes": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Footnotes",
"type": "array"
},
"image": {
"anyOf": [
{
"$ref": "#/$defs/ImageRef"
},
{
"type": "null"
}
],
"default": null
},
"data": {
"$ref": "#/$defs/TableData"
},
"annotations": {
"default": [],
"deprecated": true,
"items": {
"discriminator": {
"mapping": {
"description": "#/$defs/DescriptionAnnotation",
"misc": "#/$defs/MiscAnnotation"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/DescriptionAnnotation"
},
{
"$ref": "#/$defs/MiscAnnotation"
}
]
},
"title": "Annotations",
"type": "array"
}
},
"required": [
"self_ref",
"data"
],
"title": "TableItem",
"type": "object"
}
Fields:
-
self_ref(str) -
parent(Optional[RefItem]) -
children(list[RefItem]) -
content_layer(ContentLayer) -
meta(Optional[FloatingMeta]) -
prov(list[ProvenanceItem]) -
source(list[SourceType]) -
comments(list[FineRef]) -
captions(list[RefItem]) -
references(list[RefItem]) -
footnotes(list[RefItem]) -
image(Optional[ImageRef]) -
data(TableData) -
label(Literal[DOCUMENT_INDEX, TABLE]) -
annotations(list[TableAnnotationType])
Validators:
-
_migrate_annotations_to_meta
annotations
annotations: list[TableAnnotationType] = []
comments
comments: list[FineRef] = []
content_layer
content_layer: ContentLayer = BODY
meta
meta: Optional[FloatingMeta] = None
model_config
model_config = ConfigDict(extra='forbid')
self_ref
self_ref: str
source
source: list[SourceType]
The provenance of this document item. Currently, it is only used for media track provenance.
add_annotation
add_annotation(annotation: TableAnnotationType) -> None
Add an annotation to the table.
export_to_dataframe
export_to_dataframe(
doc: Optional[DoclingDocument] = None,
) -> DataFrame
Export the table as a Pandas DataFrame.
export_to_doctags
export_to_doctags(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
add_location: bool = True,
add_cell_location: bool = True,
add_cell_text: bool = True,
add_caption: bool = True,
)
Export table to document tokens format.
:param doc: "DoclingDocument": :param new_line: str (Default value = "") Deprecated :param xsize: int: (Default value = 500) :param ysize: int: (Default value = 500) :param add_location: bool: (Default value = True) :param add_cell_location: bool: (Default value = True) :param add_cell_text: bool: (Default value = True) :param add_caption: bool: (Default value = True)
export_to_document_tokens
export_to_document_tokens(*args, **kwargs)
Export to DocTags format.
export_to_html
export_to_html(
doc: Optional[DoclingDocument] = None,
add_caption: bool = True,
) -> str
Export the table as html.
export_to_markdown
export_to_markdown(
doc: Optional[DoclingDocument] = None,
) -> str
Export the table as markdown.
export_to_otsl
export_to_otsl(
doc: DoclingDocument,
add_cell_location: bool = True,
add_cell_text: bool = True,
xsize: int = 500,
ysize: int = 500,
self_closing: bool = False,
**kwargs: Any,
) -> str
Export the table as OTSL.
get_annotations
get_annotations() -> Sequence[BaseAnnotation]
Get the annotations of this TableItem.
get_image
get_image(
doc: DoclingDocument, prov_index: int = 0
) -> Optional[Image]
Returns the image corresponding to this FloatingItem.
This function returns the PIL image from self.image if one is available. Otherwise, it uses DocItem.get_image to get an image of this FloatingItem.
In particular, when self.image is None, the function returns None if this FloatingItem has no valid provenance or the doc does not contain a valid image for the required page.
get_location_tokens
get_location_tokens(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
self_closing: bool = False,
) -> str
Get the location string for the BaseCell.
TableCell
Bases: BaseModel
TableCell.
Show JSON schema:
{
"$defs": {
"BoundingBox": {
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
},
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
}
},
"description": "TableCell.",
"properties": {
"bbox": {
"anyOf": [
{
"$ref": "#/$defs/BoundingBox"
},
{
"type": "null"
}
],
"default": null
},
"row_span": {
"default": 1,
"title": "Row Span",
"type": "integer"
},
"col_span": {
"default": 1,
"title": "Col Span",
"type": "integer"
},
"start_row_offset_idx": {
"title": "Start Row Offset Idx",
"type": "integer"
},
"end_row_offset_idx": {
"title": "End Row Offset Idx",
"type": "integer"
},
"start_col_offset_idx": {
"title": "Start Col Offset Idx",
"type": "integer"
},
"end_col_offset_idx": {
"title": "End Col Offset Idx",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"column_header": {
"default": false,
"title": "Column Header",
"type": "boolean"
},
"row_header": {
"default": false,
"title": "Row Header",
"type": "boolean"
},
"row_section": {
"default": false,
"title": "Row Section",
"type": "boolean"
},
"fillable": {
"default": false,
"title": "Fillable",
"type": "boolean"
}
},
"required": [
"start_row_offset_idx",
"end_row_offset_idx",
"start_col_offset_idx",
"end_col_offset_idx",
"text"
],
"title": "TableCell",
"type": "object"
}
Fields:
-
bbox(Optional[BoundingBox]) -
row_span(int) -
col_span(int) -
start_row_offset_idx(int) -
end_row_offset_idx(int) -
start_col_offset_idx(int) -
end_col_offset_idx(int) -
text(str) -
column_header(bool) -
row_header(bool) -
row_section(bool) -
fillable(bool)
Validators:
col_span
col_span: int = 1
column_header
column_header: bool = False
end_col_offset_idx
end_col_offset_idx: int
end_row_offset_idx
end_row_offset_idx: int
fillable
fillable: bool = False
row_header
row_header: bool = False
row_section
row_section: bool = False
row_span
row_span: int = 1
start_col_offset_idx
start_col_offset_idx: int
start_row_offset_idx
start_row_offset_idx: int
text
text: str
from_dict_format
from_dict_format(data: Any) -> Any
from_dict_format.
TableData
Bases: BaseModel
BaseTableData.
Show JSON schema:
{
"$defs": {
"BoundingBox": {
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
},
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
},
"Orientation": {
"description": "Counter-clockwise rotation of a table on the page, in degrees.\n\nFollows the convention used by PIL/Pillow's ``Image.rotate``: positive\nangles rotate the table counter-clockwise. ``ROT_0`` / ``ROT_180`` keep\nrows running horizontally on the page; ``ROT_90`` / ``ROT_270`` turn\nrows into vertical stripes.",
"enum": [
"rot_0",
"rot_90",
"rot_180",
"rot_270"
],
"title": "Orientation",
"type": "string"
},
"RefItem": {
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
},
"RichTableCell": {
"description": "RichTableCell.",
"properties": {
"bbox": {
"anyOf": [
{
"$ref": "#/$defs/BoundingBox"
},
{
"type": "null"
}
],
"default": null
},
"row_span": {
"default": 1,
"title": "Row Span",
"type": "integer"
},
"col_span": {
"default": 1,
"title": "Col Span",
"type": "integer"
},
"start_row_offset_idx": {
"title": "Start Row Offset Idx",
"type": "integer"
},
"end_row_offset_idx": {
"title": "End Row Offset Idx",
"type": "integer"
},
"start_col_offset_idx": {
"title": "Start Col Offset Idx",
"type": "integer"
},
"end_col_offset_idx": {
"title": "End Col Offset Idx",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"column_header": {
"default": false,
"title": "Column Header",
"type": "boolean"
},
"row_header": {
"default": false,
"title": "Row Header",
"type": "boolean"
},
"row_section": {
"default": false,
"title": "Row Section",
"type": "boolean"
},
"fillable": {
"default": false,
"title": "Fillable",
"type": "boolean"
},
"ref": {
"$ref": "#/$defs/RefItem"
}
},
"required": [
"start_row_offset_idx",
"end_row_offset_idx",
"start_col_offset_idx",
"end_col_offset_idx",
"text",
"ref"
],
"title": "RichTableCell",
"type": "object"
},
"TableCell": {
"description": "TableCell.",
"properties": {
"bbox": {
"anyOf": [
{
"$ref": "#/$defs/BoundingBox"
},
{
"type": "null"
}
],
"default": null
},
"row_span": {
"default": 1,
"title": "Row Span",
"type": "integer"
},
"col_span": {
"default": 1,
"title": "Col Span",
"type": "integer"
},
"start_row_offset_idx": {
"title": "Start Row Offset Idx",
"type": "integer"
},
"end_row_offset_idx": {
"title": "End Row Offset Idx",
"type": "integer"
},
"start_col_offset_idx": {
"title": "Start Col Offset Idx",
"type": "integer"
},
"end_col_offset_idx": {
"title": "End Col Offset Idx",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"column_header": {
"default": false,
"title": "Column Header",
"type": "boolean"
},
"row_header": {
"default": false,
"title": "Row Header",
"type": "boolean"
},
"row_section": {
"default": false,
"title": "Row Section",
"type": "boolean"
},
"fillable": {
"default": false,
"title": "Fillable",
"type": "boolean"
}
},
"required": [
"start_row_offset_idx",
"end_row_offset_idx",
"start_col_offset_idx",
"end_col_offset_idx",
"text"
],
"title": "TableCell",
"type": "object"
}
},
"description": "BaseTableData.",
"properties": {
"table_cells": {
"default": [],
"items": {
"anyOf": [
{
"$ref": "#/$defs/RichTableCell"
},
{
"$ref": "#/$defs/TableCell"
}
]
},
"title": "Table Cells",
"type": "array"
},
"num_rows": {
"default": 0,
"title": "Num Rows",
"type": "integer"
},
"num_cols": {
"default": 0,
"title": "Num Cols",
"type": "integer"
},
"orientation": {
"$ref": "#/$defs/Orientation",
"default": "rot_0"
}
},
"title": "TableData",
"type": "object"
}
Fields:
-
table_cells(list[AnyTableCell]) -
num_rows(int) -
num_cols(int) -
orientation(Orientation)
num_cols
num_cols: int = 0
num_rows
num_rows: int = 0
orientation
orientation: Orientation = ROT_0
table_cells
table_cells: list[AnyTableCell] = []
add_row
add_row(row: list[str]) -> None
Add a new row to the table from a list of strings.
:param row: list[str]: A list of strings representing the content of the new row.
:returns: None
add_rows
add_rows(rows: list[list[str]]) -> None
Add multiple new rows to the table from a list of lists of strings.
:param rows: list[list[str]]: A list of lists, where each inner list represents the content of a new row.
:returns: None
from_regions
from_regions(
table_bbox: BoundingBox,
rows: list[BoundingBox],
cols: list[BoundingBox],
merges: list[BoundingBox],
row_headers: list[BoundingBox] = [],
col_headers: list[BoundingBox] = [],
row_sections: list[BoundingBox] = [],
) -> Self
Converts regions: rows, columns, merged cells into table_data structure.
Adds semantics for regions of row_headers, col_headers, row_section
get_column_bounding_boxes
get_column_bounding_boxes(
*, minimal: bool = True
) -> dict[int, BoundingBox]
Get the bounding box for each column in the table.
Layout follows the table's orientation field: ROT_0 / ROT_180
keep columns running top-to-bottom on the page; ROT_90 / ROT_270
turn columns into horizontal stripes. This affects both the axis along
which span cells extend a column's bbox and, when minimal=False, the
axis equalized across columns.
Parameters:
-
minimal(bool, default:True) –If True (default), returns the minimal bounding box for each column based on its cells. If False, all columns will have a uniform extent perpendicular to the column direction (t/b for ROT_0/ROT_180, l/r for ROT_90/ROT_270).
Returns:
-
dict[int, BoundingBox]–dict[int, BoundingBox]: A dictionary mapping column indices to their
-
dict[int, BoundingBox]–bounding boxes. Only columns with cells that have bounding boxes are included.
get_row_bounding_boxes
get_row_bounding_boxes(
*, minimal: bool = True
) -> dict[int, BoundingBox]
Get the bounding box for each row in the table.
Layout follows the table's orientation field: ROT_0 / ROT_180
keep rows running left-to-right on the page; ROT_90 / ROT_270
turn rows into vertical stripes. This affects both the axis along which
span cells extend a row's bbox and, when minimal=False, the axis
equalized across rows.
Parameters:
-
minimal(bool, default:True) –If True (default), returns the minimal bounding box for each row based on its cells. If False, all rows will have a uniform extent perpendicular to the row direction (l/r for ROT_0/ROT_180, t/b for ROT_90/ROT_270).
Returns:
-
dict[int, BoundingBox]–dict[int, BoundingBox]: A dictionary mapping row indices to their
-
dict[int, BoundingBox]–bounding boxes. Only rows with cells that have bounding boxes are included.
insert_row
insert_row(
row_index: int, row: list[str], after: bool = False
) -> None
Insert a new row from a list of strings before/after a specific index in the table.
:param row_index: int: The index at which to insert the new row. (Starting from 0) :param row: list[str]: A list of strings representing the content of the new row. :param after: bool: If True, insert the row after the specified index, otherwise before it. (Default is False)
:returns: None
insert_rows
insert_rows(
row_index: int,
rows: list[list[str]],
after: bool = False,
) -> None
Insert multiple new rows from a list of lists of strings before/after a specific index in the table.
:param row_index: int: The index at which to insert the new rows. (Starting from 0) :param rows: list[list[str]]: A list of lists, where each inner list represents the content of a new row. :param after: bool: If True, insert the rows after the specified index, otherwise before it. (Default is False)
:returns: None
pop_row
pop_row(
doc: Optional[DoclingDocument] = None,
) -> list[TableCell]
Remove and return the last row from the table.
:returns: list[TableCell]: A list of TableCell objects representing the popped row.
remove_row
remove_row(
row_index: int, doc: Optional[DoclingDocument] = None
) -> list[TableCell]
Remove a row from the table by its index.
:param row_index: int: The index of the row to remove. (Starting from 0)
:returns: list[TableCell]: A list of TableCell objects representing the removed row.
remove_rows
remove_rows(
indices: list[int],
doc: Optional[DoclingDocument] = None,
) -> list[list[TableCell]]
Remove rows from the table by their indices.
:param indices: list[int]: A list of indices of the rows to remove. (Starting from 0)
:return: list[list[TableCell]]: A list representation of the removed rows as lists of TableCell objects.
TableCellLabel
Bases: str, Enum
TableCellLabel.
Methods:
-
get_color–Return the RGB color associated with a given label.
Attributes:
-
BODY– -
COLUMN_HEADER– -
ROW_HEADER– -
ROW_SECTION–
BODY
BODY = 'body'
COLUMN_HEADER
COLUMN_HEADER = 'col_header'
ROW_HEADER
ROW_HEADER = 'row_header'
ROW_SECTION
ROW_SECTION = 'row_section'
get_color
get_color(label: TableCellLabel) -> tuple[int, int, int]
Return the RGB color associated with a given label.
KeyValueItem
Bases: FloatingItem
KeyValueItem.
Show JSON schema:
{
"$defs": {
"BoundingBox": {
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
},
"ContentLayer": {
"description": "ContentLayer.",
"enum": [
"body",
"furniture",
"background",
"invisible",
"notes"
],
"title": "ContentLayer",
"type": "string"
},
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
},
"DescriptionMetaField": {
"additionalProperties": true,
"description": "Description metadata field.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "DescriptionMetaField",
"type": "object"
},
"EntitiesMetaField": {
"additionalProperties": true,
"description": "Container for extracted entity mentions.",
"properties": {
"mentions": {
"items": {
"$ref": "#/$defs/EntityMention"
},
"minItems": 1,
"title": "Mentions",
"type": "array"
}
},
"required": [
"mentions"
],
"title": "EntitiesMetaField",
"type": "object"
},
"EntityMention": {
"additionalProperties": true,
"description": "Entity mention extracted from text.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"description": "Normalized text of the entity mention.",
"title": "Text",
"type": "string"
},
"orig": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Exact source text extracted from the original charspan, analogous to TextItem.orig. This may differ from 'text' when the mention has been normalized.",
"title": "Orig"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Entity type or category.",
"title": "Label"
},
"charspan": {
"anyOf": [
{
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "Character span (0-indexed) of the entity mention in the source text.",
"title": "Charspan"
}
},
"required": [
"text"
],
"title": "EntityMention",
"type": "object"
},
"FineRef": {
"description": "Fine-granular reference item that can capture span range info.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
},
"range": {
"anyOf": [
{
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"title": "Range"
}
},
"required": [
"$ref"
],
"title": "FineRef",
"type": "object"
},
"FloatingMeta": {
"additionalProperties": true,
"description": "Metadata model for floating.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
},
"description": {
"anyOf": [
{
"$ref": "#/$defs/DescriptionMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "FloatingMeta",
"type": "object"
},
"GraphCell": {
"description": "GraphCell.",
"properties": {
"label": {
"$ref": "#/$defs/GraphCellLabel"
},
"cell_id": {
"title": "Cell Id",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"orig": {
"title": "Orig",
"type": "string"
},
"prov": {
"anyOf": [
{
"$ref": "#/$defs/ProvenanceItem"
},
{
"type": "null"
}
],
"default": null
},
"item_ref": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
}
},
"required": [
"label",
"cell_id",
"text",
"orig"
],
"title": "GraphCell",
"type": "object"
},
"GraphCellLabel": {
"description": "GraphCellLabel.",
"enum": [
"unspecified",
"key",
"value",
"checkbox"
],
"title": "GraphCellLabel",
"type": "string"
},
"GraphData": {
"description": "GraphData.",
"properties": {
"cells": {
"items": {
"$ref": "#/$defs/GraphCell"
},
"title": "Cells",
"type": "array"
},
"links": {
"items": {
"$ref": "#/$defs/GraphLink"
},
"title": "Links",
"type": "array"
}
},
"title": "GraphData",
"type": "object"
},
"GraphLink": {
"description": "GraphLink.",
"properties": {
"label": {
"$ref": "#/$defs/GraphLinkLabel"
},
"source_cell_id": {
"title": "Source Cell Id",
"type": "integer"
},
"target_cell_id": {
"title": "Target Cell Id",
"type": "integer"
}
},
"required": [
"label",
"source_cell_id",
"target_cell_id"
],
"title": "GraphLink",
"type": "object"
},
"GraphLinkLabel": {
"description": "GraphLinkLabel.",
"enum": [
"unspecified",
"to_value",
"to_key",
"to_parent",
"to_child"
],
"title": "GraphLinkLabel",
"type": "string"
},
"HumanLanguageLabel": {
"description": "Two-letter human language primary subtags using BCP-47 values.",
"enum": [
"aa",
"ab",
"ae",
"af",
"ak",
"am",
"an",
"ar",
"as",
"av",
"ay",
"az",
"ba",
"be",
"bg",
"bh",
"bi",
"bm",
"bn",
"bo",
"br",
"bs",
"ca",
"ce",
"ch",
"co",
"cr",
"cs",
"cu",
"cv",
"cy",
"da",
"de",
"dv",
"dz",
"ee",
"el",
"en",
"eo",
"es",
"et",
"eu",
"fa",
"ff",
"fi",
"fj",
"fo",
"fr",
"fy",
"ga",
"gd",
"gl",
"gn",
"gu",
"gv",
"ha",
"he",
"hi",
"ho",
"hr",
"ht",
"hu",
"hy",
"hz",
"ia",
"id",
"ie",
"ig",
"ii",
"ik",
"io",
"is",
"it",
"iu",
"ja",
"jv",
"ka",
"kg",
"ki",
"kj",
"kk",
"kl",
"km",
"kn",
"ko",
"kr",
"ks",
"ku",
"kv",
"kw",
"ky",
"la",
"lb",
"lg",
"li",
"ln",
"lo",
"lt",
"lu",
"lv",
"mg",
"mh",
"mi",
"mk",
"ml",
"mn",
"mr",
"ms",
"mt",
"my",
"na",
"nb",
"nd",
"ne",
"ng",
"nl",
"nn",
"no",
"nr",
"nv",
"ny",
"oc",
"oj",
"om",
"or",
"os",
"pa",
"pi",
"pl",
"ps",
"pt",
"qu",
"rm",
"rn",
"ro",
"ru",
"rw",
"sa",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sk",
"sl",
"sm",
"sn",
"so",
"sq",
"sr",
"ss",
"st",
"su",
"sv",
"sw",
"ta",
"te",
"tg",
"th",
"ti",
"tk",
"tl",
"tn",
"to",
"tr",
"ts",
"tt",
"tw",
"ty",
"ug",
"uk",
"ur",
"uz",
"ve",
"vi",
"vo",
"wa",
"wo",
"xh",
"yi",
"yo",
"za",
"zh",
"zu"
],
"title": "HumanLanguageLabel",
"type": "string"
},
"ImageRef": {
"description": "ImageRef.",
"properties": {
"mimetype": {
"title": "Mimetype",
"type": "string"
},
"dpi": {
"title": "Dpi",
"type": "integer"
},
"size": {
"$ref": "#/$defs/Size"
},
"uri": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
}
],
"title": "Uri"
}
},
"required": [
"mimetype",
"dpi",
"size",
"uri"
],
"title": "ImageRef",
"type": "object"
},
"LanguageMetaField": {
"additionalProperties": true,
"description": "Detected human language.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"code": {
"$ref": "#/$defs/HumanLanguageLabel"
}
},
"required": [
"code"
],
"title": "LanguageMetaField",
"type": "object"
},
"ProvenanceItem": {
"description": "Provenance information for elements extracted from a textual document.\n\nA `ProvenanceItem` object acts as a lightweight pointer back into the original\ndocument for an extracted element. It applies to documents with an explicity\nor implicit layout, such as PDF, HTML, docx, or pptx.",
"properties": {
"page_no": {
"description": "Page number",
"title": "Page No",
"type": "integer"
},
"bbox": {
"$ref": "#/$defs/BoundingBox",
"description": "Bounding box"
},
"charspan": {
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"title": "Charspan",
"type": "array"
}
},
"required": [
"page_no",
"bbox",
"charspan"
],
"title": "ProvenanceItem",
"type": "object"
},
"RefItem": {
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
},
"Size": {
"description": "Size.",
"properties": {
"width": {
"default": 0.0,
"title": "Width",
"type": "number"
},
"height": {
"default": 0.0,
"title": "Height",
"type": "number"
}
},
"title": "Size",
"type": "object"
},
"SummaryMetaField": {
"additionalProperties": true,
"description": "Summary data.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "SummaryMetaField",
"type": "object"
},
"TrackSource": {
"description": "Source metadata for a cue extracted from a media track.\n\nA `TrackSource` instance identifies a cue in a media track (audio, video, subtitles, screen-recording captions,\netc.). A *cue* here refers to any discrete segment that was pulled out of the original asset, e.g., a subtitle\nblock, an audio clip, or a timed marker in a screen-recording.",
"properties": {
"kind": {
"const": "track",
"default": "track",
"description": "Identifies this type of source.",
"title": "Kind",
"type": "string"
},
"start_time": {
"description": "Start time offset of the track cue in seconds",
"examples": [
11.0,
6.5,
5370.0
],
"title": "Start Time",
"type": "number"
},
"end_time": {
"description": "End time offset of the track cue in seconds",
"examples": [
12.0,
8.2,
5370.1
],
"title": "End Time",
"type": "number"
},
"identifier": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "An identifier of the cue",
"examples": [
"test",
"123",
"b72d946"
],
"title": "Identifier"
},
"voice": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The name of the voice in this track (the speaker)",
"examples": [
"John",
"Mary",
"Speaker 1"
],
"title": "Voice"
}
},
"required": [
"start_time",
"end_time"
],
"title": "TrackSource",
"type": "object"
}
},
"additionalProperties": false,
"description": "KeyValueItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/FloatingMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "key_value_region",
"default": "key_value_region",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"captions": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Captions",
"type": "array"
},
"references": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "References",
"type": "array"
},
"footnotes": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Footnotes",
"type": "array"
},
"image": {
"anyOf": [
{
"$ref": "#/$defs/ImageRef"
},
{
"type": "null"
}
],
"default": null
},
"graph": {
"$ref": "#/$defs/GraphData"
}
},
"required": [
"self_ref",
"graph"
],
"title": "KeyValueItem",
"type": "object"
}
Fields:
-
self_ref(str) -
parent(Optional[RefItem]) -
children(list[RefItem]) -
content_layer(ContentLayer) -
meta(Optional[FloatingMeta]) -
prov(list[ProvenanceItem]) -
source(list[SourceType]) -
comments(list[FineRef]) -
captions(list[RefItem]) -
references(list[RefItem]) -
footnotes(list[RefItem]) -
image(Optional[ImageRef]) -
label(Literal[KEY_VALUE_REGION]) -
graph(GraphData)
comments
comments: list[FineRef] = []
content_layer
content_layer: ContentLayer = BODY
graph
graph: GraphData
meta
meta: Optional[FloatingMeta] = None
model_config
model_config = ConfigDict(extra='forbid')
self_ref
self_ref: str
source
source: list[SourceType]
The provenance of this document item. Currently, it is only used for media track provenance.
export_to_document_tokens
export_to_document_tokens(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
add_location: bool = True,
add_content: bool = True,
)
Export key value item to document tokens format.
:param doc: "DoclingDocument": :param new_line: str (Default value = "") Deprecated :param xsize: int: (Default value = 500) :param ysize: int: (Default value = 500) :param add_location: bool: (Default value = True) :param add_content: bool: (Default value = True)
get_annotations
get_annotations() -> Sequence[BaseAnnotation]
Get the annotations of this DocItem.
get_image
get_image(
doc: DoclingDocument, prov_index: int = 0
) -> Optional[Image]
Returns the image corresponding to this FloatingItem.
This function returns the PIL image from self.image if one is available. Otherwise, it uses DocItem.get_image to get an image of this FloatingItem.
In particular, when self.image is None, the function returns None if this FloatingItem has no valid provenance or the doc does not contain a valid image for the required page.
get_location_tokens
get_location_tokens(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
self_closing: bool = False,
) -> str
Get the location string for the BaseCell.
SectionHeaderItem
Bases: TextItem
SectionItem.
Show JSON schema:
{
"$defs": {
"BaseMeta": {
"additionalProperties": true,
"description": "Base class for metadata.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "BaseMeta",
"type": "object"
},
"BoundingBox": {
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
},
"ContentLayer": {
"description": "ContentLayer.",
"enum": [
"body",
"furniture",
"background",
"invisible",
"notes"
],
"title": "ContentLayer",
"type": "string"
},
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
},
"EntitiesMetaField": {
"additionalProperties": true,
"description": "Container for extracted entity mentions.",
"properties": {
"mentions": {
"items": {
"$ref": "#/$defs/EntityMention"
},
"minItems": 1,
"title": "Mentions",
"type": "array"
}
},
"required": [
"mentions"
],
"title": "EntitiesMetaField",
"type": "object"
},
"EntityMention": {
"additionalProperties": true,
"description": "Entity mention extracted from text.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"description": "Normalized text of the entity mention.",
"title": "Text",
"type": "string"
},
"orig": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Exact source text extracted from the original charspan, analogous to TextItem.orig. This may differ from 'text' when the mention has been normalized.",
"title": "Orig"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Entity type or category.",
"title": "Label"
},
"charspan": {
"anyOf": [
{
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "Character span (0-indexed) of the entity mention in the source text.",
"title": "Charspan"
}
},
"required": [
"text"
],
"title": "EntityMention",
"type": "object"
},
"FineRef": {
"description": "Fine-granular reference item that can capture span range info.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
},
"range": {
"anyOf": [
{
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"title": "Range"
}
},
"required": [
"$ref"
],
"title": "FineRef",
"type": "object"
},
"Formatting": {
"description": "Formatting.",
"properties": {
"bold": {
"default": false,
"title": "Bold",
"type": "boolean"
},
"italic": {
"default": false,
"title": "Italic",
"type": "boolean"
},
"underline": {
"default": false,
"title": "Underline",
"type": "boolean"
},
"strikethrough": {
"default": false,
"title": "Strikethrough",
"type": "boolean"
},
"script": {
"$ref": "#/$defs/Script",
"default": "baseline"
}
},
"title": "Formatting",
"type": "object"
},
"HumanLanguageLabel": {
"description": "Two-letter human language primary subtags using BCP-47 values.",
"enum": [
"aa",
"ab",
"ae",
"af",
"ak",
"am",
"an",
"ar",
"as",
"av",
"ay",
"az",
"ba",
"be",
"bg",
"bh",
"bi",
"bm",
"bn",
"bo",
"br",
"bs",
"ca",
"ce",
"ch",
"co",
"cr",
"cs",
"cu",
"cv",
"cy",
"da",
"de",
"dv",
"dz",
"ee",
"el",
"en",
"eo",
"es",
"et",
"eu",
"fa",
"ff",
"fi",
"fj",
"fo",
"fr",
"fy",
"ga",
"gd",
"gl",
"gn",
"gu",
"gv",
"ha",
"he",
"hi",
"ho",
"hr",
"ht",
"hu",
"hy",
"hz",
"ia",
"id",
"ie",
"ig",
"ii",
"ik",
"io",
"is",
"it",
"iu",
"ja",
"jv",
"ka",
"kg",
"ki",
"kj",
"kk",
"kl",
"km",
"kn",
"ko",
"kr",
"ks",
"ku",
"kv",
"kw",
"ky",
"la",
"lb",
"lg",
"li",
"ln",
"lo",
"lt",
"lu",
"lv",
"mg",
"mh",
"mi",
"mk",
"ml",
"mn",
"mr",
"ms",
"mt",
"my",
"na",
"nb",
"nd",
"ne",
"ng",
"nl",
"nn",
"no",
"nr",
"nv",
"ny",
"oc",
"oj",
"om",
"or",
"os",
"pa",
"pi",
"pl",
"ps",
"pt",
"qu",
"rm",
"rn",
"ro",
"ru",
"rw",
"sa",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sk",
"sl",
"sm",
"sn",
"so",
"sq",
"sr",
"ss",
"st",
"su",
"sv",
"sw",
"ta",
"te",
"tg",
"th",
"ti",
"tk",
"tl",
"tn",
"to",
"tr",
"ts",
"tt",
"tw",
"ty",
"ug",
"uk",
"ur",
"uz",
"ve",
"vi",
"vo",
"wa",
"wo",
"xh",
"yi",
"yo",
"za",
"zh",
"zu"
],
"title": "HumanLanguageLabel",
"type": "string"
},
"LanguageMetaField": {
"additionalProperties": true,
"description": "Detected human language.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"code": {
"$ref": "#/$defs/HumanLanguageLabel"
}
},
"required": [
"code"
],
"title": "LanguageMetaField",
"type": "object"
},
"ProvenanceItem": {
"description": "Provenance information for elements extracted from a textual document.\n\nA `ProvenanceItem` object acts as a lightweight pointer back into the original\ndocument for an extracted element. It applies to documents with an explicity\nor implicit layout, such as PDF, HTML, docx, or pptx.",
"properties": {
"page_no": {
"description": "Page number",
"title": "Page No",
"type": "integer"
},
"bbox": {
"$ref": "#/$defs/BoundingBox",
"description": "Bounding box"
},
"charspan": {
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"title": "Charspan",
"type": "array"
}
},
"required": [
"page_no",
"bbox",
"charspan"
],
"title": "ProvenanceItem",
"type": "object"
},
"RefItem": {
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
},
"Script": {
"description": "Text script position.",
"enum": [
"baseline",
"sub",
"super"
],
"title": "Script",
"type": "string"
},
"SummaryMetaField": {
"additionalProperties": true,
"description": "Summary data.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "SummaryMetaField",
"type": "object"
},
"TrackSource": {
"description": "Source metadata for a cue extracted from a media track.\n\nA `TrackSource` instance identifies a cue in a media track (audio, video, subtitles, screen-recording captions,\netc.). A *cue* here refers to any discrete segment that was pulled out of the original asset, e.g., a subtitle\nblock, an audio clip, or a timed marker in a screen-recording.",
"properties": {
"kind": {
"const": "track",
"default": "track",
"description": "Identifies this type of source.",
"title": "Kind",
"type": "string"
},
"start_time": {
"description": "Start time offset of the track cue in seconds",
"examples": [
11.0,
6.5,
5370.0
],
"title": "Start Time",
"type": "number"
},
"end_time": {
"description": "End time offset of the track cue in seconds",
"examples": [
12.0,
8.2,
5370.1
],
"title": "End Time",
"type": "number"
},
"identifier": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "An identifier of the cue",
"examples": [
"test",
"123",
"b72d946"
],
"title": "Identifier"
},
"voice": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The name of the voice in this track (the speaker)",
"examples": [
"John",
"Mary",
"Speaker 1"
],
"title": "Voice"
}
},
"required": [
"start_time",
"end_time"
],
"title": "TrackSource",
"type": "object"
}
},
"additionalProperties": false,
"description": "SectionItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/BaseMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"const": "section_header",
"default": "section_header",
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"orig": {
"title": "Orig",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"formatting": {
"anyOf": [
{
"$ref": "#/$defs/Formatting"
},
{
"type": "null"
}
],
"default": null
},
"hyperlink": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Hyperlink"
},
"level": {
"default": 1,
"maximum": 100,
"minimum": 1,
"title": "Level",
"type": "integer"
}
},
"required": [
"self_ref",
"orig",
"text"
],
"title": "SectionHeaderItem",
"type": "object"
}
Fields:
-
self_ref(str) -
parent(Optional[RefItem]) -
children(list[RefItem]) -
content_layer(ContentLayer) -
meta(Optional[BaseMeta]) -
prov(list[ProvenanceItem]) -
source(list[SourceType]) -
comments(list[FineRef]) -
orig(str) -
text(str) -
formatting(Optional[Formatting]) -
hyperlink(Optional[Union[AnyUrl, Path]]) -
label(Literal[SECTION_HEADER]) -
level(LevelNumber)
comments
comments: list[FineRef] = []
content_layer
content_layer: ContentLayer = BODY
formatting
formatting: Optional[Formatting] = None
hyperlink
hyperlink: Optional[Union[AnyUrl, Path]] = None
level
level: LevelNumber = 1
meta
meta: Optional[BaseMeta] = None
model_config
model_config = ConfigDict(extra='forbid')
orig
orig: str
self_ref
self_ref: str
source
source: list[SourceType]
The provenance of this document item. Currently, it is only used for media track provenance.
text
text: str
export_to_doctags
export_to_doctags(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
add_location: bool = True,
add_content: bool = True,
)
Export text element to document tokens format.
:param doc: "DoclingDocument": :param new_line: str (Default value = "") Deprecated :param xsize: int: (Default value = 500) :param ysize: int: (Default value = 500) :param add_location: bool: (Default value = True) :param add_content: bool: (Default value = True)
export_to_document_tokens
export_to_document_tokens(*args, **kwargs)
Export to DocTags format.
get_annotations
get_annotations() -> Sequence[BaseAnnotation]
Get the annotations of this DocItem.
get_image
get_image(
doc: DoclingDocument, prov_index: int = 0
) -> Optional[Image]
Returns the image of this DocItem.
The function returns None if this DocItem has no valid provenance or if a valid image of the page containing this DocItem is not available in doc.
get_location_tokens
get_location_tokens(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
self_closing: bool = False,
) -> str
Get the location string for the BaseCell.
PictureItem
Bases: FloatingItem
PictureItem.
Show JSON schema:
{
"$defs": {
"BoundingBox": {
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
},
"ChartBar": {
"description": "Represents a bar in a bar chart.\n\nAttributes:\n label (str): The label for the bar.\n values (float): The value associated with the bar.",
"properties": {
"label": {
"title": "Label",
"type": "string"
},
"values": {
"title": "Values",
"type": "number"
}
},
"required": [
"label",
"values"
],
"title": "ChartBar",
"type": "object"
},
"ChartLine": {
"description": "Represents a line in a line chart.\n\nAttributes:\n label (str): The label for the line.\n values (list[tuple[float, float]]): A list of (x, y) coordinate pairs\n representing the line's data points.",
"properties": {
"label": {
"title": "Label",
"type": "string"
},
"values": {
"items": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "number"
},
{
"type": "number"
}
],
"type": "array"
},
"title": "Values",
"type": "array"
}
},
"required": [
"label",
"values"
],
"title": "ChartLine",
"type": "object"
},
"ChartPoint": {
"description": "Represents a point in a scatter chart.\n\nAttributes:\n value (Tuple[float, float]): A (x, y) coordinate pair representing a point in a\n chart.",
"properties": {
"value": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "number"
},
{
"type": "number"
}
],
"title": "Value",
"type": "array"
}
},
"required": [
"value"
],
"title": "ChartPoint",
"type": "object"
},
"ChartSlice": {
"description": "Represents a slice in a pie chart.\n\nAttributes:\n label (str): The label for the slice.\n value (float): The value represented by the slice.",
"properties": {
"label": {
"title": "Label",
"type": "string"
},
"value": {
"title": "Value",
"type": "number"
}
},
"required": [
"label",
"value"
],
"title": "ChartSlice",
"type": "object"
},
"ChartStackedBar": {
"description": "Represents a stacked bar in a stacked bar chart.\n\nAttributes:\n label (list[str]): The labels for the stacked bars. Multiple values are stored\n in cases where the chart is \"double stacked,\" meaning bars are stacked both\n horizontally and vertically.\n values (list[tuple[str, int]]): A list of values representing different segments\n of the stacked bar along with their label.",
"properties": {
"label": {
"items": {
"type": "string"
},
"title": "Label",
"type": "array"
},
"values": {
"items": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "string"
},
{
"type": "integer"
}
],
"type": "array"
},
"title": "Values",
"type": "array"
}
},
"required": [
"label",
"values"
],
"title": "ChartStackedBar",
"type": "object"
},
"CodeLanguageLabel": {
"description": "CodeLanguageLabel.",
"enum": [
"Ada",
"Awk",
"Bash",
"bc",
"C",
"C#",
"C++",
"CMake",
"COBOL",
"CSS",
"Ceylon",
"Clojure",
"Crystal",
"Cuda",
"Cython",
"D",
"Dart",
"dc",
"Dockerfile",
"DocLang",
"Elixir",
"Erlang",
"FORTRAN",
"Forth",
"Go",
"HTML",
"Haskell",
"Haxe",
"Java",
"JavaScript",
"JSON",
"Julia",
"Kotlin",
"Latex",
"Lisp",
"Lua",
"Matlab",
"MoonScript",
"Nim",
"OCaml",
"ObjectiveC",
"Octave",
"PHP",
"Pascal",
"Perl",
"Prolog",
"Python",
"Racket",
"Ruby",
"Rust",
"SML",
"SQL",
"Scala",
"Scheme",
"Swift",
"Tikz",
"TypeScript",
"unknown",
"VisualBasic",
"XML",
"YAML"
],
"title": "CodeLanguageLabel",
"type": "string"
},
"CodeMetaField": {
"additionalProperties": true,
"description": "Code representation for the respective item.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/CodeLanguageLabel"
},
{
"type": "null"
}
],
"default": null
}
},
"required": [
"text"
],
"title": "CodeMetaField",
"type": "object"
},
"ContentLayer": {
"description": "ContentLayer.",
"enum": [
"body",
"furniture",
"background",
"invisible",
"notes"
],
"title": "ContentLayer",
"type": "string"
},
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
},
"DescriptionAnnotation": {
"description": "DescriptionAnnotation.",
"properties": {
"kind": {
"const": "description",
"default": "description",
"title": "Kind",
"type": "string"
},
"text": {
"title": "Text",
"type": "string"
},
"provenance": {
"title": "Provenance",
"type": "string"
}
},
"required": [
"text",
"provenance"
],
"title": "DescriptionAnnotation",
"type": "object"
},
"DescriptionMetaField": {
"additionalProperties": true,
"description": "Description metadata field.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "DescriptionMetaField",
"type": "object"
},
"EntitiesMetaField": {
"additionalProperties": true,
"description": "Container for extracted entity mentions.",
"properties": {
"mentions": {
"items": {
"$ref": "#/$defs/EntityMention"
},
"minItems": 1,
"title": "Mentions",
"type": "array"
}
},
"required": [
"mentions"
],
"title": "EntitiesMetaField",
"type": "object"
},
"EntityMention": {
"additionalProperties": true,
"description": "Entity mention extracted from text.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"description": "Normalized text of the entity mention.",
"title": "Text",
"type": "string"
},
"orig": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Exact source text extracted from the original charspan, analogous to TextItem.orig. This may differ from 'text' when the mention has been normalized.",
"title": "Orig"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Entity type or category.",
"title": "Label"
},
"charspan": {
"anyOf": [
{
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "Character span (0-indexed) of the entity mention in the source text.",
"title": "Charspan"
}
},
"required": [
"text"
],
"title": "EntityMention",
"type": "object"
},
"FineRef": {
"description": "Fine-granular reference item that can capture span range info.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
},
"range": {
"anyOf": [
{
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"title": "Range"
}
},
"required": [
"$ref"
],
"title": "FineRef",
"type": "object"
},
"HumanLanguageLabel": {
"description": "Two-letter human language primary subtags using BCP-47 values.",
"enum": [
"aa",
"ab",
"ae",
"af",
"ak",
"am",
"an",
"ar",
"as",
"av",
"ay",
"az",
"ba",
"be",
"bg",
"bh",
"bi",
"bm",
"bn",
"bo",
"br",
"bs",
"ca",
"ce",
"ch",
"co",
"cr",
"cs",
"cu",
"cv",
"cy",
"da",
"de",
"dv",
"dz",
"ee",
"el",
"en",
"eo",
"es",
"et",
"eu",
"fa",
"ff",
"fi",
"fj",
"fo",
"fr",
"fy",
"ga",
"gd",
"gl",
"gn",
"gu",
"gv",
"ha",
"he",
"hi",
"ho",
"hr",
"ht",
"hu",
"hy",
"hz",
"ia",
"id",
"ie",
"ig",
"ii",
"ik",
"io",
"is",
"it",
"iu",
"ja",
"jv",
"ka",
"kg",
"ki",
"kj",
"kk",
"kl",
"km",
"kn",
"ko",
"kr",
"ks",
"ku",
"kv",
"kw",
"ky",
"la",
"lb",
"lg",
"li",
"ln",
"lo",
"lt",
"lu",
"lv",
"mg",
"mh",
"mi",
"mk",
"ml",
"mn",
"mr",
"ms",
"mt",
"my",
"na",
"nb",
"nd",
"ne",
"ng",
"nl",
"nn",
"no",
"nr",
"nv",
"ny",
"oc",
"oj",
"om",
"or",
"os",
"pa",
"pi",
"pl",
"ps",
"pt",
"qu",
"rm",
"rn",
"ro",
"ru",
"rw",
"sa",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sk",
"sl",
"sm",
"sn",
"so",
"sq",
"sr",
"ss",
"st",
"su",
"sv",
"sw",
"ta",
"te",
"tg",
"th",
"ti",
"tk",
"tl",
"tn",
"to",
"tr",
"ts",
"tt",
"tw",
"ty",
"ug",
"uk",
"ur",
"uz",
"ve",
"vi",
"vo",
"wa",
"wo",
"xh",
"yi",
"yo",
"za",
"zh",
"zu"
],
"title": "HumanLanguageLabel",
"type": "string"
},
"ImageRef": {
"description": "ImageRef.",
"properties": {
"mimetype": {
"title": "Mimetype",
"type": "string"
},
"dpi": {
"title": "Dpi",
"type": "integer"
},
"size": {
"$ref": "#/$defs/Size"
},
"uri": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
}
],
"title": "Uri"
}
},
"required": [
"mimetype",
"dpi",
"size",
"uri"
],
"title": "ImageRef",
"type": "object"
},
"LanguageMetaField": {
"additionalProperties": true,
"description": "Detected human language.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"code": {
"$ref": "#/$defs/HumanLanguageLabel"
}
},
"required": [
"code"
],
"title": "LanguageMetaField",
"type": "object"
},
"MiscAnnotation": {
"description": "MiscAnnotation.",
"properties": {
"kind": {
"const": "misc",
"default": "misc",
"title": "Kind",
"type": "string"
},
"content": {
"additionalProperties": true,
"title": "Content",
"type": "object"
}
},
"required": [
"content"
],
"title": "MiscAnnotation",
"type": "object"
},
"MoleculeMetaField": {
"additionalProperties": true,
"description": "Molecule metadata field.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"smi": {
"description": "The SMILES representation of the molecule.",
"title": "Smi",
"type": "string"
}
},
"required": [
"smi"
],
"title": "MoleculeMetaField",
"type": "object"
},
"Orientation": {
"description": "Counter-clockwise rotation of a table on the page, in degrees.\n\nFollows the convention used by PIL/Pillow's ``Image.rotate``: positive\nangles rotate the table counter-clockwise. ``ROT_0`` / ``ROT_180`` keep\nrows running horizontally on the page; ``ROT_90`` / ``ROT_270`` turn\nrows into vertical stripes.",
"enum": [
"rot_0",
"rot_90",
"rot_180",
"rot_270"
],
"title": "Orientation",
"type": "string"
},
"PictureBarChartData": {
"description": "Represents data of a bar chart.\n\nAttributes:\n kind (Literal[\"bar_chart_data\"]): The type of the chart.\n x_axis_label (str): The label for the x-axis.\n y_axis_label (str): The label for the y-axis.\n bars (list[ChartBar]): A list of bars in the chart.",
"properties": {
"kind": {
"const": "bar_chart_data",
"default": "bar_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"x_axis_label": {
"title": "X Axis Label",
"type": "string"
},
"y_axis_label": {
"title": "Y Axis Label",
"type": "string"
},
"bars": {
"items": {
"$ref": "#/$defs/ChartBar"
},
"title": "Bars",
"type": "array"
}
},
"required": [
"title",
"x_axis_label",
"y_axis_label",
"bars"
],
"title": "PictureBarChartData",
"type": "object"
},
"PictureClassificationClass": {
"description": "PictureClassificationData.",
"properties": {
"class_name": {
"title": "Class Name",
"type": "string"
},
"confidence": {
"title": "Confidence",
"type": "number"
}
},
"required": [
"class_name",
"confidence"
],
"title": "PictureClassificationClass",
"type": "object"
},
"PictureClassificationData": {
"description": "PictureClassificationData.",
"properties": {
"kind": {
"const": "classification",
"default": "classification",
"title": "Kind",
"type": "string"
},
"provenance": {
"title": "Provenance",
"type": "string"
},
"predicted_classes": {
"items": {
"$ref": "#/$defs/PictureClassificationClass"
},
"title": "Predicted Classes",
"type": "array"
}
},
"required": [
"provenance",
"predicted_classes"
],
"title": "PictureClassificationData",
"type": "object"
},
"PictureClassificationMetaField": {
"additionalProperties": true,
"description": "Picture classification metadata field.",
"properties": {
"predictions": {
"items": {
"$ref": "#/$defs/PictureClassificationPrediction"
},
"minItems": 1,
"title": "Predictions",
"type": "array"
}
},
"title": "PictureClassificationMetaField",
"type": "object"
},
"PictureClassificationPrediction": {
"additionalProperties": true,
"description": "Picture classification instance.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"class_name": {
"title": "Class Name",
"type": "string"
}
},
"required": [
"class_name"
],
"title": "PictureClassificationPrediction",
"type": "object"
},
"PictureLineChartData": {
"description": "Represents data of a line chart.\n\nAttributes:\n kind (Literal[\"line_chart_data\"]): The type of the chart.\n x_axis_label (str): The label for the x-axis.\n y_axis_label (str): The label for the y-axis.\n lines (list[ChartLine]): A list of lines in the chart.",
"properties": {
"kind": {
"const": "line_chart_data",
"default": "line_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"x_axis_label": {
"title": "X Axis Label",
"type": "string"
},
"y_axis_label": {
"title": "Y Axis Label",
"type": "string"
},
"lines": {
"items": {
"$ref": "#/$defs/ChartLine"
},
"title": "Lines",
"type": "array"
}
},
"required": [
"title",
"x_axis_label",
"y_axis_label",
"lines"
],
"title": "PictureLineChartData",
"type": "object"
},
"PictureMeta": {
"additionalProperties": true,
"description": "Metadata model for pictures.",
"properties": {
"summary": {
"anyOf": [
{
"$ref": "#/$defs/SummaryMetaField"
},
{
"type": "null"
}
],
"default": null
},
"language": {
"anyOf": [
{
"$ref": "#/$defs/LanguageMetaField"
},
{
"type": "null"
}
],
"default": null
},
"entities": {
"anyOf": [
{
"$ref": "#/$defs/EntitiesMetaField"
},
{
"type": "null"
}
],
"default": null
},
"description": {
"anyOf": [
{
"$ref": "#/$defs/DescriptionMetaField"
},
{
"type": "null"
}
],
"default": null
},
"classification": {
"anyOf": [
{
"$ref": "#/$defs/PictureClassificationMetaField"
},
{
"type": "null"
}
],
"default": null
},
"molecule": {
"anyOf": [
{
"$ref": "#/$defs/MoleculeMetaField"
},
{
"type": "null"
}
],
"default": null
},
"tabular_chart": {
"anyOf": [
{
"$ref": "#/$defs/TabularChartMetaField"
},
{
"type": "null"
}
],
"default": null
},
"code": {
"anyOf": [
{
"$ref": "#/$defs/CodeMetaField"
},
{
"type": "null"
}
],
"default": null
}
},
"title": "PictureMeta",
"type": "object"
},
"PictureMoleculeData": {
"description": "PictureMoleculeData.",
"properties": {
"kind": {
"const": "molecule_data",
"default": "molecule_data",
"title": "Kind",
"type": "string"
},
"smi": {
"title": "Smi",
"type": "string"
},
"confidence": {
"title": "Confidence",
"type": "number"
},
"class_name": {
"title": "Class Name",
"type": "string"
},
"segmentation": {
"items": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "number"
},
{
"type": "number"
}
],
"type": "array"
},
"title": "Segmentation",
"type": "array"
},
"provenance": {
"title": "Provenance",
"type": "string"
}
},
"required": [
"smi",
"confidence",
"class_name",
"segmentation",
"provenance"
],
"title": "PictureMoleculeData",
"type": "object"
},
"PicturePieChartData": {
"description": "Represents data of a pie chart.\n\nAttributes:\n kind (Literal[\"pie_chart_data\"]): The type of the chart.\n slices (list[ChartSlice]): A list of slices in the pie chart.",
"properties": {
"kind": {
"const": "pie_chart_data",
"default": "pie_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"slices": {
"items": {
"$ref": "#/$defs/ChartSlice"
},
"title": "Slices",
"type": "array"
}
},
"required": [
"title",
"slices"
],
"title": "PicturePieChartData",
"type": "object"
},
"PictureScatterChartData": {
"description": "Represents data of a scatter chart.\n\nAttributes:\n kind (Literal[\"scatter_chart_data\"]): The type of the chart.\n x_axis_label (str): The label for the x-axis.\n y_axis_label (str): The label for the y-axis.\n points (list[ChartPoint]): A list of points in the scatter chart.",
"properties": {
"kind": {
"const": "scatter_chart_data",
"default": "scatter_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"x_axis_label": {
"title": "X Axis Label",
"type": "string"
},
"y_axis_label": {
"title": "Y Axis Label",
"type": "string"
},
"points": {
"items": {
"$ref": "#/$defs/ChartPoint"
},
"title": "Points",
"type": "array"
}
},
"required": [
"title",
"x_axis_label",
"y_axis_label",
"points"
],
"title": "PictureScatterChartData",
"type": "object"
},
"PictureStackedBarChartData": {
"description": "Represents data of a stacked bar chart.\n\nAttributes:\n kind (Literal[\"stacked_bar_chart_data\"]): The type of the chart.\n x_axis_label (str): The label for the x-axis.\n y_axis_label (str): The label for the y-axis.\n stacked_bars (list[ChartStackedBar]): A list of stacked bars in the chart.",
"properties": {
"kind": {
"const": "stacked_bar_chart_data",
"default": "stacked_bar_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"x_axis_label": {
"title": "X Axis Label",
"type": "string"
},
"y_axis_label": {
"title": "Y Axis Label",
"type": "string"
},
"stacked_bars": {
"items": {
"$ref": "#/$defs/ChartStackedBar"
},
"title": "Stacked Bars",
"type": "array"
}
},
"required": [
"title",
"x_axis_label",
"y_axis_label",
"stacked_bars"
],
"title": "PictureStackedBarChartData",
"type": "object"
},
"PictureTabularChartData": {
"description": "Base class for picture chart data.\n\nAttributes:\n title (str): The title of the chart.\n chart_data (TableData): Chart data in the table format.",
"properties": {
"kind": {
"const": "tabular_chart_data",
"default": "tabular_chart_data",
"title": "Kind",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"chart_data": {
"$ref": "#/$defs/TableData"
}
},
"required": [
"title",
"chart_data"
],
"title": "PictureTabularChartData",
"type": "object"
},
"ProvenanceItem": {
"description": "Provenance information for elements extracted from a textual document.\n\nA `ProvenanceItem` object acts as a lightweight pointer back into the original\ndocument for an extracted element. It applies to documents with an explicity\nor implicit layout, such as PDF, HTML, docx, or pptx.",
"properties": {
"page_no": {
"description": "Page number",
"title": "Page No",
"type": "integer"
},
"bbox": {
"$ref": "#/$defs/BoundingBox",
"description": "Bounding box"
},
"charspan": {
"description": "Character span (0-indexed)",
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"type": "integer"
},
{
"type": "integer"
}
],
"title": "Charspan",
"type": "array"
}
},
"required": [
"page_no",
"bbox",
"charspan"
],
"title": "ProvenanceItem",
"type": "object"
},
"RefItem": {
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
},
"RichTableCell": {
"description": "RichTableCell.",
"properties": {
"bbox": {
"anyOf": [
{
"$ref": "#/$defs/BoundingBox"
},
{
"type": "null"
}
],
"default": null
},
"row_span": {
"default": 1,
"title": "Row Span",
"type": "integer"
},
"col_span": {
"default": 1,
"title": "Col Span",
"type": "integer"
},
"start_row_offset_idx": {
"title": "Start Row Offset Idx",
"type": "integer"
},
"end_row_offset_idx": {
"title": "End Row Offset Idx",
"type": "integer"
},
"start_col_offset_idx": {
"title": "Start Col Offset Idx",
"type": "integer"
},
"end_col_offset_idx": {
"title": "End Col Offset Idx",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"column_header": {
"default": false,
"title": "Column Header",
"type": "boolean"
},
"row_header": {
"default": false,
"title": "Row Header",
"type": "boolean"
},
"row_section": {
"default": false,
"title": "Row Section",
"type": "boolean"
},
"fillable": {
"default": false,
"title": "Fillable",
"type": "boolean"
},
"ref": {
"$ref": "#/$defs/RefItem"
}
},
"required": [
"start_row_offset_idx",
"end_row_offset_idx",
"start_col_offset_idx",
"end_col_offset_idx",
"text",
"ref"
],
"title": "RichTableCell",
"type": "object"
},
"Size": {
"description": "Size.",
"properties": {
"width": {
"default": 0.0,
"title": "Width",
"type": "number"
},
"height": {
"default": 0.0,
"title": "Height",
"type": "number"
}
},
"title": "Size",
"type": "object"
},
"SummaryMetaField": {
"additionalProperties": true,
"description": "Summary data.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"text"
],
"title": "SummaryMetaField",
"type": "object"
},
"TableCell": {
"description": "TableCell.",
"properties": {
"bbox": {
"anyOf": [
{
"$ref": "#/$defs/BoundingBox"
},
{
"type": "null"
}
],
"default": null
},
"row_span": {
"default": 1,
"title": "Row Span",
"type": "integer"
},
"col_span": {
"default": 1,
"title": "Col Span",
"type": "integer"
},
"start_row_offset_idx": {
"title": "Start Row Offset Idx",
"type": "integer"
},
"end_row_offset_idx": {
"title": "End Row Offset Idx",
"type": "integer"
},
"start_col_offset_idx": {
"title": "Start Col Offset Idx",
"type": "integer"
},
"end_col_offset_idx": {
"title": "End Col Offset Idx",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"column_header": {
"default": false,
"title": "Column Header",
"type": "boolean"
},
"row_header": {
"default": false,
"title": "Row Header",
"type": "boolean"
},
"row_section": {
"default": false,
"title": "Row Section",
"type": "boolean"
},
"fillable": {
"default": false,
"title": "Fillable",
"type": "boolean"
}
},
"required": [
"start_row_offset_idx",
"end_row_offset_idx",
"start_col_offset_idx",
"end_col_offset_idx",
"text"
],
"title": "TableCell",
"type": "object"
},
"TableData": {
"description": "BaseTableData.",
"properties": {
"table_cells": {
"default": [],
"items": {
"anyOf": [
{
"$ref": "#/$defs/RichTableCell"
},
{
"$ref": "#/$defs/TableCell"
}
]
},
"title": "Table Cells",
"type": "array"
},
"num_rows": {
"default": 0,
"title": "Num Rows",
"type": "integer"
},
"num_cols": {
"default": 0,
"title": "Num Cols",
"type": "integer"
},
"orientation": {
"$ref": "#/$defs/Orientation",
"default": "rot_0"
}
},
"title": "TableData",
"type": "object"
},
"TabularChartMetaField": {
"additionalProperties": true,
"description": "Tabular chart metadata field.",
"properties": {
"confidence": {
"anyOf": [
{
"maximum": 1,
"minimum": 0,
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"description": "The confidence of the prediction.",
"examples": [
0.9,
0.42
],
"title": "Confidence"
},
"created_by": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The origin of the prediction.",
"examples": [
"ibm-granite/granite-docling-258M"
],
"title": "Created By"
},
"title": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Title"
},
"chart_data": {
"$ref": "#/$defs/TableData"
}
},
"required": [
"chart_data"
],
"title": "TabularChartMetaField",
"type": "object"
},
"TrackSource": {
"description": "Source metadata for a cue extracted from a media track.\n\nA `TrackSource` instance identifies a cue in a media track (audio, video, subtitles, screen-recording captions,\netc.). A *cue* here refers to any discrete segment that was pulled out of the original asset, e.g., a subtitle\nblock, an audio clip, or a timed marker in a screen-recording.",
"properties": {
"kind": {
"const": "track",
"default": "track",
"description": "Identifies this type of source.",
"title": "Kind",
"type": "string"
},
"start_time": {
"description": "Start time offset of the track cue in seconds",
"examples": [
11.0,
6.5,
5370.0
],
"title": "Start Time",
"type": "number"
},
"end_time": {
"description": "End time offset of the track cue in seconds",
"examples": [
12.0,
8.2,
5370.1
],
"title": "End Time",
"type": "number"
},
"identifier": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "An identifier of the cue",
"examples": [
"test",
"123",
"b72d946"
],
"title": "Identifier"
},
"voice": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The name of the voice in this track (the speaker)",
"examples": [
"John",
"Mary",
"Speaker 1"
],
"title": "Voice"
}
},
"required": [
"start_time",
"end_time"
],
"title": "TrackSource",
"type": "object"
}
},
"additionalProperties": false,
"description": "PictureItem.",
"properties": {
"self_ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "Self Ref",
"type": "string"
},
"parent": {
"anyOf": [
{
"$ref": "#/$defs/RefItem"
},
{
"type": "null"
}
],
"default": null
},
"children": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Children",
"type": "array"
},
"content_layer": {
"$ref": "#/$defs/ContentLayer",
"default": "body"
},
"meta": {
"anyOf": [
{
"$ref": "#/$defs/PictureMeta"
},
{
"type": "null"
}
],
"default": null
},
"label": {
"default": "picture",
"enum": [
"picture",
"chart"
],
"title": "Label",
"type": "string"
},
"prov": {
"default": [],
"items": {
"$ref": "#/$defs/ProvenanceItem"
},
"title": "Prov",
"type": "array"
},
"source": {
"default": [],
"description": "The provenance of this document item. Currently, it is only used for media track provenance.",
"items": {
"discriminator": {
"mapping": {
"track": "#/$defs/TrackSource"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/TrackSource"
}
]
},
"title": "Source",
"type": "array"
},
"comments": {
"default": [],
"items": {
"$ref": "#/$defs/FineRef"
},
"title": "Comments",
"type": "array"
},
"captions": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Captions",
"type": "array"
},
"references": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "References",
"type": "array"
},
"footnotes": {
"default": [],
"items": {
"$ref": "#/$defs/RefItem"
},
"title": "Footnotes",
"type": "array"
},
"image": {
"anyOf": [
{
"$ref": "#/$defs/ImageRef"
},
{
"type": "null"
}
],
"default": null
},
"annotations": {
"default": [],
"deprecated": true,
"items": {
"discriminator": {
"mapping": {
"bar_chart_data": "#/$defs/PictureBarChartData",
"classification": "#/$defs/PictureClassificationData",
"description": "#/$defs/DescriptionAnnotation",
"line_chart_data": "#/$defs/PictureLineChartData",
"misc": "#/$defs/MiscAnnotation",
"molecule_data": "#/$defs/PictureMoleculeData",
"pie_chart_data": "#/$defs/PicturePieChartData",
"scatter_chart_data": "#/$defs/PictureScatterChartData",
"stacked_bar_chart_data": "#/$defs/PictureStackedBarChartData",
"tabular_chart_data": "#/$defs/PictureTabularChartData"
},
"propertyName": "kind"
},
"oneOf": [
{
"$ref": "#/$defs/DescriptionAnnotation"
},
{
"$ref": "#/$defs/MiscAnnotation"
},
{
"$ref": "#/$defs/PictureClassificationData"
},
{
"$ref": "#/$defs/PictureMoleculeData"
},
{
"$ref": "#/$defs/PictureTabularChartData"
},
{
"$ref": "#/$defs/PictureLineChartData"
},
{
"$ref": "#/$defs/PictureBarChartData"
},
{
"$ref": "#/$defs/PictureStackedBarChartData"
},
{
"$ref": "#/$defs/PicturePieChartData"
},
{
"$ref": "#/$defs/PictureScatterChartData"
}
]
},
"title": "Annotations",
"type": "array"
}
},
"required": [
"self_ref"
],
"title": "PictureItem",
"type": "object"
}
Fields:
-
self_ref(str) -
parent(Optional[RefItem]) -
children(list[RefItem]) -
content_layer(ContentLayer) -
prov(list[ProvenanceItem]) -
source(list[SourceType]) -
comments(list[FineRef]) -
captions(list[RefItem]) -
references(list[RefItem]) -
footnotes(list[RefItem]) -
image(Optional[ImageRef]) -
label(Literal[PICTURE, CHART]) -
meta(Optional[PictureMeta]) -
annotations(list[PictureDataType])
Validators:
-
_migrate_annotations_to_meta
annotations
annotations: list[PictureDataType] = []
comments
comments: list[FineRef] = []
content_layer
content_layer: ContentLayer = BODY
meta
meta: Optional[PictureMeta] = None
model_config
model_config = ConfigDict(extra='forbid')
self_ref
self_ref: str
source
source: list[SourceType]
The provenance of this document item. Currently, it is only used for media track provenance.
export_to_doctags
export_to_doctags(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
add_location: bool = True,
add_caption: bool = True,
add_content: bool = True,
)
Export picture to document tokens format.
:param doc: "DoclingDocument": :param new_line: str (Default value = "") Deprecated :param xsize: int: (Default value = 500) :param ysize: int: (Default value = 500) :param add_location: bool: (Default value = True) :param add_caption: bool: (Default value = True) :param add_content: bool: (Default value = True) :param # not used at the moment
export_to_document_tokens
export_to_document_tokens(*args, **kwargs)
Export to DocTags format.
export_to_html
export_to_html(
doc: DoclingDocument,
add_caption: bool = True,
image_mode: ImageRefMode = PLACEHOLDER,
) -> str
Export picture to HTML format.
export_to_markdown
export_to_markdown(
doc: DoclingDocument,
add_caption: bool = True,
image_mode: ImageRefMode = EMBEDDED,
image_placeholder: str = "<!-- image -->",
) -> str
Export picture to Markdown format.
get_annotations
get_annotations() -> Sequence[BaseAnnotation]
Get the annotations of this PictureItem.
get_image
get_image(
doc: DoclingDocument, prov_index: int = 0
) -> Optional[Image]
Returns the image corresponding to this FloatingItem.
This function returns the PIL image from self.image if one is available. Otherwise, it uses DocItem.get_image to get an image of this FloatingItem.
In particular, when self.image is None, the function returns None if this FloatingItem has no valid provenance or the doc does not contain a valid image for the required page.
get_location_tokens
get_location_tokens(
doc: DoclingDocument,
new_line: str = "",
xsize: int = 500,
ysize: int = 500,
self_closing: bool = False,
) -> str
Get the location string for the BaseCell.
ImageRef
Bases: BaseModel
ImageRef.
Show JSON schema:
{
"$defs": {
"Size": {
"description": "Size.",
"properties": {
"width": {
"default": 0.0,
"title": "Width",
"type": "number"
},
"height": {
"default": 0.0,
"title": "Height",
"type": "number"
}
},
"title": "Size",
"type": "object"
}
},
"description": "ImageRef.",
"properties": {
"mimetype": {
"title": "Mimetype",
"type": "string"
},
"dpi": {
"title": "Dpi",
"type": "integer"
},
"size": {
"$ref": "#/$defs/Size"
},
"uri": {
"anyOf": [
{
"format": "uri",
"minLength": 1,
"type": "string"
},
{
"format": "path",
"type": "string"
}
],
"title": "Uri"
}
},
"required": [
"mimetype",
"dpi",
"size",
"uri"
],
"title": "ImageRef",
"type": "object"
}
Fields:
Validators:
dpi
dpi: int
mimetype
mimetype: str
pil_image
pil_image: Optional[Image]
Return the PIL Image.
uri
uri: Union[AnyUrl, Path]
from_pil
from_pil(image: Image, dpi: int) -> Self
Construct ImageRef from a PIL Image.
validate_mimetype
validate_mimetype(v)
validate_mimetype.
PictureClassificationClass
Bases: BaseModel
PictureClassificationData.
Show JSON schema:
{
"description": "PictureClassificationData.",
"properties": {
"class_name": {
"title": "Class Name",
"type": "string"
},
"confidence": {
"title": "Confidence",
"type": "number"
}
},
"required": [
"class_name",
"confidence"
],
"title": "PictureClassificationClass",
"type": "object"
}
Fields:
-
class_name(str) -
confidence(float)
class_name
class_name: str
confidence
confidence: float
PictureClassificationData
Bases: BaseAnnotation
PictureClassificationData.
Show JSON schema:
{
"$defs": {
"PictureClassificationClass": {
"description": "PictureClassificationData.",
"properties": {
"class_name": {
"title": "Class Name",
"type": "string"
},
"confidence": {
"title": "Confidence",
"type": "number"
}
},
"required": [
"class_name",
"confidence"
],
"title": "PictureClassificationClass",
"type": "object"
}
},
"description": "PictureClassificationData.",
"properties": {
"kind": {
"const": "classification",
"default": "classification",
"title": "Kind",
"type": "string"
},
"provenance": {
"title": "Provenance",
"type": "string"
},
"predicted_classes": {
"items": {
"$ref": "#/$defs/PictureClassificationClass"
},
"title": "Predicted Classes",
"type": "array"
}
},
"required": [
"provenance",
"predicted_classes"
],
"title": "PictureClassificationData",
"type": "object"
}
Fields:
-
kind(Literal['classification']) -
provenance(str) -
predicted_classes(list[PictureClassificationClass])
kind
kind: Literal['classification'] = 'classification'
provenance
provenance: str
RefItem
Bases: BaseModel
RefItem.
Show JSON schema:
{
"description": "RefItem.",
"properties": {
"$ref": {
"pattern": "^#(?:/([\\w-]+)(?:/(\\d+))?)?$",
"title": "$Ref",
"type": "string"
}
},
"required": [
"$ref"
],
"title": "RefItem",
"type": "object"
}
Config:
populate_by_name:True
Fields:
-
cref(str)
cref
cref: str
model_config
model_config = ConfigDict(populate_by_name=True)
get_ref
get_ref()
get_ref.
BoundingBox
Bases: BaseModel
BoundingBox.
Show JSON schema:
{
"$defs": {
"CoordOrigin": {
"description": "CoordOrigin.",
"enum": [
"TOPLEFT",
"BOTTOMLEFT"
],
"title": "CoordOrigin",
"type": "string"
}
},
"description": "BoundingBox.",
"properties": {
"l": {
"title": "L",
"type": "number"
},
"t": {
"title": "T",
"type": "number"
},
"r": {
"title": "R",
"type": "number"
},
"b": {
"title": "B",
"type": "number"
},
"coord_origin": {
"$ref": "#/$defs/CoordOrigin",
"default": "TOPLEFT"
}
},
"required": [
"l",
"t",
"r",
"b"
],
"title": "BoundingBox",
"type": "object"
}
Fields:
-
l(float) -
t(float) -
r(float) -
b(float) -
coord_origin(CoordOrigin)
b
b: float
height
height
height.
l
l: float
r
r: float
t
t: float
width
width
width.
area
area() -> float
area.
as_tuple
as_tuple() -> tuple[float, float, float, float]
as_tuple.
enclosing_bbox
enclosing_bbox(boxes: list[BoundingBox]) -> BoundingBox
Create a bounding box that covers all of the given boxes.
from_tuple
from_tuple(coord: tuple[float, ...], origin: CoordOrigin)
from_tuple.
:param coord: tuple[float: :param ...]: :param origin: CoordOrigin:
get_intersection_bbox
get_intersection_bbox(
other: BoundingBox,
) -> Optional[BoundingBox]
Return the intersection bounding box with another bounding box or None when disjoint.
intersection_area_with
intersection_area_with(other: BoundingBox) -> float
Calculate the intersection area with another bounding box.
intersection_over_self
intersection_over_self(
other: BoundingBox, eps: float = 1e-06
) -> float
intersection_over_self.
intersection_over_union
intersection_over_union(
other: BoundingBox, eps: float = 1e-06
) -> float
intersection_over_union.
is_horizontally_connected
is_horizontally_connected(
elem_i: BoundingBox, elem_j: BoundingBox
) -> bool
is_horizontally_connected.
is_strictly_above
is_strictly_above(
other: BoundingBox, eps: float = 0.001
) -> bool
is_strictly_above.
is_strictly_left_of
is_strictly_left_of(
other: BoundingBox, eps: float = 0.001
) -> bool
is_strictly_left_of.
overlaps_horizontally
overlaps_horizontally(other: BoundingBox) -> bool
Check if two bounding boxes overlap horizontally.
overlaps_vertically
overlaps_vertically(other: BoundingBox) -> bool
Check if two bounding boxes overlap vertically.
overlaps_vertically_with_iou
overlaps_vertically_with_iou(
other: BoundingBox, iou: float
) -> bool
overlaps_y_with_iou.
resize_by_scale
resize_by_scale(x_scale: float, y_scale: float)
resize_by_scale.
scaled
scaled(scale: float)
scaled.
to_bottom_left_origin
to_bottom_left_origin(page_height: float) -> BoundingBox
to_bottom_left_origin.
:param page_height:
to_top_left_origin
to_top_left_origin(page_height: float) -> BoundingBox
to_top_left_origin.
:param page_height:
union_area_with
union_area_with(other: BoundingBox) -> float
Calculates the union area with another bounding box.
x_overlap_with
x_overlap_with(other: BoundingBox) -> float
Calculates the horizontal overlap with another bounding box.
x_union_with
x_union_with(other: BoundingBox) -> float
Calculates the horizontal union dimension with another bounding box.
y_overlap_with
y_overlap_with(other: BoundingBox) -> float
Calculates the vertical overlap with another bounding box, respecting coordinate origin.
y_union_with
y_union_with(other: BoundingBox) -> float
Calculates the vertical union dimension with another bounding box, respecting coordinate origin.
CoordOrigin
Bases: str, Enum
CoordOrigin.
Attributes:
-
BOTTOMLEFT– -
TOPLEFT–
BOTTOMLEFT
BOTTOMLEFT = 'BOTTOMLEFT'
TOPLEFT
TOPLEFT = 'TOPLEFT'
ImageRefMode
Bases: str, Enum
ImageRefMode.
Attributes:
-
EMBEDDED– -
PLACEHOLDER– -
REFERENCED–
EMBEDDED
EMBEDDED = 'embedded'
PLACEHOLDER
PLACEHOLDER = 'placeholder'
REFERENCED
REFERENCED = 'referenced'
Size
Bases: BaseModel
Size.
Show JSON schema:
{
"description": "Size.",
"properties": {
"width": {
"default": 0.0,
"title": "Width",
"type": "number"
},
"height": {
"default": 0.0,
"title": "Height",
"type": "number"
}
},
"title": "Size",
"type": "object"
}
Fields:
height
height: float = 0.0
width
width: float = 0.0
as_tuple
as_tuple()
as_tuple.