Skip to content

CLI reference

This page documents Docling's command line tools. It is generated by scripts/render_cli_reference.py from the live Typer apps — do not edit by hand.

docling

Usage

docling [OPTIONS] source

Arguments

Name Type Required Description
source text yes PDF files to convert. Can be local file / directory paths or URL.

Options

Name Type Default Description
--from docx, pptx, html, image, pdf, asciidoc, md, csv, xlsx, xml_uspto, xml_jats, xml_xbrl, mets_gbs, json_docling, audio, vtt, latex (repeatable) Input formats to accept. Defaults to all supported formats.
--to md, json, yaml, html, html_split_page, text, doctags, vtt (repeatable) Specify output formats. Defaults to Markdown.
--show-layout / --no-show-layout flag false If enabled, the page images will show the bounding-boxes of the items.
--headers text Specify http request headers used when fetching url input sources in the form of a JSON string
--image-export-mode placeholder, embedded, referenced embedded Image export mode for image-capable document outputs (JSON, YAML, HTML, HTML split-page, and Markdown). Text, DocTags, and WebVTT outputs do not export images. With placeholder, only the position of the image is marked in the output. In embedded mode, the image is embedded as base64 encoded string. In referenced mode, the image is exported in PNG format and referenced from the main exported document.
--pipeline legacy, standard, vlm, asr standard Choose the pipeline to process PDF or image files.
--vlm-model text granite_docling Choose the VLM preset to use with PDF or image files. Available presets: smoldocling, granite_docling, deepseek_ocr, granite_vision, pixtral, got_ocr, phi4, qwen, nanonets_ocr2, gemma_12b, gemma_27b, dolphin, glm_ocr, lightonocr, falcon_ocr
--asr-model whisper_tiny, whisper_small, whisper_medium, whisper_base, whisper_large, whisper_turbo, whisper_tiny_mlx, whisper_small_mlx, whisper_medium_mlx, whisper_base_mlx, whisper_large_mlx, whisper_turbo_mlx, whisper_tiny_native, whisper_small_native, whisper_medium_native, whisper_base_native, whisper_large_native, whisper_turbo_native whisper_tiny Choose the ASR model to use with audio/video files.
--ocr / --no-ocr flag true If enabled, the bitmap content will be processed using OCR.
--force-ocr / --no-force-ocr flag false Replace any existing text with OCR generated text over the full content.
--tables / --no-tables flag true If enabled, the table structure model will be used to extract table information.
--ocr-engine text auto The OCR engine to use. When --allow-external-plugins is not set, the available values are: auto, easyocr, kserve_v2_ocr, ocrmac, rapidocr, tesserocr, tesseract. Use the option --show-external-plugins to see the options allowed with external plugins.
--ocr-lang text Provide a comma-separated list of languages used by the OCR engine. Note that each OCR engine has different values for the language names.
--psm integer Page Segmentation Mode for the OCR engine (0-13).
--pdf-backend pypdfium2, docling_parse, threaded_docling_parse, dlparse_v1, dlparse_v2, dlparse_v4 docling_parse The PDF backend to use.
--pdf-password text Password for protected PDF documents
--table-mode fast, accurate accurate The mode to use in the table structure model.
--enrich-code / --no-enrich-code flag false Enable the code enrichment model in the pipeline.
--enrich-formula / --no-enrich-formula flag false Enable the formula enrichment model in the pipeline.
--enrich-picture-classes / --no-enrich-picture-classes flag false Enable the picture classification enrichment model in the pipeline.
--enrich-picture-description / --no-enrich-picture-description flag false Enable the picture description model in the pipeline.
--enrich-chart-extraction / --no-enrich-chart-extraction flag false Enable chart data extraction from bar, pie, and line charts.
--artifacts-path path If provided, the location of the model artifacts.
--enable-remote-services / --no-enable-remote-services flag false Must be enabled when using models connecting to remote services.
--allow-external-plugins / --no-allow-external-plugins flag false Must be enabled for loading modules from third-party plugins.
--show-external-plugins / --no-show-external-plugins flag false List the third-party plugins which are available when the option --allow-external-plugins is set.
--abort-on-error / --no-abort-on-error flag false If enabled, the processing will be aborted when the first error is encountered.
--output path . Output directory where results are saved.
--verbose / -v integer 0 Set the verbosity level. -v for info logging, -vv for debug logging.
--debug-visualize-cells / --no-debug-visualize-cells flag false Enable debug output which visualizes the PDF cells
--debug-visualize-ocr / --no-debug-visualize-ocr flag false Enable debug output which visualizes the OCR cells
--debug-visualize-layout / --no-debug-visualize-layout flag false Enable debug output which visualizes the layout clusters
--debug-visualize-tables / --no-debug-visualize-tables flag false Enable debug output which visualizes the table cells
--version flag Show version information.
--document-timeout float The timeout for processing each document, in seconds.
--num-threads integer 4 Number of threads
--release-native-memory-every-n-pages integer 128 Release native parser memory after every N decoded pages when using the threaded docling-parse backend.
--device auto, cpu, cuda, mps, xpu auto Accelerator device
--logo flag Docling logo
--page-batch-size integer 4 Number of pages processed in one batch. Default: 4
--profiling / --no-profiling flag false If enabled, it summarizes profiling details for all conversion stages.
--save-profiling / --no-save-profiling flag false If enabled, it saves the profiling summaries to json.

docling-tools

Usage

docling-tools [OPTIONS] COMMAND [ARGS]...

Subcommands

Command Description
docling-tools models

docling-tools models

Usage

docling-tools models [OPTIONS] COMMAND [ARGS]...

Subcommands

Command Description
docling-tools models download
docling-tools models download-hf-repo

docling-tools models download

Usage

docling-tools models download [OPTIONS] [MODELS]:[layout|tableformer|tableformerv2|code_formula|picture_classifier|smolvlm|granitedocling|granitedocling_mlx|smoldocling|smoldocling_mlx|granite_vision|granite_chart_extraction|granite_chart_extraction_v4|rapidocr|easyocr]...

Arguments

Name Type Required Description
MODELS layout, tableformer, tableformerv2, code_formula, picture_classifier, smolvlm, granitedocling, granitedocling_mlx, smoldocling, smoldocling_mlx, granite_vision, granite_chart_extraction, granite_chart_extraction_v4, rapidocr, easyocr no Models to download (default behavior: a predefined set of models will be downloaded).

Options

Name Type Default Description
-o / --output-dir path /home/runner/.cache/docling/models The directory where to download the models.
--force / --no-force flag false If true, the download will be forced.
--all flag false If true, all available models will be downloaded (mutually exclusive with passing specific models).
-q / --quiet flag false No extra output is generated, the CLI prints only the directory with the cached models.

docling-tools models download-hf-repo

Usage

docling-tools models download-hf-repo [OPTIONS] MODELS...

Arguments

Name Type Required Description
MODELS text yes Specific models to download from HuggingFace identified by their repo id. For example: docling-project/docling-models .

Options

Name Type Default Description
-o / --output-dir path /home/runner/.cache/docling/models The directory where to download the models.
--force / --no-force flag false If true, the download will be forced.
-q / --quiet flag false No extra output is generated, the CLI prints only the directory with the cached models.