How-to guides/Metadata

Read PDF metadata in Rust

Read title, author, subject, keywords, producer, creator and timestamps from any PDF's document-information dictionary.

rust
use pdfluent::prelude::*;

fn main() -> Result<()> {
    let doc = PdfDocument::open("report.pdf")?;
    let meta = doc.metadata();

    println!("title:    {:?}", meta.title);
    println!("author:   {:?}", meta.author);
    println!("subject:  {:?}", meta.subject);
    println!("keywords: {:?}", meta.keywords);
    println!("producer: {:?}", meta.producer);
    println!("creator:  {:?}", meta.creator);
    println!("created:  {:?}", meta.creation_date);
    println!("modified: {:?}", meta.modification_date);

    Ok(())
}
Install:cargo add pdfluent@1.0.0-beta.5Download SDK →

Step by step

1

Open the PDF

Open the document. Metadata is cached on the PdfDocument and read lazily from the Info dictionary on first access.

rust
use pdfluent::prelude::*;

let doc = PdfDocument::open("report.pdf")?;
2

Call doc.metadata()

metadata() returns a Metadata struct — a plain snapshot with public fields. There's no Result to unwrap for reads; missing entries surface as None / empty Vec.

rust
let meta = doc.metadata();
println!("title = {:?}", meta.title);
println!("author = {:?}", meta.author);
3

Inspect every standard field

Metadata exposes title, author, subject, keywords (Vec<String>), producer, creator, creation_date and modification_date. Dates are PDF D-format strings (e.g. "D:20260421103000+02'00'") — parse them through your preferred date library if you need a DateTime.

rust
let meta = doc.metadata();
if let Some(ref t) = meta.title { println!("T: {}", t); }
if let Some(ref a) = meta.author { println!("A: {}", a); }
for k in &meta.keywords { println!("K: {}", k); }
4

Bulk-read for a directory of PDFs

Loop over files. Dropping the document at the end of each iteration keeps memory bounded across large batches.

rust
use pdfluent::prelude::*;
use std::fs;

for entry in fs::read_dir("./inbox")? {
    let path = entry?.path();
    if path.extension().map(|e| e == "pdf").unwrap_or(false) {
        match PdfDocument::open(&path) {
            Ok(doc) => {
                let m = doc.metadata();
                println!(
                    "{}: {} — {}",
                    path.display(),
                    m.title.as_deref().unwrap_or("(no title)"),
                    m.author.as_deref().unwrap_or("(no author)"),
                );
            }
            Err(e) => eprintln!("{}: {}", path.display(), e),
        }
    }
}

Notes and tips

  • Metadata.title and Metadata.author are Option<String>; a document may have no title or no author set.
  • Metadata.keywords is Vec<String>, parsed from the PDF's /Keywords entry — an empty vector means no keywords.
  • creation_date and modification_date are PDF D-format strings; conversion to chrono::DateTime is an application-side concern.
  • The 1.0 SDK exposes the Info-dictionary surface on Metadata. Full XMP metadata read (structured RDF) is tracked for a later release.

Why PDFluent for this

Pure Rust

No JVM, no runtime, no DLL dependencies. Ships as a single native binary or WASM module.

Memory safe

Rust's ownership model prevents buffer overflows and use-after-free. No segfaults in PDF parsing.

Runs anywhere

Same code runs server-side, in Docker, on AWS Lambda, on Cloudflare Workers, or in the browser via WASM.

Frequently asked questions