Read the text content of each page as a plain string or as structured spans with font and position data.
use pdfluent::prelude::*;
fn main() -> Result<()> {
let doc = PdfDocument::open("document.pdf")?;
for page in doc.pages() {
let text = page.text()?;
println!("--- Page {} ---", page.number());
println!("{}", text);
}
Ok(())
}Open the PDF. Text extraction is per-page and streams cleanly.
use pdfluent::prelude::*;
let doc = PdfDocument::open("document.pdf")?;doc.pages() returns an iterator of Page<'_>. Each Page has a text() method that returns Result<String>.
for page in doc.pages() {
let text = page.text()?;
println!("page {}: {} chars", page.number(), text.len());
}For downstream processing, join the per-page strings with page separators.
let combined: String = doc
.pages()
.map(|p| p.text().unwrap_or_default())
.collect::<Vec<_>>()
.join("\n\n");No JVM, no runtime, no DLL dependencies. Ships as a single native binary or WASM module.
Rust's ownership model prevents buffer overflows and use-after-free. No segfaults in PDF parsing.
Same code runs server-side, in Docker, on AWS Lambda, on Cloudflare Workers, or in the browser via WASM.