Get each word or character with its x, y, width, and height on the page. Useful for building search, redaction, or document analysis tools.
use pdfluent::prelude::*;
fn main() -> Result<()> {
let doc = PdfDocument::open("document.pdf")?;
for block in doc.text_with_layout()? {
println!(
"[page {}] [{:.1},{:.1}] {:?}",
block.page, block.x, block.y, block.text,
);
}
Ok(())
}Load the PDF.
use pdfluent::prelude::*;
let doc = PdfDocument::open("document.pdf")?;Returns Vec<TextBlock> document-wide. Each TextBlock carries the text, its 1-based page number, and bounding-box coordinates in PDF points (bottom-left origin).
let blocks = doc.text_with_layout()?;
println!("{} text blocks", blocks.len());Read block.page, block.x, block.y, block.width, block.height, block.text.
for block in doc.text_with_layout()? {
if block.page == 1 {
println!("[{:.1},{:.1}] {:?}", block.x, block.y, block.text);
}
}No JVM, no runtime, no DLL dependencies. Ships as a single native binary or WASM module.
Rust's ownership model prevents buffer overflows and use-after-free. No segfaults in PDF parsing.
Same code runs server-side, in Docker, on AWS Lambda, on Cloudflare Workers, or in the browser via WASM.