Text & Search

pdfnova uses PDFium's text extraction for character-level precision — the same accuracy as Chrome's "Find in PDF" feature.

Extract Plain Text

const page = doc.getPage(0);
const text = page.getText();
console.log(text); // "Annual Report 2025\nQuarterly revenue grew by..."

Text Spans

Get text with position data for each word/run:

const spans = page.getTextSpans();
for (const span of spans) {
  console.log(span.text, span.x, span.y, span.fontSize);
}

Each TextSpan contains:

Property	Type	Description
`text`	`string`	The text content
`x`	`number`	Left position (PDF points)
`y`	`number`	Bottom position (PDF points)
`width`	`number`	Span width
`height`	`number`	Span height
`fontSize`	`number`	Font size in points
`charIndex`	`number`	Starting character index
`charCount`	`number`	Number of characters

Character Boxes

For pixel-perfect text selection or highlighting, get individual character bounding boxes:

const boxes = page.getCharBoxes();
for (const box of boxes) {
  console.log(`"${box.char}" at (${box.left}, ${box.bottom}) - (${box.right}, ${box.top})`);
}

Text Layer

Build a transparent, selectable text overlay on top of a rendered canvas:

const container = document.getElementById("page-container")!;

// Render the page
const canvas = document.createElement("canvas");
await page.render(canvas, { scale: 2 });
container.appendChild(canvas);

// Build text layer on top
const textLayer = page.createTextLayer(container);
// textLayer is a div with positioned spans matching the rendered text

The text layer uses CSS position: absolute spans matched to the rendered scale, enabling native text selection, copy/paste, and accessibility.

Full-Text Search

Search a Single Page

const results = page.search("revenue", { caseSensitive: true });
for (const match of results) {
  console.log(`Found "${match.text}" at char index ${match.charIndex}`);
  console.log("Highlight rects:", match.rects);
}

Search the Entire Document

const allResults = doc.search("quarterly revenue", { wholeWord: true });
for (const match of allResults) {
  console.log(`Page ${match.pageIndex + 1}: "${match.text}"`);
}

Search Options

Option	Type	Default	Description
`caseSensitive`	`boolean`	`false`	Match case exactly
`wholeWord`	`boolean`	`false`	Match whole words only

Search Result

Each SearchResult contains:

Property	Type	Description
`pageIndex`	`number`	0-based page number
`matchIndex`	`number`	Global match counter
`charIndex`	`number`	Character index in the page text
`charCount`	`number`	Number of characters matched
`rects`	`TextRect[]`	Bounding rectangles for highlighting
`text`	`string`	The matched text

Highlighting Search Results

Use the rects from search results to draw highlights:

const results = doc.search("revenue");
const ctx = canvas.getContext("2d")!;
const scale = 2;

ctx.fillStyle = "rgba(255, 235, 59, 0.4)";
for (const match of results.filter((r) => r.pageIndex === 0)) {
  for (const rect of match.rects) {
    ctx.fillRect(
      rect.left * scale,
      (page.height - rect.top) * scale,
      (rect.right - rect.left) * scale,
      (rect.top - rect.bottom) * scale,
    );
  }
}

Bookmarks / Table of Contents

const outline = doc.outline;
for (const item of outline) {
  console.log(`${item.title} → page ${item.pageIndex + 1}`);
  for (const child of item.children) {
    console.log(`  ${child.title} → page ${child.pageIndex + 1}`);
  }
}

Links

Extract hyperlinks from a page:

const links = page.getLinks();
for (const link of links) {
  console.log(`${link.url} at page ${link.pageIndex}`);
}

Extract Plain Text​

Text Spans​

Character Boxes​

Text Layer​

Full-Text Search​

Search a Single Page​

Search the Entire Document​

Search Options​

Search Result​

Highlighting Search Results​

Bookmarks / Table of Contents​

Links​