Pre-launch — all tools live, free with no limits.

PDF to Excel

Financial reports. Research datasets. Government archives.
Some tables shouldn't leave your device. Extract them in your browser — nothing uploads.

Drop a PDF to extract tables
Max 25 MB per file

About extracting tables from PDFs

What this tool extracts

Well-aligned tables, borderless tables (algorithm uses text geometry, not visual borders), and multi-page table continuations (when column anchors and header signatures match across pages). Detected tables surface with confidence tiers — high, medium, or low — letting you judge each table's reliability before downstream use.

When to use it

Financial reports, research datasets, scraped public records, government archives, internal data exports — anything where the source PDF contains information sensitive enough that uploading to a third-party converter is the wrong shape, but structured enough that re-typing the tables manually is also the wrong shape.

Privacy framing

Per empirical research, every reachable PDF→Excel competitor (Smallpdf, iLovePDF, PDFCandy, PDF24, Convertio) processes content on their servers or with ambiguous "browser-based" framing. Adobe Acrobat paywalls the feature. The browser-only privacy quadrant is unoccupied. PDF to Excel runs entirely on your device via pdfjs + SheetJS — your financial reports, research datasets, and government archives stay in your browser.

Confidence signals

PDF table extraction is statistical, not deterministic. Tool 28 surfaces per-table confidence tiers (high / medium / low) based on column alignment, empty-cell ratio, and row count. No reachable competitor surfaces confidence — all claim "structural integrity preserved" without acknowledging the lossiness inherent in PDF text-positioning extraction. Confidence transparency is differentiation alongside privacy.

What we don't handle and why

Scanned PDFs need OCR first — route to OCR PDF to add a text layer, then return. Encrypted PDFs need the password removed first — do that in your PDF software, then return. Corrupted PDFs need repair first — route to Repair PDF. Low-confidence tables surface with explicit warnings; manual cleanup or alternative tools remain options for critical data.

After extraction

Add a password to the extracted spreadsheet before sharing with Protect PDF — keeps financial and research exports encrypted at rest.

Frequently asked questions

What kinds of PDFs work best?
Well-aligned text PDFs (Excel exports, generated reports, government data). Borderless tables work — the algorithm uses text geometry, not visual borders. Multi-table pages produce one sheet per table. Scanned PDFs need OCR first via the OCR PDF tool.
Why might my table look wrong?
PDF table extraction is statistical. The algorithm clusters text items by position to infer rows and columns. Well-aligned tables produce high-confidence extractions; tables with merged cells, multi-line content, or unusual layouts may produce medium- or low-confidence extractions with visible warnings. For critical data, review low-confidence tables manually or use Excel / LibreOffice to clean up.
Can I extract tables from scanned PDFs?
No. Tool 28 requires PDFs with selectable text. For scanned PDFs (image-only), run OCR PDF first to add a text layer, then return to PDF to Excel. We don't bundle OCR in v1 — the existing OCR PDF tool covers the case at no marginal cost.
Can I extract tables from password-protected PDFs?
No. v1 declines decryption. Remove the password in your PDF software first, then return to PDF to Excel. We don't proxy your password through pdfmundo.
Are my PDFs uploaded to your servers?
No. All extraction runs in your browser via pdfjs and SheetJS. Financial reports, research datasets, and government archives never leave your device. We have no servers receiving the content.
Should I use XLSX or CSV?
XLSX preserves multi-table structure (one sheet per detected table), header formatting hints, and confidence metadata. Use XLSX for downstream Excel / LibreOffice work. Use CSV when feeding a data pipeline (Pandas, R, Python) that prefers flat text; multi-table CSV outputs bundle as a ZIP archive.

Extract another PDF

Or explore the rest of the catalog.

Back to homepage →