How to Convert a PDF Form to Excel
By BankStatementReader Team ·
PDF forms collect data well, but they do not analyze it. The moment you want to sort responses, total a column, or compare entries across submissions, you need that data in a spreadsheet. This guide walks through getting form data out of a PDF and into Excel, and the route you take depends on one thing first: what kind of PDF form you actually have.
First, identify your form type
There are two broad kinds of PDF forms, and they require completely different approaches.
A fillable form (often called an AcroForm or an XFA form) has interactive fields you can click into and type — text boxes, checkboxes, dropdowns, radio buttons. The data is stored as structured field values inside the file, separate from the visual page. This is the better case, because the values can be exported directly.
A flat or printed form is just an image of a form. This includes scanned paper forms, photographed documents, and PDFs that were flattened or "printed to PDF" so the fields are no longer interactive. There are no field values to read — only pixels — so you need optical character recognition (OCR) to recover the text.
A quick test: open the PDF and try to click into a field. If your cursor enters the field and you can edit the value, it is fillable. If clicking does nothing, treat it as a flat form.
Converting a fillable PDF form
Because a fillable form already stores its data as named fields, the most reliable path is to export those field values rather than scrape the visible page.
Export the form data
Most full-featured PDF readers can export the data from a completed form. The export typically produces a structured file — commonly FDF, XFDF, XML, or CSV — that pairs each field name with its value. A CSV export opens directly in Excel. For FDF, XFDF, or XML, you can import the file into Excel using Data → Get Data (or From File), which lets you map the field names to columns.
This approach preserves the field-to-value relationship, so a field named invoice_total
becomes a column header and its entry becomes the cell value. That structure is exactly what
you want for analysis.
Combining many submissions
If you collected the same form from many people, exporting one at a time is tedious. Some PDF tools can merge data from a folder of completed forms into a single spreadsheet, with one row per form and one column per field. When that option is available, it turns dozens of returned forms into a clean table in a single step. If your tool cannot merge, export each form to CSV and stack the rows in Excel, keeping the header row only once.
Copy and paste as a fallback
For a one-off fillable form, you can sometimes select the text in the fields and paste it into Excel. Results vary: values often land in a single cell, labels and entries merge, and checkbox states may not copy at all. After pasting, use Data → Text to Columns to split the content, then align the rows by hand. This is workable for a single short form but does not scale.
Converting a flat or scanned form
A flat form has no field data to export, so you have to read the characters off the page with OCR.
Run OCR first
OCR converts the image of the form into selectable, searchable text. Many PDF readers include an OCR or "recognize text" function; standalone OCR tools and document scanners offer the same capability. Run it on the whole document so every page gains a text layer.
Accuracy depends heavily on input quality. A clean, straight scan at a reasonable resolution recognizes far better than a skewed phone photo or a faint fax. Before running OCR, straighten the page, crop out the background, and rescan at higher quality if the result looks garbled.
Get the recognized text into Excel
Once the form has a text layer, you have a few options. You can copy the recognized text and paste it into Excel, then clean it up with Text to Columns. You can export the OCR output to a spreadsheet format if your tool supports it. Or you can transcribe the recognized values into a prepared template — slower, but it forces you to verify each entry, which matters when OCR misreads characters.
Always check OCR output against the original. Common errors include confusing the letter O with the number 0, the letter l with the number 1, and dropping leading zeros or decimal points in amounts.
Structuring the result into columns
However you extracted the data, the goal is the same: a tidy table where each column is one field and each row is one form or one record.
A few habits keep the result usable:
- Put a clear header in the first row — one label per column, no blank columns between them.
- Keep one type of data per column. Do not mix a date and a description in the same cell.
- Strip stray labels that came along during copy-paste, so cells hold values, not "Name: Jane".
- Set the right cell format for each column — dates as dates, currency as numbers — so Excel can sort and total them correctly.
- For checkbox and yes/no fields, normalize the values (for example, Yes/No or TRUE/FALSE) so you can filter on them.
Once the data sits in clean columns, Excel's sorting, filtering, and formula tools work as expected, and you can pivot or chart the responses.
When you have many PDFs
The techniques above work well for one form or a small batch. If you regularly convert PDFs to spreadsheets — invoices, statements, recurring reports — a repeatable workflow saves more than the time of any single conversion. For a fuller walkthrough of moving tabular data from PDF to spreadsheet, see how to convert a PDF to Excel.
For one specific high-volume case — bank statements, where the layout is consistent and the columns are predictable — a purpose-built tool can detect the transaction table and export rows directly. You can try that with the bank statement converter.
The short version
Identify the form first. If it is fillable, export the field data and import it into Excel — the structure comes along for free. If it is flat or scanned, run OCR, move the recognized text into a spreadsheet, and verify it against the original. Either way, finish by shaping the data into one column per field so the spreadsheet can do the work the PDF could not.
Related reading
How to Convert PDF to Excel: Methods That Keep Formatting
Learn how to convert a PDF to Excel using copy-paste, built-in import, converters, and OCR — and when each method keeps your formatting intact.
How to Convert a PDF File to an Excel Spreadsheet (No Formatting Loss)
Learn how to convert a PDF file to an Excel spreadsheet while keeping table structure intact — text layer checks, import, Text to Columns, and OCR.
PDF to CSV: When to Use CSV Instead of Excel
When to convert a PDF to CSV instead of Excel, why accounting tools prefer it, and how to get a clean PDF to CSV bank statement export.