MetaChat REGISTER   ||   LOGIN   ||   IMAGES ARE OFF   ||   RECENT COMMENTS




artphoto by splunge
artphoto by TheophileEscargot
artphoto by Kronos_to_Earth
artphoto by ethylene

Home

About

Search

Archives

Mecha Wiki

Metachat Eye

Emcee

IRC Channels

IRC FAQ


 RSS


Comment Feed:

RSS

28 December 2009

Is it possible to translate a PDF into an Excel file?[More:]Like, to scan a paper that was an Excel file into PDF, and then magically translate that back into an Excel file that you can do all Excel things with? Please tell me there's a magical way.
And if not, could someone build a magical way? You only have to give me a small cut of the babillion dollars you make.
posted by ThePinkSuperhero 28 December | 16:49
I really don't think so and I just tried real quick. The whole magic of PDF is to take a document and make it so it is static and can't be changed. Do an askme to make sure? Or could you protect cells in the excel document so people can only change certain things?
posted by rainbaby 28 December | 16:54
Alas, what it is is that we have paper copies of Excel sheets that we were hoping to magically convert back into Excel documents without, you know, manually typing it all up. This is 2010! And where is my flying car????
posted by ThePinkSuperhero 28 December | 17:00
I have an Export | Word Document option in my Acrobat Pro. Might that be a way to do it?
posted by mrmoonpie 28 December | 17:06
If you can convert it to a Word document, then convert the text into a table, you can import it into Excel. Might be quicker and easier to re-type it, though :-(
posted by dg 28 December | 17:14
Sorry to be a bummer, but even if you could get all the OCR working (and trust that it worked, without verifying everything using the same amount of time as typing) excel still won't have any clue about data types, formats etc.

When you print an excel sheet, you throw away a lot of information.
posted by pompomtom 28 December | 17:20
I haven't looked at this kind of thing for a long time so I don't know what's currently good, but you want something like this. The wikipedia article on OCR lists a lot of packages that basically do what you want, and some of them are free. 'Basically' means there's still going to be some work involved into getting it into Excel, but that's sort of a separate job from translating the scanned image into characters. It goes without saying that if the Excel file contained formulas to generate the numbers, those can't be recovered from the paper version.
posted by Wolfdog 28 December | 17:26
I meant to link to this, too, though I haven't used it myself.
posted by Wolfdog 28 December | 17:27
Able2Extract and PDFConverter claim to have this capability, as does CometDocs, an online tool.

PDF is a portable document format. It isn't simply an image storage format. PDF documents can retain text information.

Except that I now see you aren't starting with a PDF. Why complicate matters?

we have paper copies of Excel sheets that we were hoping to magically convert back into Excel documents without, you know, manually typing it all up

Many professional OCR suites will automatically do what you need (to varying degrees of accuracy depending on equipment, state of the input, etc.). Do you have a scanner? Did it come with any software? Honestly, it's possible to get this working to where the conversion step is almost painless (the getting-working part can be a bear, though).

If you have enough original documents, you can actually hire a service to do the conversion for you, including cleanup.
posted by dhartung 28 December | 18:44
If the printouts are good quality, then this will be fairly easy.

1. Scan into Adobe Acrobat (or other software with OCR).
2. Select text and paste into excel
3. Go to Data, and text to columns and select an appropriate delimiter (usually a space should be fine).
4. Clean up any stray fields.
5. Get a martini.
posted by special-k 28 December | 18:51
I meant to say Acrobat Professional.
posted by special-k 28 December | 18:51
Whatcha resolvin? || In the style of the late Geocities,

HOME  ||   REGISTER  ||   LOGIN