How to extract data from msword and excel?
Extracted from Web submission (CPAN:OLE:MSchwarz)
Tip provided by Daniel Heiserer
A fine-working way of extracting data out of MS-word and MS-excel files is the
following. It works fine on linux and IRIX. You get the OLE-perl module out of CPAN
Using this module inside perl you can read ms-word and ms-excel files and writing
them out again as clean text files, csv or even html files. The OLE Package of
M.Schartz is a sort of an upgrade of the LAOLA Package. Read his Documentation
which comes with the Package. Including this tools into netscape is quite
easy. It is not a plugin, but it works fine as a "helper". The only thing
you have to do is editing the "preferences/applications/" specifying
the "helpers" which are the perl-scripts or some shell-wrappers of them
mentioned above for the file-types .xls and .doc.
So you extract in fact the data and write it into a .txt or
.html file and use netscapes feature of
"netscape -remote openURL(file:/tmp/tmp.html)". To bring it into your netscape
session. It is a little bit of hacking but it works fine and is very general.
Previous | Next | Index of category | Main Index | Submit |
Appears in section(s) :
Tip recorded : 16-05-1999 21:49:33
HTML page last changed : 27-07-1999 20:12:59