Extract Text From Documents (PDF, DOC, XLS, PPT, etc) - docsToText

Authorbshopcho

Last UpdateFebruary 3, 2021

LicenseMIT

How to use it:

1. To get started, load the JavaScript file docToText.js in the document.

<script src="docToText.js"></script>

2. Create a new instance of the DocToText.

const docToText = new DocToText();

3. Exact text from a file you specify.

docToText.extractToText('example.pdf', 'pdf')
.then(function (text) {
  console.log(text)
}).catch(function (error) {
  console.log(error)
});

4. Exact text from a file you choose from local.

const file = files[0];
const {name} = file;
const ext = name.toLowerCase().substring(name.lastIndexOf('.') + 1);
docToText.extractToText(file, ext)
.then(function (text) {
  console.log(text)
}).catch(function (error) {
  console.log(error)
});

5. You can also exact from multiple files bundled in a zip.

docToText.extractZipToText('file.zip')
.then(function (text) {
  console.log(text)
}).catch(function (error) {
  console.log(error)
});

// from a local file
const file = files[0];
const docToText = new DocToText();
docToText.extractZipToText(file)
.then(function (text) {
  console.log(text)
}).catch(function (error) {
  console.log(error)
});