pdf-to-google-doc

Google Script To Extract Text from PDF

In this article, you can read how you can Google script to extract text from PDF. Google Script can be used to extract text from a pdf document. And retain simple formatting. You can have the pdf in google drive folder or can be accessed from web URL.

The following snippet is extracting text from pdf and creating a new document with that text in root folder of drive.

function getTextFromPDF() {
var pdfFile = DriveApp.getFilesByName("sample.pdf").next();
var blob = pdfFile.getBlob();

var resource = {
title: blob.getName(),
mimeType: blob.getContentType()
};

//Create a Doc file in Google Drive with Extracted Text From Pdf
var docFile = Drive.Files.insert(resource, blob, {ocr: true, ocrLanguage: "en"});
}

Here Google Drive API works as OCR Engine to extract text from pdf. You need to first enable the advanced Drive API Service before running the script.

Read More about Google Apps Script to create and delete folder

Steps to enable API Service.
Goto Resource->Advanced Google Services and enable the Drive API as image below.

Thanks for reading this article I hope it will help you to get Google Script To Extract Text from PDF. Please comment if you have to face any problem related to an article.

One Response

Add a Comment

Your email address will not be published. Required fields are marked *