Python and PDF

Here is an example of how to merge the first pages of a number of pdf pages into a single file.

I wrote this script, because I needed to print the first page of 200 individual files and I didn’t really want to open each one manually…

 

import os.path, pyPdf
from os import walk
 
output = pyPdf.PdfFileWriter()
original = "c:\\original\\folder\\"
f = []
 
for (dirpath, dirnames, filenames) in walk(original): f.extend(filenames)
 
for eachfile in filenames: 
	ffile = original + "\\" + eachfile
 
	if "pdf" in eachfile:
		pdf = pyPdf.PdfFileReader(open(ffile, "rb"))
		output.addPage(pdf.getPage(0))
	print ffile
 
outputStream = open("c:\\out.pdf", "wb")
output.write(outputStream)
outputStream.close()

Merging pdf files in Python with pyPDF

Here is a very simple python script that marges two pdf files, using the pyPDF library.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import os.path
import pyPdf
 
pdfOne = "C:\\a.pdf"
pdfTwo = "C:\\b.pdf"
 
merged = "C:\\c.pdf"
 
if os.path.exists(pdfOne) and os.path.exists(pdfTwo): 
 
	output = pyPdf.PdfFileWriter()
 
	pdfOne = pyPdf.PdfFileReader(open(pdfOne, "rb"))
	for page in range(pdfOne.getNumPages()):
		output.addPage(pdfOne.getPage(page))
 
	pdfTwo = pyPdf.PdfFileReader(open(pdfTwo, "rb"))
	for page in range(pdfTwo.getNumPages()):
		output.addPage(pdfTwo.getPage(page))
 
	outputStream = open(merged, "wb")
	output.write(outputStream)
	outputStream.close()