I am trying to extract the images from a pdf using this code:
import fitz from PIL.ImageEnhance import Color doc = fitz.open("P1.pdf") for i in range(len(doc)): for img in doc.getPageImageList(i): xref = img pix = fitz.Pixmap(doc, xref) if pix.n < 5: # this is GRAY or RGB pix.writeImage("p%s-%s.png" % (i, xref)) else: # CMYK: convert to RGB first pix1 = fitz.Pixmap(fitz.csRGB, pix) pix1.writeImage("p%s-%s.png" % (i, xref)) pix1 = None pix = None
However, all my images turn out completely back. I assume it's because they are black text on black background. How can I solve this?