Updated 2022-05-21 0
Viewed 7 times

I am trying to extract the images from a pdf using this code:

import fitz
from PIL.ImageEnhance import Color

doc = fitz.open("P1.pdf")
for i in range(len(doc)):
    for img in doc.getPageImageList(i):
        xref = img[0]
        pix = fitz.Pixmap(doc, xref)
        if pix.n < 5:       # this is GRAY or RGB
            pix.writeImage("p%s-%s.png" % (i, xref))
        else:               # CMYK: convert to RGB first
            pix1 = fitz.Pixmap(fitz.csRGB, pix)
            pix1.writeImage("p%s-%s.png" % (i, xref))
            pix1 = None
        pix = None

However, all my images turn out completely back. I assume it's because they are black text on black background. How can I solve this?

🔴 No definitive solution yet