NAME

epubcssfix -- Make corrections to CSS in .EPUB files for better readability

SYNOPSIS

# Make corrections to example.epub with the default options and write to 
# example_fixed.epub
epubcssfix example.epub

# The same, but chatty
epubcssfix --verbose example.epub

# Put the unmodified example.epub at example_orig.epub and the modified version
# at example.epub
epubcssfix --replace example.epub

# Remove all color: and background-color: attributes from example.epub's CSS
epubcssfix --nocolors example.epub 

# Take 10pt as the baseline font-size which is mapped to 100%. (So 12pt would be
# 120%, 24pt would be 240%, etc.)
epubcssfix --fontsize 10 example.epub

# Remove colors from CSS and append '[removed colors]' to the epub title
epubcssfix --nocolors --title '[removed colors]' example.epub 

# Fix all the EPUB files in a directory and its subdirectories
epubcssfix --recurse '/home/username/Calibre Library'

# Fix all the EPUB files only in the target directory with no recursion
epubcssfix /home/username/ebooks/authorname

# Fix multiple specified EPUB files
epubcssfix example.epub tinyfont.epub nocontrast.epub

# Get brief help on command line options
epubcssfix --help

DESCRIPTION

Many epubs come with unprofessional CSS that will not display correctly on some ebook readers. For instance, the font size may be illegibly small on a mobile device, or the user may have dark mode turned on, but the CSS specifies element colors according to an assumed (but not specified) white background, so there is little or no contrast with the actual black background.

This script will take each listed epub, check if it has problematic color: or font-size: elements in its CSS, and correct or remove them. If it makes any corrections, it will write the corrected epub to <filename>_fixed.epub (unless the -r or --replace option is specified).

Default corrections currently consist of supplying a contrasting background color whenever a foreground color is specified without one, altering font sizes specified in pt to percentages of default font size (based on 12pt = 100%), and removing font sizes specified with other absolute units (cm, mm, in, px, pc).

OPTIONS

-h --help

Display a brief help message.

-R --recurse

Recurse through any directories found on the command line.

-v --verbose

Chatter about what files we're examining and what fixes we're making.

-r --replace

Put the original epub at <filename>_orig.epub and the correction at <filename>.epub.

-n --nocolors

Strip out color attributes instead of supplying contrasting background colors.

-d --darkmode

Add a black background color and invert all existing colors.

-F --filesuffix

Add this string to the output filename instead of '_fixed'.

-t --title

Suffix the following string to the epub's title. This can help keep the different versions of an epub distinct in your ebook reader, if you're trying out different options before deciding on a final version to keep.

-f --fontsize

Baseline font size in pt that maps to 100%. E.g. if you set --fontsize 10 then a font-size of 10pt will be mapped to 100%, 12pt will be mapped to 120%, and so on. Default is 12.

DEPENDENCIES

This script uses Archive::Zip to access the files within an epub, Graphics::ColorUtils to map CSS/HTML color names to RGB values, both from CPAN. It also uses DirwalkCallback, supplied with this script, and Pod::Usage and Getopt::Long, from the standard Perl library.

LIMITATIONS

This script doesn't fully parse the CSS; it just uses regular expressions to check for certain bad patterns I've noticed in epubs (mostly indie ebooks formatted by the author with whatever tools they have on hand). It does not check the CSS for syntax errors, and doesn't fully check that color: values are valid (e.g., color: rgb( 257, 0, 0 ); would slip past it).

CHANGELOG

2023-09-12

Add --darkmode and --filesuffix options.

2023-09-08

Fixed logic errors in the complementary color code. Fixed bug where elements with greyscale foreground colors were getting background colors that barely contrasted with them or not at all. Now, if a color is greyish (defined as having a difference of no more than 32 between its highest and lowest r,g,b values), we average the r,g,b and return white if it's less than 128 and black if it's more than 128. Otherwise, use the (revised and corrected) complementary color function.

AUTHOR

Jim Henry III, http://jimhenry.conlang.org/software/

LICENSE

This library is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

FUNCTIONS

help()

Print brief help.

complement( rgb array )

Calculate the complementary color and return another rgb array.

rgb2hex( rgb array )

Convert from an rgb array to a six-digit HTML/CSS hex color string.

hex2rgb( hex color string )

Convert from a six-digit HTML/CSS hex color string to an rgb array.

complementary_color( CSS color string )

Take any of the valid types of CSS color value and return the complementary color (as the same type of CSS color value if possible, otherwise a hex string).

There's a lot of duplicate code between this and css2rgb(). I'm not sure we can fix that unless we give up on mirroring the type of CSS color value the color: had in the background-color: we supply.

css2rgb( CSS color string )

Take any of the valid types of CSS color value and return an rgb array.

inverse_color( CSS color value )

Take a CSS color value and return an inverse color as six-digit hex string.

average( list )

Average a list of numbers. Used by contrasting_color().

contrasting_color( CSS color string )

Test whether a CSS color is greyish (all the rgb values are fairly close together if not equal), and return black or white depending on how dark the grey is. Otherwise, return the complementary color. (The complement of a greyish color tends to be equally greyish and too low-contrast.) This is an imperfect algorithm because often a color can be far from greyish and still its complementary color doesn't look good in contrast to it. If that happens with a given epub, it's probably best to run with --nocolors.

fix_css_colors( css text, css filename, epub filename )

Takes a string representing the contents of a CSS file or a <style> element, plus filename context for debugging purposes, and returns a corrected version of the CSS or undef if no changes were needed.

Basically it supplies a contrasting background color for those elements where there are is only a foreground color.

Need to account for the rarer case where the background color is specified but the foreground color is not. This is the case with Programming Perl.epub:

tr:nth-of-type(even) { background-color: #f1f6fc; }

force_darkmode( CSS text, CSS filename, EPUB filename )

Set the body { background-color: black } and invert all the color: elements found in CSS.

remove_css_colors( CSS text, CSS filename, EPUB filename )

Return corrected CSS text with all color: and background-color: attributes removed, or undef if there were no color or background-color: attributes found.

corrected_font_size( font-size string )

Look for absolute font size units (pt, px, pc, in, cm, mm) and correct them. Adjust pt sizes to percentages and remove any other absolute font sizes. Return the corrected font-size string (which may be empty.)

fix_css_fontsize( CSS text, CSS filename, EPUB filename )

Return corrected CSS with bad font-size attributes corrected or removed, or undef if no corrections were needed.

get_css_from_style_element( HTML text )

Search HTML file contents, supplied as a string, for <style> elements and return the contents.

swap_in_corrected_style_element( HTML text, CSS text )

Replace the contents of the first <style> element with the CSS text argument.

N.B. This might not produce the desired results for an epub that has multiple style elements in each .html file.

make_corrections_to_css( CSS text, CSS filename, EPUB filename )

Based on command line options about how to fix things, call the appropriate functions to transform CSS. Return a two-element array; the first is the corrected CSS, or undef if there are no changes, and the second is the number of types of changes made (0-2).

add_title_suffix( EPUB zip object, EPUB filename )

Prepend the --title option string to the <dc:title> element in the .opf file of the target epub.

save_modified_zip( EPUB zip object, EPUB filename )

Save the modified EPUB object to disk, either appending '_fixed' to the modified file's name or saving it under the original name and appending _orig to the original file's name.

main function

Process command line options, then iterate over filenames and directories given on the command line. If a directory is found, iterate over it and collect the .epub filenames, appending them to @ARGV; otherwise, process the epub file by checking for .css files, then for .html files, and dealing with the CSS text in both appropriately. Save the resulting file if there were any changes, then repeat for the next file.