Google frees nifty ML image-compression model… but it’s for JPEG-XL

Source Node: 1765805

A new application of machine learning looks both clever and handy, as opposed to the more normal properties of being somewhere between privacy-, copyright-, or life-endangering. But before you get too excited, you can’t have it.

The true cost of ML applications varies. Many are free to use, which means they endanger the paid income of someone somewhere. Speech recognition puts poor people in call centers out of work. “AI” image generators deprive creative artists of their income, and “AI” text generators threaten writers – in those few jobs which survived the web destroying print journalism, anyway.

Applying ML to image compression and decompression seems like a relatively safe use. Adding more smarts to image compression has felt like it was an inspired idea waiting for its moment ever since Michael Barnsley invented fractal image compression in 1987.

The new attention center model does something different: It uses machine learning to attempt to identify which parts of an image will attract a human’s attention first, so that it can selectively decompress those regions first.

Load the important bits first

If you’re old enough to remember watching GIF images gradually appear, line by line, as they downloaded over a dial-up modem, you will immediately grasp the appeal. But now it’s more about mobile and wireless connections, whose speed not only varies wildly but unpredictably.

The idea is that a low-res version of the whole image appears right at the start, and by the time that your visual cortex has decided where to point your pupils, that area of the image is already getting sharpened up. Then as your attention roams around the picture, the algorithm has guessed where your eyes will go next and fills in more detail in those bits next. Once those parts are fairly sharp, then the rest is filled in, the relatively boring bits last of all.

If it worked well enough, you probably wouldn’t even notice it happened. The illusion would be that a perfectly sharp version appeared right at the start. We recommend playing with this demonstration, so long as you have a Chrome-based browser and you enable its experimental JPEG-XL image renderer: go to chrome://flags, search for jxl and enable it.

The algorithm is described in a post titled “Open sourcing the attention center model” on Google’s open source blog… and there lies the irony, and that is the reason that the preceding paragraph used the conditional mode. Because this feature uses the new JPEG-XL image format – the one that Google said it would remove from future versions of Chrome back in October.

It would be unjustifiably and indefensibly cynical of us to suggest that because the format is to be removed from Chrome 110, that is why Google is willing to open source the tech, so we won’t. ®

Time Stamp:

More from The Register