kageru.moe

Abstract

In order to remedy the effects of lossy compression of digital media files, dither is applied to randomize quantization errors and thus avoid or remove distinct patterns which are perceived as unwanted artifacts. This can be used to remove banding artifacts by adding random pixels along banded edges which will create the impression of a smoother gradient. The resulting image will often be more resilient to lossy compression, as the added information is less likely to be omitted by the perceptual coding algorithm of the encoding software.
Wikipedia explains it like this:

High levels of noise are almost always undesirable, but there are cases when a certain amount of noise is useful, for example to prevent discretization artifacts (color banding or posterization). [. . .] Noise added for such purposes is called dither; it improves the image perceptually, though it degrades the signal-to-noise ratio.

In video encoding, especially regarding anime, this is utilized by debanding filters to improve their effectiveness and to “prepare” the image for encoding. While grain may be beneficial under some circumstances, it is generally perceived as an unwanted artifact, especially in brighter scenes where banding artifacts would be less likely to occur even without dithering. Most Blu-rays released today will already have grain in most scenes which will mask most or even all visual artifacts, but for reasons described here, it may be beneficial to remove this grain.

As mentioned previously, most debanding filters will add grain to the image, but in some cases this grain might be either to weak to mask all artifacts or to faint, causing it to be removed by the encoding software, which in turn allows the banding to resurface. In the past, scripts like GrainFactory were written to specifically target dark areas to avoid the aforementioned issues without affecting brighter scenes.

This idea can be further expanded by using a continuous function to determine the grain's strength based on the average brightness of the frame as well as the brightness of every individual pixel. This way, the problems described above can be solved with less grain, especially in brighter areas and bright scenes where the dark areas are less likely to the focus of the viewer's attention. This improves the perceived quality of the image while simultaneously saving bitrate due to the absence of grain in brighter scenes and areas.

Demonstration and Examples

Since there are two factors that will affect the strength of the grain, we need to analyze the brightness of any given frame before applying any grain. This is achieved by using the PlaneStats function in Vapoursynth. The following clip should illustrate the results. The brightness of the current frame is always displayed in the top left-hand corner. The surprisingly low values in the beginning are caused by the 21:9 black bars. (Don't mind the stuttering in the middle. That's just me being bad)

You can download the video if your browser is not displaying it correctly.

In the dark frames you can see banding artifacts which were created by x264's lossy compression algorithm. Adding grain fixes this issue by adding more randomness to the gradients.

Download

By using the script described above, we are able to remove most of the banding without lowering the crf, increasing aq-strength, or graining other surfaces where it would have decreased the image quality.

Theory and Explanation

The script works by generating a copy of the input clip in advance and graining that copy. For each frame in the input clip, a mask is generated based on the frame's average luma and the individual pixel's value. This mask is then used to apply the grained clip with the calculated opacity. The generated mask for the previously used clip looks like this:

Download

The brightness of each pixel is calculated using this polynomial:

z = (1 - (1.124x - 9.466x^2 + 36.624x^3 - 45.47x^4 + 18.188x^5))^(y^2 * 10)

where x is the luma of the current pixel, y is the current frame's average luma, and z is the resulting pixels brightness. The highlighted number (10) is a parameter called luma_scaling which will be explained later.

The polynomial is applied to every pixel and every frame. All luma values are floats between 0 (black) and 1 (white). For performance reasons the precision of the mask is limited to 8 bits, and the frame brightness is rounded to 1000 discrete levels. All lookup tables are generated in advance, significantly reducing the number of necessary calculations.

Here are a few examples to better understand the masks generated by the aforementioned polynomial.

Generally, the lower a frame's average luma, the more grain is applied even to the brighter areas. This abuses the fact that our eyes are instinctively drawn to the brighter part of any image, making the grain less necessary in images with an overall very high luma.

Plotting the polynomial for all y-values (frame luma) results in the following image (red means more grain and yellow means less or no grain):

More detailed versions can be found here (100 points per axis) or here (400 points per axis).
Now that we have covered the math, I will quickly go over the Vapoursynth script.

Click to expand code

import vapoursynth as vs
import numpy as np
import functools

def adaptive_grain(clip, source=None, strength=0.25, static=True, luma_scaling=10, show_mask=False):

    def fill_lut(y):
        x = np.arange(0, 1, 1 / (1 << src_bits))
        z = (1 - (1.124 * x - 9.466 * x ** 2 + 36.624 * x ** 3 - 45.47 * x ** 4 + 18.188 * x ** 5)) ** (
            (y ** 2) * luma_scaling) * ((1 << src_bits) - 1)
        z = np.rint(z).astype(int)
        return z.tolist()

    def generate_mask(n, clip):
        frameluma = round(clip.get_frame(n).props.PlaneStatsAverage * 999)
        table = lut[int(frameluma)]
        return core.std.Lut(clip, lut=table)

    core = vs.get_core(accept_lowercase=True)
    if source is None:
        source = clip
    if clip.num_frames != source.num_frames:
        raise ValueError('The length of the filtered and unfiltered clips must be equal')
    source = core.fmtc.bitdepth(source, bits=8)
    src_bits = 8
    clip_bits = clip.format.bits_per_sample

    lut = [None] * 1000
    for y in np.arange(0, 1, 0.001):
        lut[int(round(y * 1000))] = fill_lut(y)

    luma = core.std.ShufflePlanes(source, 0, vs.GRAY)
    luma = core.std.PlaneStats(luma)
    grained = core.grain.Add(clip, var=strength, constant=static)

    mask = core.std.FrameEval(luma, functools.partial(generate_mask, clip=luma))
    mask = core.resize.Bilinear(mask, clip.width, clip.height)

    if src_bits != clip_bits:
        mask = core.fmtc.bitdepth(mask, bits=clip_bits)

    if show_mask:
        return mask

    return core.std.MaskedMerge(clip, grained, mask)

Thanks to Frechdachs for suggesting the use of std.FrameEval.

In order to adjust for things like black bars, the curves can be manipulated by changing the luma_scaling parameter. Higher values will cause comparatively less grain even in darker scenes, while lower values will increase the opacity of the grain even in brighter scenes.

Usage

The script has four parameters, three of which are optional.

Parameter	[type, default]	Explanation
clip	[clip]	The filtered clip that the grain will be applied to
strength	[float, 0.25]	Strength of the grain generated by AddGrain.
static	[boolean, True]	Whether to generate static or dynamic grain.
luma_scaling	[float, 10]	This values changes the general grain opacity curve. Lower values will generate more grain, even in brighter scenes, while higher values will generate less, even in dark scenes.

Closing Words

Grain is a type of visual noise that can be used to mask discretization if used correctly. Too much grain will degrade the perceived quality of a video, while to little grain might be destroyed by the perceptual coding techniques used in many popular video encoders.

The script described in this article aims to apply the optimal amount of grain to all scenes to prevent banding artifacts without having a significant impact on the perceived image quality or the required bitrate. It does this by taking the brightness of the frame as a whole and every single pixel into account and generating an opacity mask based on these values to apply grain to certain areas of each frame. This can be used to supplement or even replace the dither generated by other debanding scripts. The script has a noticeable but not significant impact on encoding performance.

Download

There's probably a much simpler way to do this, but I like this one. fite me m8