MaskTools

MaskTools

Übersicht

Autor: Kurosu
Version: 1.4.2
Download: http://www.avisynth.org/warpenterprises/
Kategorie: Verschiedenes
Anforderungen: YV12 Farbraum

Table of contents

I) Disclaimer (don't skip that part, but I don't force you to learn it either)
II) What it is
1) Simple version
2) Description
III) Revisions
IV) Developer's walkthrough
V) Functions description
VI) Some practical uses
1) MSharpen
2) MSoften
3) Rainbow reduction
4) Supersampled fxtoon
5) Warpsharp for dark luma
6) pseudo-deinterlacer (chroma will still be problematic)
7) Non-rectangular overlays
8) Replace backgrounds
9) Kmf-Toon ;-)
VII) TODO

I) Disclaimer

This plugin is released under the GPL license. You must agree to the terms of 'Copying.txt' before using the plugin or its source code.

You are also advised to use it in a philanthropic state-of-mind, i.e. not "I'll keep this secret for myself".

Last but not least, a very little part of all possible uses of each filter was tested (maybe 5% - still a couple of hours spent to debug ;-). Therefore, feedback is _very_ welcome (the opposite - lack of feedback - is also true...)

II) What it is

1) Simple version
After a processing, you may need to keep only a part of the output. Say, you have a clip named smooth that is the result of a smoothing (blur() for instance) on a clip named source.
Most of the noise from source have disappeared in smooth, but so have details. You could therefore want to only keep filtered pixels and discard those where there are big difference of color or brightness. That's what does MSmooth by D. Graft for instance. Now consider that you write on an image pixels from smooth that you want to keep as white pixels, and the other ones from source as black pixels. You get what is called a mask. MaskTools deals with the creation, the enhancement and the manipulating of such mask for each component of the YV12 colorspace.

2) Description
This Avisynth 2.5 YV12-only plugin offers several functions manipulating clips as masks:
- EdgeMask will build a mask of the edges of a clip, applying thresholdings (proper values will enable or disable them).
- Inflate will 'inflate' the high values in a plane, by putting in the output plane either the average of the 8 neighbours if it's higher than the original value, otherwise the original value. The opposite function is called Deflate (dedicated to Phil Katz).
- Expand will 'expand' the high values in a plane, by putting in the output the maximum value in the 3x3 neighbourhood around the input pixel. The opposite function is called Inpand.
- Invert will invert the pixel (i.e. out = 255 - in); this can be also used to apply a 'solarize' effect to the picture
- Binarize will binarize the input picture depending on a threshold and a command.
- YV12Substract is the same as Subtract, also works in YV12, but *should* be a bit faster (because MMX optimised).
- YV12Layer is the equivalent of most of original Layer's operations, but in YV12; alpha channel isn't exploited, so it might or might not fill the same purpose.
- Logic will perform most typical logical operations (in fact, the ones provided by MMX mnemonics, though C functions are still available, mainly because of the picture dimensions limits).
- FitY2UV/FitY2U/FitY2V resizes Y plane and replace UV/U/V plane(s) by the result of the resize (you can specify your resizer filter, even one that isn't built-in AviSynth); the opposite functions are FitU2Y and FitV2Y Fast<Function> is the "fast" version (ie home-made) resizing only one plane.
- OverlayMask will compare 2 clips based on luminance and chrominance thresholds, and output whether pixels are close or not (close to what ColorKeyMask does).
- MaskedMerge will take 3 clips and apply a weighed merge between first and second clips depending on the mask represented by clip3.

In addition, all functions take 3 parameters: Y, U and V (except the FitPlane functions, where obviously the name tells what is processed). Depending on their value, different operations are applied to each plane:
- value=3 will do the actual process of the filter,
- value=2 will copy the 2nd video plane (if appliable) to the output corresponding plane
- value=1 will not process it (i.e., most often, left it with 1st clip plane or garbage - check by yourself)
- value=[-255...0] will fill the output plane with -value (i.e. to have grey levels, use U=128,V=128)

A last point is the ability of some functions to process only a part of the frame:
- this behaviour is set by the parameters (offX, offY) (position of the start point) and (w,h) (width and height of the processed area); filters should modify those parameters so that the processed area is inside the 2 pictures
- in case of a filter (except YV12Layer) using 2 clips, the 2 clips must have the same dimensions
- in all cases, the picture must be at least MOD8 (MOD16 sometimes) in order to enable the filter to use MMX functions (ie work at full speed)

This was intended for modularity and atomic operations (or as useful as possible), not really speed. It became both bloated and slow. I let you decide whether this statement is totally true, or a bit less... The examples of VI) are most probably much faster applied with the original filters.

III) Revisions

1.4.1
- Fixed the dreadly bug "multiple instances of a filter with different functions needed"

1.4.0
- Added an experimental LUT filter. Not tested, debug later.

1.3.0 (private version)
- Made usable the FitPlane function (still an overload of work when only one plane has to be resized) which was previously undocumented; therefore, added FastFitPlane functions (corresponding FitPlane ones should be useless now, except for the resizers settings)
- Allowed the specification of a processing area for many filters; however, this should not produce any noticable speed increase.
- Cleaned YV12Layer (in particular the unusable "Darken"/"Lighten" modes)
- Added OverlayMask, a function that compares 2 clips, and outputs a mask of the parts that are identical (slow and far from perfect).

1.2.0 (private version)
- YV12Layer: no more useless RGB32 conversion! Approximately the same as Arithmetic (except a third clip is not used), so that one is gone...
- YV12Substract: hey, why only a C version? Masks are really an underused feature of AviSynth |-[

1.1.0 (private version)
- Older inflate/deflate are renamed expand/inpand while newer functions replace them
- Logic and Arithmetic functions added (shouldn't produce the expected results because of no debugging)
- Edgemask now takes 4 thresholds (2 for luma and 2 for chroma). They are used for:
. setting to 0 or leaving as is a value depending on first threshold,
. setting to 255 or leaving as is a value depending on the second one.

1.0.2 (last version - public project dropped):
- Fix the shift for edgemask using sobel and roberts (misplaced MMX instruction)
- MaskMerge now works (mask cleared before being used... check with MaskMerge(clip3,clip3) for instance)

1.0.1: Initial release

IV) Developer's walkthrough

Skip to V) if you're not interested in developing the tools available.

The project is a VC++ 6 basic project. Each filter has its own folder which stores the header used by the interface, the source for the function members, the source for processing functions and its header. Let's look at EdgeMask:
- EdgeMask.h is included by the interface to know what the filter 'looks like' (but interface.cpp still holds the definition of the calling conventions and exported functions)
- EM_func.h describes the different processing functions (they should all have the same prototype/parameters):
. Line_MMX and Line_C
. Roberts_MMX and Roberts_C
. Sobel_MMX and Sobel_C
- EM_func.cpp, as all <filter's initials>_func.cpp, stores the implementation of the processing functions, and sometimes their MMX equivalents.
- EdgeMask.cpp implements the class; the constructor select the appropriate processing function (MMX? C? Roberts? Line? Sobel?) and uses it to fill the generic protected function pointer used in GetFrame

Interface.cpp stores the export function and all of the calling functions (AVSValue ... Create_<filter>).

ChannelMode.cpp defines the Channel operating modes. There could be added the equivalent of a debugprintf.

This quick walkthrough won't probably help most developers, as the examples of V) for users, but that's the best I've come with so far. It will improve of course over time depending on the success of the idea, which main drawback, speed, will probably make it scarcely used, if ever. <g>

V) Functions description (Y,U,V parameters described above)

1) EdgeMask(th1=[0..255], th2=[0..255], string):
Component is either Y, U or V; derivative is the result of the convolution by the differential kernel. MMX code is only used with MOD8 width (might not be necessary).
- string: which differential convolution to use:
. "line" will try to spot darker thin (1 pixel-wide) lines in the picture (it has an incorporated anti-noise floor which makes it ignore any variation lower than 3)
. "roberts" will apply a pseudo-Roberts 2x2 kernel:
2 -1
-1 0
. "sobel" will apply a pseudo-Sobel 3x3 kernel:
0 -1 0
-1 0 1
0 1 0
. "hq" uses a Laplacian kernel:
-1 -1 -1
-1 8 -1
-1 -1 -1
. "special" uses this kernel:
-1/4 0 -1/4
0 1 0
-1/4 0 -1/4
. "cartoon" uses the pseudo-Sobel kernel but only keeps components with lower value than their neighbours (i.e. negative values of the derivative)
- th1: threshold under which component value is set to 0
- th2: threshold under which component value is set to the derivative value, and above which it's set to 255

2) Inflate/Deflate(int offX, int offX, int w, int h)
(works in-place, so unprocessed planes contain the original data of the first clip) Replace the component value by the maximum (respectively minimum) in the 9 pixels of a 3x3 neighbourhood. The 4 parameters specify the area to process, as for all other plugins with such options. By default, the whole pciture is processed.

3) Inpand/Expand(int offX, int offX, int w, int h) (works in-place)
Replace the component value by the average of the values of its 8 neighbours if it is respectively lower or higher than the original value.

4) Invert(int offX, int offX, int w, int h) (works in-place)
Replace pixel's value by 255-pixel's value Binarize(upper=false) could be seen (but isn't processed as) as Invert().Binarize(upper=true)

5) Binarize(threshold=[0..255], upper=true/false, int offX, int offX, int w, int h) (works in-place)
- threshold is the limit to tell when to put 0 or 255
- upper = true will replace values higher than threshold by 0 and lower or equal by 255 upper = false simply does the opposite
- Nota: in YV12, the luma range is 16-235 (yay! I like limitations due to some analogical crap - doesn't matter much, but still a pain to care for), so don't be surprised, and set some of your levels to 17...

5) input.MaskedMerge(clip clip2, clip clip3, int offX, int offX, int w, int h) (works in-place, mode=2 will copy the plane of clip2 to clip1)
output = [(255-clip3)*input + clip3*clip2]/256
Therefore, it isn't perfectly weighed/normalized, but will still do the job. One could see as a simple (no alpha considered) layer("add")

6) YV12Subtract(clip clip1, clip clip2, int tolerance, int offX, int offX, int w, int h) (works in-place, mode=2 will copy the plane of clip2 to clip1)
Performs exactly the same as Subtract, but uses MMX.
- if mode<0:
if clip1 < clip2, output = 255 + clip1 - clip2
else output = clip1 - clip2
- otherwise
output' = |clip2 - clip1| - tol
if output'<0, output = output'
else output = output'
The last mode can be used in many cases to only process pixels that are different from an image (MaskOverlay would be nicer, though)

7) YV12Layer(clip clip1, clip clip2, string operator, int level, bool chroma, int offX, int offX, int w, int h)
Same as Layer, except "Lighten" and "Darken" were removed (the luma of the 2nd clip gaves the weighing for each luminance, but should be reused for chrominance, not practical in current architecture).

8) FitPlane(string resizer) has the following incarnations:
- luma to chroma: FitY2U, FitY2V, FitY2UV
- chroma to luma: FitU2Y, FitV2Y
- chroma to chroma: FitU2V, FitV2U
You can by this mean propagate a mask created on a particular plane to another plane.

9) FastFitPlane: it has the same incarnations as FitPlane, but use its own resizer (could replace AviSynth's C ReduceBy2 ;-)

10) MaskOverlayclip clip1, clip clip2, luma threshold tY, chroma threshold tC, int strictness)
If: U and V values of a group of 4 pixels of clip1 (due to the YV12's 4:2:0 sampling) is different by less than tC from clip2's corresponding pixel's chroma. 
Then: if more than strictness pixels out of those 4 pixels have a luma different from the corresponding pixels' luma by less than tY.
Then: mark the pixel satisfying to the 2 above conditions as (255,255,255) (process mask = full for maskedmerge)

VI) Some practical uses (not tested extensively)

Those won't produce the exact same results as the original filters they try to mimic, in addition to be far more slower. Despite the numerous additional functions, no newer idea.

Notes: 
- I'm too lazy to update the syntax, especially regarding how mode=2 works, and how EdgeMask was updated (now longer needs of a Binarize for instance)
- Some filters I describe as 'to create' already exist (imagereader, levels for clamping, ...).

1) MSharpen
. Build EdgeMask of clip1, Binarize it and store the result into clip3
. Apply any sharpening filter to clip1 and store it into clip2
. return MaskMerge(clip1,clip2,clip3):
The sharpened edges of clip2 higher than the threshold given to Binarize will be sharpened and used to replace their original value in clip1. You could also write a filter with a particular Look-up table (best would look like a bell), replace Binarize by it, and have a weighed sharpening depending on the edge value: that's the HiQ part in SmartSmoothHiQ

clip2 = clip1.<EdgeEnhancer>(<parameters>)
#U and V planes don't need filtering, Y needs it
#EdgeMask(<...>,"roberts",Y=3,U=-128,V=-128) for greyscale map
clip3 = clip1.EdgeMask(15,60,"roberts",Y=3,U=1,V=1)
return MaskedMerge(clip1,clip2,clip3)

2) MSoften
Replace EdgeEnhancer by a spatial softener (cascaded blurs? spatialsoftenMMX?) and use upper=true to select near-flat pixels.

3) Rainbow reduction (as described here: http://forum.doom9.org/showthread.php?s=&threadid=48167)
Warning, this isn't a miracle solution either
clip2 = clip1 soften at maximum (using deen("m2d") or edeen for instance)
#Get luma edgemap and increase edges by inflating
# -> wider areas to be processed
clip3 = clip1.EdgeMask(6,"roberts",Y=3,U=1,V=1).Inflate(Y=3,U=1,V=1)
#Now, use the luma edgemask as a chroma mask
clip3 = YtoUV(clip3,clip3).reduceby2().Binarize(15,upper=false,Y=1,U=3,V=3)
#We have to process pixels' chroma near edges, but keep intact Y plane
return MaskedMerge(clip1,clip2,clip3,Y=1,U=3,V=3)

4) Supersampled fxtoon
Not tested
. Use tweak to darken picture or make a plugin that scales down Y values -> clip2
. Build edge mask, Supersample this mask, Binarize it with a high threshold (clamping sounds better), Inflate it -> clip3
. Apply the darker pixels of clip2 depending on the values of clip3

5) Warpsharp for dark luma
Not tested
. Apply warpsharp -> clip2 (replacement pixels)
. Create a clamping filter or a low-luma bypass filter -> clip3 (mask)

6) pseudo-deinterlacer (chroma will still be problematic)
Not tested
. clip2 = clip1.SeparateFields().SelectEven().<Method>Resize(<parameters>)
. clip3 = clip1.<CombingDetector>(<parameters>)
. return MaskedMerge(clip1,clip2,clip3,Y=3,U=3,V=3)
(chroma even more problematic)

7) Non-rectangular overlays
In fact, this is handled more nicely by layer and mask...
#Simple hack because ImageReader needs an integer fps...
#Most sources are natively in YUY2/YV12
clip = avisource("test.avi").ConvertToYV12().assumefps(fps)
#Load the picture to be overlayed
image = ImageReader("mask.bmp",0,clip.framecount()-1,24,use_DevIl=false)
#Simple way: assume black is transparent 
#Any other colour would be quite more complicated*
masktemp=imageYV12.Binarize(17,upper=false,Y=3)
#We set the luma mask to fit the chroma planes
mask=mask.FitY2UV()
#Now that we have the mask that tells us what we want to keep...
#Replace by image the parts of clip masked by mask!
maskedmerge(clip,image,mask,Y=3,U=3,V=3)
#*solution: mask=OverlayMask(image,image.BlankClip("$xxxxxx"),1,1)

8) Replace backgrounds
This example clearly would look better in RGB. To avoid typical problems due to noise or compression, you would better use blurred versions of the clip and picture.
source=avisource("overlay.avi").assumefps(24)
#blur the source
clip=source.blur(1.58).blur(1.58).blur(1.58)
#load the background to replace, captured from the blurred sequence
bgnd=ImageReader("bgnd.ebmp",0,clip.framecount()-1,24,use_DevIl=false)
#load new background
new=ImageReader("new.ebmp",0,clip.framecount()-1,24,use_DevIl=false)
#integrated filter to output the mask = (clip~overlay?)
mask=OverlayMask(clip,overlay.ConvertToYV12(),10,10)
MaskedMerge(source,new.ConvertToYV12()s,mask,Y=3,U=3,V=3)

9) K-mfToon
I need to include more info (original urls/posts) but for now I think mfToon's original author, mf (mf@onthanet.net) will not react too violently to it, while it's still not addressed.
The output of the function inside K-mfToon.avs should be identical to the output of the original mftoon.avs (also included), with twice the speed.
The requirements are:
- For mfToon:
. load the plugins called "MaskTools", "warsharp", "awarsharp" 

VII) TODO

Nothing, it all depends in feeback