Skip to content

Google Summer of Code GSoC ideas

Frédéric Devernay edited this page Nov 19, 2015 · 12 revisions

This page contains ideas for development projects to be part of Google Summer of Code.

There are two main parts:

  • things that have to be done in the main application (called the host), which require skills in Qt and C++.
  • OpenFX plugins that have to be developped, which require skills in C++, computer graphics and image processing.

Natron host

Github project:

Natron heavily uses Qt, both for GUI and threads management. Natron still uses Qt 4, because the python bindings are still not available in Qt5. We will do the transition as soon as they become available.

The Natron code is split into two main parts: Engine and Gui. Engine manages the images, projects (including loading and saving), parameters, interface with OpenFX plugins (via the HostSupport library), and the rendering itself. Gui manages the graphical interfaces to all these things.

Rotoscoping

Curves editing

  • Curves consisting of many samples, such as tracking results or results from external algorithms, or imported ascii data, should be convertible to piecewise cubic curves for easier editing. Curves will be easier to edit if there are no cusp points at the junction (i.e. keyframes) between cubics (derivatives should be continuous). Several methods may be proposed and tested. Multi-dimensional parameters should share the same keyframes for all dimensions for easier editing.

Vectorscope, Waveform, Image analysis

  • Natron would benefit from vectorscope and waveform displays to complement the Histogram display.
  • Image analysis and picking can be improved by adding the possibility to draw a profile of the image values along a segment drawn by the user.

Color selecting dialog/widget

  • We need better color selection dialog (and widgets) than the platform-native one. For example, we need to be able to select negative values.

OpenFX plugins

Github projects:

An OpenFX plugin is usually a few hundred lines of code and follows the same global pattern.

There are several documentation resources available:

The missing plugins

This is a list of essential plugins that are missing from Natron. Some may be implementable as pyplugs, others as OpenFX plugins.

Channel

  • Copy (nothing more than a maskable Shuffle)

Color

  • HueCorrect
  • HueShift
  • Sampler (additionally, sampling could be done along a given segment rather than on a horizontal scanline)

Filter

Merge

Convolve

We need an FFT-based generic convolution node, preferably using fftw3.

Two inputs: Source and Filter, with optional mask.

Lens aperture generators

We need basic basic lens aperture generators for Bokeh effects (circle, blades, with or without chromatic aberration, etc.). They can be combined with Convolve to get a Defocus effect.

The Flare node in Nuke does that, and a lot more.

Zdefocus

The goal of depth-dependent defocus is, given a depth image and a all-in-focus color image, to blur each pixel in the color image depending on its depth. Many algorithms are available in the litterature have been proposed, but some of them produced coarse approximations of the result. The implementation in Natron should at least handle correctly occlusions by foreground objects. We propose the following method:

  • extract "depth-slices" from the image, using the depth image. a depth slice is black and transparent everywhere except where the depth is within some range, where it is the original RGBA data (it has to be premultiplied if input is unpremultiplied). If the value of stepBlend (see below) is not zero, one pixel may belong to two slices, with linear interpolation between slices.
  • the number of blur steps can be fixed, or may changes in blur magnitude smaller than 10% are generally indiscriminable (Mather and Smith 2002). This makes 2*log(10*3/2)/log(1.1) = 56 depth slices for a maximum size of 10.
  • in-focus slice: do not blur below blur_size=2/3 (corresponds to sigma=0.275 ~= sqrt(1/12), the variance of a uniform distribution over a pixel)
  • blur each slice with the proper blur size (FFT-based convolution is necessary for a proper Bokeh effect). The in-focus slice is not blurred.
  • if "occlusion" is checked: merge-over each slice, from the back to the from. That way, each blurred slice may occlude objects in the back.

Input clips:

  • Source
  • Z (Alpha only): Depth (must be aliased)
  • Filter (optional): Filter shape (RGB for or Alpha)

Parameters:

  • math: Specifies how the depthchannel is used to calculate the distance between the camera and an object. For example, some programs use higher values to denote further away, while in others they mean closer to the camera:
    • direct - The Z value in the depth channel directly controls blur. For example, if |Z-C| is 0.5, then the blur size will be 0.5 times the value of the size control (unless this is bigger than maximum, in which case it will be clamped to maximum).
    • depth - The Z value in the depth channel is the distance between the camera and whatever is in the image at that pixel.
    • far = 0 - The Z value in the depth channel is equal to 1/distance. The values are expected to decrease from large positive values close to the camera to zero at infinity. This is compatible with depth maps generated by Nuke and RenderMan.
    • far = 1 - Near plane = 0, far plane = 1. This is compatible with depth maps generated by OpenGL.
    • -direct - As with the direct mode, the Z value in the depth channel directly controls blur. In other words, each layer is blurred by the same amount as in the direct mode. However, in this mode, the layers are interpreted as being in the opposite order, so a higher depth value places a layer in front of another rather than behind it.
    • -depth - The Z value in the depth channel is -distance in front of the camera. This is the same as depth, but the distances are negative to start with.
    • far = -0 - The Z value in the depth channel is equal to -1/distance. The values are expected to increase from large negative values close to the camera to zero at infinity. This is compatible with depth maps generated by Maya.
    • far = -1 - Near plane = 0, far plane = -1.
  • occlusion (boolean): If checked, layers are processed from back to front, and each layer is merged using the "over" operator. If unchecked, layers are processed by blur size, and the "plus" operator is used.
  • setup_mode "focal plane setup": If checked, in-focus area are colored in green (g+=0.5), areas farther than focus are in blue, and closer than focus in blue.
  • center "focus plane (C)" (focusCenter in Shake): The value of Z which is in focus.
  • focal_point: (with an attached host interact) Moving this point sets C to the value at this point.
  • dof "depth of field" (focusRange in Shake): The distance away from the focusCenter, both towards and away from the camera, that remains un-blurred. This should theoretically be 0. Half this value is subtracted from the blur size factor.
  • size: size of blur at infinity (size of blur when |Z-C| = 1 if math=direct). Actual blur size is obtained by multiplying this by the blur size factor (which depends on math).
  • maximum: maximum blur size. Setting this to a large value with autoLayerSpacing slows down rendering.
  • autoLayerSpacing: Compute the number of layers from the maximum blur size value.
  • layers (steps in Shake): The amount of steps that the total range is divided between.
  • layerCurve: 0 gives evenly-spaced layers. 1 gives geometric progression between layers. Higher values concentrate layers around the
  • layerBlend (stepBlend in Shake): The mixing of the different layers. 0 is no mixing and good for getting a feel for your step ranges. 1 is complete, linear blending.
  • filter: any of the BlurCImg filters, or disc, or image.

Retiming / Warping / Distortion / Motion blur / Shutter unrolling

Basic method

  • All these functionalities require the same basic tool: given a displacement function D(x,y) defined at each pixel (x,y) in the image, giving the displacement in pixels to apply at that point, compute the displaced image (this is sometimes called "pushing pixels"). Retiming uses the optical flow from and to the next image to compute an intermediate image by blending "pushed" images.
  • Pushing can be implemented easily in OpenGL by mesh warping. Mesa3D can be used for software rendering when OpenGL is not available.
  • Warping uses a function which can be given by given by lens distortion, a low-resolution deformation grid, a spline function, etc., to push pixels.
  • Motion blur is obtained by blending the generated images obtained over a range of displacements corresponding to the shutter time. It could work either by generating a fixed number of images over the shutter period and blending them, or by using an adaptative algorithm (see how it is implemented in ofxsTransform3x3.cpp using Metropolis sampling).
  • Lens distortion and convertion from/to polar coordinates are easy to implement once these tools are available.
  • Shutter unrolling is used to correct images taken with a rolling shutter camera (such as most CMOS-based cameras, including RED). In these images, each line is taken at a different time, causing deformation artifacts. Shutter unrolling consists in using the optical flow to generate a virtual image taken with a global shutter.

Pixel pushing may be implemented by drawing a warped mesh. A triangular mesh renderer has to be used (either an existing one with a compatible open-source licence, or one has to be written).

Corresponding Nuke nodes: Oflow, Kronos, Vectorblur, GridWarp, SplineMorph, LensDistortion, RollingShutter...

Advanded method

Moving gradients: a path-based method for plausible image interpolation

PoissonMerge

Another version of the Merge node could be used for Poisson image editing:

Frei0r meta-plugin

frei0r is a minimalist plugin suite for video effects.

Encapsulation in OpenFX plugins should be easy, since the frei0r spec is very simple and close to what OpenFX requires.

G'MIC meta-plugin

Work was started https://github.com/MrKepzie/openfx-gmic on writing a G'MIC meta-plugin, which should contain hundreds of plugins similar to what is available in GIMP's G'MIC plugin. There is also an interactive application with G'MIC, called ZArt, which implements a reduced number of effects. Maybe we should start with the plugins implemented by ZArt first.

Also check out the work by Tobias Fleisher on bringing G'MIC to After Effects https://discuss.pixls.us/t/gmic-for-adobe-after-effects-and-premiere-pro/452

FFmpeg filters meta-plugin

http://ffmpeg.org/ffmpeg-filters.html

The problem is that parameters for these filters are not well specified, so creating OpenFX plugins for each filter probably requires a lot of manual work.

MLT meta-plugin

MLT is a framework for audio/video editing.

It has its own set of plugin, but many video plugins actually come from frei0r or FFmpeg.

Natural image matting (without green/blue screen)

http://www.wisdom.weizmann.ac.il/~levina/papers/Matting-Levin-Lischinski-Weiss-PAMI.pdf

http://www.wisdom.weizmann.ac.il/~levina/papers/Matting-Levin-Lischinski-Weiss-CVPR06.pdf

Matlab code available: http://people.csail.mit.edu/alevin/matting.tar.gz

Eduardo S. L. Gastal and Manuel M. Oliveira, Shared Sampling for Real-Time Alpha Matting, Eurographics, 2010

KNN matting: http://dingzeyu.li/projects/knn/index.html

For a comparison of matting methods: http://www.alphamatting.com/

Available implementation(s): vfx-matting

This implementation may either use CImg or OpenCV (but may also use neither), and may also use an external library such as CSparse (or any other sparse linear system solver with a compatible licence) for sparse system solving. All pointers to OpenFX documentation and guides are given above on this page. The OpenFX plugin should take two inputs: an image containing the strokes/samples (e.g. red for foreground, green/blue for background, black for unclassified), and the image for which to compute the matte. On output, the alpha channel of the image is set to the matte.

A CImg-based implementation would start from a fork of "openfx-misc", and an OpenCV-based implementation would start from a fork of "openfx-opencv".

See also:

SlitScan

A node that applies a per-pixel time offset. Historical reference: invented by Douglas Trumbull for 2001 (see video). For an example, see this video by Adrien M / Claire B.

The default time map is a linear function of y (as in the original slitscan), which is 0 at the top and 1 at the bottom of the image, but there can be an optionnal single-channel "TimeMap" input.

This plugin requires to render many frames on input, which may fill up the host cache, but if more than 4 frames required on input, Natron renders them on-demand, rather than pre-caching them (see @39d9960). Time offsets that "fall between" frames should either be mapped to the nearest frame (shutter=0), or averaged over several frames (shutter=1 corresponds to linear interpolation, higher shutter may blend more than two frames).

The parameters are:

  • offset for the TimeMap (default = -0.5)
  • gain for the TimeMap (default = 10)
  • absolute, a boolean indicating that the time map gives absolute frames rather than relative frames
  • frame range, only if TimeMap is connected, because the frame range cannot be guessed without looking at the image (default = -5..5). If "absolute" is checked, this frame range is absolute, else it's relative.

FFmpeg reader/writer enhancements

Audio

We should have an AudioRead-like node to show the audio waveform in the curve editor, and play it when flipbooking.

We should be able to mux an audio track in an output video, either from an audio file, or from another video file. See the ffmpeg.c source in the ffmpeg distribution for hints on how to do it.

Presets

done The writer should be simpler to use, with a few useful presets:

  • Create XDCAM HD422 files in .mov or .mxf
  • Create XDCAM IMX/D-10 files in .mov or .mxf
  • Create AVID DNxHD files in .mov
  • Create DVCPROHD files in .mov or .mxf
  • Create ProRes 422 or 4444 files in .mov

References: