Tool-Plugin — komplette Face-Restoration-Pipeline (detect + align + restore + blend)
Repliziert die Q-engineering GFPGAN-ncnn-Pipeline mit den nativen OpticScript-Operationen für alles ausser den beiden ML-Schritten. Erkennt Gesichter mit 3-Punkt-Landmarks (gfpgan-detect), berechnet in JS eine Procrustes-Ähnlichkeitstransformation, verzerrt jedes Gesicht auf 512×512, restauriert mit GFPGAN, dann zurück per erodiertem + Gauss-weichgezeichnetem Mask-Blend. Ersetzt drei C++-Binaries (align, merge, upscale) durch native Engine-Calls — mit live-justierbarem BLEND_FEATHER-Slider.
INPUT
JavaScript
// Tool plugin demo — full face restoration pipeline orchestrated in JS
// tool_face_restore_full.js
//!INPUT: INPUT
//!OUTPUT: OUTPUT
//!PARAM: THRESHOLD:number=0.35,min=0.05,max=0.9,step=0.05
//!PARAM: BLEND_FEATHER:number=2.0,min=0.2,max=3.0,step=0.1
//!PARAM: UPSCALE_BG:boolean=true
//!PARAM: COLOR_MATCH:boolean=true
//!PARAM: BLEND:enum(soft|tight|cpp-style)=cpp-style
//!SAMPLE fast(UPSCALE_BG=false, COLOR_MATCH=false, BLEND=soft)
//!SAMPLE quality(UPSCALE_BG=true, COLOR_MATCH=true, BLEND=cpp-style, BLEND_FEATHER=2.0)
//!SAMPLE newsprint(UPSCALE_BG=true, COLOR_MATCH=true, BLEND=cpp-style, BLEND_FEATHER=2.5, THRESHOLD=0.3)
// Replicates the Q-engineering GFPGAN pipeline using OpticScript's
// native ops for everything except the two ML steps:
//
// 1. Engine.tool('gfpgan-detect') → faces + 3-point landmarks
// 2. JS similarity-transform fit → forward affine (orig→512)
// 3. img.warpPerspective(512, 512) → aligned face
// 4. Engine.tool('gfpgan-restore') → restored 512×512
// 5. inverse affine + warpPerspective → paste back to orig coords
// 6. erode + gaussianBlur on a warped white square → soft mask
// 7. applyMask + blendAt → composite restored face onto orig
//
// Result: we replace the C++ `gfpgan_align` + `gfpgan_merge` binaries
// (28 MB total) with ~80 lines of JS, gaining live-tunable feather
// width and a pipeline the user can inspect / fork.
// ── Linear algebra: the built-in Matrix toolbox ────────────────────────────
//
// The orig→512 face-alignment fit and its inverse used to be ~55 lines
// of hand-rolled Procrustes math here. They are now one-liners on the
// built-in `Matrix.Mat3` class:
//
// const fwd = Matrix.Mat3.estimateSimilarity(srcPts, FACE_TEMPLATE);
// const inv = Matrix.Mat3.clone(fwd).invert();
//
// `Mat3` is column-major (gl-matrix); `img.warpPerspective()` accepts a
// `Mat3` directly and transposes to the engine's row-major layout for
// us — so we never touch raw matrix indices.
// ── Reference template (matches gfpgan_align in mlc-ncnn-img2img) ──────────
// Three keypoints in the 512×512 GFPGAN-trained coordinate system:
// - left eye, right eye, mouth-centre
// These exact values come from the Q-engineering reference impl.
const FACE_TEMPLATE = [
[192, 240], // left eye
[319, 240], // right eye
[257, 371], // mouth centre
];
// ── Pipeline ──────────────────────────────────────────────────────────────
const orig = Engine.loadImage(INPUT);
const W = orig.width, H = orig.height;
// 1. Detect faces with 3-point landmarks (always in original coords)
const faces = Engine.tool('gfpgan-detect').applyJSON(orig, {
threshold: THRESHOLD,
});
Engine.log(`detected ${faces.length} face(s)`);
if (faces.length === 0) {
orig.save(OUTPUT);
orig.free();
} else {
// 2. Optional background upscale via Real-ESRGAN. Two reasons it
// matters:
// (a) GFPGAN always produces 512×512 output. Without upscale,
// that gets compressed back to a small face area —
// downsampling kills most of the added detail.
// (b) The C++ reference pipeline (gfpgan_upscale) ALWAYS
// runs Real-ESRGAN on the BG, which subtly denoises +
// smooths the photo so the GFPGAN face doesn't visually
// stand out as "AI-look-on-raw-photo". Without it you
// get a noticeable boundary around restored faces.
//
// No size gate any more — trust the user. For a 2048×1536
// input, output is 8192×6144 (~50 MP), ~60 MB PNG. Disable
// via UPSCALE_BG=false if memory or time is tight.
let composite, scale = 1, Wout = W, Hout = H;
if (UPSCALE_BG) {
composite = Engine.tool('mlcupscale').apply(orig);
scale = 4;
Wout = W * scale;
Hout = H * scale;
Engine.log(`background upscaled ${W}×${H} → ${Wout}×${Hout} (4×)`);
} else {
composite = orig.clone();
Engine.log(`background kept at ${W}×${H} (no upscale)`);
}
for (let i = 0; i < faces.length; i++) {
const face = faces[i];
// 3. Compute the orig→512 similarity transform. `forward`
// maps a pixel in the ORIGINAL image to its 512×512 aligned
// position; `inverse` is its inverse, mapping aligned
// coords back to original coords.
//
// `warpPerspective` takes a **src→dst** (forward) transform
// — for each src pixel, where it lands in the dst-grid —
// and accepts a `Mat3` directly.
const srcLandmarks = face.pts.map(p => [p.x, p.y]);
const forward = Matrix.Mat3.estimateSimilarity(srcLandmarks, FACE_TEMPLATE);
const inverse = Matrix.Mat3.clone(forward).invert();
// 4. Warp orig → aligned 512×512. src=orig, dst=512×512, so
// we pass the orig→512 forward matrix. Keep this around —
// it doubles as a colour-reference for the COLOR_MATCH
// step in (5b) below.
const aligned = orig.clone().warpPerspective(512, 512, forward);
// 5. Restore via GFPGAN — produces a fresh 512×512 ImageHandle.
const restored = Engine.tool('gfpgan-restore').apply(aligned);
// 5b. Optional Reinhard colour transfer: align the restored
// face's per-channel mean/std to the original aligned
// face's stats. GFPGAN tends to brighten skin + reduce
// colour variance — without this step the restored face
// shows a distinct tonal jump against the surrounding
// photo. Alpha-weighted internally so the
// warpPerspective transparent border doesn't skew the
// stats.
if (COLOR_MATCH) {
restored.colorMatch(aligned);
}
aligned.free();
// 6. Warp restored → composite coords. src=512, dst=Wout×Hout,
// so we need the 512→Wout-coords forward matrix. That is the
// composition (orig→Wout) · (512→orig), where the first
// factor is the uniform background upscale and the second
// is `inverse`. Mat3.multiply does `this = this · b`, so
// `scaleM.multiply(inverse)` gives exactly scaleM · inverse.
const inverseScaled = scale === 1
? inverse
: new Matrix.Mat3().fromScaling([scale, scale]).multiply(inverse);
restored.warpPerspective(Wout, Hout, inverseScaled);
// 7. Build the blend mask. Three strategies, switchable via
// BLEND param so we can compare visually:
//
// "soft" — single erode + small blur. Quickest, but the
// blur tail can leak into the warpPerspective
// "outside-source = transparent black" zone and
// produce a dark fringe.
//
// "tight" — same as soft but with a bigger erode, so the
// blur stays strictly inside the warped quad.
// Smaller darkening risk, slightly less smooth.
//
// "cpp-style" — replicates Q-engineering's two-mask trick
// from merge.cpp:
// 1. hardMask = warped quad ⊖ small erode
// (defines where the restored face is
// allowed to contribute at all)
// 2. RGB is hard-clipped to hardMask via
// applyMask + premultiplyAlpha — so any
// blur-tail outside this boundary sees
// RGB=0 and can't darken the composite
// 3. softMask = warped quad ⊖ (small + big)
// eroded, then Gaussian-blurred. Its
// 0-fade lies strictly INSIDE hardMask,
// so where alpha > 0 RGB is always valid
//
// Feather scales with face AREA in the upscaled space
// (× scale²) so the blend looks the same whether you
// upscale or not.
const faceArea = face.rect.w * face.rect.h * scale * scale;
const wEdge = Math.max(2, Math.round(Math.sqrt(faceArea) / 20 * BLEND_FEATHER));
if (BLEND === "cpp-style") {
// ── C++-style two-mask blend ─────────────────────────────
const warpedQuad = Engine.createColoredImage(512, 512, [1, 1, 1, 1]);
warpedQuad.warpPerspective(Wout, Hout, inverseScaled);
// Hard mask: tiny erode (~2 px) — strips warp anti-aliasing
// but keeps face content full-strength right up to the
// boundary, matching cv::bitwise_and(…, inv_mask_erosion).
const hardMask = warpedQuad.clone().erode(2);
// Soft mask: bigger erode (wEdge*2 further) + blur. The
// gradient zone of softMask lies between the inner-erode
// boundary and the hardMask boundary, never beyond.
const softMask = warpedQuad.clone().erode(2 + wEdge * 2).gaussianBlur(wEdge);
warpedQuad.free();
restored.applyMask(hardMask); // alpha = hardMask (0 outside)
restored.premultiplyAlpha(); // RGB *= alpha → 0 outside hardMask
restored.applyMask(softMask); // re-set alpha to soft gradient
hardMask.free();
softMask.free();
composite.blendAt(restored, px(0, 0), 1.0, BlendMode.Over);
} else {
// ── Original single-mask blend (BLEND="soft" | "tight") ──
const erodeSize = BLEND === "tight" ? wEdge * 4 : wEdge * 2;
const blurSigma = BLEND === "tight"
? Math.max(0.5, wEdge * 0.3)
: Math.max(0.5, wEdge * 0.5);
const mask = Engine.createColoredImage(512, 512, [1, 1, 1, 1]);
mask.warpPerspective(Wout, Hout, inverseScaled);
mask.erode(erodeSize).gaussianBlur(blurSigma);
restored.applyMask(mask);
mask.free();
composite.blendAt(restored, px(0, 0), 1.0, BlendMode.Over);
}
restored.free();
Engine.log(`face ${i + 1}/${faces.length}: area=${faceArea.toFixed(0)} feather=${wEdge}px`);
}
composite.save(OUTPUT);
composite.free();
orig.free();
}
// © 2026 Michael Lechner · mlc OpticScript · https://mlcgo.eu · Elastic License 2.0