Compositing Digital Images (和訳)

https://keithp.com/~keithp/porterduff/
- https://keithp.com/~keithp/porterduff/p253-porter.pdf

Thomas Porter Tom Duff 'f
Computer Graphics Project Lucasfilm Ltd.

ABSTRACT

Most computer graphics pictures have been computed all at once, so that the rendering program takes care of all computations relating to the overlap of objects.

殆どのコンピュータグラフィックスピクチャは一度にすべて計算するので、このレンダリングプログラムではオブジェクトの重なりを全て計算処理する。

There are several applications, however, where elements must be rendered separately,　relying on compositing techniques for the anti-aliased accumulation of the full image.
This paper presents the case for four-channel pictures, demonstrating that a matte component can be computed similarly to the color channels.The paper discusses guidelines for the generation of elements and the arithmetic for their arbitrary compositing.

しかしながら、要素を別々にレンダリングしなければならないアプリケーションがいくつかあり、アンチエイリアスされたフル画像の合成技術を利用する。
当論文では、4チャンネル画像のケースを導入し、マット成分をカラーチャンネルと同様に計算できることを示す。この論文では、要素の生成とそれらの任意の合成のための算術に関するガイドラインについて説明する。

CR Categories and Subject Descriptors:
1.3.3 [Computer Graphics]: Picture/Image Generations -- Display algorithms;
1.3.4 [Computer Graphics]: Graphics Utilities -- Software support;
1.4.1 [Image Processing]: Digitization -- Sampling.
General Terms: Algorithms Additional Key Words and Phrases: compositing, matte channel, matte algebra, visible surface algorithms, graphics systems

CRカテゴリと題名：

1.3.3 [Computer Graphics]：画像/画像生成 - 表示アルゴリズム。
1.3.4 [コンピュータグラフィックス]：グラフィックスユーティリティ - ソフトウェアサポート。
1.4.1 [画像処理 ]：デジタル化 - サンプリング。

一般用語：アルゴリズム追加キーワードとフレーズ：合成、マットチャンネル、マット代数、可視表面アルゴリズム、グラフィックスシステム

1. Introduction

Increasingly, we find that a complex three dimensional scene cannot be fully rendered by a single program.
The wealth of literature on rendering polygons and curved surfaces, handling the special cases of fractals and spheres and quadrics and triangles, implementing refinements for texture mapping and bump mapping, noting speed-ups on the basis of coherence or depth complexity in the scene, suggests that multiple programs are necessary.
In fact, reliance on a single program for rendering an entire scene is a poor strategy for minimizing the cost of small modeling errors. Experience has taught us to break down large bodies of source code into separate modules in order to save compilation time. An error in one routine forces only the recompilation of its module and the relatively quick reloading of the entire program. Similarly, small errors in coloration or design in one object should
not force the "recompilation" of an entire image.
Separating the image into elements which can be independently rendered saves enormous time.
Each element has an associated matte, coverage information which designates the shape of the element. The eompositing of those elements makes use of the mattes to accumulate the final image.
The compositing methodology must not induce aliasing in the image; soft edges of the elements must be honored in computing the final image.
Features should be provided to exploit the full associativity of the compositing process;
this affords flexibility, for example, for the accumulation of several foreground elements into an aggregate foreground which can be examined over different backgrounds.
The compositor should provide facilities for arbitrary dissolves and fades of elements during an animated sequence.
Several highly successful rendering algorithms have worked by reducing their environments to pieces that can be combined in a 2 1/2 dimensional manner, and then overlaying them either front-to-back or back-to-front [3].
Whitted and Weimar's graphics test-bed [6] and Crow's image generation environment [2] are both designed to deal with heterogenously rendered elements.
Whitted and Weimar's system reduces all objects to horizontal spans which are eomposited using a Warnock-like algorithm.
In Crow's system a supervisory process decides the order in which to combine images created by independent special-purpose rendering processes.
The imaging system of Warnock and Wyatt [5] incorporates 1-bit mattes.
The Hanna-Barbera cartoon animation system [4] incorporates soft-edge mattes, representing the opacity information in a less convenient manner than that proposed here.
The present paper presents guidelines for rendering elements and introduces the algebra for compositing.

ますます複雑になる3次元シーンを1つのプログラムで完全にレンダリングすることはできない。
ポリゴンや曲面のレンダリング、立体的な球や球の特殊なケースの扱い、テクスチャマッピングやバンプマッピングの改良、シーンの一貫性や深さの複雑さに基づくスピードアップなど、豊富な文献が提案されています複数のプログラムが必要である。
実際、シーン全体をレンダリングするための単一のプログラムに依存することは、小さなモデリングエラーのコストを最小限に抑えるための貧弱な戦略といえる。経験により、コンパイル時間を節約するために、大量のソースコードを別々のモジュールに分解することを学習した。あるルーチンのエラーは、モジュールの再コンパイルとプログラム全体の比較的速い再ロードのみを強制する。同様に、1つのオブジェクトの着色やデザインの誤差が小さい画像全体の「再コンパイル」を強制してはならない。
画像を独立してレンダリングできる要素に分離することは、膨大な時間を節約する。
各要素には、要素の形状を指定するカバレッジ情報が関連付ける。これらの要素を組み合わせることで、マットを使用して最終的な画像を蓄積する。
合成方法は、画像にエイリアシングを引き起こすものであってはならない。要素のソフトエッジは、最終画像を計算する際に尊重されるべきである。
合成プロセスの完全な結合性を利用するための機能を提供する必要がある。
これは、例えば、いくつかの前景要素を、異なる背景にわたって検査することができる集合的な前景に蓄積するための柔軟性を提供する。
コンポジターは、アニメーション化されたシーケンス中に任意のディゾルブとフェードのための機能を提供する必要がある。
非常に成功したレンダリングアルゴリズムのいくつかは、環境を2/1/2次元の方法で組み合わせて前後に重ね合わせることで作業した。
WhittedとWeimarのグラフィックステストベッドとCrowの画像生成環境は、異種レンダリングされた要素を扱うように設計されている。
WhittedとWeimarのシステムは、Warnockのようなアルゴリズムを使ってすべてのオブジェクトを水平スパンに縮小する。
Crowのシステムでは、監視プロセスは、独立した特殊用途のレンダリングプロセスによって作成された画像を組み合わせる順序を決定する。
WarnockとWyattのイメージングシステムには、1ビットのマットが組み込まれています。
ハンナ・バーベラ漫画アニメーションシステムはソフトエッジマットを採用しており、不透明情報をここで提案された方法よりも不便な方法で表現している。
本論文では、要素を描画するためのガイドラインを示し、合成の代数を紹介する。

2. The Alpha Channel

A separate component is needed to retain the matte information, the extent of coverage of an element at a pixel.
In a full color rendering of an element, the RGB components retain only the color. In order to place the element over an arbitrary background, a mixing factor is required at every pixel to control the linear interpolation of foreground and background colors.
In general, there is no way to encode this component as part of the color information. For anti-aliasing purposes, this mixing factor needs to be of comparable resolution to the color channels.
Let us call this an alpha channel, and let us treat an alpha of 0 to indicate no coverage, 1 to mean full coverage, with fractions corresponding to partial coverage.

マット情報を保持するために、ピクセルでの要素のカバレッジの範囲が別個のコンポーネントにする必要がある。
要素のフルカラーレンダリングでは、RGBコンポーネントは色のみを保持する。要素を任意の背景上に配置するために、前景色と背景色の線形補間を制御するために、各ピクセルに混合係数が必要である。
一般に、このコンポーネントをカラー情報の一部としてエンコードする方法はない。アンチエイリアシングの目的では、この混合係数はカラーチャネルに匹敵する解像度でなければならない。。
これをアルファチャンネルと呼ぶ事にしよう。.0のアルファベットはカバレッジなし、1はフルカバレッジ、部分カバレッジは対応する分数で扱う。

In an environment where the compositing of elements is required, we see the need for an alpha channel as an integral part of all pictures.
Because mattes are naturally computed along with the picture, a separate alpha component in the frame buffer is appropriate.
Off-line storage of alpha information along with color works conveniently into run-length encoding schemes because the alpha information tends to abide by the same runs.
What is the meaning of the quadruple (r,g,b,a) at a pixel?
How do we express that a pixel is half covered by a full red object?
One obvious suggestion is to assign (1,0,0,.5} to that pixel: the .5 indicates the coverage and the (1,0,0) is the color.
There are a few reasons to dismiss this proposal, the most severe being that all compositing operations will involve multiplying the 1 in the red channel by the .5 in the alpha channel to compute the red contribution of this object at this pixel.
The desire to avoid this multiplication points up a better solution, storing the pre-multiplied value in the color component, so that (.5,0,0,.5) will indicate a full red object half covering a pixel.
The quadruple (r,g,b,a) indicates that the pixel is a covered by the color (r/a, g/a, b/a).
A quadruple where the alpha component is less than a color component indicates a color outside the [0,1] interval, which is somewhat unusual.
We will see later that luminescent objects can be usefully represented in this way.
For the representation of normal objects, an alpha of 0 at a pixel generally forces the color components to be 0.
Thus the RGB channels record the true colors where alpha is 1, linearly darkened colors for fractional alphas along edges, and black where alpha is 0.
Silhouette edges of RGBA elements thus exhibit their anti-aliased nature when viewed on an RGB monitor.
It is important to distinguish between two key pixel representations:
black ~- (0,0,0,1);
clear-~ (0,0,0,0).
The former pixel is an opaque black; the latter pixel is transparent.

要素の合成が必要な環境では、すべての画像の不可欠なパーツとしてアルファチャンネルが必要であることがわかります。
マットはピクチャと共に自然に計算されるため、フレームバッファ内の別個のアルファ成分が適切です。
オフラインの記憶アルファので色と共に情報はランレングス符号化方式に都合よく働くアルファ情報は、同じランに従う傾向があります。
あるピクセルで四つ組（r、g、b、a）の意味は何ですか？

ピクセルが完全な赤いオブジェクトの半分で覆われていると表現するにはどうすればよいですか？

1つの明白な示唆は、そのピクセルに（1,0,0,.5} を割り当てることである：.5はカバレッジを示し、（1,0,0）はカラーである。

この提案を却下する理由はいくつかあります。最も重大なのは、すべての合成操作で赤チャンネルの1にアルファチャンネルの.5を掛け、このピクセルでこのオブジェクトの赤の寄与を計算することです。

この乗算を避ける欲求は、より良い解決法を示し、色成分に予め乗算された値を格納するので、(.5,0,0,.5) は、四つ組 (r,g,b,a) は、ピクセルが色 (r/a, g/a, b/a) で覆われていることを示します。

アルファ成分がカラー成分よりも小さい四つ組、[0,1] インターバルの外側の色を示しますが、これはやや珍しいことです。

このような輝度オブジェクトが有用な表現である事は、後で分かるでしょう。

通常のオブジェクトの表現の場合、ピクセルで0のアルファは一般的にカラー成分を0にします。

したがって、RGBチャンネルは、アルファが1 である真の色、エッジに沿って分数アルファに線形に暗い色、アルファが0である黒を記録する。

したがって、RGBA要素のシルエットエッジは、RGBモニタで見るとアンチエイリアス処理された性質を示します。

区別することが重要であるとの間の二つの重要なピクセル表現：

black ~- (0,0,0,1);
clear-~ (0,0,0,0).

前のピクセルは不透明な黒です。後者の画素は透明である。

3. RGBA Pictures

If we survey the variety of elements which contribute to a complex animation, we find many complete background images which have an alpha of 1 everywhere.
Among foreground elements, we find that the color components roll off in step with the alpha channel, leaving large areas of transparency.
Mattes, colorless stencils used for controlling the compositing of other elements, have 0 in their RGB components.
Off-line storage of RGBA pictures should therefore provide the natural data compression for
handling the RGB pixels of backgrounds, RGBA pixels of foregrounds, and A pixels of mattes.
There are some objections to computing with these RGBA pictures.
Storage of the color components premultiplied by the alpha would seem to unduly quantize
the color resolution, especially as alpha approaches 0.
However, because any compositing of the picture will require that multiplication anyway, storage of the product forces only a very minor loss of precision in this regard.
Color extraction, to compute in a different color space for example, becomes more difficult.
We must recover (r/a, g/a, b/a), and once again, as alpha approaches 0, the precision falls off sharply.
For our applications, this has yet to affect us.

4. The Algebra of Compositlng

Given this standard of RGBA pictures, let us examine how compositing works.
We shall do this by enumerating the complete set of binary compositing operations.
For each of these, we shall present a formula for computing the contribution of each of two input pictures to the output composite at each pixel.
We shall pay particular attention to the output pixels, to see that they remain pre-multiplied by their alpha.

4.1. Assumptions

When blending pictures together, we do not have information about overlap of coverage information within a pixel; all we have is an alpha value.
When we consider the mixing of two pictures at a pixel, we must make some assumption about the interplay of the two alpha values.
In order to examine that interplay, let us first consider the overlap of two semi-transparent elements like haze, then consider the overlap of two opaque, hard-edged elements.

If a A and aB represent the opaqueness of semitransparent objects which fully cover the pixel, the computation is well known.
Each object lets (l-a) of the background through, so that the background shows through only (1-aA)(1-aB) of the pixel, aA(l-a~) of the background is blocked by object A and passed by object B; (1-~A)a B of the background is passed by A and blocked by B.
This leaves OlAOl B of the pixel which we can consider to be blocked by both.
If ol A and a B represent subpixel areas covered by opaque geometric objects, the overlap of objects within the pixel is quite arbitrary.
We know that object A divides the pixel into two subpixel areas of ratio C~A:l-a A.
We know that object B divides the pixel into two subpixel areas of ratio crB:l-cr13. Lacking further information, we make the following assumption: there is nothing special about the shape of the pixel; we expect that object B will divide each of the subpixel areas inside and outside of object A into the same ratio a/3:l-a B.
The result of the assumption is the same arithmetic as with semi-transparent objects and is summarized in the following table:

(テーブル図)

The assumption is quite good for most mattes, though it can be improved if we know that the coverage seldom overlaps (adjacent segments of a continuous line) or always overlaps (repeated application of a picture).
For ease in presentation throughout this paper, let us make this assumption and consider the alpha values as representing subpixel coverage of opaque objects.

4.2. Compositing Operators

Consider two pictures A and B.
They divide each pixel into the 4 subpixel areas

(テーブル図)

listed in this table along with the choices in each area for contributing to the composite.
In the last area, for example,
because both input pictures exist there, either could survive to the composite. Alternatively, the composite could be clear in that area.
A particular binary compositing operation can be identified as a quadruple indicating the input picture which contributes to the composite in each of the four subpixel areas 0, A, B, AB of the table above.
With three choices where the pictures intersect, two where only one picture exists and one outside the two pictures, there are 3 X 2 X 2 X 1~---12 distinct compositing operations listed in the table below.
Note that pictures A and B are diagrammed as covering the pixel with triangular wedges whose overlap conforms to the assumption above.

(テーブル図)

Useful operators include A over/5, A in B, and A held out by B.
A over B is the placement of foreground A in front of background B.
A in B refers only to that part of A inside picture B. A held out by B, normally shortened to A out B, refers only to that part of A outside picture B.
For completeness, we include the less useful operators A atop B and A xor B. A atop B is the union of A in B and B out A.
Thus, paper atop table includes paper where it is on top of table, and table otherwise;
area beyond the edge of the table is out of the picture.
A xor Bis the union of A out B and B out A.

4.3. Compositing Arithmetic

For each of the compositing operations, we would like to compute the contribution of each input picture at each pixel.
This is quite easily solved by recognizing that each input picture survives in the composite pixel only within its own matte.
For each input picture, we are looking for that fraction of its own matte which prevails in the output.
By definition then, the alpha value of the composite, the total area of the pixel covered, can be computed by adding aA times its fraction F A to aB times its fraction
The color of the composite can be computed on a component basis by adding the color of the picture A times its fraction to the color of picture B times its fraction.
To see this, let CA, cB, and c o be some color component of pictures A, B and the composite, and let CA, CB, and C o be the true color component before pre-multiplication by alpha.
Then we have
(数式) C 0 ~ OLoC 0
Now C o can be computed by averaging contributions made by C A and CB, so
(数式) OIAFACA-I-OeBFBC B
C o = o~ o. aAFA+O~BFB
but the denominator is just ao, so
(数式) ~o = "AFACa+.BF,%
CA C B
: CAFA-FCBFB (1)
Because each of the input colors is pre-multiplied by its alpha, and we are adding contributions from nonoverlapping areas, the sum will be effectively premultiplied by the alpha value of the composite just computed.
The pleasant result that the color channels are handled with the same computation as alpha can be traced back to our decision to store pre-multiplied RGBA quadruples.
Thus the problem is reduced to finding a table of fractions F A and F B which indicate the extent of contribution of A and B, plugging these values into equation 1 for both the color and the alpha components.
By our assumptions above, the fractions are quickly determined by examining the pixel diagram included in the table of operations.
Those fractions are listed in the F A and F B columns of the table. For example, in the
A over B case, picture A survives everywhere while picture B survives only outside picture A, so the corresponding fractions are 1 and (1-~A). Substituting into equation 1, we find
(数式) c o : CAX l+cBX (1-aA).
This is almost the well used linear interpolation of foreground F with background B
(数式) /~ = FXc~+Bx(1-a),
except that our foreground is pre-multiplied by alpha.

yoya's diary

Compositing Digital Images (和訳)

ABSTRACT

1. Introduction

2. The Alpha Channel

3. RGBA Pictures

4. The Algebra of Compositlng

4.1. Assumptions

4.2. Compositing Operators

4.3. Compositing Arithmetic

4.4. Unary operators

5. Examples

6. Conclusion

7. References

8. Acknowledgment