04_convolut (1119073)
Текст из файла
ConvolutionCarlo TomasiTo introduce the concept of convolution, suppose that we want to determine where in theimage there are vertical edges. Since an edge is an abrupt change of image intensity, we mightstart by computing the derivatives of an image in the horizontal direction.
Derivatives witha large magnitude, either positive or negative, are elements of vertical edges. The partialderivative of a continuous function F (x; y) with respect to the \horizontal" variable x isdened as the local slope of the plot of the function along the x direction or, formally, bythe following limit:F (x + x; y ) F (x; y )@F (x; y )=lim:x!0@xxAn image from a digitizer is a function of a discrete variable, so we cannot take xarbitrarily small: the smallest we can go is one pixel. If our unit of measure is the pixel, wehavex = 1and a rather crude approximation to the derivative at an integer position j = x, i = y istherefore1@F (x; y ) f (i; j + 1) f (i; j ) :@xx=j;y=iHere we assume for simplicity that the origins an axis orientations of the x; y reference systemand the i; j system coincide. When we do edge detection, we will see that we can do muchbetter than this as an approximation to the derivative, but this example is good enough forintroducing convolution.Here is a piece of code that computes this approximation along row i in the image:for (j = jstart; j <= jend; j++) h[i][j]= f[i][j+1]f[i][j];Notice, in passing, that the last value of j for which this computation is dened is the nextto-last pixel in the row, so jend must be dened appropriately.
This operation amounts totaking a little two-cell mask g with the values g[0] = 1 and g[1] = 1 in its two entries,placing the mask in turn at every position j along row i, multiplying what is under the maskby the mask entries, and adding the result. In C, we haveNotice that to conform with usual notation the order of variables in the discrete array is switchedwith respect to that of the corresponding variablesin the continuous function:are right and up,respectively, while are down and right. Other conventions are possible, of course.1i; jx; yi; j1x; y= jstart; j <= jend; j++) h[i][j]= g[0]f[i][j+1] + g[1]f[i][j];This adds a little generality, because we can change the values of g without changing thecode.
Since we are generalizing, we might as well allow for several entries in g. For instance,we might in the future switch to a centered approximation to the derivative,@F (x; y ) f (i; j + 1) f (i; j 1) :for (j@xx=j;y=iSo now we can dene for instance g[ 1] = 1, g[0] = 0, and g[1] = 1 and write ageneral-purpose loop in view of possible future changes in our choice of g:for (j = jstart; j <= jend; j++)fgh[i][j] = 0;for (b =bstart; b <= bend; b++)h[i][j] + = g[b]f[i][j b];This is now much more general: it lets us choose which horizontal neighbors to combineand with what weights. But clearly we will soon want to also combine pixels above i; j , notonly on its sides, and for the whole picture, not just one row. This is easily done:for (i = istart; i <= iend; i++)for (j = jstart; j <= jend; j++)fgh[i][j] = 0;for (a = astart; a <= aend; a++)for (b = bstart; b <= bend; b++)h[i][j] + = g[a][b]f[i a][j b];where now g[a][b] is a two-dimensional array.
The part within the braces is a very important operation in signal processing. The two innermost for loops just keep adding values toh[i][j],so we can express that piece of code by the following mathematical expression:h(i; j ) =aXendbXenda=astart b=bstartg (a; b)f (ib) :a; j(1)This is called a convolution. Convolving a signal with a given mask g is also calledltering that signal with that mask. When referred to image ltering, the mask is also calledthe point-spread function of the lter.
In fact, if we letf (i; j )(if i = j = 0= (i; j ) = 10 otherwise2;(2)then the image f is a single point (the 1) in a sea of zeros. When the convolution (1) iscomputed, we obtainh(i; j ) = g (i; j ) :In words, the single point at the origin is spread into a blob equal to the mask (interpretedas an image).The choice of subscripts for the entries of g, in both the code and the mathematicalexpression, seems arbitrary at rst. In fact, instead of dening g[ 1] = 1, g[0] = 0, g[1] = 1,we could have written, perhaps more naturally, g[ 1] = 1, g[0] = 0, g[1] = 1, and in theexpressions f[i-a][j-b] and f (i a; j b) the minus signs would be replaced by plus signs.In terms of programming, there is no dierence between these two options (and others aswell).
Mathematically, on the other hand, the minus sign is much preferable. The rstreason is that g(i; j ) can be interpreted, as done above, as a point spread function. With theother choice of signs the convolution of f = with g would yield a doubly-mirrored imageg ( i; j ) of the mask g .Another reason for this choice of signs is that the convolution now looks like the familiarmultiplication for polynomials.
In fact, consider two polynomials= f0 + f1z + : : : + fmzmg (z ) = g0 + g1 z + : : : + gn z n :f (z )Then, the sequence of coecients of the producth(z )= h0 + h1z + : : : + hm+n zm+nof these polynomials is the (one-variable) convolution of the sequences of their coecients:hi=aXenda=astartg (a)f (ia) :(3)In fact, notice that g(a) multiplies za and f (i a) multiplies zi a, so the power corresponding to g(a)f (i a) is zi for all values of a, and hi as dened by equation (3) is the sumof all the products with a term zi, as required by the denition of product between twopolynomials. Verify this with an example. Thus, putting a minus sign in the denition (1)of the convolution makes the latter coincide with the product of two polynomials, therebymaking the convolution an even deeper are more pervasive concept in mathematics.The interpretation of the convolution mask g(i; j ) as a point-spread function suggestsanother useful way to look at the operation of ltering. The function dened in (2) is asingle spike of unit height at the origin.
A generic image f (i; j ), on the other hand, can beseen as a whole collection of spikes, one per pixel, whose height equals the image value. Informulas,f (i; j ) =f (a; b) (i a; j b) ;XXa b3where the summations range over the entire image. This expression is the convolution of fand .
Notice that this is the same asf (i; j )=XXa bf (ia; jb) (a; b)after the change of variables i ! i a; j ! j b at least if the summation ranges areassumed to be ( 1; +1)2. But if the output to (i; j ) is the point-spread function g(i; j ),then the output to a b f (a; b)(i a; j b) is a linear combination of point-spread functions,amplied each by one of the pixels in the image.
This describes, for instance, what happensin a pinhole camera with a pinhole of nonzero radius. In fact, one point in the world spreadsinto a small disk on the image plane (the point-spread function, literally). Each point inthe world draws a little disk onto the image, and the brightness of each disk is proportionalto the brightness of the point in the world.
This results in a blurred image. In conclusion,the image formed by a pinhole camera is the convolution of the ideal (sharp) image with apillow-case function.The dierence between the convolution dened in (1) and what happens in the pinholecamera is that the points in the world are not neatly arranged onto a rectangular grid,as are pixels in an image, but form a continuous.
Fortunately, all the concepts relativeto convolution can be extended to continuous functions as well. In analogy with equation(1), we dene the convolution between two continuous functions f (x; y) and g(x; y) as thefollowing double integral:PPh(x; y ) =Z+1 Z +111g (a; b)f (xa; yb) da db :The blurred image produced by the pinhole camera is then the convolution of the ideallysharp image f (x; y) with the pillow-case functionp1 if x2 + y2 rg (x; y ) =0 otherwise(;where r is the radius of the pinhole.Convolution and ltering are key concepts in image processing, and we will encounterthem throughout this class.2Otherwise, they should be modied according to the change of variables.4.
Характеристики
Тип файла PDF
PDF-формат наиболее широко используется для просмотра любого типа файлов на любом устройстве. В него можно сохранить документ, таблицы, презентацию, текст, чертежи, вычисления, графики и всё остальное, что можно показать на экране любого устройства. Именно его лучше всего использовать для печати.
Например, если Вам нужно распечатать чертёж из автокада, Вы сохраните чертёж на флешку, но будет ли автокад в пункте печати? А если будет, то нужная версия с нужными библиотеками? Именно для этого и нужен формат PDF - в нём точно будет показано верно вне зависимости от того, в какой программе создали PDF-файл и есть ли нужная программа для его просмотра.