Wednesday, April 21, 2010

Video stabilization - using sift




first video shows Original video with Unwanted camera shake
second video shows result of video without camera shake




Buit in matlab function:
CP2TFORM Infer spatial transformation from control point pairs.
CP2TFORM takes pairs of control points and uses them to infer a
spatial transformation.


CP2TFORM requires a minimum number of control point pairs to infer a
% TFORM structure of each TRANSFORMTYPE:
%
% TRANSFORMTYPE MINIMUM NUMBER OF PAIRS
% ------------- -----------------------
% 'nonreflective similarity' 2
% 'similarity' 3
% 'affine' 3
% 'projective' 4


http://www.mathworks.com/access/helpdesk/help/toolbox/images/cp2tform.html

given a set of points "inp" in the first image
another set of points "outp" in another image
recompute the change in the projected points

Code:

% compute the similarity transformation
tt = cp2tform(inp, outp, 'linear conformal');
L = tt.tdata.T;
scale = sqrt(L(1,1)^2 + L(1,2)^2);
L(:,1:2) = L(:,1:2)/scale;
% sum the transforms up
T = T + L;
end

% get the average transformation
T = T/count;

% warping by backprojection.
% inv(T) is the transform from target back to the original image
T = inv(T);

width = size(im, 2);
height = size(im, 1);
% x and y are coordinates on the warped image
[x,y] = meshgrid(1:width, 1:height);
nxy = [x(:), y(:), ones(length(x(:)), 1)] * T;
% newx and newx are corresponding point coordinates in the original image
newx = reshape(nxy(:,1), height, width);
newy = reshape(nxy(:,2), height, width);

Math:

For an ‘affine’ transformation, the parametric motion can be described by the following formulas:

u(x,y) = a1x + a2y + a3
v(x,y) = a4x + a5y + a6

In this case there are six unknowns namely a1 –> a6 . In order to solve for these unknowns we must expand matrices to this form:







Since u1 –> u3 and v1 –> v3 can be easily calculated using the 3 pairs of points from each image and (x1 –> x3 , y1 –> y3) are already known, a1 –> a6 can be solved by using the formula v = (AT*A)-1 * AT b where (AT*A)-1 is defined as the pseudo inverse of A. With these new values imtransform() can modify the first image to correspond with the second image. xdata and ydata are variables describing the offset between the static image and transformed image, while trans is the scaled and sheared transformed image.


For a projective transformation: [up vp wp] = [x y w] T, where

u = up / wp
v = vp / wp.


T is a 3-by-3 matrix, where all nine elements can be different.

T = [ A D G
B E H
C F I]


The above matrix equation is equivalent to these two expressions:

u = (Ax + By + C) / (Gx + Hy + 1)
v = (Dx + Ey + F) / (Gx + Hy + 1)

Summary:

For a projective transformation:

u = (Ax + By + C)/(Gx + Hy + I)
v = (Dx + Ey + F)/(Gx + Hy + I)

Assume I = 1, multiply both equations, by denominator:

u = [x y 1 0 0 0 -ux -uy] * [A B C D E F G H]'
v = [0 0 0 x y 1 -vx -vy] * [A B C D E F G H]'

With 4 or more correspondence points we can combine the u equations and
the v equations for one linear system to solve for [A B C D E F G H]:

[ u1 ] = [ x1 y1 1 0 0 0 -u1*x1 -u1*y1 ] * [A]
[ u2 ] = [ x2 y2 1 0 0 0 -u2*x2 -u2*y2 ] [B]
[ u3 ] = [ x3 y3 1 0 0 0 -u3*x3 -u3*y3 ] [C]
[ u1 ] = [ x4 y4 1 0 0 0 -u4*x4 -u4*y4 ] [D]
[ ... ] [ ... ] [E]
[ un ] = [ xn yn 1 0 0 0 -un*xn -un*yn ] [F]
[ v1 ] = [ 0 0 0 x1 y1 1 -v1*x1 -v1*y1 ] [G]
[ v2 ] = [ 0 0 0 x2 y2 1 -v2*x2 -v2*y2 ] [H]
[ v3 ] = [ 0 0 0 x3 y3 1 -v3*x3 -v3*y3 ]
[ v4 ] = [ 0 0 0 x4 y4 1 -v4*x4 -v4*y4 ]
[ ... ] [ ... ]
[ vn ] = [ 0 0 0 xn yn 1 -vn*xn -vn*yn ]

Or rewriting the above matrix equation:
U = X * Tvec, where Tvec = [A B C D E F G H]'
so Tvec = X\U.

1 comment:

  1. what is the complete code?
    and how did you choose the points?

    ReplyDelete