Pin-hole camera revisited

There are three coordinate systems involved - camera, image and the world:
  1. Camera: perspective projection.

    \includegraphics{pinhole.ps}

    $\displaystyle \left[
\begin{array}{c}
x_c \\ y_c \\ f
\end{array}\right]
=
k
\left[
\begin{array}{c}
X_c \\ Y_c \\ Z_c
\end{array}\right]
$

    where $ k = f/Z_c$ . This can be written as

    $\displaystyle \left[ \begin{array}{c}x_c \\ y_c \\ f \end{array} \right] =
\lef...
...ray} \right]
\left[ \begin{array}{c}X_c \\ Y_c \\ Z_c \\ 1 \end{array} \right]
$

  2. Image: (intrinsic/internal camera parameters)

    \includegraphics{intrinsic.ps}

    \begin{displaymath}
\begin{array}{ccc}
k_u x_c & = & u - u_0 \\
k_v y_c & = & v_0 - v \\
\end{array}\end{displaymath}

    where the unit of $ k$ 's are pixel/length. This can be expressed as

    $\displaystyle \left[ \begin{array}{c}u \\ v \\ 1 \end{array} \right] =
\left[ \...
...} \right]
=
{\bf C} \left[ \begin{array}{c}x_c \\ y_c \\ f \end{array} \right]
$

    $ {\bf C}$ is called the camera calibration matrix and it provides the transformation between an image point and a ray in Euclidean 3-space.
  3. World: (extrinsic/external camera parameters)

    \includegraphics{external.ps}

    The Euclidean transformation between the camera and world coordinates is:

    $\displaystyle {\bf X_c} = {\bf R}{\bf X_w} + {\bf T}
$

    and is expressed projectively as:

    $\displaystyle \left[ \begin{array}{c}X_c \\ Y_c \\ Z_c \\ 1 \end{array} \right]...
...rray}\right]
\left[ \begin{array}{c}X_w \\ Y_w \\ Z_w \\ 1 \end{array} \right]
$

    Finally, concatenating the three matrices, we have

    $\displaystyle \left[ \begin{array}{c}u \\ v \\ 1 \end{array} \right] =
{\bf C}\...
...f T} \right]
\left[ \begin{array}{c}X_w \\ Y_w \\ Z_w \\ 1 \end{array} \right]
$

    which defines the $ 3 \times 4$ projection from Euclidean 3-space to an image:

    $\displaystyle {\bf x} = {\bf P_E}{\bf X} \;\;\;\;\;\; {\bf P_E} = {\bf C}\left[ {\bf R}\mid {\bf T} \right]
$

Subhashis Banerjee 2008-01-20