Files
uni/year4/semester1/CT404: Graphics & Image Processing/notes/CT404-Notes.tex

1177 lines
42 KiB
TeX

%! TeX program = lualatex
\documentclass[a4paper,11pt]{article}
% packages
\usepackage{censor}
\StopCensoring
\usepackage{fontspec}
\setmainfont{EB Garamond}
% for tironian et fallback
% % \directlua{luaotfload.add_fallback
% % ("emojifallback",
% % {"Noto Serif:mode=harf"}
% % )}
% % \setmainfont{EB Garamond}[RawFeature={fallback=emojifallback}]
\setmonofont[Scale=MatchLowercase]{Deja Vu Sans Mono}
\usepackage[a4paper,left=2cm,right=2cm,top=\dimexpr15mm+1.5\baselineskip,bottom=2cm]{geometry}
\setlength{\parindent}{0pt}
\usepackage{ulem}
\usepackage{gensymb}
\usepackage{fancyhdr} % Headers and footers
\fancyhead[R]{\normalfont \leftmark}
\fancyhead[L]{}
\pagestyle{fancy}
\usepackage{microtype} % Slightly tweak font spacing for aesthetics
\usepackage[english]{babel} % Language hyphenation and typographical rules
\usepackage{xcolor}
\definecolor{linkblue}{RGB}{0, 64, 128}
\usepackage[final, colorlinks = false, urlcolor = linkblue]{hyperref}
% \newcommand{\secref}[1]{\textbf{§~\nameref{#1}}}
\newcommand{\secref}[1]{\textbf{§\ref{#1}~\nameref{#1}}}
\usepackage{multicol}
\usepackage{amsmath}
\usepackage{changepage} % adjust margins on the fly
\usepackage{minted}
\usemintedstyle{algol_nu}
\usepackage{pgfplots}
\pgfplotsset{width=\textwidth,compat=1.9}
\usepackage{caption}
\newenvironment{code}{\captionsetup{type=listing}}{}
\captionsetup[listing]{skip=0pt}
\setlength{\abovecaptionskip}{5pt}
\setlength{\belowcaptionskip}{5pt}
\usepackage[yyyymmdd]{datetime}
\renewcommand{\dateseparator}{--}
\usepackage{enumitem}
\usepackage{titlesec}
\author{Andrew Hayes}
\begin{document}
\begin{titlepage}
\begin{center}
\hrule
\vspace*{0.6cm}
\censor{\huge \textbf{CT404}}
\vspace*{0.6cm}
\hrule
\LARGE
\vspace{0.5cm}
Graphics \& Image Processing
\vspace{0.5cm}
\hrule
\vfill
\centering
\includegraphics[width=\textwidth]{images/cover.png}
\vfill
\hrule
\begin{minipage}{0.495\textwidth}
\vspace{0.4em}
\raggedright
\normalsize
Name: \censor{Andrew Hayes} \\
E-mail: \censor{\href{mailto://a.hayes18@universityofgalway.ie}{\texttt{a.hayes18@universityofgalway.ie}}} \hfill\\
Student ID: \censor{21321503} \hfill
\end{minipage}
\begin{minipage}{0.495\textwidth}
\raggedleft
\vspace*{0.8cm}
\Large
\today
\vspace*{0.6cm}
\end{minipage}
\medskip\hrule
\end{center}
\end{titlepage}
\pagenumbering{roman}
\newpage
\tableofcontents
\newpage
\setcounter{page}{1}
\pagenumbering{arabic}
\section{Introduction}
Textbooks:
\begin{itemize}
\item Main textbook: \textit{Image Processing and Analysis} -- Stan Birchfield (ISBN: 978-1285179520).
\item \textit{Introduction to Computer Graphics} -- David J. Eck. (Available online at \url{https://math.hws.edu/graphicsbook/}).
\item \textit{Computer Graphics: Principles and Practice} -- John F. Hughes et al. (ISBN: 0-321-39952-8).
\item \textit{Computer Vision: Algorithms and Applications} -- Richard Szeliski (ISBN: 978-3-030-34371-2).
\end{itemize}
\textbf{Computer graphics} is the processing \& displaying of images of objects that exist conceptually rather than
physically with emphasis on the generation of an image from a model of the objects, illumination, etc. and the
real-time rendering of images.
Ideas from 2D graphics extend to 3D graphics.
\\\\
\textbf{Digital Image processing/analysis} is the processing \& display of images of real objects, with an emphasis
on the modification and/or analysis of the image in order to automatically or semi-automatically extract useful
information.
Image processing leads to more advanced feature extraction \& pattern recognition techniques for image analysis \&
understanding.
\subsection{Grading}
\begin{itemize}
\item Assignments: 30\%.
\item Final Exam: 70\%.
\end{itemize}
\subsubsection{Reflection on Exams}
``A lot of people give far too little detail in these questions, and/or don't address the discussion
parts -- they just give some high-level definitions and consider it done -- which isn't enough for
final year undergrad, and isn't answering the question.
More is expected in answers than just repeating what's in my slides.
The top performers demonstrate a higher level of understanding and synthesis as well as more
detail about techniques and discussion of what they do on a technical level and how they fit
together''
\subsection{Lecturer Contact Information}
\begin{multicols}{2}
\begin{itemize}
\item Dr. Nazre Batool.
\item \href{mailto://nazre.batool@universityofgalway.ie}{\texttt{nazre.batool@universityofgalway.ie}}
\item Office Hours: Thursdays 16:00 -- 17:00, CSB-2009.
\item Dr. Waqar Shahid Qureshi.
\item \href{mailto://waqarshahid.qureshi@universityofgalway.ie}{\texttt{waqarshahid.qureshi@universityofgalway.ie}}.
\item Office Hours: Thursdays 16:00 -- 17:00, CSB-3001.
\end{itemize}
\end{multicols}
\section{Introduction to 2D Graphics}
\subsection{Digital Images -- Bitmaps}
\textbf{Bitmaps} are grid-based arrays of colour or brightness (greyscale) information.
\textbf{Pixels} (\textit{picture elements}) are the cells of a bitmap.
The \textbf{depth} of a bitmap is the number of bits-per-pixel (bpp).
\subsection{Colour Encoding Schemes}
Colour is most commonly represented using the \textbf{RGB (Red, Green, Blue)} scheme, typically using 24-bit colour
with one 8-bit number representing the level of each colour channel in that pixel.
\\\\
Alternatively, images can also be represented in \textbf{greyscale} wherein pixels are represented with one
(typically 8-bit) brightness value (or scale of grey) .
\subsection{The Real-Time Graphics Pipeline}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{images/real_time_graphics_pipeline.png}
\caption{The Real-Time Graphics Pipeline}
\end{figure}
\subsection{Graphics Software}
The \textbf{Graphics Processing Unit (GPU)} of a computer is a hardware unit designed for digital image processing \& to
accelerate computer graphics that is included in modern computers to complement the CPU.
They have internal, rapid-access GPU memory and parallel processors for vertices \& fragments to speed up graphics
renderings.
\\\\
\textbf{OpenGL} is a 2D \& 3D graphics API that has existed since 1992 that is supported by the graphics hardware in most
computing devices today.
\textbf{WebGL} is a web-based implementation of OpenGL for use within web browsers.
OpenGL ES for Embedded Systems such as tablets \& mobile phones also exists.
\\\\
OpenGL was originally a client/server system with the CPU+Application acting as a client sending commands \& data to the GPU
acting as a server.
This was later replaced by a programmable graphics interface (OpenGL 3.0) to write GPU programs (shaders) to be run by the
GPU directly.
It is being replaced by newer APIs such as Vulkan, Metal, \& Direct3D and WebGL is being replaced by WebGPU.
\subsection{Graphics Formats}
\textbf{Vector graphics} are images described in terms of co-ordinate drawing operations, e.g. AutoCAD, PowerPoint, Flash,
SVG.
\textbf{SVG (Scalable Vector Graphics)} is an image specified by vectors which are scalable without losing any quality.
\\\\
\textbf{Raster graphics} are images described as pixel-based bitmaps.
File formats such as GIF, PNG, JPEG represent the image by storing colour values for each pixel.
\section{2D Vector Graphics}
\textbf{2D vector graphics} describe drawings as a series of instructions related to a 2-dimensional co-ordinate system.
Any point in this co-ordinate system can be specified using two numbers $(x, y)$:
\begin{itemize}
\item The horizontal component $x$, measuring the distance from the left-hand edge of the screen or window.
\item The vertical component $y$, measuring the distance from the bottom of the screen or window (or sometimes from the
top).
\end{itemize}
\subsection{Transformations}
\subsubsection{2D Translation}
The \textbf{translation} of a point in 2 dimensions is the movement of a point $(x,y)$ to some other point $(x', y')$.
$$
x' = x + a
$$
$$
y' = y + b
$$
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{images/2d_translation.png}
\caption{2D Translation of a Point}
\end{figure}
\subsubsection{2D Rotation of a \textit{Point}}
The simplest rotation of a point around the origin is given by:
$$
x' = x \cos \theta - y \sin \theta
$$
$$
y' = x \cos \theta + y \sin \theta
$$
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{images/2d_point_rotation.png}
\caption{2D Rotation of a Point}
\end{figure}
\subsubsection{2D Rotation of an \textit{Object}}
In vector graphics, \textbf{objects} are defined as series of drawing operations (e.g., straight lines) performed on a set
of vertices.
To rotate a line or more complex object, we simply apply the equations to rotate a point to the $(x,y)$ co-ordinates of each
vertex.
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{images/2d_object_rotation.png}
\caption{2D Rotation of an Object}
\end{figure}
\subsubsection{Arbitrary 2D Rotation}
In order to rotate around an arbitrary point $(a,b)$, we perform translation, then rotation, then reverse the translation.
$$
x' = a + (x - a) \cos \theta - (y - b) \sin \theta
$$
$$
y' = a + (x - a) \cos \theta + (y - b) \sin \theta
$$
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{images/2d_arbitrary_rotation.png}
\caption{Arbitrary 2D Rotation}
\end{figure}
\subsubsection{Matrix Notation}
\textbf{Matrix notation} is commonly used for vector graphics as more complex operations are often easier in matrix format
and because several operations can be combined easily into one matrix using matrix algebra.
Rotation about $(0,0)$:
$$
\begin{bmatrix}
x' & y'
\end{bmatrix}
=
\begin{bmatrix}
x & y
\end{bmatrix}
\begin{bmatrix}
\cos \theta & \sin \theta \\
-\sin \theta & \cos \theta
\end{bmatrix}
$$
Translation:
$$
\begin{bmatrix}
x' & y' 1
\end{bmatrix}
=
\begin{bmatrix}
x & y & 1
\end{bmatrix}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
a & 0 & 1
\end{bmatrix}
$$
\subsubsection{Scaling}
\textbf{Scaling} of an object is achieved by considering each of its vertices in turn, multiplying said vertex's $x$ \& $y$
values by the scaling factor.
A scaling factor of 2 will double the size of the object, while a scaling factor of 0.5 will halve it.
It is possible to have different scaling factors for $x$ \& $y$, resulting in a \textbf{stretch}:
$$
x' = x \times s
$$
$$
y' = y \times t
$$
If the object is not centred on the origin, then scaling it will also effect a translation.
\subsubsection{Order of Transformations}
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{images/order_of_transformations.png}
\caption{Order of Transformations}
\end{figure}
\section{2D Raster Graphics}
The raster approach to 2D graphics considers digital images to be grid-based arrays of pixels and operates on the
images at the pixel level.
\subsection{Introduction to HTML5/Canvas}
\textbf{HTML} or HyperText Markup Language is a page-description language used primarily for website.
\textbf{HTML5} brings major updates \& improvements to the power of client-side web development.
\\\\
A \textbf{canvas} is a 2D raster graphics component in HTML5.
There is also a \textbf{canvas with 3D} (WebGL) which is a 3D graphics component that is more likely to be
hardware-accelerated but is also more complex.
\subsubsection{Canvas: Rendering Contexts}
\mintinline{html}{<canvas>} creates a fixed-size drawing surface that exposes one or more \textbf{rendering contexts}.
The \mintinline{javascript}{getContext()} method returns an object with tools (methods) for drawing.
\begin{minted}[linenos, breaklines, frame=single]{html}
<html>
<head>
<script>
function draw() {
var canvas = document.getElementById("canvas");
var ctx = canvas.getContext("2d");
ctx.fillStyle = "rgb(200,0,0)";
ctx.fillRect (10, 10, 55, 50);
ctx.fillStyle = "rgba(0, 0, 200, 0.5)";
ctx.fillRect (30, 30, 55, 50);
}
</script>
</head>
<body onload="draw();">
<canvas id="canvas" width="150" height="150"></canvas>
</body>
</html>
\end{minted}
\begin{figure}[H]
\centering
\includegraphics[width=0.2\textwidth]{images/canvas_rendering_contexts.png}
\caption{Rendering of the Above HTML Code}
\end{figure}
\subsubsection{Canvas2D: Primitives}
Canvas2D only supports one primitive shape: rectangles.
All other shapes must be created by combining one or more \textit{paths}.
Fortunately, there are a collection of path-drawing functions which make it possible to compose complex shapes.
\begin{minted}[linenos, breaklines, frame=single]{javascript}
function draw(){
var canvas = document.getElementById('canvas');
var ctx = canvas.getContext('2d');
ctx.fillRect(125,25,100,100);
ctx.clearRect(145,45,60,60);
ctx.strokeRect(150,50,50,50);
ctx.beginPath();
ctx.arc(75,75,50,0,Math.PI*2,true); // Outer circle
ctx.moveTo(110,75);
ctx.arc(75,75,35,0,Math.PI,false); // Mouth (clockwise)
ctx.moveTo(65,65);
ctx.arc(60,65,5,0,Math.PI*2,true); // Left eye
ctx.moveTo(95,65);
ctx.arc(90,65,5,0,Math.PI*2,true); // Right eye
ctx.stroke(); // renders the Path that has been built up..
}
\end{minted}
\begin{figure}[H]
\centering
\includegraphics[width=0.3\textwidth]{images/canvas2d_primitives.png}
\caption{Rendering of the Above JavaScript Code}
\end{figure}
\subsubsection{Canvas2D: \mintinline{javascript}{drawImage()}}
The example below uses an external image as the backdrop of a small line graph:
\begin{minted}[linenos, breaklines, frame=single]{javascript}
function draw() {
var ctx = document.getElementById('canvas').getContext('2d');
var img = new Image();
img.src = 'backdrop.png';
img.onload = function(){
ctx.drawImage(img,0,0);
ctx.beginPath();
ctx.moveTo(30,96);
ctx.lineTo(70,66);
ctx.lineTo(103,76);
ctx.lineTo(170,15);
ctx.stroke();
}
}
\end{minted}
\begin{figure}[H]
\centering
\includegraphics[width=0.3\textwidth]{images/canvas2d_drawimage.png}
\caption{Rendering of the Above JavaScript Code}
\end{figure}
\subsubsection{Canvas2D: Fill \& Stroke Colours}
\begin{minted}[linenos, breaklines, frame=single]{html}
<html>
<head>
<script>
function draw() {
var canvas = document.getElementById("canvas");
var context = canvas.getContext('2d');
// Filled Star
context.lineWidth=3;
context.fillStyle="#CC00FF";
context.strokeStyle="#ffff00"; // NOT lineStyle!
context.beginPath();
context.moveTo(100,50);
context.lineTo(175,200);
context.lineTo(0,100);
context.lineTo(200,100);
context.lineTo(25,200);
context.lineTo(100,50);
context.fill(); // colour the interior
context.stroke(); // draw the lines
}
</script>
</head>
<body onload="draw();">
<canvas id="canvas" width="300" height="300"></canvas>
</body>
</html>
\end{minted}
Colours can be specified by name (\mintinline{javascript}{red}), by a string of the form
\mintinline{javascript}{rgb(r,g,b)}, or by hexadecimal colour codes \mintinline[escapeinside=||]{javascript}{|#|RRGGBB}.
\begin{figure}[H]
\centering
\includegraphics[width=0.3\textwidth]{images/canvas2d_fill_stroke.png}
\caption{Rendering of the Above JavaScript Code}
\end{figure}
\subsubsection{Canvas2D: Translations}
\begin{minted}[linenos, breaklines, frame=single]{html}
<html>
<head>
<script>
function draw() {
var canvas = document.getElementById("canvas");
var context = canvas.getContext('2d');
context.save(); // save the default (root) co-ord system
context.fillStyle="#CC00FF"; // purple
context.fillRect(100,0,100,100);
// translates from the origin, producing a nested co-ordinate system
context.translate(75,50);
context.fillStyle="#FFFF00"; // yellow
context.fillRect(100,0,100,100);
// transforms further, to produce another nested co-ordinate system
context.translate(75,50);
context.fillStyle="#0000FF"; // blue
context.fillRect(100,0,100,100);
context.restore(); // recover the default (root) co-ordinate system
context.translate(-75,90);
context.fillStyle="#00FF00"; // green
context.fillRect(100,0,100,100);
}
</script>
</head>
<body onload="draw();">
<canvas id="canvas" width="600" height="600"></canvas>
</body>
</html>
\end{minted}
\begin{figure}[H]
\centering
\includegraphics[width=0.3\textwidth]{images/canvas2d_translations.png}
\caption{Rendering of the Above JavaScript Code}
\end{figure}
\subsubsection{Canvas2D: Order of Transformations}
\begin{minted}[linenos, breaklines, frame=single]{html}
<html>
<head>
<script>
function draw() {
var canvas = document.getElementById("canvas");
var context = canvas.getContext('2d');
context.save(); // save the default (root) co-ord system
context.fillStyle="#CC00FF"; // purple
context.fillRect(0,0,100,100); // positioned with TL corner at 0,0
// translate then rotate
context.translate(100,0);
context.rotate(Math.PI/3);
context.fillStyle="#FF0000"; // red
context.fillRect(0,0,100,100); // positioned with TL corner at 0,0
// recover the root co-ord system
context.restore();
// rotate then translate
context.rotate(Math.PI/3);
context.translate(100,0);
context.fillStyle="#FFFF00"; // yellow
context.fillRect(0,0,100,100); // positioned with TL corner at 0,0
}
</script>
</head>
<body onload="draw();">
<canvas id="canvas" width="600" height="600"></canvas>
</body>
</html>
\end{minted}
\begin{figure}[H]
\centering
\includegraphics[width=0.2\textwidth]{images/canvas2d_order_of_transformations.png}
\caption{Rendering of the Above JavaScript Code}
\end{figure}
\subsubsection{Scaling}
\begin{minted}[linenos, breaklines, frame=single]{html}
<html>
<head>
<script>
function draw() {
var canvas = document.getElementById("canvas");
var context = canvas.getContext('2d');
context.fillStyle="#CC00FF"; // purple
context.fillRect(0,0,100,100); // positioned with TL corner at 0,0
context.translate(150,0);
context.scale(2,1.5);
context.fillStyle="#FF0000"; // red
context.fillRect(0,0,100,100); // positioned with TL corner at 0,0
}
</script>
</head>
<body onload="draw();">
<canvas id="canvas" width="600" height="600"></canvas>
</body>
</html>
\end{minted}
\begin{figure}[H]
\centering
\includegraphics[width=0.2\textwidth]{images/canvas2d_scaling.png}
\caption{Rendering of the Above JavaScript Code}
\end{figure}
\subsubsection{Canvas2D: Programmatic Graphics}
\begin{minted}[linenos, breaklines, frame=single]{html}
<html>
<head>
<script>
function draw() {
var canvas = document.getElementById("canvas");
var context = canvas.getContext('2d');
context.translate(150,150);
for (i=0;i<15;i++) {
context.fillStyle = "rgb("+(i*255/15)+",0,0)";
context.fillRect(0,0,100,100);
context.rotate(2*Math.PI/15);
}
}
</script>
</head>
<body onload="draw();">
<canvas id="canvas" width="600" height="600"></canvas>
</body>
</html>
\end{minted}
\begin{figure}[H]
\centering
\includegraphics[width=0.2\textwidth]{images/canvas2d_programmatic_graphics.png}
\caption{Rendering of the Above JavaScript Code}
\end{figure}
\section{3D Co-Ordinate Systems}
In a 3D co-ordinate system, a point $P$ is referred to by three real numbers (co-ordinates): $(x,y,z)$.
The directions of $x$, $y$, \& $z$ are not universally defined but normally follow the \textbf{right-hand rule}
for axes systems.
In this case, $z$ defined the co-ordinate's distance ``out of'' the monitor and negative $z$ values go ``into''
the monitor.
\subsection{Nested Co-Ordinate Systems}
A \textbf{nested co-ordinate system} is defined as a translation relative to the world co-ordinate system.
For example, $-3.0$ units along the $x$ axis, $2.0$ units along the $y$ axis, and $2.0$ units along the $z$ axis.
\subsection{3D Transformations}
\subsubsection{Translation}
To translate a 3D point, modify each dimension separately:
$$
x' = x + a_1
$$
$$
y' = y + a_2
$$
$$
z' = z + a_3
$$
$$
\begin{bmatrix}
x' & y' & z' & 1
\end{bmatrix}
=
\begin{bmatrix}
x & y & z & 1
\end{bmatrix}
\begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
a_1 & a_2 & a_3 & 1
\end{bmatrix}
$$
\subsubsection{Rotation About Principal Axes}
A \textbf{principal axis} is an imaginary line through the ``center of mass'' of a body around which the body
rotates.
\begin{itemize}
\item Rotation around the $x$-axis is referred to as \textbf{pitch}.
\item Rotation around the $y$-axis is referred to as \textbf{yaw}.
\item Rotation around the $z$-axis is referred to as \textbf{roll}.
\end{itemize}
\textbf{Rotation matrices} define rotations by angle $\alpha$ about the principal axes.
$$
R_x =
\begin{bmatrix}
1 & 0 & 0 \\
0 & \cos \alpha & \sin \alpha \\
0 & - \sin \alpha & \cos \alpha
\end{bmatrix}
$$
To get new co-ordinates after rotation, multiply the point $\begin{bmatrix} x & y & z \end{bmatrix}$ by the
rotation matrix:
$$
\begin{bmatrix}
x' & y' & z'
\end{bmatrix}
=
\begin{bmatrix}
x & y & z
\end{bmatrix}
R_x
$$
For example, as a point rotates about the $x$-axis, its $x$ component remains unchanged.
\subsubsection{Rotation About Arbitrary Axes}
You can rotate about any axis, not just the principal axes.
You specify a 3D point, and the axis of rotation is defined as the line that joins the origin to this point
(e.g., a toy spinning top will rotate about the $y$-axis, defined as $(0, 1, 0)$).
You must also specify the amount to rotate by, this is measured in radians (e.g., $2\pi$ radians is $360\degree$).
\section{Graphics APIs}
\textbf{Low-level} graphics APIs are libraries of graphics functions that can be accessed from a standard
programming language.
They are typically procedural rather than descriptive, i.e. the programmer calls the graphics functions which
carry out operations immediately.
The programmer also has to write all other application code: interface, etc.
Procedural programming languages are typically faster than descriptive programming languages.
Examples include OpenGL, DirectX, Vulkan, Java Media APIs.
Examples that run in the browser include Canvas2D, WebGL, SVG.
\\\\
\textbf{High-level} graphics APIs are ones in which the programmer describes the required graphics, animations,
interactivity, etc. and doesn't need to deal with how this will be displayed \& updated.
They are typically descriptive rather than procedural and so are generally slower \& less flexible because it is
generally interpreted and rather general-purpose rather than task-specific.
Examples include VRML/X3D.
\subsection{Three.js}
\textbf{WebGL (Web Graphics Library)} is a JavaScript API for rendering interactive 2D \& 3D graphics within any
compatible web browser without the use of plug-ins.
WebGL s fully integrated with other web standards, allowing GPU-accelerated usage of physics \& image processing
and effects as part of the web page canvas.
\\\\
\textbf{Three.js} is a cross-browser JavaScript library and API used to create \& display animated 4D computer
graphics in a web browser.
Three.js uses WebGL.
\begin{code}
\begin{minted}[linenos, breaklines, frame=single]{html}
<html>
<head>
<script src="three.js"></script>
<script>
'use strict'
function draw() {
// create renderer attached to HTML Canvas object
var c = document.getElementById("canvas");
var renderer = new THREE.WebGLRenderer({ canvas: c, antialias: true });
// create the scenegraph
var scene = new THREE.Scene();
// create a camera
var fov = 75;
var aspect = 600/600;
var near = 0.1;
var far = 1000;
var camera = new THREE.PerspectiveCamera( fov, aspect, near, far );
camera.position.z = 100;
// add a light to the scene
var light = new THREE.PointLight(0xFFFF00);
light.position.set(10, 30, 25);
scene.add(light);
// add a cube to the scene
var geometry = new THREE.BoxGeometry(20, 20, 20);
var material = new THREE.MeshLambertMaterial({color: 0xfd59d7});
var cube = new THREE.Mesh(geometry, material);
scene.add(cube);
// render the scene as seen by the camera
renderer.render(scene, camera);
}
</script>
</head>
<body onload="draw();">
<canvas id="canvas" width="600" height="600"></canvas>
</body>
</html>
\end{minted}
\caption{``Hello World'' in Three.js}
\end{code}
In Three.js, a visible object is represented as a \textbf{mesh} and is constructed from a \textit{geometry} \& a
\textit{material}.
\subsubsection{3D Primitives}
Three.js provides a range of primitive geometry as well as the functionality to implement more complex geometry at a lower
level.
See \url{https://threejs.org/manual/?q=prim#en/primitives}.
\begin{code}
\begin{minted}[linenos, breaklines, frame=single]{html}
<html>
<head>
<script src="three.js"></script>
<script>
'use strict'
var scene;
function addGeometryAtPosition(geometry, x, y, z) {
var material = new THREE.MeshLambertMaterial({color: 0xffffff});
var mesh = new THREE.Mesh(geometry, material);
scene.add(mesh);
mesh.position.set(x,y,z);
}
function draw() {
// create renderer attached to HTML Canvas object
var c = document.getElementById("canvas");
var renderer = new THREE.WebGLRenderer({ canvas: c, antialias: true });
// create the scenegraph (global variable)
scene = new THREE.Scene();
// create a camera
var fov = 75;
var aspect = 400/600;
var near = 0.1;
var far = 1000;
var camera = new THREE.PerspectiveCamera( fov, aspect, near, far );
camera.position.z = 100;
// add a light to the scene
var light = new THREE.PointLight(0xFFFF00);
light.position.set(10, 0, 25);
scene.add(light);
// add a bunch of sample primitives to the scene
// see more here: https://threejsfundamentals.org/threejs/lessons/threejs-primitives.html
// args: width, height, depth
addGeometryAtPosition(new THREE.BoxGeometry(6,4,8), -50, 0, 0);
// args: radius, segments
addGeometryAtPosition(new THREE.CircleBufferGeometry(7, 24), -30, 0, 0);
// args: radius, height, segments
addGeometryAtPosition(new THREE.ConeBufferGeometry(6, 4, 24), -10, 0, 0);
// args: radiusTop, radiusBottom, height, radialSegments
addGeometryAtPosition(new THREE.CylinderBufferGeometry(4, 4, 8, 12), 20, 0, 0);
// arg: radius
// Polyhedrons
// (Dodecahedron is a 12-sided polyhedron, Icosahedron is 20-sided, Octahedron is 8-sided, Tetrahedron is 4-sided)
addGeometryAtPosition(new THREE.DodecahedronBufferGeometry(7), 40, 0, 0);
addGeometryAtPosition(new THREE.IcosahedronBufferGeometry(7), -50, 20, 0);
addGeometryAtPosition(new THREE.OctahedronBufferGeometry(7), -30, 20, 0);
addGeometryAtPosition(new THREE.TetrahedronBufferGeometry(7), -10, 20, 0);
// args: radius, widthSegments, heightSegments
addGeometryAtPosition(new THREE.SphereBufferGeometry(7,12,8), 20, 20, 0);
// args: radius, tubeRadius, radialSegments, tubularSegments
addGeometryAtPosition(new THREE.TorusBufferGeometry(5,2,8,24), 40, 20, 0);
// render the scene as seen by the camera
renderer.render(scene, camera);
}
</script>
</head>
<body onload="draw();">
<canvas id="canvas" width="600" height="600"></canvas>
</body>
</html>
\end{minted}
\caption{Code Illustrating Some Primitives Provided by Three.js}
\end{code}
\subsubsection{Cameras}
3D graphics API cameras allow you to define:
\begin{itemize}
\item The camera location $(x,y,z)$.
\item The camera orientation (\sout{straight, gay} $x$ rotation, $y$ rotation, $z$ rotation).
\item The \textbf{viewing frustum} (the Field of View (FoV) \& clipping planes).
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{images/viewing_frustum.png}
\caption{The Viewing Frustum}
\end{figure}
\end{itemize}
In Three.js, the FoV can be set differently in the vertical and horizontal directions via the first \& second arguments
to the constructor can be set differently in the vertical and horizontal directions via the first \& second arguments
to the constructor \mintinline{javascript}{(fov, aspect)}.
Generally speaking, the aspect ratio should match that of the canvas width \& height to avoid the scene appearing to be
stretched.
\subsubsection{Lighting}
Six different types of lights are available in both Three.js \& WebGL:
\begin{itemize}
\item \textbf{Point lights:} rays emanate in all directions from a 3D point source (e.g., a lightbulb).
\item \textbf{Directional lights:} rays emanate in one direction only from infinitely far away
(similar effect rays from the Sun, i.e. very far away).
\item \textbf{Spotlights:} project a cone of light from a 3D point source aimed at a specific target point.
\item \textbf{Ambient lights:} simulate in a simplified way the lighting of an entire scene due to complex
light/surface interactions -- lights up everything in the scene regardless of position or occlusion.
\item \textbf{Hemisphere lights:} ambient lights that affect the ``ceiling'' or ``floor'' hemisphere of objects
rather than affecting them in their entirety.
\item \textbf{RectAreaLights:} emit rectangular areas of light (e.g., fluorescent light strip).
\end{itemize}
\begin{code}
\begin{minted}[linenos, breaklines, frame=single]{html}
<html>
<head>
<script src="three.js"></script>
<script>
'use strict'
function draw() {
// create renderer attached to HTML Canvas object
var c = document.getElementById("canvas");
var renderer = new THREE.WebGLRenderer({ canvas: c, antialias: true });
// create the scenegraph
var scene = new THREE.Scene();
// create a camera
var fov = 75;
var aspect = 600/600;
var near = 0.1;
var far = 1000;
var camera = new THREE.PerspectiveCamera( fov, aspect, near, far );
camera.position.set(0, 10, 30);
// add a light to the scene
var light = new THREE.PointLight(0xFFFFFF);
light.position.set(0, 10, 30);
scene.add(light);
// add a cylinder
// args: radiusTop, radiusBottom, height, radialSegments
var cyl = new THREE.Mesh(
new THREE.CylinderBufferGeometry(1, 1, 10, 12),
new THREE.MeshLambertMaterial({color: 0xAAAAAA}) );
scene.add(cyl);
// clone the cylinder
var cyl2 = cyl.clone();
// modify its rotation by 60 degrees around its z axis
cyl2.rotateOnAxis(new THREE.Vector3(0,0,1), Math.PI/3);
scene.add(cyl2);
// clone the cylinder again
var cyl3 = cyl.clone();
scene.add(cyl3);
// set its rotation directly using "Euler angles", to 120 degrees on z axis
cyl3.rotation.set(0,0,2*Math.PI/3);
// render the scene as seen by the camera
renderer.render(scene, camera);
}
</script>
</head>
<body onload="draw();">
<canvas id="canvas" width="600" height="600"></canvas>
</body>
</html>
\end{minted}
\caption{Rotation Around a Local Origin in Three.js}
\end{code}
\subsubsection{Nested Co-Ordinates}
\textbf{Nested co-ordinates} help manage complexity as well as promote reusability \& simplify the transformations of
objects composed of multiple primitive shapes.
In Three.js, 3D objects have a \mintinline{javascript}{children} array;
a child can be added to an object using the method \mintinline{javascript}{.add(childObject)}, i.e. nesting the child
object's transform within the parent object.
Objects have a parent in the scene graph so when you set their transforms (translation, rotation) it's relative to that
parent's local co-ordinate system.
\begin{code}
\begin{minted}[linenos, breaklines, frame=single]{html}
<html>
<head>
<script src="three.js"></script>
<script>
'use strict'
function draw() {
// create renderer attached to HTML Canvas object
var c = document.getElementById("canvas");
var renderer = new THREE.WebGLRenderer({ canvas: c, antialias: true });
// create the scenegraph
var scene = new THREE.Scene();
// create a camera
var fov = 75;
var aspect = 600/600;
var near = 0.1;
var far = 1000;
var camera = new THREE.PerspectiveCamera( fov, aspect, near, far );
camera.position.set(0, 1.5, 6);
// add a light to the scene
var light = new THREE.PointLight(0xFFFFFF);
light.position.set(0, 10, 30);
scene.add(light);
// desk lamp base
// args: radiusTop, radiusBottom, height, radialSegments
var base = new THREE.Mesh(
new THREE.CylinderBufferGeometry(1, 1, 0.1, 12),
new THREE.MeshLambertMaterial({color: 0xAAAAAA}) );
scene.add(base);
// desk lamp first arm piece
var arm = new THREE.Mesh(
new THREE.CylinderBufferGeometry(0.1, 0.1, 3, 12),
new THREE.MeshLambertMaterial({color: 0xAAAAAA}) );
// since we want to rotate around a point other than the arm's centre,
// we can create a pivot point as the parent of the arm, position the
// arm relative to that pivot point, and apply rotation on the pivot point
var pivot = new THREE.Object3D();
// centre of rotation we want
// (in world coordinates, since pivot is not yet a child of the base)
pivot.position.set(0, 0, 0);
pivot.add(arm); // pivot is parent of arm
base.add(pivot); // base is parent of pivot
// translate arm relative to its parent, i.e. 'pivot'
arm.position.set(0, 1.5, 0);
// rotate pivot point relative to its parent, i.e. 'base'
pivot.rotateOnAxis(new THREE.Vector3(0,0,1), -Math.PI/6);
// clone a second arm piece (consisting of a pivot with a cylinder as its child)
var pivot2 = pivot.clone();
// add as a child of the 1st pivot
pivot.add(pivot2);
// rotate the 2nd pivot relative to the 1st pivot (since it's nested)
pivot2.rotation.z = Math.PI/3;
// translate the 2nd pivot relative to the 1st pivot
pivot2.position.set(0,3,0);
// TEST: we can rotate the 1st arm piece and the 2nd arm piece should stay correct
pivot.rotateOnAxis(new THREE.Vector3(0,0,1), Math.PI/12);
// TEST: we can also move the base, and everything stays correct
base.position.x -= 3;
// render the scene as seen by the camera
renderer.render(scene, camera);
}
</script>
</head>
<body onload="draw();">
<canvas id="canvas" width="600" height="600"></canvas>
</body>
</html>
\end{minted}
\caption{Partial Desk Lamp with Nested Objects}
\end{code}
The above code creates a correctly set-up hierarchy of nested objects, allowing us to:
\begin{itemize}
\item Translate the base while the two arms remain in the correct relative position.
\item Rotate the first arm while keeping the second arm in the correct position.
\end{itemize}
\subsubsection{Geometry Beyond Primitives}
In Three.js, the term ``low-level geometry'' is used to refer to geometry objects consisting of vertices, faces, \&
normal.
\section{Animation \& Interactivity}
\subsection{Handling the Keyboard}
Handling the keyboard involves recognising keypresses and updating the graphics in response.
\begin{code}
\inputminted[linenos, breaklines, frame=single]{html}{../materials/week3/examples/canvasWithKeyboardExample.html}
\caption{Keyboard Handling (Canvas/JavaScript)}
\end{code}
\subsection{Mouse Handling}
\begin{code}
\inputminted[linenos, breaklines, frame=single]{html}{../materials/week3/examples/canvasWithMouseExample.html}
\caption{Mouse Handling (Canvas/JavaScript)}
\end{code}
\subsection{Time-Based Animation}
Time-based animation can be achieved using \mintinline{javascript}{window.setTimeout()} which repaints the
canvas at pre-defined intervals.
\begin{code}
\inputminted[linenos, breaklines, frame=single]{html}{../materials/week3/examples/canvasAnimationExample1.html}
\caption{Time-Based Animation with \mintinline{javascript}{window.setTimeout()}}
\end{code}
However, improved smoothness can be achieved using \mintinline{javascript}{window.requestAnimationFrame()} which is called at
every window repaint/refresh.
\begin{code}
\inputminted[linenos, breaklines, frame=single]{html}{../materials/week3/examples/canvasAnimationExample1_withSmootherAnimation.html}
\caption{Smoother Time-Based Animation with \mintinline{javascript}{window.requestAnimationFrame()}}
\end{code}
\subsection{Raycasting}
\textbf{Raycasting} is a feature offered by 3D graphics APIs which computes a ray from a start position in a specified direction
and identifies the geometry that the ray hits.
\begin{minted}[linenos, breaklines, frame=single]{html}
renderer = new THREE.WebGLRenderer({ canvas: c, antialias: true });
\end{minted}
The following example illustrates the use of raycasting/picking and rotation/translation based on mouse selection and mouse
movement.
It also illustrates how nested co-ordinate systems have been used to make the lamp parts behave correctly.
\begin{code}
\inputminted[linenos, breaklines, frame=single]{html}{../materials/week3/examples/Threejs-20-controllable-desk-lamp.html}
\caption{Controllable Desk Lamp}
\end{code}
\subsection{Shading Algorithms}
The colour at any pixel on a polygon is determined by:
\begin{itemize}
\item The characteristics (including colour) of the surface itself.
\item Information about light sources (ambient, directional, parallel, point, or spot) and their positions relative to
the surface.
\item \textit{Diffuse} \& \textit{specular} reflections.
\end{itemize}
Classic shading algorithms include:
\begin{itemize}
\item Flat shading.
\item Smooth shading (Gourard).
\item Normal Interpolating Shading (Phong).
\end{itemize}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{images/shading_algs.png}
\caption{Different Shading Algorithms}
\end{figure}
\subsubsection{Flat Shading}
\textbf{Flat shading} calculates and applies directly the shade of each surface, which is calculated via the cosine of the angle
of incidence ray to the \textit{surface normal} (a \textbf{surface normal} is a vector perpendicular to the surface).
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{images/flat_shading.png}
\caption{Flat Shading}
\end{figure}
\subsubsection{Smooth (Gourard) Shading}
\textbf{Smooth (Gourard) shading} calculates the shade at each vertex, and interpolates (smooths) these shades across the
surfaces.
Vertex normals are calculated by averaging the normals of the connected faces.
Interpolation is often carried out in graphics hardware, making it generally very fast.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{images/smooth_shading.png}
\caption{Smooth Shading}
\end{figure}
\subsubsection{Normal Interpolating (Phong) Shading}
\textbf{Normal interpolating (Phong) shading} calculates the normal at each vertex and interpolates these normals across the
surfaces.
The light, and therefore the shade at each pixel is individually calculated from its unique surface normal.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{images/phong_shading.png}
\caption{Normal Interpolating (Phong) Shading}
\end{figure}
\subsection{Shading in Three.js}
In Three.js, \textbf{materials} define how objects will be shaded in the scene.
There are three different shading models to choose from:
\begin{itemize}
\item \mintinline{javascript}{MeshBasicMaterial}: none.
\item \mintinline{javascript}{MeshPhongMaterial} (with \mintinline{javascript}{flatShading = true}): flat shading.
\item \mintinline{javascript}{MeshLamberMaterial}: Gourard shading.
\end{itemize}
\subsection{Shadows in Three.js}
Three.js supports the use of shadows although they are expensive to use.
The scene is redrawn for each shadow-casting light, and finally composed from all the results.
Games sometimes use fake ``blob shadows'' instead of proper shadows or else only let one light cast shadows to save computation.
\subsection{Reflectivity of Materials in Three.js}
There are a variety of colour settings in Three.js
\begin{itemize}
\item \textbf{Diffuse colour} is defined by the colour of the material.
\item \textbf{Specular colour} is the colour of specular highlights (in Phong shading only).
\item \textbf{Shininess} is the strength of specular highlights (in Phong only).
\item \textbf{Emissive colour} is not affected by lighting.
\end{itemize}
\end{document}