Section 1.1 Metrics, Norms and Inner Products
The central notion of this course is that of a metric space. This is a concept that allows one to generalise the notion of distance, and as a consequence the notions of convergence, continuity, etc., to an abstract setting and to many examples in a variety of contexts.
Definition 1.1.
Suppose that \(X\) is a set. A function \(d \maps X \times X \to \R\) is called a metric on \(X\) if it has the following properties:
- \(d(x,y) \ge 0\) for all \(x,y \in X\text{,}\)
- \(d(x,y) = 0\) if and only if \(x = y\text{,}\)
- \(d(x,y) = d(y,x)\) for all \(x,y \in X\) (symmetry) and
- \(d(x,z) \le d(x,y) + d(y,z)\) for all \(x,y,z \in X\) (triangle inequality).
The pair \((X,d)\) is then called a metric space.
Sometimes \(d\) is called a ‘distance function’ rather than a ‘metric’. Some authors require that \(X\) is non-empty in the definition of a metric space. We will not do this, but we will also not spend too much time worrying about empty metric spaces because they are not interesting.
Definition 1.2.
A normed space is a pair \((X, \n\blank)\text{,}\) where \(X\) is a vector space over \(\R\) and \(\n\blank \maps X \to \R\) is a function satisfying the following axioms:
- \(\n x \ge 0\) for all \(x \in X\text{;}\)
- \(\n x = 0\) if and only if \(x = 0\text{;}\)
- \(\n{\alpha x} = \abs\alpha \n x\) for all \(x \in X\) and all \(\alpha \in \R\text{;}\)
- \(\|x+y\| \le \n x+ \n y\) for all \(x, y \in X\) (triangle inequality).
The function \(\n\blank\) is then called a norm on \(X\text{.}\)
Any normed space \((X, \n\blank)\) gives rise to a metric space \((X,d)\) with metric \(d\) defined as follows:
\begin{gather}
d(x, y) = \|x - y\|, \quad x, y \in X\tag{1.1}
\end{gather}
We say the metric \(d\) is induced by the norm \(\n\blank\text{.}\) However, not every metric space arises this way. (We will see examples shortly.) Note that a normed space requires a certain algebraic structure (the structure of a vector space), but a metric space need not satisfy any such conditions. This means that in a normed space, we automatically have the operations of vector addition and multiplication by real numbers (scalars). In a generic metric space, such operations make no sense, unless additional properties are given.
Definition 1.3.
Let \(X\) be a vector space over \(\R\text{.}\) An inner product on \(X\) is a function \(\scp\blank\blank \maps X \times X \to \R\) such that
- \(\scp{x + y}{z} = \scp xz + \scp yz\) for all \(x,y,z \in X\text{;}\)
- \(\scp{\alpha x}{y} = \alpha \scp xy\) for all \(x,y \in X\) and \(\alpha \in \R\text{;}\)
- \(\scp xy = \scp yx\) for all \(x,y \in X\text{;}\)
- \(\scp xx \ge 0\) for all \(x \in X\text{,}\) with equality if and only if \(x = 0\text{.}\)
The pair \((X,\scp\blank\blank)\) is then called an inner product space.
You know from MA20216 (Algebra 2A) that for any inner product space \((X,\scp\blank\blank)\text{,}\) we have an induced norm \(\n\blank\) on \(X\text{,}\) defined by
\begin{gather}
\n x = \sqrt{\scp xx}, \quad x \in X\text{.}\tag{1.2}
\end{gather}
Moreover, we have the following important lemma (which have been proved in MA20216).
Lemma 1.4. Cauchy–Schwarz inequality.
Let \((X, \scp\blank\blank)\) be an inner product space. Then
\begin{equation*}
\left|\scp xy\right| \le \n x \n y
\end{equation*}
Example 1.5. Euclidean spaces.
Let \(n \in \N\) and define \(\scp\blank\blank \maps \R^n \times \R^n \to \R\) as follows:
\begin{equation*}
\scp xy = \sum_{i = 1}^n x_i y_i, \quad x, y \in \R^n\text{.}
\end{equation*}
This is an inner product (called the Euclidean inner product), which means that \((\R^n, \scp\blank\blank)\) is an inner product space (and also a normed space and a metric space).
Example 1.6. Discrete metric.
Let \(X\) be a set. Define \(d(x,x) = 0\) for every \(x \in X\) and \(d(x,y) = 1\) if \(x \ne y\text{.}\) Then Axioms i, ii and iii in Definition 1.1 are clearly satisfied. Axiom iv is quite easy to verify, too, and so \(d\) is a metric, called the discrete metric, on \(X\text{.}\)
Example 1.7. Supremum norm.
For a given set \(S\text{,}\) consider the space \(B(S)\text{,}\) comprising all bounded functions \(f \maps S \to \R\text{.}\) This is a vector space with vector addition and scalar multiplication defined as follows: for \(f,
g \in B(S)\) and \(\alpha \in \R\text{,}\) let
1
A function \(f \maps S \to \R\) is called bounded if there exists a constant \(M \ge 0\) such that \(\abs{f(x)} \le M\) for all \(x \in S\text{.}\)
\begin{equation*}
(f + g)(s) = f(s) + g(s), \quad s \in S\text{,}
\end{equation*}
and
\begin{equation*}
(\alpha f)(s) = \alpha f(s), \quad s \in S\text{.}
\end{equation*}
If \(S\ne\varnothing\text{,}\) we may define \(\n\blank_{\sup}\maps B(S)\to\R\) by setting
\begin{equation*}
\n f_{\sup}= \sup_{s \in S} |f(s)|\text{.}
\end{equation*}
It is verified in Exercise 1.1.6 that \(\n\blank_{\sup}\) is a norm, called the supremum norm. Hence \((B(S),\n\blank_{\sup})\) is a normed space.
The triangle inequality gives rise to the following useful inequality.
Lemma 1.8. Reverse triangle inequality.
Let \((X,d)\) be a metric space. Then
\begin{equation*}
|d(x, y) - d(y, z)| \le d(x, z)
\end{equation*}
for all \(x, y, z \in X\text{.}\)
Proof.
Suppose first that \(d(x, y) \ge d(y, z)\text{.}\) Then the desired inequality is equivalent to
\begin{equation*}
d(x, y) - d(y, z) \le d(x, z)\text{,}
\end{equation*}
which is just the triangle inequality rearranged. If \(d(x,y) \lt
d(y,z)\text{,}\) we can use the same arguments.
Given a metric, normed, or inner product space, we may restrict the metric, norm, or inner product to a subset or vector subspace.
Definition 1.9.
Suppose that \((X,d)\) is a metric space and \(Y \subset X\) is a subset, and let \(d' = d|_{Y \times Y}\) denote the restriction of \(d\) to pairs of points in \(Y\text{.}\) Then \((Y,d')\) is a metric space and is called a metric subspace of \((X,d)\text{.}\) We call \(d'\) the induced metric.
Convention 1.10.
Since the induced metric \(d'\) in Definition 1.9 is just a restriction of \(d\text{,}\) we will often abuse notation and write \((Y,d)\text{.}\) Also, the specific symbol \(d'\) in is arbitrary; we could just as well have used different letter such as \(\rho\text{,}\) for instance, or some other symbol like \(d_Y\) to remind us of the dependence on \(Y\text{.}\) The important thing is not the symbol which is used but instead the terms ‘metric subspace’ and ‘induced metric’. The same goes for the two definitions below.
Definition 1.11.
Suppose that \((X, \n\blank)\) is a normed space and \(Y \subset X\) is a linear subspace, and let \(\n\blank'\) denote the restriction of \(\n\blank\) to points in \(y\text{.}\) Then \((Y, \n\blank')\) is a normed space and is called a normed subspace of \((X,
\n\blank)\text{.}\)
Definition 1.12.
Suppose that \((X, \scp\blank\blank)\) is an inner product space and \(Y \subset
X\) is a linear subspace, and let \(\scp\blank\blank'\) denote the restriction of \(\scp\blank\blank\) to pairs of points in \(Y\text{.}\) Then \((Y,
\scp\blank\blank')\) is an inner product space and is called an inner product subspace of \((X, \scp\blank\blank)\text{.}\)
Example 1.13. \(C^0([a,b])\).
For \(a \lt b\text{,}\) consider the closed interval \([a,b]\text{.}\) Recall the space \(B([a,b])\) (comprising all bounded functions on \([a,b]\)) and consider the linear subspace \(C^0([a,b])\) comprising all continuous functions \(f \maps [a,b] \to \R\text{.}\) (By the Weierstrass extreme value theorem, any such function is bounded, thus we have the inclusion \(C^0([a,b]) \subset B([a,b])\text{.}\)) The restriction of the supremum norm to \(C^0([a,b])\) gives rise to a normed subspace of \(B([a,b])\text{.}\)
Definition 1.14.
Suppose that \((X, d_X)\) and \((Y, d_Y)\) are metric spaces. Then the metric space consisting of the set \(X \times Y\) and the metric \(d_{X \times Y}\) with
\begin{equation*}
d_{X \times Y}((x_1, y_1), (x_2, y_2))
= \sqrt{(d_X(x_1, x_2))^2 + (d_Y(y_1, y_2))^2}
\end{equation*}
(for \(x_1, x_2 \in X\) and \(y_1, y_2 \in Y\)) is called the product space of \((X,d_X)\) and \((Y,d_Y)\text{.}\)
You are asked in Exercise 1.1.9 to show that \(d_{X \times Y}\) is a metric on \(X \times Y\text{.}\) Product subspaces can also be defined for normed spaces and inner product spaces. The above is not the only way to equip \(X \times Y\) with a metric, but it arises naturally when you think of a metric induced by an inner product.
Definition 1.15.
Let \((X,d)\) be a metric space and \(S \subseteq X\text{.}\)
- The diameter of \(S\) is \(\diam S = \sup_{x,y \in S} d(x,y)\text{.}\)
- We say that \(S\) is bounded if \(\diam S \lt \infty\text{.}\)
- If \(S=X\) is bounded, we say that the metric space \((X,d)\) is bounded.
- If \((X,\n\blank)\) is a normed space, we say that \(S\) is bounded if \(\sup_{x \in S} \n x \lt \infty\text{.}\)
Since normed spaces are also metric spaces, we should check that the second and fourth bullet points in Definition 1.15 are equivalent. This is indeed the case; the details are requested in Exercise 1.1.11.
Convention 1.16.
When the metric \(d\) is clear from context, and as things get complicated, for brevity we will often omit it and say things like “\(X\) is a metric space”. This is especially true for later chapters. In particular,
- If \(X\) is (a subset of) \(\R\) or \(\R^n\text{,}\) then, unless stated otherwise, we will assume that the relevant metric \(d\) is the Euclidean metric, and similarly use \(\n\blank\) and \(\scp\blank\blank\) to denote the Euclidean norm and inner product.
- If \(X\) is (a subset of) a space \(B(S)\) of bounded functions \(S \to \R\text{,}\) then, unless stated otherwise, we will assume the relevant metric \(d\) is the one induced by the supremum norm, \(d(f,g)=\n{f-g}_{\sup}\text{.}\)
- If \(X\) is (a subset of) a product \(Y \times Z\) where \((Y,d_Y)\) and \((Z,d_Z)\) are metric spaces, then, unless stated otherwise, we will assume the relevant metric is the one \(d_{Y \times Z}\) given in Definition 1.14 above.
Similarly,
- If \((X,\n\blank)\) is a normed space, then, unless stated otherwise, we also think of it as a metric space \((X,d)\) with the induced metric \(d\) from (1.1).
You are of course also free to save ink by using these conventions on your problem sheets.
Exercises Exercises
1. Iterated triangle inequality.
Let \((X,d)\) be a metric space. Show that, for any finite list \(x_1,\ldots,x_n\) of points in \(X\text{,}\)
\begin{gather*}
d(x_1,x_n) \le d(x_1,x_2) + d(x_2,x_3) + \cdots + d(x_{n-1},x_n) \text{.}
\end{gather*}
Hint.
Repeatedly apply the triangle inequality, for instance using induction.
2. Spaces with three points.
Let \(X=\{1,2,3\}\) and let \(a,b,c\gt 0\) be positive real numbers. Define a function \(d \maps X \times X \to \R\) by
\begin{align*}
d(1,1)\amp=d(2,2)=d(3,3)=0,\\
d(1,2)\amp=d(2,1)=a,\\
d(2,3)\amp=d(3,2)=b,\\
d(1,3)\amp=d(3,1)=c\text{.}
\end{align*}
Find a necessary and sufficient condition on \(a,b,c\) for \(d\) to be a metric.
Hint.
The main thing to check is the triangle inequality. Since \(X\) only has three elements, this can be done by brute force.
3. Are these metric spaces?
Do the following definitions of a set \(X\) and a function \(d \maps X
\times X \to \R\) give rise to a metric space \((X,d)\text{?}\)
(a)
Let \(X = \R^2\) and \(d(x, y) = \max\set{n \in \N \cup \{0\}}{n \le
\n{x-y}}\) for \(x, y \in \R^2\) (so \(\n{x-y}\) is rounded down to the next integer).
(b)
Let \(X = \R^2\) and \(d(x,y) = \min\set{n \in \N \cup \{0\}}{n \ge \n{x-y}}\) for \(x, y \in \R^2\) (so \(\n{x-y}\) is rounded up to the next integer).
(c)
Let \(X\) be the set of all finite subsets of \(\N\) and \(d(A,B) =
|A \setminus B| + |B \setminus A|\) for \(A, B \in X\text{,}\) where \(\abs
S\) denotes the number of elements of a set \(S \subset \N\text{.}\)
4. The inner product is not the metric.
Consider the Euclidean inner product space \((\R^2,\scp\blank\blank)\text{,}\) with \(\n\blank\) and \(d\) the induced norm and metric.
(a)
Find a pair of points \(x,y \in \R^2\) with \(\scp xy \ne d(x,y)\text{.}\)
(b)
Find a pair of points \(x,y \in \R^2\) with \(\scp xy = d(x,y)\text{.}\)
5. (PS2) Some norms on \(\R^2\).
We saw in Example 1.5 that the mapping \(\n\blank_2 \maps
\R^2 \to \R\) defined by \(\n x_2 = \sqrt{x_1^2 + x_2^2}\) is norm on the vector space \(\R^2\text{.}\) In this exercise we consider the alternative norms \(\n\blank_1\) and \(\n\blank_\infty\) defined by
\begin{align*}
\n x_1 \amp = \abs{x_1} + \abs{x_2}, \\
\n x_\infty \amp = \max\{\abs{x_1},\abs{x_2}\}
\end{align*}
for all \(x \in \R^2\text{.}\)
(a)
Show that \(\n\blank_1\) satisfies the triangle inequality, and conclude that \((\R^2,\n\blank_1)\) is a normed space. Sketch the set \(\set{x \in \R^2}{\n x_1 = 1}\text{.}\)
Hint.
Use the triangle inequality for \(\abs\blank\) and regroup terms.
Solution.
Let \(x, y \in \R^2\text{.}\) Then using the triangle inequality for \(\R\) we have
\begin{align*}
\n{x+y}_1
\amp = \abs{x_1+y_1} + \abs{x_2+y_2}\\
\amp \le (\abs{x_1} + \abs{y_1}) + (\abs{x_2}+\abs{y_2})\\
\amp = (\abs{x_1} + \abs{x_2}) + (\abs{y_1}+\abs{y_2})\\
\amp = \n x_1 + \n y_1\text{.}
\end{align*}
The remaining three axioms are clear, and so we conclude that \(\n\blank_1\) is a norm on \(\R^2\text{.}\)
The set \(\set{x \in \R^2}{\n x_1 = 1}\) is a diamond with vertices at \((\pm 1,0)\) and \((0,\pm 1)\text{.}\)
Comment.
The official solutions are perhaps a bit glib when they say that the remaining three axioms are clear. Many students quite reasonably showed these was well. Remember that for scalar multiplication the axiom is \(\n{\alpha x}_1 = \abs \alpha \n x_1\text{,}\) with absolute values around the scalar \(\alpha\text{!}\)
(b)
Show that \(\n\blank_\infty\) satisfies the triangle inequality, and conclude that \((\R^2,\n\blank_\infty)\) is a normed space. Sketch the set \(\set{x \in \R^2}{\n x_\infty = 1}\text{.}\)
Hint.
This one is slightly trickier. In addition to the triangle inequality for \(\abs\blank\text{,}\) the obvious inequalities \(\abs{x_1}, \abs{x_2} \le \n
x_\infty\) are also useful. I would advise against breaking into different cases based on whether \(\abs{x_1}\) or \(\abs{x_2}\) is larger and so on.
Solution.
Let \(x, y \in \R^2\text{.}\) Then using the triangle inequality for \(\R\) we have
\begin{align*}
\n{x+y}_\infty
\amp = \max\{\abs{x_1+y_1},\abs{x_2+y_2}\}\\
\amp \le \max\{\abs{x_1}+\abs{y_1},\abs{x_2}+\abs{y_2}\}\\
\amp \le \max\{\n x_\infty+ \n y_\infty,\n x_\infty+\n y_\infty\}\\
\amp = \n x_\infty + \n y_\infty\text{.}
\end{align*}
The remaining three axioms are clear, and so we conclude that \(\n\blank_\infty\) is a norm on \(\R^2\text{.}\)
The set \(\set{x \in \R^2}{\n x_\infty = 1}\) is a square with vertices at the four points \((\pm 1, \pm 1)\text{.}\)
Comment 1.
Several students instead sketched \(\set{x \in \R^2}{\n x_\infty \le 1}\text{,}\) which is a different set.
Comment 2.
The official solutions are perhaps a bit glib when they say that the remaining three axioms are clear. Many students quite reasonably showed these was well. Remember that for scalar multiplication the axiom is \(\n{\alpha x}_\infty = \abs \alpha \n x_\infty\text{,}\) with absolute values around the scalar \(\alpha\text{!}\)
(c)
Show that, for any \(x \in \R^2\text{,}\) we have the four inequalities
\begin{gather}
\n x_\infty \le \n x_2 \le \n x_1 \le \sqrt 2 \n x_2 \le 2 \n x_\infty \text{.}\tag{✶}
\end{gather}
Hint 1.
It may be helpful to look back to your sketches from the previous two parts.
Hint 2.
For the third inequality, interpret \(\n x_1\) as the inner product of the vectors \((\abs{x_1},\abs{x_2})\) and \((1,1)\text{,}\) and use the Cauchy–Schwarz inequality.
Solution.
To see the first inequality, we estimate
\begin{align*}
\n x_\infty
\amp =
\max\{\abs{x_1}, \abs{x_2}\}\\
\amp =
\max\{\sqrt{x_1^2}, \sqrt{x_2^2}\}\\
\amp \le
\max\{\sqrt{x_1^2+x_2^2}, \sqrt{x_2^2+x_1^2}\}\\
\amp = \n x_2\text{.}
\end{align*}
For the second inequality, we estimate the square of the left hand side,
\begin{align*}
\n x_2^2
\amp = \abs{x_1}^2 + \abs{x_2}^2\\
\amp \le \abs{x_1}^2 + \abs{x_2}^2 + 2 \abs{x_1} \abs{x_2}\\
\amp = (\abs{x_1} + \abs{x_2})^2 \\
\amp = \n x_1^2 \text{.}
\end{align*}
(Alternatively, we could write \(x = (x_1,0) + (0,x_2)\) and use the triangle inequality for \(\n\blank_2\text{.}\)) For the third inequality, we follow the hint and write
\begin{equation*}
\n x_1 = \scp{ (\abs{x_1},\abs{x_2})}{ (1,1) }
\end{equation*}
where here \(\scp\blank\blank\) is the usual Euclidean inner product. Applying the Cauchy–Schwarz inequality, we conclude that
\begin{align*}
\n x_1
\amp
\le \n{ (\abs{x_1},\abs{x_2}) }_2 \n{ (1,1) }_2\\
\amp = \sqrt 2 \n x_2 \text{.}
\end{align*}
Finally, for the fourth inequality we simply observe that
\begin{align*}
\n x_2
\amp = \sqrt{\abs{x_1}^2 + \abs{x_2}^2}\\
\amp \le \sqrt{ \n x_\infty^2 + \n x_\infty^2 }\\
\amp \le \sqrt 2 \n x_\infty,
\end{align*}
which yields the fourth inequality after multiplying both sides by \(\sqrt 2\text{.}\)
Comment 1.
Many students implicitly assumed that all vectors \(x \in \R^2\) they encountered had \(x_1 \gt 0\) and \(x_2 \gt 0\) so that they could drop all of the absolute values. This is a very strong assumption to make, and I didn’t see any convincing arguments that it could be made ‘without loss of generality’.
Comment 2.
Several students proved the weaker inequality \(\sqrt 2 \n x_1 \le 2 \sqrt
2 \n x_\infty\) instead of the requested inequality \(\sqrt 2 \n x_1 \le 2 \n
x_\infty\text{.}\)
Comment 3.
The inequalities in (✶) can be summarised in the following diagram showing the ‘nested’ curves \(\{\n x_\infty =
1\}\text{,}\) \(\{\n x_2 = 1\}\text{,}\) \(\{\n x_1 = 1\}\text{,}\) \(\{\n x_2 =
\sqrt 2\}\) and \(\{\n x_\infty = 1/2\}\) in \(\R^2\text{.}\) The corresponding open balls in \((\R^2,\n\blank_1)\text{,}\) \((\R^2,\n\blank_2)\) and \((\R^2,\n\blank_\infty)\) are nested in a similar way.
6. (PS2) Supremum norm.
Let \(S\) be a set, and define \(B(S)\) and \(\n\blank_{\sup} \maps
B(S) \to \R\) as in Example 1.7. (You can take for granted that \(B(S)\) is a vector space.) Show that \(\n\blank_{\sup}\) is a norm on \(B(S)\text{.}\)
Solution.
Clearly \(\n f_{\sup} \ge 0\) for all \(f \in B(S)\) and \(\n
0_{\sup} = 0\text{.}\) If \(\n f_{\sup} = 0\text{,}\) then it follows that \(f(s) =
0\) for every \(s \in S\text{;}\) hence \(f = 0\text{.}\)
For \(f \in B(S)\) and \(\alpha \in \R\text{,}\)
\begin{align*}
\|\alpha f\|_{\sup}
\amp = \sup_{s \in S} |\alpha f(s)| \\
\amp = \sup_{s \in S} \big(\abs \alpha\, \abs{f(s)} \big)\\
\amp = |\alpha| \sup_{s \in S} |f(s)| \\
\amp = |\alpha|\n f_{\sup}\text{.}
\end{align*}
Finally, for \(f, g \in B(S)\text{,}\)
\begin{align*}
\|f + g\|_{\sup} \amp = \sup_{s \in S} |f(s) + g(s)|\\
\amp \le \sup_{s \in S} (|f(s)| + |g(s)|) \\
\amp \le \sup_{s \in S} |f(s)| + \sup_{s \in S} |g(s)| \\
\amp = \n f_{\sup} + \n g_{\sup}\text{,}
\end{align*}
where here we first used the triangle inequality in \(\R\) and then the fact that the supremum of a sum is less than or equal to the sum of the suprema.
Alternatively, for the last part we could have first argued that, for any \(s \in S\text{,}\)
\begin{gather*}
|f(s) + g(s)|
\le |f(s)| + |g(s)|
\le \n f_{\sup} + \n g_{\sup}\text{,}
\end{gather*}
which implies that the right hand side is an upper bound for the set \(\set{\abs{f(s)+g(s)}}{s \in S}\text{,}\) and hence in particular an upper bound on the supremum of that set. Another way to talk about this is to say that we’re taking “taking the supremum” of the above inequality over \(S\text{,}\) and using the fact that the supremum of a constant is a constant.
Comment 1.
As usually happens, a few students wrote
\begin{equation*}
\sup_{s \in S} (\abs{f(s)} + \abs{g(s)})
=
\sup_{s \in S} \abs{f(s)} + \sup_{s \in S}\abs{g(s)}
\end{equation*}
with equality rather than \(\le\text{.}\) This is not true. For a counterexample, consider \(S=(0,1)\text{,}\) \(f(t)=t\) and \(g(t)=1-t\text{.}\) Then \(\sup_S f + \sup_S g = 1+1=2\text{,}\) but \(\sup_S (f+g) = \sup_S 1 =
1\text{.}\) Also notice that in this example \(f\) and \(g\) do not have maxima over \(S\text{,}\) but only suprema. That is, there is no point \(s \in
S\) where \(f(s) = \sup_S f = 1\text{.}\)
One way to see why the version if \(\le\) is true is to observe that, for all \(s \in S\text{,}\) \(\abs{f(s)} \le \sup_{t \in S} \abs{f(t)}\) and similarly for \(g\text{.}\) Thus, for any \(s \in S\text{,}\) we have
\begin{align*}
\abs{f(s)} + \abs{g(s)}
\amp \le
\sup_{t \in S}\abs{f(t)} + \sup_{t \in S}\abs{g(t)}\text{.}
\end{align*}
Since the right hand side is a constant independent of \(s\text{,}\) taking the supremum of both sides we are left with
\begin{align*}
\sup_{s \in S}(\abs{f(s)} + \abs{g(s)})
\amp \le
\sup_{t \in S}\abs{f(t)} + \sup_{t \in S}\abs{g(t)}\\
\amp =
\sup_{s \in S}\abs{f(s)} + \sup_{s \in S}\abs{g(s)}\text{.}
\end{align*}
Comment 2.
I’m not sure if this actually came up this year, but I’m including it anyway in case it is helpful. What does it mean for a function \(f \in B(S)\) to be the zero function \(f=0\text{?}\) Sometimes I see students simply write “\(f(s)=0\)”, which I read as shorthand for the longer statement “\(f(s)=0\) for all \(s \in S\)”. What does it mean for \(f\) to not be the zero function, though, \(f\ne 0\text{?}\) Surely not that \(f(s) \ne 0\) for all \(s \in S\text{.}\) Rather, logically negating the statement with a ‘for all’ in it, we arrive at “there exists \(s \in S\) such that \(f(s) \ne 0\)”.
Comment 3.
To prove the triangle inequality for \(\n\blank_\infty\) we need both the triangle inequality for \(\abs\blank\) and an inequality for the supremum of a sum of two functions. Collapsing these into a single step by writing
\begin{gather*}
\sup_{s \in S} |f(s) + g(s)| \le \sup_{s \in S} |f(s)| + \sup_{s \in S} |g(s)|
\end{gather*}
without further explanation is probably a bit too short for a solution to this particular problem. A similar, but I think more minor, complaint could be made about the leap
\begin{align*}
\sup_{s \in S} |\alpha f(s)|
\amp = |\alpha| \sup_{s \in S} |f(s)| \text{.}
\end{align*}
That being said, now that we have proven that \(\n\blank_\infty\) is a norm in this problem, we are free to do both of these manipulations in a single step for the rest of the unit!
7. Calculating supremum norms.
For each of following sets \(S\) and functions \(f \maps S \to \R\text{,}\) calculate the supremum norm \(\n f_{\sup} = \sup_{s \in S} \abs{f(s)}\text{.}\)
(a)
\(S = [0,1]\text{,}\) \(f(s)=s+e^s\text{.}\)
Solution.
We calculate
\begin{align*}
\n f_{\sup}
\amp= \sup_{s \in S} \abs{f(s)}\\
\amp=\sup_{s \in [0,1]} \abs{s+e^s}\\
\amp=\sup_{s \in [0,1]} (s+e^s)\\
\amp=1+e\text{,}
\end{align*}
where here we have used the fact that \(s+e^s\) is positive and increasing on \([0,1]\text{.}\)
(b)
\(S=\R\text{,}\) \(f(s)=(s-1)/(s^2+1)\text{.}\)
Hint.
Don’t forget about the absolute values! Also, you will need to use basic calculus.
Solution.
First, we observe that \(f(s) \ge 0\) for \(s \ge 1\) and \(f(s) \le
0\) for \(s \le 1\text{.}\) Thus
\begin{align*}
\n f_{\sup}
\amp= \sup_{s \in \R} \abs{f(s)}\\
\amp= \max\left\{
\sup_{s \ge 1} f(s),\,
\sup_{s \le 1} (-f(s))\right\}\\
\amp= \max\left\{
\sup_{s \ge 1} f(s),\,
-\inf_{s \le 1} f(s)\right\}\text{.}
\end{align*}
Each of these terms can now be found using basic calculus. Indeed, we easily check that \(f(s) \to 0\) as \(s \to \pm\infty\text{,}\) and that
\begin{equation*}
f'(s) = -\frac{s^2-2s-1}{(s^2+1)^2}
\end{equation*}
so that \(f\) has critical points at \(s_\pm = 1 \pm \sqrt 2\text{.}\) From this we deduce that
\begin{equation*}
\sup_{s \ge 1} f(s)
= f(s_+) = \frac{\sqrt 2}{(\sqrt 2 + 1)^2+1}
\end{equation*}
while
\begin{equation*}
-\inf_{s \le 1} f(s)
= -f(s_-) = \frac{\sqrt 2}{(\sqrt 2 - 1)^2+1}\text{.}
\end{equation*}
Hence
\begin{align*}
\n f_{\sup}
\amp= \max\left\{
\sup_{s \ge -1} f(s),\,
-\inf_{s \le -1} f(s)\right\}\\
\amp= \max\{f(s_+),-f(s_-)\} \\
\amp =\frac{\sqrt 2}{(\sqrt 2 - 1)^2+1}\text{.}
\end{align*}
8. (PS2) Estimating supremum norms.
For each of following sets \(S\) and functions \(f \maps S \to \R\text{,}\) do not calculate the supremum norm \(\n f_{\sup}\text{,}\) but instead find an explicit constant \(C \gt 0\) such that \(\n f_{\sup} \le C\text{.}\)
(a)
\(S = [0,1]\text{,}\) \(f(s)=e^{-s^3} \sin s\text{.}\)
Hint.
Estimate each factor separately.
Solution.
We simply estimate each factor separately,
\begin{align*}
\n f_{\sup}
\amp= \sup_{s \in [0,1]} \abs{e^{-s^3} \sin s}\\
\amp\le
\Big(\sup_{s \in [0,1]} \abs{e^{-s^3}} \Big)
\Big(\sup_{s \in [0,1]} \abs{\sin s} \Big)\\
\amp\le e^{-0^3} \cdot 1 = 1,
\end{align*}
where here we have used the fact that \(e^{-s^3}\) is positive and decreasing on \([0,1]\text{,}\) while \(\abs{\sin s} \le 1\) for all \(s \in
\R\text{.}\)
Comment 1.
There are many different ways to write this argument. One option is to first fix \(s \in S\text{,}\) and then argue that \(\abs{f(s)} \le 1\text{,}\) and then finally take a supremum.
Comment 2.
Estimating the \(\sin s\) term more carefully, one can show the bound \(\n f_{\sup} \le \sin 1 \lt 0.85\text{.}\)
Comment 3.
Note that we need to estimate \(\sup_{s \in S} \abs{f(s)}\) and not \(\sup_{s \in S} f(s)\text{.}\)
(b)
\(S=[-1,2]\text{,}\) \(f(s)=-s-s^2+s^3/2-s^5/100\text{.}\)
Hint.
Estimate each term separately. Remember that the definition of \(\n\blank_{\sup}\) involves not only a supremum but an absolute value.
Solution.
We use the triangle inequality and estimate each term in the sum separately,
\begin{align*}
\n f_{\sup}
\amp= \sup_{s \in [-1,2]}
\left|-s-s^2+\frac 12 s^3- \frac 1{100}s^5\right|\\
\amp\le \sup_{s \in [-1,2]}
\left(\abs s+ \abs s^2+\frac 12 \abs s^3+ \frac 1{100} \abs
s^5\right)\\
\amp\le
\sup_{s \in [-1,2]} \abs s
+ \sup_{s \in [-1,2]} \abs s^2
+ \frac 12 \sup_{s \in [-1,2]} \abs s^3
+ \frac 1{100} \sup_{s \in [-1,2]}\abs s^5\\
\amp=
2
+ 2^2
+ \frac 1{2} 2^3
+ \frac 1{100} 2^5\\
\amp = 10 + \frac 8{25}
\le 11\text{.}
\end{align*}
Comment 1.
As with the previous part of this question, this argument can be written in many ways. One option is to first fix \(s \in S\text{,}\) argue that \(\abs{f(s)} \le 11\text{,}\) and then finally take a supremum. Another option is to write the various suprema in terms of supremum norms.
Comment 2.
Note that \(\sup_{s \in S} f(s)\) and \(\sup_{s \in S} \abs{f(s)}\) are in general different. Indeed, in this case plotting the function numerically we can see that
\begin{gather*}
\sup_{s \in S} f(s) \approx 0.208 \lt 2.238 \approx \sup_{s \in S} \abs{f(s)} \text{.}
\end{gather*}
On the other hand it is true that
\begin{equation*}
\sup_{s \in S} \abs{f(s) }
= \max\Big\{ \sup_{s \in S} f(s), -\inf_{s \in S} f(s)\Big\}\text{,}
\end{equation*}
and so one option is to estimate both \(\sup_{s \in S} f(s)\) (from above) and \(\inf_{s \in S} f(s)\) (from below).
Comment 3.
The official solution splits \(f\) into four terms and estimating each term separately. Since each term was simple enough, we could calculate their individual supremum norms exactly. This year a student obtained a much sharper bound by instead splitting \(f\) into two terms,
\begin{align*}
\n f_{\sup}
\amp= \sup_{s \in [-1,2]}
\left|-s-s^2+\frac 12 s^3- \frac 1{100}s^5\right|\\
\amp\le
\sup_{s \in [-1,2]}
\left|-s-s^2+\frac 12 s^3\right|
+ \sup_{s \in [-1,2]} \left|\frac 1{100}s^5\right|\\
\amp = \frac{10^{3/2}+26}{27} + \frac{32}{100}
\le 2.455,
\end{align*}
which is only about 10% off of the true value \(\n f_{\sup} \approx 2.238\) (calculated numerically). It is considerably more work than in the official solution, but again one can calculate the first supremum norm on the right hand side exactly using using calculus: it is achieved at \(s^*
= (2+\sqrt{10})/3 \in S\text{.}\) Using this same point gives a similarly good lower bound, \(\n f_{\sup} \ge \abs{f(s^*)} \ge 2.285\text{.}\)
It is important to emphasise, though, that this sort of fancy footwork is not at all what the question is looking for. It is just asking for any upper bound on \(\n f_{\sup}\) which we can rigorously prove, not the best one.
9. (PS2) Product spaces.
Let \((X, d_X)\) and \((Y, d_Y)\) be metric spaces, and define \(d_{X
\times Y}
\maps (X \times Y) \times (X \times Y) \to \R\) as in Definition 1.14,
\begin{equation*}
d_{X \times Y}((x_1, y_1), (x_2, y_2))
= \sqrt{(d_X(x_1, x_2))^2 + (d_Y(y_1, y_2))^2}
\end{equation*}
for \(x_1, x_2 \in X\) and \(y_1,y_2 \in Y\text{.}\) Show that \((X \times Y, d_{X
\times Y})\) is a metric space.
Hint 1.
In one step it is helpful to use the Cauchy–Schwarz inequality in \(\R^2\text{,}\) which implies that \(ab + cd \le \sqrt{a^2 + c^2} \sqrt{b^2 + d^2}\) for all \(a, b, c, d \in \R\text{.}\)
Hint 2.
If you are struggling with this problem, one strategy is to break things down into stages:
- First, carefully write down what it would mean for \((X \times Y, d_{X \times Y})\) to be a metric space.
- Then use the definition of \(d_{X \times Y}\) to rephrase the axioms to be proved in terms of the metrics \(d_X\) and \(d_Y\text{.}\)
- Finally, try prove these rephrased versions.
In my experience marking this question over the years, many students run into problems because they effectively jump straight to this last step but have made serious errors in the first two which prevent this from working.
Solution 1.
It is clear that \(d_{X \times Y}((x_1,y_1),(x_2,y_2)) \ge 0\) for all \(x_1,x_2
\in X\) and \(y_1,y_2 \in Y\text{,}\) and \(d_{X \times Y}((x_1,y_1),(x_2,y_2)) = 0\) if \((x_1,y_1) = (x_2,y_2)\text{.}\) If \(d_{X \times Y}((x_1,y_1),(x_2,y_2)) = 0\text{,}\) it follows that \(d_X(x_1,x_2) = 0\) and \(d_Y(x_1,x_2) = 0\text{,}\) and thus \(x_1 = x_2\) and \(y_1 = y_2\text{.}\)
The symmetry is rather obvious, but let’s write out the details anyway. For all \(x_1,x_2 \in X\) and \(y_1,y_2 \in Y\) we have
\begin{align*}
d_{X \times Y}( (x_2,y_2), (x_1,y_1) )
\amp =
\sqrt{ (d_X(x_2,x_1))^2 + (d_Y(y_2,y_1))^2 }\\
\amp =
\sqrt{ (d_X(x_1,x_2))^2 + (d_Y(y_1,y_2))^2 }\\
\amp =
d_{X \times Y}( (x_1,y_1), (x_2,y_2) ),
\end{align*}
where here in the second-to-last step we used the fact that \(d_X(x_2,x_1)=d_X(x_1,x_2)\) since \(d_X\) is a metric, and similarly for \(d_Y\text{.}\)
To prove the triangle inequality, consider \((x_1,y_1), (x_2,y_2),
(x_3,y_3)\) in \(X \times Y\text{.}\) Then
\begin{align*}
\amp (d_{X \times Y}((x_1,y_1),(x_3,y_3)))^2 \\
\amp \qquad = (d_X(x_1,x_3))^2 + (d_Y(y_1,y_3))^2 \\
\amp \qquad \le (d_X(x_1,x_2) + d_X(x_2,x_3))^2 + (d_Y(y_1,y_2) + d_Y(y_2,y_3))^2\\
\amp \qquad = (d_X(x_1,x_2))^2 + 2d_X(x_1,x_2)d_X(x_2,x_3) + (d_X(x_2,x_3))^2\\
\amp \qquad \quad + (d_Y(y_1,y_2))^2 + 2d_Y(y_1,y_2) d_Y(y_2,y_3) + (d_Y(y_2,y_3))^2\text{.}
\end{align*}
We use the Cauchy–Schwarz inequality in \(\R^2\) as indicated in the hint with \(a = d_X(x_1,x_2)\text{,}\) \(b = d_X(x_2,x_3)\text{,}\) \(c = d_Y(y_1,y_2)\) and \(d = d_Y(y_2,y_3)\text{.}\) This yields
\begin{align*}
\amp \left(d_{X \times Y}((x_1,y_1),(x_3,y_3))\right)^2\\
\amp \quad \le (d_X(x_1,x_2))^2 + (d_Y(y_1,y_2))^2
+ (d_X(x_2,x_3))^2 + (d_Y(y_2,y_3))^2\\
\amp \quad \quad + 2\sqrt{(d_X(x_1,x_2))^2 + (d_Y(y_1,y_2))^2}
\sqrt{(d_X(x_2,x_3))^2 + (d_Y(y_2,y_3))^2}\\
\amp \quad = \left(\sqrt{(d_X(x_1,x_2))^2 + (d_Y(y_1,y_2))^2}
+ \sqrt{(d_X(x_2,x_3))^2 + (d_Y(y_2,y_3))^2}\right)^2\\
\amp \quad = \left(d_{X \times Y}((x_1,y_1),(x_2,y_2))
+ d_{X \times Y}((x_2,y_2),(x_3,y_3))\right)^2\text{.}
\end{align*}
Now the triangle inequality follows.
Solution 2. Alternative proof of the triangle inequality
Another way to prove the triangle inequality is to notice that
\begin{gather*}
d_{X \times Y}\big((x_1,y_1),(x_2,y_2)\big)
= \left\|\big(d_X(x_1,x_2),d_Y(y_1,y_2)\big)\right\|
\end{gather*}
where \(\n\blank\) is the usual Euclidean norm on \(\R^2\text{.}\) Thus for any \(x_1,x_2,x_3 \in X\) and \(y_1,y_2,y_3 \in Y\) we have
\begin{align*}
\amp d_{X \times Y}\big((x_1,y_1),(x_3,y_3)\big)\\
\amp\quad
= \left\|\big(d_X(x_1,x_3),d_Y(y_1,y_3)\big)\right\|\\
\amp\quad
\le \left\|\big(d_X(x_1,x_2)+d_X(x_2,x_3),d_Y(y_1,y_2)+d_Y(y_2,y_3)\big)\right\|\\
\amp\quad
= \left\|\big(d_X(x_1,x_2),d_Y(y_1,y_2)\big)+\big(d_X(x_2,x_3),d_Y(y_2,y_3)\big)\right\|\\
\amp\quad
\le \left\|\big(d_X(x_1,x_2),d_Y(y_1,y_2)\big)\right\|+\left\|\big(d_X(x_2,x_3),d_Y(y_2,y_3)\big)\right\|\\
\amp\quad
=
d_{X \times Y}\big((x_1,y_1),(x_2,y_2)\big)
+ d_{X \times Y}\big((x_2,y_2),(x_3,y_3)\big)\text{.}
\end{align*}
Here in the first inequality we are using the triangle inequalities for \(d_X\) and \(d_Y\) as well as the fact that \(\n a \le \n b\) whenever \(a,b \in \R^2\) satisfy \(0 \le a_1 \le b_1\) and \(0 \le a_2 \le
b_2\text{.}\) The second inequality is the triangle inequality for the norm \(\n\blank\text{.}\)
Looking at the argument this way, it seems plausible that we could replace Euclidean norm \(\n\blank\) with either of the norms \(\n\blank_\infty,\n\blank_1\) from Exercise 1.1.5 to get a different metric on \(X \times Y\text{,}\) and this is indeed the case.
Comment 1.
Sometimes students, presumably trying to mimic the structure of the triangle inequality in Definition 1.1, introduce a mysterious third metric space \((Z,d_Z)\) into the problem where instead they should have introduced a third point \((x_3,y_3) \in X \times Y\text{.}\) Since \((Z,d_Z)\) does not appear in the statement of the problem, introducing it should also mean giving a proper definition! And once your start trying to do that you will hopefully realise that this approach doesn’t quite make sense.
Comment 2.
Several students wrote that, e.g., \(d_X(x_1,x_2)=0\) forces \((x_1,x_2)=0\) or \(x_1=x_2=0\text{.}\) Since we are working in a general metric space, \(X\) is a set and not necessarily a vector space, and so these statements are nonsensical! The same goes for \(X \times Y\) and \(d_{X
\times Y}\text{.}\)
10. Normed spaces are unbounded.
Let \((X,\n\blank)\) be a normed space, and suppose that there exists a non-zero element \(x_0 \in X\text{.}\) Show that \(X\) is unbounded.
Hint.
Consider points of the form \(\alpha x_0\) where \(\alpha \in \R\text{.}\)
11. Bounded sets in normed spaces.
Let \((X,\n\blank)\) be a normed space and let \(S \subseteq X\text{.}\) Show that \(\diam S \lt \infty\) if and only if \(\sup_{x \in S} \n x \lt \infty\text{.}\)
Solution.
Suppose that \(R=\sup_{x \in S} \n x \lt \infty\text{.}\) Then for any \(x,y \in S\) we can use the triangle inequality to estimate
\begin{equation*}
d(x,y) = \n{x-y} \le \n x + \n y \le R + R = 2R\text{.}
\end{equation*}
Taking a supremum over \(x,y \in S\) yields \(\diam S \le 2R \lt \infty\text{,}\) and hence that \(S\) is bounded.
Conversely, suppose that \(\diam S \lt \infty\text{,}\) and fix some \(y \in S\text{.}\) Then for any \(x \in S\) we have
\begin{equation*}
\n{x} = \n{x-y+y} \le \n{x-y} + \n y
\le \diam S + \n y\text{.}
\end{equation*}
Taking a supremum over \(x \in S\) yields
\begin{equation*}
\sup_{x \in S} {\n x} \le \diam S + \n y \lt \infty\text{.}
\end{equation*}
Comment.
In fact, a similar result also holds for metric spaces \((X,d)\text{,}\) with essentially the same proof. As \(X\) is no longer necessarily a vector space, the role of \(0\) is instead played by some fixed element \(x_0 \in
X\text{,}\) and \(\n \blank\) is replaced by \(d(\blank,x_0)\text{.}\)