Skip to main content

Appendix E Solutions to exercises

Here we collect all solutions to the exercises which have been posted so far. In the html version these are already accessible via links, and so this is mostly useful for the pdf version.
For reasons I don’t completely understand, the solutions to Problem Sheet 1 are an exception: They don’t show up here in any version, but do now show up in the problem sheet itself.

1 Fundamental Concepts
1.1 Metrics, Norms and Inner Products

Exercises

1.1.5. (PS2) Some norms on \(\R^2\).
1.1.5.a
Solution.
Let \(x, y \in \R^2\text{.}\) Then using the triangle inequality for \(\R\) we have
\begin{align*} \n{x+y}_1 \amp = \abs{x_1+y_1} + \abs{x_2+y_2}\\ \amp \le (\abs{x_1} + \abs{y_1}) + (\abs{x_2}+\abs{y_2})\\ \amp = (\abs{x_1} + \abs{x_2}) + (\abs{y_1}+\abs{y_2})\\ \amp = \n x_1 + \n y_1\text{.} \end{align*}
The remaining three axioms are clear, and so we conclude that \(\n\blank_1\) is a norm on \(\R^2\text{.}\)
The set \(\set{x \in \R^2}{\n x_1 = 1}\) is a diamond with vertices at \((\pm 1,0)\) and \((0,\pm 1)\text{.}\)
Comment.
The official solutions are perhaps a bit glib when they say that the remaining three axioms are clear. Many students quite reasonably showed these was well. Remember that for scalar multiplication the axiom is \(\n{\alpha x}_1 = \abs \alpha \n x_1\text{,}\) with absolute values around the scalar \(\alpha\text{!}\)
1.1.5.b
Solution.
Let \(x, y \in \R^2\text{.}\) Then using the triangle inequality for \(\R\) we have
\begin{align*} \n{x+y}_\infty \amp = \max\{\abs{x_1+y_1},\abs{x_2+y_2}\}\\ \amp \le \max\{\abs{x_1}+\abs{y_1},\abs{x_2}+\abs{y_2}\}\\ \amp \le \max\{\n x_\infty+ \n y_\infty,\n x_\infty+\n y_\infty\}\\ \amp = \n x_\infty + \n y_\infty\text{.} \end{align*}
The remaining three axioms are clear, and so we conclude that \(\n\blank_\infty\) is a norm on \(\R^2\text{.}\)
The set \(\set{x \in \R^2}{\n x_\infty = 1}\) is a square with vertices at the four points \((\pm 1, \pm 1)\text{.}\)
Comment 1.
Several students instead sketched \(\set{x \in \R^2}{\n x_\infty \le 1}\text{,}\) which is a different set.
Comment 2.
The official solutions are perhaps a bit glib when they say that the remaining three axioms are clear. Many students quite reasonably showed these was well. Remember that for scalar multiplication the axiom is \(\n{\alpha x}_\infty = \abs \alpha \n x_\infty\text{,}\) with absolute values around the scalar \(\alpha\text{!}\)
1.1.5.c
Solution.
To see the first inequality, we estimate
\begin{align*} \n x_\infty \amp = \max\{\abs{x_1}, \abs{x_2}\}\\ \amp = \max\{\sqrt{x_1^2}, \sqrt{x_2^2}\}\\ \amp \le \max\{\sqrt{x_1^2+x_2^2}, \sqrt{x_2^2+x_1^2}\}\\ \amp = \n x_2\text{.} \end{align*}
For the second inequality, we estimate the square of the left hand side,
\begin{align*} \n x_2^2 \amp = \abs{x_1}^2 + \abs{x_2}^2\\ \amp \le \abs{x_1}^2 + \abs{x_2}^2 + 2 \abs{x_1} \abs{x_2}\\ \amp = (\abs{x_1} + \abs{x_2})^2 \\ \amp = \n x_1^2 \text{.} \end{align*}
(Alternatively, we could write \(x = (x_1,0) + (0,x_2)\) and use the triangle inequality for \(\n\blank_2\text{.}\)) For the third inequality, we follow the hint and write
\begin{equation*} \n x_1 = \scp{ (\abs{x_1},\abs{x_2})}{ (1,1) } \end{equation*}
where here \(\scp\blank\blank\) is the usual Euclidean inner product. Applying the Cauchy–Schwarz inequality, we conclude that
\begin{align*} \n x_1 \amp \le \n{ (\abs{x_1},\abs{x_2}) }_2 \n{ (1,1) }_2\\ \amp = \sqrt 2 \n x_2 \text{.} \end{align*}
Finally, for the fourth inequality we simply observe that
\begin{align*} \n x_2 \amp = \sqrt{\abs{x_1}^2 + \abs{x_2}^2}\\ \amp \le \sqrt{ \n x_\infty^2 + \n x_\infty^2 }\\ \amp \le \sqrt 2 \n x_\infty, \end{align*}
which yields the fourth inequality after multiplying both sides by \(\sqrt 2\text{.}\)
Comment 1.
Many students implicitly assumed that all vectors \(x \in \R^2\) they encountered had \(x_1 \gt 0\) and \(x_2 \gt 0\) so that they could drop all of the absolute values. This is a very strong assumption to make, and I didn’t see any convincing arguments that it could be made ‘without loss of generality’.
Comment 2.
Several students proved the weaker inequality \(\sqrt 2 \n x_1 \le 2 \sqrt 2 \n x_\infty\) instead of the requested inequality \(\sqrt 2 \n x_1 \le 2 \n x_\infty\text{.}\)
Comment 3.
The inequalities in (✶) can be summarised in the following diagram showing the ‘nested’ curves \(\{\n x_\infty = 1\}\text{,}\) \(\{\n x_2 = 1\}\text{,}\) \(\{\n x_1 = 1\}\text{,}\) \(\{\n x_2 = \sqrt 2\}\) and \(\{\n x_\infty = 1/2\}\) in \(\R^2\text{.}\) The corresponding open balls in \((\R^2,\n\blank_1)\text{,}\) \((\R^2,\n\blank_2)\) and \((\R^2,\n\blank_\infty)\) are nested in a similar way.
The nested curves (diamonds, squares, circles) described above.
1.1.6. (PS2) Supremum norm.
Solution.
Clearly \(\n f_{\sup} \ge 0\) for all \(f \in B(S)\) and \(\n 0_{\sup} = 0\text{.}\) If \(\n f_{\sup} = 0\text{,}\) then it follows that \(f(s) = 0\) for every \(s \in S\text{;}\) hence \(f = 0\text{.}\)
For \(f \in B(S)\) and \(\alpha \in \R\text{,}\)
\begin{align*} \|\alpha f\|_{\sup} \amp = \sup_{s \in S} |\alpha f(s)| \\ \amp = \sup_{s \in S} \big(\abs \alpha\, \abs{f(s)} \big)\\ \amp = |\alpha| \sup_{s \in S} |f(s)| \\ \amp = |\alpha|\n f_{\sup}\text{.} \end{align*}
Finally, for \(f, g \in B(S)\text{,}\)
\begin{align*} \|f + g\|_{\sup} \amp = \sup_{s \in S} |f(s) + g(s)|\\ \amp \le \sup_{s \in S} (|f(s)| + |g(s)|) \\ \amp \le \sup_{s \in S} |f(s)| + \sup_{s \in S} |g(s)| \\ \amp = \n f_{\sup} + \n g_{\sup}\text{,} \end{align*}
where here we first used the triangle inequality in \(\R\) and then the fact that the supremum of a sum is less than or equal to the sum of the suprema.
Alternatively, for the last part we could have first argued that, for any \(s \in S\text{,}\)
\begin{gather*} |f(s) + g(s)| \le |f(s)| + |g(s)| \le \n f_{\sup} + \n g_{\sup}\text{,} \end{gather*}
which implies that the right hand side is an upper bound for the set \(\set{\abs{f(s)+g(s)}}{s \in S}\text{,}\) and hence in particular an upper bound on the supremum of that set. Another way to talk about this is to say that we’re taking “taking the supremum” of the above inequality over \(S\text{,}\) and using the fact that the supremum of a constant is a constant.
Comment 1.
As usually happens, a few students wrote
\begin{equation*} \sup_{s \in S} (\abs{f(s)} + \abs{g(s)}) = \sup_{s \in S} \abs{f(s)} + \sup_{s \in S}\abs{g(s)} \end{equation*}
with equality rather than \(\le\text{.}\) This is not true. For a counterexample, consider \(S=(0,1)\text{,}\) \(f(t)=t\) and \(g(t)=1-t\text{.}\) Then \(\sup_S f + \sup_S g = 1+1=2\text{,}\) but \(\sup_S (f+g) = \sup_S 1 = 1\text{.}\) Also notice that in this example \(f\) and \(g\) do not have maxima over \(S\text{,}\) but only suprema. That is, there is no point \(s \in S\) where \(f(s) = \sup_S f = 1\text{.}\)
One way to see why the version if \(\le\) is true is to observe that, for all \(s \in S\text{,}\) \(\abs{f(s)} \le \sup_{t \in S} \abs{f(t)}\) and similarly for \(g\text{.}\) Thus, for any \(s \in S\text{,}\) we have
\begin{align*} \abs{f(s)} + \abs{g(s)} \amp \le \sup_{t \in S}\abs{f(t)} + \sup_{t \in S}\abs{g(t)}\text{.} \end{align*}
Since the right hand side is a constant independent of \(s\text{,}\) taking the supremum of both sides we are left with
\begin{align*} \sup_{s \in S}(\abs{f(s)} + \abs{g(s)}) \amp \le \sup_{t \in S}\abs{f(t)} + \sup_{t \in S}\abs{g(t)}\\ \amp = \sup_{s \in S}\abs{f(s)} + \sup_{s \in S}\abs{g(s)}\text{.} \end{align*}
Comment 2.
I’m not sure if this actually came up this year, but I’m including it anyway in case it is helpful. What does it mean for a function \(f \in B(S)\) to be the zero function \(f=0\text{?}\) Sometimes I see students simply write “\(f(s)=0\)”, which I read as shorthand for the longer statement “\(f(s)=0\) for all \(s \in S\)”. What does it mean for \(f\) to not be the zero function, though, \(f\ne 0\text{?}\) Surely not that \(f(s) \ne 0\) for all \(s \in S\text{.}\) Rather, logically negating the statement with a ‘for all’ in it, we arrive at “there exists \(s \in S\) such that \(f(s) \ne 0\)”.
Comment 3.
To prove the triangle inequality for \(\n\blank_\infty\) we need both the triangle inequality for \(\abs\blank\) and an inequality for the supremum of a sum of two functions. Collapsing these into a single step by writing
\begin{gather*} \sup_{s \in S} |f(s) + g(s)| \le \sup_{s \in S} |f(s)| + \sup_{s \in S} |g(s)| \end{gather*}
without further explanation is probably a bit too short for a solution to this particular problem. A similar, but I think more minor, complaint could be made about the leap
\begin{align*} \sup_{s \in S} |\alpha f(s)| \amp = |\alpha| \sup_{s \in S} |f(s)| \text{.} \end{align*}
That being said, now that we have proven that \(\n\blank_\infty\) is a norm in this problem, we are free to do both of these manipulations in a single step for the rest of the unit!
1.1.7. Calculating supremum norms.
1.1.7.a
Solution.
We calculate
\begin{align*} \n f_{\sup} \amp= \sup_{s \in S} \abs{f(s)}\\ \amp=\sup_{s \in [0,1]} \abs{s+e^s}\\ \amp=\sup_{s \in [0,1]} (s+e^s)\\ \amp=1+e\text{,} \end{align*}
where here we have used the fact that \(s+e^s\) is positive and increasing on \([0,1]\text{.}\)
1.1.7.b
Solution.
First, we observe that \(f(s) \ge 0\) for \(s \ge 1\) and \(f(s) \le 0\) for \(s \le 1\text{.}\) Thus
\begin{align*} \n f_{\sup} \amp= \sup_{s \in \R} \abs{f(s)}\\ \amp= \max\left\{ \sup_{s \ge 1} f(s),\, \sup_{s \le 1} (-f(s))\right\}\\ \amp= \max\left\{ \sup_{s \ge 1} f(s),\, -\inf_{s \le 1} f(s)\right\}\text{.} \end{align*}
Each of these terms can now be found using basic calculus. Indeed, we easily check that \(f(s) \to 0\) as \(s \to \pm\infty\text{,}\) and that
\begin{equation*} f'(s) = -\frac{s^2-2s-1}{(s^2+1)^2} \end{equation*}
so that \(f\) has critical points at \(s_\pm = 1 \pm \sqrt 2\text{.}\) From this we deduce that
\begin{equation*} \sup_{s \ge 1} f(s) = f(s_+) = \frac{\sqrt 2}{(\sqrt 2 + 1)^2+1} \end{equation*}
while
\begin{equation*} -\inf_{s \le 1} f(s) = -f(s_-) = \frac{\sqrt 2}{(\sqrt 2 - 1)^2+1}\text{.} \end{equation*}
Hence
\begin{align*} \n f_{\sup} \amp= \max\left\{ \sup_{s \ge -1} f(s),\, -\inf_{s \le -1} f(s)\right\}\\ \amp= \max\{f(s_+),-f(s_-)\} \\ \amp =\frac{\sqrt 2}{(\sqrt 2 - 1)^2+1}\text{.} \end{align*}
1.1.8. (PS2) Estimating supremum norms.
1.1.8.a
Solution.
We simply estimate each factor separately,
\begin{align*} \n f_{\sup} \amp= \sup_{s \in [0,1]} \abs{e^{-s^3} \sin s}\\ \amp\le \Big(\sup_{s \in [0,1]} \abs{e^{-s^3}} \Big) \Big(\sup_{s \in [0,1]} \abs{\sin s} \Big)\\ \amp\le e^{-0^3} \cdot 1 = 1, \end{align*}
where here we have used the fact that \(e^{-s^3}\) is positive and decreasing on \([0,1]\text{,}\) while \(\abs{\sin s} \le 1\) for all \(s \in \R\text{.}\)
Comment 1.
There are many different ways to write this argument. One option is to first fix \(s \in S\text{,}\) and then argue that \(\abs{f(s)} \le 1\text{,}\) and then finally take a supremum.
Comment 2.
Estimating the \(\sin s\) term more carefully, one can show the bound \(\n f_{\sup} \le \sin 1 \lt 0.85\text{.}\)
Comment 3.
Note that we need to estimate \(\sup_{s \in S} \abs{f(s)}\) and not \(\sup_{s \in S} f(s)\text{.}\)
1.1.8.b
Solution.
We use the triangle inequality and estimate each term in the sum separately,
\begin{align*} \n f_{\sup} \amp= \sup_{s \in [-1,2]} \left|-s-s^2+\frac 12 s^3- \frac 1{100}s^5\right|\\ \amp\le \sup_{s \in [-1,2]} \left(\abs s+ \abs s^2+\frac 12 \abs s^3+ \frac 1{100} \abs s^5\right)\\ \amp\le \sup_{s \in [-1,2]} \abs s + \sup_{s \in [-1,2]} \abs s^2 + \frac 12 \sup_{s \in [-1,2]} \abs s^3 + \frac 1{100} \sup_{s \in [-1,2]}\abs s^5\\ \amp= 2 + 2^2 + \frac 1{2} 2^3 + \frac 1{100} 2^5\\ \amp = 10 + \frac 8{25} \le 11\text{.} \end{align*}
Comment 1.
As with the previous part of this question, this argument can be written in many ways. One option is to first fix \(s \in S\text{,}\) argue that \(\abs{f(s)} \le 11\text{,}\) and then finally take a supremum. Another option is to write the various suprema in terms of supremum norms.
Comment 2.
Note that \(\sup_{s \in S} f(s)\) and \(\sup_{s \in S} \abs{f(s)}\) are in general different. Indeed, in this case plotting the function numerically we can see that
\begin{gather*} \sup_{s \in S} f(s) \approx 0.208 \lt 2.238 \approx \sup_{s \in S} \abs{f(s)} \text{.} \end{gather*}
On the other hand it is true that
\begin{equation*} \sup_{s \in S} \abs{f(s) } = \max\Big\{ \sup_{s \in S} f(s), -\inf_{s \in S} f(s)\Big\}\text{,} \end{equation*}
and so one option is to estimate both \(\sup_{s \in S} f(s)\) (from above) and \(\inf_{s \in S} f(s)\) (from below).
Comment 3.
The official solution splits \(f\) into four terms and estimating each term separately. Since each term was simple enough, we could calculate their individual supremum norms exactly. This year a student obtained a much sharper bound by instead splitting \(f\) into two terms,
\begin{align*} \n f_{\sup} \amp= \sup_{s \in [-1,2]} \left|-s-s^2+\frac 12 s^3- \frac 1{100}s^5\right|\\ \amp\le \sup_{s \in [-1,2]} \left|-s-s^2+\frac 12 s^3\right| + \sup_{s \in [-1,2]} \left|\frac 1{100}s^5\right|\\ \amp = \frac{10^{3/2}+26}{27} + \frac{32}{100} \le 2.455, \end{align*}
which is only about 10% off of the true value \(\n f_{\sup} \approx 2.238\) (calculated numerically). It is considerably more work than in the official solution, but again one can calculate the first supremum norm on the right hand side exactly using using calculus: it is achieved at \(s^* = (2+\sqrt{10})/3 \in S\text{.}\) Using this same point gives a similarly good lower bound, \(\n f_{\sup} \ge \abs{f(s^*)} \ge 2.285\text{.}\)
It is important to emphasise, though, that this sort of fancy footwork is not at all what the question is looking for. It is just asking for any upper bound on \(\n f_{\sup}\) which we can rigorously prove, not the best one.
1.1.9. (PS2) Product spaces.
Solution 1.
It is clear that \(d_{X \times Y}((x_1,y_1),(x_2,y_2)) \ge 0\) for all \(x_1,x_2 \in X\) and \(y_1,y_2 \in Y\text{,}\) and \(d_{X \times Y}((x_1,y_1),(x_2,y_2)) = 0\) if \((x_1,y_1) = (x_2,y_2)\text{.}\) If \(d_{X \times Y}((x_1,y_1),(x_2,y_2)) = 0\text{,}\) it follows that \(d_X(x_1,x_2) = 0\) and \(d_Y(x_1,x_2) = 0\text{,}\) and thus \(x_1 = x_2\) and \(y_1 = y_2\text{.}\)
The symmetry is rather obvious, but let’s write out the details anyway. For all \(x_1,x_2 \in X\) and \(y_1,y_2 \in Y\) we have
\begin{align*} d_{X \times Y}( (x_2,y_2), (x_1,y_1) ) \amp = \sqrt{ (d_X(x_2,x_1))^2 + (d_Y(y_2,y_1))^2 }\\ \amp = \sqrt{ (d_X(x_1,x_2))^2 + (d_Y(y_1,y_2))^2 }\\ \amp = d_{X \times Y}( (x_1,y_1), (x_2,y_2) ), \end{align*}
where here in the second-to-last step we used the fact that \(d_X(x_2,x_1)=d_X(x_1,x_2)\) since \(d_X\) is a metric, and similarly for \(d_Y\text{.}\)
To prove the triangle inequality, consider \((x_1,y_1), (x_2,y_2), (x_3,y_3)\) in \(X \times Y\text{.}\) Then
\begin{align*} \amp (d_{X \times Y}((x_1,y_1),(x_3,y_3)))^2 \\ \amp \qquad = (d_X(x_1,x_3))^2 + (d_Y(y_1,y_3))^2 \\ \amp \qquad \le (d_X(x_1,x_2) + d_X(x_2,x_3))^2 + (d_Y(y_1,y_2) + d_Y(y_2,y_3))^2\\ \amp \qquad = (d_X(x_1,x_2))^2 + 2d_X(x_1,x_2)d_X(x_2,x_3) + (d_X(x_2,x_3))^2\\ \amp \qquad \quad + (d_Y(y_1,y_2))^2 + 2d_Y(y_1,y_2) d_Y(y_2,y_3) + (d_Y(y_2,y_3))^2\text{.} \end{align*}
We use the Cauchy–Schwarz inequality in \(\R^2\) as indicated in the hint with \(a = d_X(x_1,x_2)\text{,}\) \(b = d_X(x_2,x_3)\text{,}\) \(c = d_Y(y_1,y_2)\) and \(d = d_Y(y_2,y_3)\text{.}\) This yields
\begin{align*} \amp \left(d_{X \times Y}((x_1,y_1),(x_3,y_3))\right)^2\\ \amp \quad \le (d_X(x_1,x_2))^2 + (d_Y(y_1,y_2))^2 + (d_X(x_2,x_3))^2 + (d_Y(y_2,y_3))^2\\ \amp \quad \quad + 2\sqrt{(d_X(x_1,x_2))^2 + (d_Y(y_1,y_2))^2} \sqrt{(d_X(x_2,x_3))^2 + (d_Y(y_2,y_3))^2}\\ \amp \quad = \left(\sqrt{(d_X(x_1,x_2))^2 + (d_Y(y_1,y_2))^2} + \sqrt{(d_X(x_2,x_3))^2 + (d_Y(y_2,y_3))^2}\right)^2\\ \amp \quad = \left(d_{X \times Y}((x_1,y_1),(x_2,y_2)) + d_{X \times Y}((x_2,y_2),(x_3,y_3))\right)^2\text{.} \end{align*}
Now the triangle inequality follows.
Solution 2. Alternative proof of the triangle inequality
Another way to prove the triangle inequality is to notice that
\begin{gather*} d_{X \times Y}\big((x_1,y_1),(x_2,y_2)\big) = \left\|\big(d_X(x_1,x_2),d_Y(y_1,y_2)\big)\right\| \end{gather*}
where \(\n\blank\) is the usual Euclidean norm on \(\R^2\text{.}\) Thus for any \(x_1,x_2,x_3 \in X\) and \(y_1,y_2,y_3 \in Y\) we have
\begin{align*} \amp d_{X \times Y}\big((x_1,y_1),(x_3,y_3)\big)\\ \amp\quad = \left\|\big(d_X(x_1,x_3),d_Y(y_1,y_3)\big)\right\|\\ \amp\quad \le \left\|\big(d_X(x_1,x_2)+d_X(x_2,x_3),d_Y(y_1,y_2)+d_Y(y_2,y_3)\big)\right\|\\ \amp\quad = \left\|\big(d_X(x_1,x_2),d_Y(y_1,y_2)\big)+\big(d_X(x_2,x_3),d_Y(y_2,y_3)\big)\right\|\\ \amp\quad \le \left\|\big(d_X(x_1,x_2),d_Y(y_1,y_2)\big)\right\|+\left\|\big(d_X(x_2,x_3),d_Y(y_2,y_3)\big)\right\|\\ \amp\quad = d_{X \times Y}\big((x_1,y_1),(x_2,y_2)\big) + d_{X \times Y}\big((x_2,y_2),(x_3,y_3)\big)\text{.} \end{align*}
Here in the first inequality we are using the triangle inequalities for \(d_X\) and \(d_Y\) as well as the fact that \(\n a \le \n b\) whenever \(a,b \in \R^2\) satisfy \(0 \le a_1 \le b_1\) and \(0 \le a_2 \le b_2\text{.}\) The second inequality is the triangle inequality for the norm \(\n\blank\text{.}\)
Looking at the argument this way, it seems plausible that we could replace Euclidean norm \(\n\blank\) with either of the norms \(\n\blank_\infty,\n\blank_1\) from Exercise 1.1.5 to get a different metric on \(X \times Y\text{,}\) and this is indeed the case.
Comment 1.
Sometimes students, presumably trying to mimic the structure of the triangle inequality in Definition 1.1, introduce a mysterious third metric space \((Z,d_Z)\) into the problem where instead they should have introduced a third point \((x_3,y_3) \in X \times Y\text{.}\) Since \((Z,d_Z)\) does not appear in the statement of the problem, introducing it should also mean giving a proper definition! And once your start trying to do that you will hopefully realise that this approach doesn’t quite make sense.
Comment 2.
Several students wrote that, e.g., \(d_X(x_1,x_2)=0\) forces \((x_1,x_2)=0\) or \(x_1=x_2=0\text{.}\) Since we are working in a general metric space, \(X\) is a set and not necessarily a vector space, and so these statements are nonsensical! The same goes for \(X \times Y\) and \(d_{X \times Y}\text{.}\)
1.1.11. Bounded sets in normed spaces.
Solution.
Suppose that \(R=\sup_{x \in S} \n x \lt \infty\text{.}\) Then for any \(x,y \in S\) we can use the triangle inequality to estimate
\begin{equation*} d(x,y) = \n{x-y} \le \n x + \n y \le R + R = 2R\text{.} \end{equation*}
Taking a supremum over \(x,y \in S\) yields \(\diam S \le 2R \lt \infty\text{,}\) and hence that \(S\) is bounded.
Conversely, suppose that \(\diam S \lt \infty\text{,}\) and fix some \(y \in S\text{.}\) Then for any \(x \in S\) we have
\begin{equation*} \n{x} = \n{x-y+y} \le \n{x-y} + \n y \le \diam S + \n y\text{.} \end{equation*}
Taking a supremum over \(x \in S\) yields
\begin{equation*} \sup_{x \in S} {\n x} \le \diam S + \n y \lt \infty\text{.} \end{equation*}
Comment.
In fact, a similar result also holds for metric spaces \((X,d)\text{,}\) with essentially the same proof. As \(X\) is no longer necessarily a vector space, the role of \(0\) is instead played by some fixed element \(x_0 \in X\text{,}\) and \(\n \blank\) is replaced by \(d(\blank,x_0)\text{.}\)

1.2 Open and Closed Sets

Exercises

1.2.1. (PS3) Inclusions of balls.
1.2.1.a
Solution.
If \(s + d(x_0, y_0) \le r\text{,}\) then for any \(y \in B_s(y_0)\text{,}\) the triangle inequality implies that
\begin{equation*} d(y, x_0) \le d(y, y_0) + d(y_0, x_0) \lt s + d(y_0, x_0) \le r\text{.} \end{equation*}
Hence \(y \in B_r(x_0)\text{.}\)
Comment.
Don’t forget that in Definition 1.17 the open ball \(B_r(x_0)\) is the set of points \(x \in X\) satisfying the strict inequality \(d(x,x_0) \lt r\text{.}\) A few students argued as if, either some of the time or all of the time, this was instead the non-strict inequality \(d(x,x_0) \le r\text{.}\) Please be careful about this seemingly-minor difference. While in this problem it turns out not to be a huge deal, in other problems it will be the entire point. Best to get in the habit of being careful.
1.2.1.b
Solution.
This is true, e.g., for \(X = \{0\}\text{.}\) If \(x_0 = y_0 = 0\) and \(r, s \gt 0\text{,}\) then \(B_r(x_0) = B_s(y_0) = X\) regardless of the values of \(r\) and \(s\text{.}\)
A slightly less extreme example is to take a bounded metric space, \(x_0=y_0\text{,}\) and \(s \gt r\) both larger than the diameter of \(X\text{.}\) This works for any non-empty discrete metric space, and also, e.g., for \(X=(0,2)\text{,}\) \(x_0=y_0=1\text{,}\) \(s=4\) and \(r=3\text{.}\)
1.2.2. (PS3) Open sets in metric subspaces.
1.2.2.a
Solution.
Let \(y \in Y\) and \(r \gt 0\text{.}\) Using the definitions of open balls and the fact that \((Y,d')\) is a metric subspace, we have
\begin{align*} B_r'(y) \amp = \{ z \in Y : d'(z,y) \lt r \}\\ \amp = \{ z \in Y : d(z,y) \lt r \}\\ \amp = \{ z \in X : d(z,y) \lt r \} \cap Y\\ \amp = B_r(y) \cap Y\text{.} \end{align*}
1.2.2.b
Solution.
Suppose that \(U\) is an open subset of \((X,d)\text{,}\) let \(V = U \cap Y\text{,}\) and fix \(y \in V\text{.}\) To show that \(V\) is open in \((Y,d')\text{,}\) we need to find \(r \gt 0\) such that the open ball \(B_r'(y) \subseteq V\text{.}\) Since \(U\) is open in \((X,d)\text{,}\) we can pick \(r \gt 0\) such that \(B_r(y) \subseteq U\text{.}\) Then by Part a, we have
\begin{gather*} B_r'(y) = B_r(y) \cap Y \subseteq U \cap Y = V \end{gather*}
as desired.
Conversely, suppose that \(V\) is an open subset of \((Y,d')\text{.}\) Then, as in Corollary 1.27, for every \(y \in V\) we can find \(r_y \gt 0\) such that \(B'_{r_y}(y) \subseteq V\text{,}\) and hence by Part a that
\begin{align*} V \amp = \bigcup_{y \in V} B'_{r_y}(y)\\ \amp = \bigcup_{y \in V} \Big( B_{r_y}(y) \cap Y \Big)\\ \amp = \Big(\bigcup_{y \in V} B_{r_y}(y)\Big) \cap Y\\ \amp = U \cap Y \end{align*}
where \(U = \bigcup_{v \in V} B_{r_y}(y)\) is open in \((X,d)\) as it is a union of open balls (in \((X,d)\)).
Comment 1.
Since this exercise involves two different metric spaces \((X,d)\) and \((Y,d')\text{,}\) we need to be clear which space we have in mind when we use metric space terminology like ‘open’. In particular, while in one implication we assume that \(U\) is an open subset of \((X,d)\text{,}\) and \(Y\) is always an open subset of \((Y,d')\text{,}\) this in no way allows us to use Theorem 1.25 (which applies to multiple open sets in the same metric space) to conclude that the intersection is open in \((X,d)\text{.}\)
Comment 2.
Suppose that \(V\) is open in \((Y,d')\text{.}\) Then for all \(y \in V\) there exists an \(r \gt 0\) such that \(B_r'(y) \subseteq V\text{.}\) So far so good. But if we go from this to writing something like
\begin{equation*} V = \bigcup_{y \in V} B_r'(y), \end{equation*}
then we’re being a bit sloppy in a way which could easily cause problems later on. This is because, in the above formula, it sure looks like \(r\) is some constant real number which isn’t going to change as we change \(y \in V\) to get all of the sets that we need to make up this union. This is why the official solutions are careful to introduce the symbol \(r_y\text{,}\) making the dependence of this radius on the centre of the ball clear. Alternatively, we could have introduced a symbol for the entire ball.
Comment 3.
That the word ‘union’ is not synonymous with ‘finite union’. In particular, Corollary 1.27 says ‘union of balls’ it means a set of the form
\begin{equation*} \bigcup_{\lambda \in \Lambda} B_\lambda, \end{equation*}
where each \(B_\lambda\) is a ball and \(\Lambda\) is some index set which could be uncountably infinite. For instance, in \(\R^2\) we could be talking about a union like
\begin{equation*} \bigcup_{x_1 \in \R} B_1((x_1,0)). \end{equation*}
In this problem the cardinality of the set \(\Lambda\) didn’t have much of an impact on the argument we wanted to run, but in other cases it cause things to be drastically different.
1.2.2.c
Solution.
Since \(X\) is a metric subspace of \(\R\) with the Euclidean metric, we can use Lemma 1.28. The set \((2,3]\) is open since we can write it as \((2,3] = X \cap (2,4)\) where \((2,4)\) is an open subset of \(\R\text{.}\) Similarly the complement in \(X\) of \((0,1]\) is \(X \without (0,1] = (1,3]\text{,}\) which we can write as \(X \cap (1,4)\) where \((1,4)\) is an open subset of \(\R\text{.}\)
Comment 1.
Some students write this metric subspace as the pair \(\big((0,3],\abs\blank\big)\text{.}\) Technically this doesn’t quite work, because \(\abs\blank\) is a norm on the vector space \(\R\text{,}\) but neither a norm nor a metric on \((0,3]\text{.}\) What I assume these students meant is that they are equipping \((0,3]\) with the restriction of the metric on \(\R\) induced by the norm \(\abs\blank\text{.}\) When possible I avoid using the notation in lectures, but it okay for you to use it on problem sheets an exams, provided it does not confuse you!
Comment 2.
Note that \((0,3] \without (0,1] = (1,3]\text{,}\) not \((2,3]\) as many students claimed.
Comment 3.
Note that being able to write \((2,3]\) as \(S \cap X\) where \(S \subset \R\) is not open proves nothing. For instance here we have \((2,3] = (2,5] \cap X\) where \((2,5] \subset \R\) is not open, but nevertheless \((2,3] \subset X\) is open.

1.3 Convergence and Completeness

Exercises

1.3.1. (PS3) Some basic limits.
1.3.1.a
Solution.
Applying the triangle inequality twice we find, for any \(n \in \N\text{,}\)
\begin{gather*} d(x_0,y_0) \le d(x_0,x_n) + d(x_n,y_n) + d(y_n,y_0), \end{gather*}
which rearranges to
\begin{gather*} d(x_0,y_0) - d(x_n,y_n) \le d(x_0,x_n) + d(y_n,y_0). \end{gather*}
Reversing the roles of \((x_0,y_0)\) and \((x_n,y_n)\) we similarly find
\begin{gather*} d(x_n,y_n) - d(x_0,y_0) \le d(x_n,x_0) + d(y_0,y_n). \end{gather*}
By the symmetry of the metric, the right hand sides of these two inequalities are the same, and so combining them we get
\begin{gather*} \abs{d(x_n,y_n) - d(x_0,y_0)} \le d(x_n,x_0) + d(y_0,y_n) \to 0 \end{gather*}
as \(n \to \infty\text{,}\) which is the desired convergence.
Comment 1.
Several students wrote statements like “\(d(x_n,y_n) \le d(x_0,y_0)\) as \(n \to \infty\)”. Not sure what this means (so don’t use it on an exam!) but my best guess is something like \(\limsup_{n\to\infty} d(x_n,y_n) \le d(x_0,y_0)\text{?}\)
Comment 2.
We could of course also have written out an argument using \(\varepsilon\)s and \(N\)s without too much additional effort, and many students did so. You will never lose marks on an exam for writing out a perfect \(\varepsilon\)-\(N\) proof, but on the other hand you may find that the argument in the official solution, which takes advantage of some basic facts about limits of sequences of real numbers, is a bit easier.
1.3.1.b
Solution.
Since \(\n{x_n} = d(x_n,0)\text{,}\) the limit in i follows from the previous part. For ii, we use the triangle inequality to estimate
\begin{align*} \n{(x_n+y_n) - (x_0+y_0)} \amp = \n{(x_n-x_0) + (y_n - y_0)}\\ \amp \le \n{x_n-x_0} + \n{y_n - y_0}\\ \amp \to 0\text{.} \end{align*}
For iii, we add and subtract \(\alpha_0 x_n\) and use the triangle inequality to estimate
\begin{align*} \n{\alpha_n x_n - \alpha_0 x_0} \amp = \n{\alpha_n x_n - \alpha_0 x_n + \alpha_0 x_n - \alpha_0 x_0}\\ \amp \le \n{(\alpha_n -\alpha_0)x_n} + \n{\alpha_0 (x_n-x_0)}\\ \amp = \abs{ \alpha_n -\alpha_0} \n{x_n} + \abs{\alpha_0} \n{x_n-x_0}\\ \amp \to 0 \cdot \n {x_0} + \abs{\alpha_0} \cdot 0\\ \amp = 0\text{,} \end{align*}
where in the second-to-last step we have used i.
Comment 1.
In the official solution we really did need to use i, in which case it is probably a good idea to mention this (e.g. on an exam). Depending on which term you add and subtract, though, this may not be an issue.
Comment 2.
As with the previous part, it is possible to write out an \(\varepsilon\)-\(N\) argument which accomplishes the same thing. For this one has to think much harder about inequalities, and either avoid dividing by things like \(\n{x_0}\) or \(\abs{\alpha_0}\) which might be zero or treat these as special cases.
1.3.1.c
Solution.
Adding and subtracting \(\scp{x_n}{y_0}\text{,}\) we estimate
\begin{align*} \abs{\scp{x_n}{y_n} - \scp{x_0}{y_0}} \amp = \abs{(\scp{x_n}{y_n} - \scp{x_n}{y_0}) - (\scp{x_0}{y_0} - \scp{x_n}{y_0})}\\ \amp = \abs{\scp{x_n}{y_n-y_0} + \scp{x_n-x_0}{y_0}}\\ \amp \le \abs{\scp{x_n}{y_n-y_0}} + \abs{\scp{x_n-x_0}{y_0}}\\ \amp \le \n{x_n}\n {y_n-y_0} + \n{x_n-x_0}\n{y_0}\\ \amp \to \n {x_0} \cdot 0 + 0 \cdot \n{y_0}\\ \amp = 0\text{,} \end{align*}
where we have used the Cauchy–Schwarz inequality and i from the previous part.
Comment.
Again one can give an \(\varepsilon\)-\(N\) argument instead, but this requires a bit more work.
1.3.2. (PS3) Convergence in product spaces.
Solution.
The convergence \((x_n,y_n) \to (x_0,y_0)\) in \(X \times Y\) is equivalent to the condition
\begin{equation*} \lim_{n \to \infty} \sqrt{(d_X(x_n,x_0))^2 + (d_Y(y_n,y_0))^2} = 0\text{.} \end{equation*}
This in turn is equivalent to
\begin{equation*} \lim_{n \to \infty} d_X(x_n,x_0) = 0 \quad \text{and} \quad \lim_{n \to \infty} d_Y(y_n,y_0) = 0\text{,} \end{equation*}
which means that \(x_n \to x_0\) in \(X\) and \(y_n \to y_0\) in \(Y\text{.}\)
Comment 1.
Here we have used the fact that \((d_X(x_n,x_0))^2 \ge 0\) and similarly for the \(d_Y\) term. This is true just because it is the square of a real number, the additional fact that \(d_X(x_n,x_0) \ge 0\) is not needed.
Comment 2.
The official solution, which I’ve inherited from previous years, is quite short. It assumes that the reader is well-versed in things from Analysis 1, and could write out a more detailed argument (e.g. with \(\varepsilon\)s and \(N\)s) without too much difficulty. Many students wrote out such arguments, which is of course totally fine.
While no one did so this year, it’s interesting to note that some of the inequalities in the last part of Exercise 1.1.5 could also be used here.
1.3.3. (PS4) Cauchy implies bounded.
Solution.
Let \((X,d)\) be a metric space and \((x_n)_{n \in \N}\) a Cauchy sequence in \(X\text{.}\) Then there exists \(N \in \N\) such that \(d(x_m, x_n) \lt 1\) for all \(m, n \ge N\text{.}\) Set
\begin{equation*} M = \max_{m, n \le N} d(x_m, x_n)\text{.} \end{equation*}
Then for any \(m, n \in \N\text{,}\) if \(m, n \le N\text{,}\) then clearly \(d(x_m, x_n) \le M\text{.}\) If \(m, n \gt N\text{,}\) then \(d(x_m, x_n) \lt 1\text{.}\) Moreover, if \(m \le N\) and \(n \gt N\text{,}\) then
\begin{equation*} d(x_m, x_n) \le d(x_m, x_N) + d(x_N, x_n) \le M + 1\text{,} \end{equation*}
and the same inequality also holds for \(n \le N\) and \(m \gt N\text{.}\) Combing all of the cases we conclude that
\begin{equation*} \diam \set{x_n}{n \in \N} \le \max\{1,M,M+1\} = M + 1 \lt \infty \end{equation*}
and hence that the sequence is bounded.
Comment 1.
Recall from Definition 1.15 that, in metric spaces, we have defined boundedness of subsets in terms of diameters.
Comment 2.
Alternatively, we could show that the two sets \(\{x_n : n \le N\}\) and \(\{x_n : n \ge N\}\) were each bounded, and then prove a lemma which says that finite unions of bounded sets are bounded.
Comment 3.
Please take a look at the last part of Exercise B.2, and if you have any questions about this get in touch with me over email or come to office hours. This sort of confusion over what a supremum means would be disastrous on an exam.
In the definitions in this unit the word ‘bounded’ almost always refers to some sort of supremum being finite. The supremum
\begin{equation*} \sup_{n \in \N} n^2 = \sup\{ n^2 : n \in \N\} \end{equation*}
is clearly infinite. At the same time, for each fixed \(n \in \N\text{,}\) \(n^2\) is certainly finite.
When showing that a set \(S\) in a metric space \((X,d)\) is bounded (according to Definition 1.15), we need to show that the supremum
\begin{equation*} \diam S = \sup_{x,y \in S} d(x,y) \lt \infty. \end{equation*}
As in the above example, it is not enough to observe that each individual distance \(d(x,y)\) is finite. Indeed, this is always the case for any set, because \(d \maps X \times X \to \R\) always outputs a (finite) real number.
Comment 4.
As is often the case in this unit, in this question we only know that \((X,d)\) is a metric space. So \(X\) is just a set, which for all we know could be the set of all spaghetti recipes. In particular we cannot assume that \(X=\R\text{,}\) or even that \(X\) is a vector space. We do not know if there is a ‘zero’ in \(X\text{,}\) or if it makes sense to add two elements of \(X\) or to take their absolute value. Making such assumptions can completely change the character of a problem, and is a very fast way to lose most or all of the marks on an exam question.
1.3.4. (PS4) Pairs of Cauchy sequences.
Solution.
Since \(\R\) (with the usual Euclidean metric) is complete, it suffices to show that \((d(x_n,y_n))_{n \in \N}\) is Cauchy. Let \(\varepsilon \gt 0\text{,}\) and pick \(N \in \N\) such that \(n,m \ge N\) implies both \(d(x_n,x_m) \lt \varepsilon/2\) and \(d(y_n,y_m) \lt \varepsilon/2\text{.}\) For \(n,m \ge N\) we then have
\begin{align*} \abs{d(x_n,y_n) - d(x_m,y_m)} \amp\le d(y_n,y_m) + d(x_n,x_m)\\ \amp\lt \frac \varepsilon 2 + \frac \varepsilon 2 = \varepsilon \end{align*}
where the first inequality was shown in our solution to the first part of Exercise 1.3.1.
Comment 1.
Several students wrote \(d(d(x_n,y_n),d(x_m,y_m))\) as a synonym for \(\abs{d(x_n,y_n) - d(x_m,y_m)}\text{,}\) so that, depending on context, \(d\) either meant the metric on \(X\) or the metric on \(\R\text{.}\) While not completely unreasonable, I would still advise against using such a convention in this unit, and to instead give each metric in a problem its own name.
Comment 2.
After establishing that \((d(x_n,y_n))_{n \in \N}\) was Cauchy, several students claimed that “\(d(x_n,y_n) \to d(x_m,y_m)\)”. I cannot see any sensible way to interpret this statement. Is the limit as \(n \to \infty\text{?}\) What is \(m\text{?}\) In general there is no reason that the limit of a convergent sequence would be one of the terms in the this sequence.
1.3.6. Limit points.
Solution.
We construct the subsequence step by step as follows. First, set \(r_1 = 1\text{.}\) Observe that the set \(B_{r_1}(x_0) \cap \set{x_n}{n \in \N}\) contains a point other than \(x_0\) by the given condition. So we may choose \(n_1 \in \N\) such that \(d(x_{n_1}, x_0) \lt r_1\text{.}\)
Next, we set \(r_2 = \min\{\frac 12, d(x_1, x_0), \dotsc, d(x_{n_1}, x_0)\}\text{.}\) The set \(B_{r_2}(x_0) \cap \set{x_n}{n \in \N}\) contains a point other than \(x_0\text{,}\) hence we may choose \(n_2 \in \N\) such that \(d(x_{n_2}, x_0) \lt r_2\text{.}\) This implies in particular that \(n_2 \gt n_1\) by the choice of \(r_2\text{.}\)
Set \(r_3 = \min\{\frac{1}{3}, d(x_1, x_0), \dotsc, d(x_{n_2}, x_0)\}\) and find \(n_3 \in \N\) such that \(d(x_{n_3}, x_0) \lt r_3\text{.}\) Then \(n_3 \gt n_2\text{.}\)
Repeat the process indefinitely, which gives rise to an increasing sequence \((n_k)_{k \in \N}\text{.}\) The corresponding subsequence has the property that \(d(x_{n_k}, x_0) \lt r_k \le \frac{1}{k}\text{;}\) hence \(x_0 = \lim_{k \to \infty} x_{n_k}\text{.}\)

1.4 Continuity

Exercises

1.4.1. (PS4) Metric is Lipschitz.
1.4.1.a
Solution.
For \(x, y \in X\text{,}\) we have the inequality
\begin{equation*} |f(x) - f(y)| = |d(x, p) - d(y, p)| \le d(x, y) \end{equation*}
by Lemma 1.8. This means that \(f\) is Lipschitz continuous (with Lipschitz constant \(1\)).
1.4.1.b
Solution.
Let \((x_1, y_1),(x_2,y_2) \in X \times X\text{.}\) Arguing exactly as in the solution to the first part of Exercise 1.3.1, we have
\begin{align*} \abs{d(x_2, y_2) - d(x_1, y_1)} \amp \le d(x_2, x_1) + d(y_2, y_1)\text{.} \end{align*}
As we have the two inequalities
\begin{gather*} d(x_2, x_1) \le d_{X \times X}((x_2, y_2), (x_1, y_1)),\\ d(y_2, y_1) \le d_{X \times X}((x_2, y_2), (x_1, y_1))\text{,} \end{gather*}
this implies that
\begin{equation*} \abs{d(x_2, y_2) - d(x_1, y_1)} \le 2d_{X \times X}((x_2, y_2), (x_1, y_1))\text{,} \end{equation*}
which means that the metric is Lipschitz continuous (with Lipschitz constant \(2\)).
Comment 1.
As happened in previous years, some students used incorrect formulas for \(d_{X \times X}\) which made the problem considerably easier. On an exam this would have been an easy way to lose a lot of marks. Recall from Definition 1.14 that
\begin{equation*} d_{X \times X}( (x_1,y_1), (x_2,y_2) ) = \sqrt{(d_X(x_1, x_2))^2 + (d_Y(y_1, y_2))^2}\text{.} \end{equation*}
One way to remember this formula is that it gives the correct answer if we work with Euclidean norms on \(\R^2 = \R \times \R\text{.}\) That is, if \(x = (x_1,x_2)\) and \(y=(y_1,y_2)\) are vectors in \(\R^2\text{,}\) then
\begin{equation*} d_{\R^2}(x,y) = d_{\R \times \R}(x,y) = \sqrt{(d_\R(x_1,y_1))^2 + (d_\R(x_2,y_2))^2}\text{.} \end{equation*}
Comment 2.
We could in fact have estimated
\begin{equation*} d(x_1, x_2) + d(y_1, y_2) \le \sqrt 2 \sqrt{ (d(x_1,x_2))^2 + (d(y_1,y_2))^2 }\text{,} \end{equation*}
for instance by applying the third inequality in the last part of Exercise 1.1.5 to the vector \((d(x_1,x_2),d(y_1,y_2)) \in \R^2\text{.}\) This gives a better (i.e. smaller) Lipschitz constant \(L=\sqrt 2\text{.}\)
Comment 3.
Several students estimated inside of absolute values as if the absolute values themselves, and often some of the minus signs inside the absolute values, were not there. As a concrete example, suppose that \(a,b,A,B\) are real numbers with \(0 \le a \le A\) and \(0 \le b \le B\text{.}\) These inequalities in no way imply that \(\abs{a-b} \le \abs{A-B}\text{.}\) If this is surprising, I recommend that you play around with explicit values of \(a,b,A,B\) until you can find a counterexample.
Comment 4.
In some sense the first step of this problem is to translate the sentence “\(d \maps X \times X \to \R\) is Lipschitz continuous” into an inequality to be proven. Several students did not do this correctly. If this is you, my main recommendation is do this and other translations as slowly and carefully as you can stand (maybe on scrap paper) so that you are really sure you are getting the right thing. Once you are more comfortable with the definitions (e.g. from endless practice on problem sheets!) this will get faster.
1.4.2. (PS4) Image of closure under continuous map.
Solution 1.
Suppose that \(y \in f(\overline S)\) and choose \(x \in \overline S\) with \(y = f(x)\text{.}\) We have to show that \(B_r(y) \cap f(S) \ne \varnothing\) for all \(r \gt 0\text{.}\) So fix \(r \gt 0\text{.}\) By the continuity of \(f\text{,}\) there exists \(\delta \gt 0\) such that \(d_Y(f(x), f(x')) \lt r\) for all \(x' \in X\) with \(d_X(x, x') \lt \delta\text{.}\) Moreover, since \(x \in \overline S\text{,}\) there exists \(s \in S\) with \(d_X(x, s) \lt \delta\text{.}\) Therefore, we conclude that \(d_Y(y, f(s)) \lt r\text{,}\) and in particular \(B_r(y) \cap f(S) \ne \varnothing\text{.}\)
Solution 2. Using sequences
Arguments very similar to the proof of Theorem 1.35 show that a point \(x \in X\) lies in the closure \(\overline{S}\) if and only if there is a sequence of points \(\seq xn\) in \(S\) with \(x_n \to x\) as \(n \to \infty\text{.}\) Suppose that we have proved such a lemma.
Then we can offer the following alternative solution. Let \(y \in f(\overline S)\) and choose \(x \in \overline S\) with \(y=f(x)\text{.}\) Then by the above lemma, there exists a sequence \(\seq xn\) in \(S\) with \(x_n \to x\text{.}\) As \(f\) is continuous, we then have \(f(x_n) \to f(x) = y\) (by Theorem 1.46). Since the sequence \((f(x_n))_{n \in \N}\) lies in \(f(S)\text{,}\) applying the lemma a second time (but now in the reverse direction and with \(f(S) \subseteq Y\) playing the role of \(S \subseteq X\)) we conclude that \(y \in \overline{f(S)}\) as desired.
Solution 3. Using Exercise 1.4.3
As in previous years, at least one student gave a more ‘topological’ argument using Exercise 1.4.3, which I am assuming they saw a proof of in Analysis 2A or perhaps the Topology unit. Since \(\overline{f(S)} \supseteq f(S)\text{,}\) we have by basic set theory that
\begin{equation*} S \subseteq f^{-1}(f(S)) \subseteq f^{-1}(\overline{f(S)})\text{.} \end{equation*}
Since \(\overline{f(S)}\) is closed and \(f\) is continuous, the version of Exercise 1.4.3 for closed sets (obtained by taking complements) implies that \(f^{-1}(\overline{f(S)})\) is closed. Thus Theorem 1.23 and the inclusion \(S \subseteq f^{-1}(\overline{f(S)})\) above imply \(f^{-1}(\overline{f(S)}) \supseteq \overline{S}\text{,}\) and hence that \(f(\overline S) \subseteq \overline{f(S)}\) as desired.

1.5 The Space \(C_\bdd(X)\)

Exercises

1.5.1. Uniform limit of continuous maps.
Solution.
Let \(x_0 \in X\) and fix \(\varepsilon \gt 0\text{.}\) Choose \(N \in \N\) such that \(d_Y(f_n(x), f(x)) \lt \varepsilon/3\) for all \(n \ge N\) and all \(x \in X\text{.}\) Furthermore, choose \(\delta \gt 0\) such that
\begin{equation*} d_Y(f_N(x), f_N(x_0)) \lt \frac{\varepsilon}{3} \end{equation*}
for all \(x \in X\) with \(d_X(x, x_0) \lt \delta\text{.}\) Then if \(d_X(x, x_0) \lt \delta\text{,}\) we find that
\begin{align*} d_Y(f(x), f(x_0)) \amp\le d_Y(f(x), f_N(x)) + d_Y(f_N(x), f_N(x_0)) + d_Y(f_N(x_0), f(x_0)) \\ \amp\lt \varepsilon\text{.} \end{align*}
Therefore, the map \(f\) is continuous.
Comment.
In the above solution we first fix \(n=N\) appropriately and then choose \(\delta \gt 0\) based on the continuity of \(f_N\text{.}\) In previous years, many students tried to choose \(\delta\) based on the continuity of some general \(f_n\text{.}\) This works, up to a point, but the value \(\delta=\delta_n\) that you get will in general depend \(n\text{.}\) This is a bit worrying, since ultimately we want a single value of \(\delta \gt 0\) that works for the limiting function \(f\text{,}\) and if we are very unlucky then perhaps \(\delta_n \to 0\) as \(n \to \infty\text{.}\)

1.6 Functions and higher-order functions

Exercises

1.6.1. (PS4) Compositions of higher-order functions.
1.6.1.a
Solution 1. Outside-in
The domains and codomains of \(\Phi_1,\Phi_2\) line up nicely,
\begin{equation*} 1 \in \R \xrightarrow{\Phi_1} \R^\R \xrightarrow{\Phi_2} \R\text{,} \end{equation*}
with \(1\) lying in the domain of \(\Phi_1\text{.}\) Thus \(\Phi_2(\Phi_1(1))\) is a well-defined element of \(\R\text{,}\) the codomain of \(\Phi_2\text{.}\) To find the value of this real number, we simply manipulate the expressions in brackets using our formulas for \(\Phi_1,\Phi_2\text{:}\)
\begin{align*} \Phi_2(\Phi_1(1)) \amp = \Phi_1(1)(0) + \Phi_1(1)(1)\\ \amp = e^0 + e^1 \\ \amp = 1 + e\text{.} \end{align*}
Here the first line is the trickiest one; you can think of it as plugging \(f = \Phi_1(1)\) into our formula for \(\Phi_2(f)\text{,}\) so that \(f(0)=\Phi_1(1)(0)\) and so on.
Solution 2. Inside-out
The previous solution works “from the outside in”, first using our formula for \(\Phi_2\) and only then using our formula for \(\Phi_1\text{.}\) Many students instead worked from “from the inside out”, first dealing with the \(\Phi_1(1)\) and only then applying our formula for \(\Phi_2\text{.}\) One way to accomplish this is with the \(\mapsto\) symbol; see this Convention 1.56 above. The fact that \(\Phi_2(\Phi_1(1)) \in \R\) follows as before, but to find the formula we now calculate this way:
\begin{align*} \Phi_2(\Phi_1(1)) \amp = \Phi_2(t \mapsto e^t)\\ \amp = (t \mapsto e^t)(0) + (t \mapsto e^t)(1)\\ \amp = 1 + e\text{.} \end{align*}
If you don’t like the \(\mapsto\) notation, you can always give this function a name. We could say that \(\Phi_1(1)\) is the function \(f \maps \R \to \R\) defined by \(f(t) = e^t\text{,}\) and hence
\begin{align*} \Phi_2(\Phi_1(1)) \amp = \Phi_2(f)\\ \amp = f(0) + f(1)\\ \amp = 1 + e\text{.} \end{align*}
Comment 1.
Especially in more applied units, one is encouraged to view the function \(t \mapsto e^t\) and the expression \(e^t\) as essentially the same object. When talking about mappings between space of functions, this ambiguity in notation can lead to real confusion (and real lost marks on an exam). To avoid this, please do not write things like \(\Phi_1(e^t)\) when what you really mean is \(\Phi_1(t \mapsto e^t)\text{.}\) If you don’t like the \(\mapsto\) notation, then you can always introduce names for the various functions you encounter, as is done in the second official solution. Alternatively, arguing as in the first official solution, we didn’t actually encounter this issue at all!
Comment 2.
While the question asks what type of object \(\Phi_2(\Phi_1(1))\) is, in past years many students instead gave the type of the object \(\Phi_2 \circ \Phi_1\text{,}\) or else \(\Phi_2\) or \(\Phi_1\text{.}\) Note that these are different questions with different answers! Seems to be less of an issue this year.
1.6.1.b
Solution 1. Outside-in
As with the previous part, the domains and codomains of \(\Phi_3,\Phi_1\) line up nicely:
\begin{equation*} 2 \in \R \xrightarrow{\Phi_1} \R^\R \xrightarrow{\Phi_3} \R^\R. \end{equation*}
Thus \(\Phi_3(\Phi_1(2))\) is a well-defined element of \(\R^\R\text{,}\) the codomain of \(\Phi_3\text{.}\) In other words, \(\Phi_3(\Phi_1(2)) \maps \R \to \R\text{.}\) To get a formula for this function, we should input a real number, say \(t \in \R\text{.}\) Using our formulas for \(\Phi_1,\Phi_3\) we find that
\begin{align*} \Phi_3(\Phi_1(2))(t) \amp = \Phi_1(2)(t) \cdot \Phi_1(2)(1+t)\\ \amp = e^{2t} \cdot e^{2(1+t)}\\ \amp = e^{4t+2}\text{.} \end{align*}
Solution 2. Inside-out
Alternatively, we can find the formula for \(\Phi_3(\Phi_1(2))\) working from the inside out. The fact that \(\Phi_3(\Phi_1(2)) \in \R^\R\) is a mapping \(\R \to \R\) follows as in the previous solution, but now we calculate
\begin{align*} \Phi_3(\Phi_1(2))(t) \amp = \Phi_3(s \mapsto e^{2s})(t)\\ \amp = (s \mapsto e^{2s})(t) \cdot (s \mapsto e^{2s})(1+t)\\ \amp = e^{2t} \cdot e^{2(1+t)}\\ \amp = e^{4t+2}\text{.} \end{align*}
To avoid the \(\mapsto\) notation, we can give the intermediate function a name. Let \(f = \Phi_1(2)\) be the function \(\R \to \R\) defined by \(f(t) = e^{2t}\text{.}\) Then
\begin{align*} \Phi_3(\Phi_1(2))(t) \amp = \Phi_3(f)(t)\\ \amp = f(t)f(1+t) \\ \amp = e^{2t} e^{2(t+1)} \\ \amp = e^{4t+2}\text{.} \end{align*}
Comment 1.
Many students wrote things like \(\Phi_3(e^{2t}) = e^{4t+2}\text{.}\) Strictly speaking this is incorrect, and you should write something more verbose like \(\Phi_3(s \mapsto e^{2s})(t) = e^{4t+2}\text{,}\) or alternatively give the function \(s \mapsto e^{2s}\) a name as was done in the official solution. I won’t take off marks on exam for using the sloppier notation, but I’ve seen a lot of students on exams who use the sloppier notation get confused about what the mapping in the question even is, and lose a huge number of marks. Whatever notation you use, please be careful and stop to think about what these expressions mean.
Comment 2.
While the question asks what type of object \(\Phi_3(\Phi_1(2))\) is, in past years many students gave the type of the object \(\Phi_3 \circ \Phi_1\text{,}\) or else \(\Phi_3\) or \(\Phi_1\text{.}\) Note that these are different questions with different answers! Seems to be less of an issue this year.
1.6.1.c
Solution 1. Outside-in
Once again, the domains and codomains of \(\Phi_2,\Phi_3\) line up nicely
\begin{equation*} \R^\R \xrightarrow{\Phi_3} \R^\R \xrightarrow{\Phi_2} \R \end{equation*}
so that the composition \(\Phi_2 \circ \Phi_3 \maps \R^\R \to \R\text{.}\) To find a formula for this mapping, we need to feed it an element of \(\R^\R\text{,}\) i.e. a function \(f \maps \R \to \R\text{.}\) Using the formulas for \(\Phi_2,\Phi_3\) we calculate
\begin{align*} (\Phi_2 \circ \Phi_3)(f) \amp = \Phi_2(\Phi_3(f))\\ \amp = \Phi_3(f)(0) + \Phi_3(f)(1)\\ \amp = f(0)f(0+1) + f(1)f(1+1)\\ \amp = f(0) f(1) + f(1) f(2)\text{.} \end{align*}
Solution 2. Inside-out
Alternatively, we can calculate this composition from the inside out. Deducing as before that \(\Phi_2 \circ \Phi_3 \maps \R^\R \to \R\text{,}\) we have that, for \(f \in \R^\R\text{,}\)
\begin{align*} (\Phi_2 \circ \Phi_3)(f) \amp = \Phi_2\big(t \mapsto f(t)f(1+t)\big)\\ \amp = \big(t \mapsto f(t)f(1+t)\big)(0) + \big(t \mapsto f(t)f(1+t)\big)(1)\\ \amp = f(0)f(0+1) + f(1)f(1+1)\\ \amp = f(0) f(1) + f(1) f(2)\text{.} \end{align*}
If you don’t like the \(\mapsto\) notation, you can introduce a name for the intermediate function. Given \(f \in \R^\R\text{,}\) let \(g = \Phi_2(f) \maps \R \to \R\) be the function defined by \(g(s) = f(s)f(1+s)\text{.}\) Then
\begin{align*} (\Phi_2 \circ \Phi_3)(f) \amp = \Phi_2(g)\\ \amp = g(0) + g(1)\\ \amp = f(0)f(0+1) + f(1)f(1+1)\\ \amp = f(0) f(1) + f(1) f(2)\text{.} \end{align*}

1.7 Isometries

Exercises

1.7.2. Evaluation map on \(B(S)\).
1.7.2.a
Solution.
For \(f, g \in B(S)\text{,}\) we have the estimate
\begin{equation*} |\Phi(f) - \Phi(g)| = |f(s_0) - g(s_0)| \le \|f - g\|_{\sup}\text{,} \end{equation*}
which shows that \(\Phi\) is Lipschitz continuous (with Lipschitz constant \(1\)).
1.7.2.b
Solution.
In general, \(\Phi\) is not an isometry. Suppose that there exists another point \(s_1 \in S\) with \(s_1 \ne s_0\) and consider a function \(f \in B(S)\) with \(f(s_0) = 0\) and \(f(s_1) \ne 0\text{.}\) Then \(|\Phi(f)| = 0\text{,}\) while \(\n f_{\sup} \ne 0\text{.}\) So \(\Phi\) does not preserve the distance between \(f\) and \(0\text{.}\)
But if \(S = \{s_0\}\) is a single point, then \(\Phi\) is an isometry. Indeed, any \(f,g \in B(S)\) will satisfy
\begin{equation*} \n{f-g}_{\sup} = \sup_{s \in S} |f(s)-g(s)| = |f(s_0)-g(s_0)| = |\Phi(f)-\Phi(g)|\text{.} \end{equation*}
Comment.
While the counterexample in the official solution is perhaps the simplest, there is nothing wrong with using a more complicated one. In previous years, for instance, several students working together took \(S=\R\) and \(s_0=0\) and considered a pair of trigonometric functions, say \(f=\cos\) and \(g = \sin\text{.}\) They then calculated
\begin{equation*} \abs{\Phi(f)-\Phi(g)} = \abs{1-0} = 1 \end{equation*}
and
\begin{equation*} \n{f-g}_{\sup} = \cdots = \sqrt 2 \text{.} \end{equation*}
One minor comment I will make on this is that, rather than explicitly calculating \(\n{f-g}_{\sup}\text{,}\) it is enough to find a point \(s \in \R\) where \(\abs{f(s)-g(s)} \gt 1\text{,}\) since this will force \(\n{f-g}_{\sup} \gt 1\text{.}\)

2 Completion of Metric Spaces
2.1 Dense Sets

Exercises

2.1.1. (PS5) Dense sets in product spaces.
Solution.
Suppose that \(S\) is dense in \(X\) and \(T\) is dense in \(Y\text{,}\) and let \((x, y) \in X \times Y\) and \(\varepsilon \gt 0\text{.}\) Then by density there exists \(s \in S\) with \(d_X(s, x) \lt \frac{\varepsilon}{2}\) and \(t \in T\) with \(d_Y(t, y) \lt \frac{\varepsilon}{2}\text{.}\) Hence
\begin{equation*} d_{X \times Y}((s, t), (x, y)) \lt \sqrt{\frac{\varepsilon^2}{4} + \frac{\varepsilon^2}{4}} \lt \varepsilon. \end{equation*}
Conversely, suppose that \(S \times T\) is dense in \(X \times Y\text{,}\) and let \(x \in X\text{,}\) \(y \in Y\) and \(\varepsilon \gt 0\text{.}\) Then there exists \((s, t) \in S \times T\) such that \(d_{X \times Y}((s, t), (x, y)) \lt \varepsilon\text{.}\) This implies in particular that \(d_X(s, x) \lt \varepsilon\) and \(d_Y(t,y) \lt \varepsilon\text{.}\) So \(S\) is dense in \(X\) and \(T\) is dense in \(Y\text{.}\)
With the same arguments we prove that \(T\) is dense in \(Y\text{.}\)
Comment 1.
Lemma 2.3 gives us other characterisations of what it means for a set to be dense, one in terms of sequences and another in terms of closures. For the proof using sequences we can take advantage of Exercise 1.3.2.
Comment 2.
Since this is a relatively straightforward question, I used is as an opportunity to be picky about ways of writing arguments which don’t scale well when things get a bit more complicated, e.g. Exercise 2.1.2 below. Many students wrote things like the following:
Since \(S\) is dense in \(X\) and \(T\) is dense in \(Y\text{,}\)
\begin{gather*} \forall x \in X\, \forall \varepsilon \gt 0\, \exists s \in S \st d_X(x,s) \lt \varepsilon/\sqrt 2\\ \forall y \in Y\, \forall \varepsilon \gt 0\, \exists t \in T \st d_Y(y,t) \lt \varepsilon/\sqrt 2\text{.} \end{gather*}
So
\begin{gather*} d_{X \times Y}\big((x,y),(s,t)\big) \lt \sqrt{\frac \varepsilon2 + \frac \varepsilon2} = \varepsilon\text{.} \end{gather*}
When writing this way the logical status of \(x,y,s,t,\varepsilon\) in the second line is not entirely clear. Given some \(x,y,\varepsilon\) in the definition of \(S \times T\) being dense in \(X \times Y\text{,}\) the author presumably plans to use the first two lines to choose \(s,t\) appropriately? But they haven’t explicitly said this and so we are forced to guess. In my experience with this unit, longer arguments written in this style eventually become impossible to follow, and students who write this way routinely get themselves confused when attempting more difficult problems. The lecture notes will always avoid this sort of ambiguity by explicitly introducing variables and how they relate to one another:
Let \((x,y) \in X \times Y\) and \(\varepsilon \gt 0\text{.}\) Since \(S\) is dense in \(X\) and \(T\) is dense in \(Y\text{,}\)
\begin{gather*} \exists s \in S \st d_X(x,s) \lt \varepsilon/\sqrt 2\\ \exists t \in T \st d_Y(y,t) \lt \varepsilon/\sqrt 2\text{.} \end{gather*}
So
\begin{gather*} d_{X \times Y}\big((x,y),(s,t)\big) \lt \sqrt{\frac \varepsilon2 + \frac \varepsilon2} = \varepsilon\text{.} \end{gather*}
I encourage you in the strongest possible terms to start writing your arguments in something closer to this style if you are not already.
2.1.2. (PS5) Uniform continuity and Cauchy sequences.
Solution.
Fix \(\varepsilon \gt 0\text{.}\) Since \(f\) is uniformly continuous, there exists \(\delta \gt 0\) such that \(d_Y(f(x),f(\tilde x)) \lt \varepsilon\) for all \(x,\tilde x \in X\) with \(d_X(x,\tilde x) \lt \delta\text{.}\) Since \(\seq xn\) is Cauchy, there exists \(N \in \N\) such that \(d_X(x_n,x_m) \lt \delta\) for all \(n,m \ge N\text{.}\) By our above choice of \(\delta\text{,}\) this therefore guarantees that \(d_Y(f(x_n),f(x_m)) \lt \varepsilon\text{.}\)
Comment 1.
In the official solution we really do need to pick \(\delta\) depending on \(\varepsilon\) first, and only then pick \(N\) based on \(\delta\text{.}\) It is worth thinking carefully about why doing things in the reverse order, first picking \(N\) and then picking \(\delta\text{,}\) does not work. Errors of this type can be more easily avoided if you practice good ‘variable hygiene’.
Comment 2.
Copied verbatim from previous years, just as relevant this year:
Many students wrote arguments where the logical status of the symbol \(\delta\) (and sometimes also \(\varepsilon\)) was not clear. For instance, some students seemed to be implying that in the definition of uniform continuity they were free to choose both \(\varepsilon \gt 0\) and \(\delta \gt 0\) arbitrarily, which is not true. (Rather, uniform continuity tells us that, for any given \(\varepsilon \gt 0\text{,}\) there exists a suitable choice of \(\delta \gt 0\text{.}\)) In an advanced analysis unit such as this one, this sort of fuzziness with logic and quantifiers is a very fast way to lose marks on an exam. In my experience, students who write arguments in this way are also very likely to get extremely confused when attempting more difficult problems with multiple parts.
In general, I would discourage you from beginning your arguments by copying down all of the definitions involved, complete with all of their quantifiers and potentially conflicting uses of the same symbols, and then writing down a set of relations between these symbols which seems like it gets you the inequality you want. This is fine as a brainstorming technique on scratch paper, but to see if it actually works (and to get full marks on an exam question) you will likely want to be more explicit about the logical flow of your argument. Here the official solution is a good model: It starts from a fixed \(\varepsilon \gt 0\text{,}\) uses continuity to find a corresponding \(\delta \gt 0\text{,}\) and then uses the definition of a Cauchy sequence to find an \(N \in \N\text{.}\) It does not simply copy out all of these definitions at the beginning, and no symbol is used twice to mean different things. For more on this see Section G.2 and Section G.3.
2.1.3. (PS5) Continuous extensions.
2.1.3.a
Solution.
Let \(\varepsilon \gt 0\) and \(x \in [a,b] \cap \Q\text{.}\) Since \(f\) is continuous, there exists \(\delta \gt 0\) such that \(\abs{f(x)-f(y)} \lt \varepsilon\) for all \(y \in [a,b]\) with \(\abs{x-y} \lt \delta\text{.}\) In particular, for all \(y \in [a,b] \cap \Q\) we have by (✶) that
\begin{equation*} \abs{g(x)-g(y)} = \abs{f(x)-f(y)} \lt \varepsilon\text{.} \end{equation*}
2.1.3.b
Solution.
Let \(\varepsilon \gt 0\text{.}\) Since \(f\) is uniformly continuous, there exists \(\delta \gt 0\) such that \(\abs{f(x)-f(y)} \lt \varepsilon\) for all \(x,y \in [a,b]\) with \(\abs{x-y} \lt \delta\text{.}\) In particular, for all \(x,y \in [a,b] \cap \Q\) with \(\abs{x-y}\lt \delta\) we have by (✶) that
\begin{equation*} \abs{g(x)-g(y)} = \abs{f(x)-f(y)} \lt \varepsilon\text{.} \end{equation*}
Comment.
As you have seen in previous analysis units, and as we will see in more generality in Exercise 4.3.3 of Chapter 2, continuity of \(f \maps [a,b] \to \R\) automatically implies uniform continuity, as the domain \([a,b]\) is a compact metric space.
2.1.3.c
Solution.
First observe that this does not immediately follow from the fact that \(\Q\) is dense in \(\R\text{.}\) Indeed, \(\{\sqrt 2\}\) is a perfectly nice metric subspace of \(\R\text{,}\) but \(\{\sqrt 2\} \cap \Q = \varnothing\) is certainly not a dense subset of \(\{\sqrt 2\}\text{.}\)
Let \(x \in [a,b]\) and \(\varepsilon \gt 0\text{.}\) We seek \(q \in [a,b] \cap \Q\) such that \(\abs{x-q} \lt \varepsilon\text{.}\) If \(x \in (a,b)\text{,}\) then we can use the openness of \((a,b)\) to find \(\varepsilon \gt 0\) such that \(B_\varepsilon(x) \subset (a,b)\text{.}\) The density of \(\Q\) in \(\R\) then implies that there exists \(q \in \Q\) such that \(q \in B_\varepsilon(x)\text{,}\) which gives \(q \in [a,b] \cap \Q\) as desired.
For \(x=b\) things are a bit trickier, because the \(q \in \Q\) that we get from the density of \(\Q\) in \(\R\) might have \(q \gt b\) and hence \(q \notin [a,b] \cap \Q\text{.}\) To get around this, first choose \(n \in \N\) large enough that \(b-1/n \in (a,b)\) and \(1/n \lt \varepsilon/2\text{.}\) Applying our above argument to \(b-1/n\) then yields \(q \in [a,b] \cap \Q\) with \(\abs{q-(b-1/n)} \lt \varepsilon/2\text{.}\) Applying the triangle inequality we conclude that \(\abs{q-b} \le \abs{q-(b-1/n)} + \abs{1/n} \lt \varepsilon/2 + \varepsilon/2 = \varepsilon\text{.}\) as desired. The argument for \(x=a\) is similar.
Here we have argued directly using Definition 2.1, but we could have also used Lemma 2.3 and given an alternative argument in terms of sequences. In either case, the endpoints \(a,b\) need some additional thought.
Comment 1.
Several students gave alternative solutions using the method from Example 5 in the problems classes. This can also work, but depending on how we set things up we may still have to worry a bit about what happens near the endpoints.
Comment 2.
The official solution uses the fact that \(\Q\) is dense in \(\R\) according to the definition in Definition 2.1. Several students cited the slightly stronger fact that between any two distinct real numbers there exists a rational number. If used appropriately, this can lead to a somewhat streamline argument.
Comment 3.
Several students simply wrote that
\begin{equation*} \overline{[a,b] \cap \Q} = \overline{[a,b]} \cap \overline \Q = [a,b] \cap \R = \R\text{.} \end{equation*}
One question here is in which metric space the various closures are being taken. In any case, it is not in general true that \(\overline{A \cap B} = \overline A \cap \overline B\text{.}\) Indeed, for \(\{\sqrt 2\}\) and \(\Q\) as a subsets of \(\R\) we have
\begin{equation*} \overline{\{\sqrt 2\} \cap \Q} = \overline{\varnothing} = \varnothing \end{equation*}
but
\begin{equation*} \overline{\{\sqrt 2\}} \cap \overline \Q = \{\sqrt 2\} \cap \R = \{\sqrt 2\}\text{.} \end{equation*}
2.1.3.d
Solution.
Since \([a,b] \cap \Q\) is dense in \([a,b]\) by the previous part, and the codomain of \(g\) is the complete space \(\R\text{,}\) we can choose \(f\) to be the unique continuous extension of \(g\) guaranteed by Lemma 2.4.
2.1.3.e
Solution.
Let \(x_0 \in (a,b) \without \Q\text{,}\) and define \(g(x) = (x-x_0)^{-1}\text{.}\) Then \(g\) is continuous on \([a,b] \without \{x_0\}\) and hence in particular it is continuous on \([a,b] \cap \Q \subseteq [a,b] \without \{x_0\}\text{.}\)
Suppose that \(f \maps [a,b] \to \R\) satisfies (✶). To see that \(f\) is not continuous, use density to find a sequence \(\seq qn\) of points in \([a,b] \cap \Q\) converging to \(x_0\text{.}\) Then
\begin{equation*} \lim_{n\to\infty} f(q_n) = \lim_{n\to\infty} g(q_n) = \infty\text{.} \end{equation*}
Since \(f(x_0) \in \R\text{,}\) this in particular means (by Theorem 1.46) that \(f\) cannot be continuous at \(x_0\text{.}\)
See the comments for one of many possible alternative solutions.
Comment 1.
Note that it is not enough to find a function \(g\) which has a discontinuous extension (indeed, all functions \(g\) will have many discontinuous extensions, see Example 6). Instead, we need to find a function \(g\) such that no extension could possibly be continuous.
Comment 2.
There are many different functions \(g\) we could have chosen. For instance, we could pick \(x_0 \in (a,b) \without \Q\) and set
\begin{equation*} g(x) = \begin{cases} 0 \amp x \in [a,x_0) \cap \Q, \\ 1 \amp x \in (x_0,b] \cap \Q. \end{cases} \end{equation*}
Since \(x_0 \notin [a,b] \cap \Q\text{,}\) it is easy to show that \(g\) is continuous. Suppose that \(f \maps [a,b] \to \R\) satisfies (✶). Using the density of \(\Q\text{,}\) we can find sequences \(\seq xn\) and \(\seq{x'}n\) in \([a,b] \cap \Q\) converging to \(x_0\text{,}\) with \(x_n \lt x_0 \le x_n'\) for all \(n \in \N\text{.}\) Then
\begin{equation*} \lim_{n \to \infty} f(x_n) = 0 \ne 1 = \lim_{n \to \infty} f(x_n'), \end{equation*}
and so by Theorem 1.46 \(f\) cannot be continuous at \(x_0\text{.}\)
2.1.4. (PS6) \(\C\) is separable.
2.1.4.a
Solution.
Consider the point \(z=i \in \C\text{.}\) Then for any rational number \(q \in \Q\text{,}\) we have
\begin{equation*} \abs{z-q} = \sqrt{(0-q)^2 + (1-0)^2} = \sqrt{q^2 + 1} \ge 1\text{.} \end{equation*}
Thus, setting \(\varepsilon = 1\text{,}\) there is no \(q \in \Q\) with \(\abs{z-q} \lt \varepsilon\text{.}\)
Comment.
Note that the closure \(\overline \Q\) of \(\Q\) depends very much on which metric space we are thinking of \(\Q\) as a subset of. For instance, if we consider \(\Q\) as a subset of itself, then \(\overline \Q = \Q\text{!}\) So in order to use the last part of Lemma 2.3 here we would have to argue that \(\overline \Q = \R\) as a subset of \(\C\).
2.1.4.b
Solution 1. Direct argument
We know that \(\Q\) is countable, so
\begin{equation*} \Q + i\Q = \set{p + iq}{p, q \in \Q}, \end{equation*}
which can be mapped bijectively to \(\Q \times \Q\text{,}\) is countable as well. We want to show that \(\Q + i\Q\) is dense in \(\C\text{.}\)
Let \(z = x + iy \in \C\) and fix \(\varepsilon \gt 0\text{.}\) Since \(\Q\) is dense in \(\R\text{,}\) we can choose \(p, q \in \Q\) such that \(|x - p| \lt \frac{\varepsilon}{2}\) and \(|y - q| \lt \frac{\varepsilon}{2}\text{.}\) Then
\begin{equation*} |z - (p + iq)| \lt \sqrt{\frac{\varepsilon^2}{4} + \frac{\varepsilon^2}{4}} = \frac{\varepsilon}{\sqrt{2}} \lt \varepsilon. \end{equation*}
Hence \(\Q + i\Q\) is dense in \(\C\text{.}\)
Solution 2. Using isometries
It is straightforward to check that the mapping \(f \maps \R \times \R \to \C\) defined by \(f(x,y) = x+iy\) is bijective and an isometry. The metric spaces \(\C\) and \(\R \times \R\) are therefore isometric, and so it suffices to show that \(\R \times \R\) is separable. As \(\Q\) is dense in \(\R\text{,}\) by Exercise 2.1.1 \(\Q \times \Q\) is dense in \(\R \times \R\text{.}\) As \(\Q\) is countable, \(\Q \times \Q\) is countable, and so the proof is complete.
Comment.
The official solution above using isometries is already relatively terse, but some students went one step further and simply wrote \(\C ≅ \R \times \R\) without further comment or explanation. On an exam this would probably have lost some marks. Some students also never explicitly said that they even had a bijective isometry between \(\R \times \R\) and \(\C\text{,}\) and (at least as far as I could tell) only talked about an isometry with domain \(\Q \times \Q\text{.}\) This alone is not enough.

2.2 Existence of the Completion

Exercises

2.2.1. (PS6) Equivalence relation in the proof of Theorem 2.10.
Solution 1. Argument using Exercise 2.2.2
Since we have already done Exercise 1.3.4 and Exercise 2.2.2, we can give a very short proof. Let \(\seq xn, \seq yn, \seq zn\) be Cauchy sequences in \((X,d)\text{.}\) Then by Exercise 2.2.2 we have
\begin{align*} \lim_{n\to\infty} d(x_n,z_n) \amp \le \lim_{n\to \infty} d(x_n,y_n) + \lim_{n\to\infty} d(y_n,z_n), \end{align*}
where the limits all exist thanks to Exercise 1.3.4. If \(\seq xn \sim \seq yn\) and \(\seq yn \sim \seq zn\text{,}\) then the two limits on the right hand side are both \(0\text{,}\) and so (since it must be non-negative) the limit on the left hand side is also \(0\text{,}\) i.e. \(\seq xn \sim \seq zn\) as desired. Thus \(\sim\) is transitive.
The symmetry of \(\sim\) follows from the symmetry of \(d\text{,}\) while reflexivity follows from the fact that \(d(x,x)=0\) for any \(x \in X\text{.}\) Therefore \(\sim\) is an equivalence relation.
Solution 2. Direct argument
Let \(\seq xn, \seq yn, \seq zn\) be Cauchy sequences in \((X,d)\text{,}\) and suppose that \(\seq xn \sim \seq yn\) and \(\seq yn \sim \seq zn\text{.}\) Using the triangle inequality we have, for any \(n \in \N\text{,}\)
\begin{equation*} d(x_n,z_n) \le d(x_n,y_n) + d(y_n,z_n), \end{equation*}
and hence
\begin{align*} d(x_n,z_n) \amp \le d(x_n,y_n) + d(y_n,z_n)\\ \amp \to 0 + 0 = 0 \end{align*}
as \(n \to \infty\text{,}\) which implies that \(\seq xn \sim \seq zn\text{.}\) Thus \(\sim\) is transitive.
Note that here we can get actually away without using Exercise 1.3.4. The limits on the right hand side are zero by assumption. Since the left hand side is nonnegative, this then proves that its limit also exists and is also zero.
The symmetry of \(\sim\) follows from the symmetry of \(d\text{,}\) while reflexivity follows from the fact that \(d(x,x)=0\) for any \(x \in X\text{.}\) Therefore \(\sim\) is an equivalence relation.
Comment 1.
Both of the solutions above explain why the limit \(\lim_{n\to\infty} d(x_n,z_n)\) exists. Simply assuming that it exists, without providing any justification whatsoever, would probably have cost marks on an exam.
Comment 2.
This question ends with ‘Conclude that…’. This means that you should write something about how this conclusion can in fact be reached. Failing to do so on an exam would likely cost marks.
2.2.2. (PS5) Triangle inequality in Theorem 2.10.
Solution.
Using the triangle inequality we have, for any \(n \in \N\text{,}\)
\begin{equation*} d(x_n,z_n) \le d(x_n,y_n) + d(y_n,z_n). \end{equation*}
Taking \(n \to \infty\text{,}\) we deduce that
\begin{align*} \lim_{n\to\infty} d(x_n,z_n) \amp \le \lim_{n\to \infty} \Big(d(x_n,y_n) + d(y_n,z_n)\Big),\\ \amp = \lim_{n\to \infty} d(x_n,y_n) + \lim_{n\to\infty} d(y_n,z_n), \end{align*}
where the limits all exist thanks to Exercise 1.3.4.
Comment 1.
Recall from Analysis 1 that, for sequences \(\seq an\) and \(\seq bn\) are sequences of real numbers, it is not always true that
\begin{equation*} \lim_{n\to\infty} (a_n+b_n) = \lim_{n\to\infty} a_n + \lim_{n\to\infty} b_n\text{.} \end{equation*}
For instance we could have \(a_n = n\) and \(b_n = -n\text{.}\) Then neither \(\seq an\) nor \(\seq bn\) converge (in the sense of Definition 1.29), whereas the limit on the left hand side is \(0\text{.}\)
If this had been an exam question, explaining why the limits exist in the first place would have counted for non-negligible portion of the total marks.
Comment 2.
Note that there is no reason whatsoever for the Cauchy sequences \(\seq xn, \seq yn, \seq zn\) to be convergent, as we have not assumed that \((X,d)\) is complete. Indeed, this exercise is a step in the proof of Theorem 2.10, which is only interesting in the case where \((X,d)\) is not complete.
2.2.3. (PS5) Completion vs. closure.
Solution.
Thanks to Theorem 1.23, \(\overline S\) is closed in \((X,d)\text{.}\) Since \((X,d)\) is complete, Theorem 1.41 then implies that \((\overline S,d_{\overline S})\) is also complete. It remains to show that there is a subset \(S_0 \subset \overline S\) which is both isometric to \(S\) and dense in the metric space \((\overline S,d_{\overline S})\text{.}\) We simply take \(S_0=S\text{,}\) which is clearly isometric to \(S\text{.}\) To see that \(S\) is dense in \((\overline S,d_{\overline S})\text{,}\) let \(x \in \overline S\) and \(\varepsilon\gt 0\text{.}\) Since \(x \in \overline S\) we know that \(B_{\varepsilon}(x)\cap S \ne \varnothing\text{.}\) It follows that there exists an \(s \in S\) with \(d_{\overline S}(s,x) = d(s,x) \lt \varepsilon\text{.}\)
Comment.
We can slightly shorten the last step above argument by appealing to Lemma 2.3, but there is a tiny wrinkle. When we originally took the closure of \(S\text{,}\) we were thinking of it as a subset of \((X,d)\text{.}\) To use Lemma 2.3, we need this to be the same thing as taking the closure of \(S\) as a subset of \((\overline S, d_{\overline S})\text{.}\) Thankfully this turns out to be the case, but in the official solutions we prefer to just give a direct argument and sidestep this whole issue.
If you did want to talk about talk about the closure of \(S\) as a subset of \((X,d)\) and as a subset of \((\overline S, d_{\overline S})\) in the same sentence, probably the least confusing thing to do is to give them different names: “Let \(C \subseteq X\) be the closure of \(S\) as a subset of \((X,d)\text{,}\) and let \(C' \subseteq \overline S\) be the closure of \(S\) as a subset of \((\overline S, d_{\overline S})\)”.

2.4 Normed and Inner Product Spaces

Exercises

2.4.2. (PS6) Scalar multiplication and Cauchy sequences.
2.4.2.a
Solution.
If \(\alpha = 0\text{,}\) then \(\seq{\alpha x}n\) is the constant sequence \((0,0,\ldots)\text{,}\) which is clearly convergent and hence Cauchy. So assume that \(\alpha \ne 0\) and let \(\varepsilon \gt 0\text{.}\) Since \(\seq xn\) is Cauchy, we can choose \(N \in \N\) such that, for all \(n,m \ge N\text{,}\) \(\n{x_n-x_m}_X \lt \varepsilon/\abs \alpha\text{.}\) Then, for all \(n,m \ge N\text{,}\) we have
\begin{align*} \n{\alpha x_n-\alpha x_m}_X \amp = \n{\alpha (x_n- x_m)}_X\\ \amp = \abs \alpha \n{x_n- x_m}_X\\ \amp \lt \abs \alpha \frac \varepsilon{\abs \alpha}\\ \amp = \varepsilon\text{.} \end{align*}
Comment.
In the argument above we want to show that the sequence \((\alpha x_n)_{n\in\N}\) is Cauchy, and so given an \(\varepsilon \gt 0\) we must choose \(N \in \N\) appropriately based on this \(\varepsilon\text{.}\) This logical relationship between \(\varepsilon\) and \(N\) is at the heart of the definition of a Cauchy sequence, and getting it wrong on an exam (or being overly vague) would probably mean losing most or even all of the marks on a question like this one — even though the resulting argument might superficially look quite similar to the official solution. If you are at all confused about this important point, I encourage you in the strongest possible terms to look back at Exercise B.1 from Problem Sheet 1 as well as all of Appendix G, and also to get in touch with me over email or in to office hours.
2.4.2.b
Solution.
We simply calculate
\begin{align*} \lim_{n\to \infty} \n{\alpha x'_n - \alpha x_n}_X \amp = \lim_{n\to \infty}\Big( \abs \alpha \n{x'_n - x_n}_X \Big)\\ \amp = \abs \alpha \lim_{n\to \infty} \n{x'_n - x_n}_X \\ \amp = 0\text{.} \end{align*}

3 Some Function Spaces
3.1 \(C^k([a,b])\)

Exercises

3.1.1. (PS6) Calculating and estimating \(C^k\) norms.
3.1.1.a
Solution.
From Exercise 1.1.7 we know that \(\n f_{C^0([0,1])} = \n f_{\sup} = 1+e\text{.}\) Next we calculate
\begin{align*} \n{f'}_{C^0([0,1])} \amp = \sup_{t \in [0,1]} \abs{f'(t)}\\ \amp = \sup_{t \in [0,1]} \abs{1+e^t}\\ \amp = 1+e\text{,} \end{align*}
where we have used that \(t \mapsto 1+e^t\) is positive an increasing on \([0,1]\text{.}\) Similarly we find
\begin{align*} \n{f''}_{C^0([0,1])} \amp = \sup_{t \in [0,1]} \abs{f''(t)}\\ \amp = \sup_{t \in [0,1]} \abs{e^t}\\ \amp = e\text{.} \end{align*}
Putting everything together, we have
\begin{align*} \n f_{C^2([0,1])} \amp = \n f_{C^0([0,1])} + \n{f'}_{C^0([0,1])} + \n{f''}_{C^0([0,1])}\\ \amp = (1+e)+(1+e)+e\\ \amp = 2+3e\text{.} \end{align*}
Comment.
Many students wrote, e.g., \(\n{e^t}_{C^0}\) as a shorthand for the \(C^0([0,1])\) norm of the function \(t \mapsto e^t\text{.}\) This breaks our rules laid down in Section 1.6, because it conflates the expression (or maybe formula) \(e^t\) with the function \(t \mapsto e^t\text{,}\) \([0,1] \to \R\text{.}\) So technically it would be more correct to write something like \(\n{t\mapsto e^t}_{C^0}\text{.}\) In the context of this relatively simple exercise, though, this didn’t seem to cause any real confusion.
On an exam I would not take off marks for this particular abuse of notation in and of itself. But do be warned that in more complicated problems this type of notational abuse can lead to real confusion and many lost marks. Be careful!
3.1.1.b
Solution.
From Exercise 1.1.8 we know that \(\n f_{C^0([-1,2])} = \n f_{\sup} \le 11\text{.}\) Estimating each term in the sum separately, we similarly find
\begin{align*} \n{f'}_{C^0([-1,2])} \amp= \sup_{t \in [-1,2]} \abs{f'(t)}\\ \amp = \sup_{t \in [-1,2]} \left|1+2t-\frac 32 t^2+ \frac 1{20}t^4\right|\\ \amp\le \sup_{t \in [-1,2]} \left(1 + 2\abs t+\frac 32 \abs t^2+ \frac 1{20} \abs t^4\right)\\ \amp\le 1 + 2\sup_{t \in [-1,2]} \abs t + \frac 32 \sup_{t \in [-1,2]} \abs t^2 + \frac 1{20} \sup_{t \in [-1,2]} \abs t^4\\ \amp\le 1 + 2 \cdot 2 + \frac 32 \cdot 2^2 + \frac 1{20} \cdot 2^4\\ \amp = 11 + \frac 4{5} \le 12\text{.} \end{align*}
Thus
\begin{align*} \n f_{C^1([-1,2])} \amp = \n f_{C^0([-1,2])} + \n{f'}_{C^0([-1,2])}\\ \amp \le 11 + 12 = 23\text{.} \end{align*}
Comment 1.
See the first comment for the previous part of this exercise.
Comment 2.
Some students were very brief in their solutions to this part of the exercise, for instance writing
\begin{equation*} \n{f'}_{C^0} \le 1 + 4 + 6 + \tfrac 5{20} \end{equation*}
without any further explanation. On an exam this would certainly be living dangerously! If you happen to do the estimate exactly as I had expected you to, and get exactly the same sum on the right hand side, then I can at least guess at what your thought process might be. But if you do something else, e.g. if you write something confusing like
\begin{equation*} \n{f'}_{C^0} \le 0 + 8 + 0 - \tfrac 1{20}\text{,} \end{equation*}
then it becomes very hard for me to read your mind, and basically impossible for me to assign you even partial marks.
For this particular problem, I think the best thing to do is just give at least a few intermediate steps. Alternatively, you could try to explain in words what you are doing, e.g. “using the triangle inequality for \(\n\blank_{C^0}\) and estimating each term separately, I get…”.

3.2 Hölder Spaces

Exercises

3.2.2. (PS7) Regularity of \(x \mapsto x^\alpha\).
3.2.2.a
Solution.
We know that \(f\) is continuously differentiable on the open interval \((0,1)\) with \(f'(x)=\alpha x^{\alpha-1}\text{.}\) Since \(\alpha-1 \lt 0\text{,}\) we have \(f'(x) \to \infty\) as \(x \to 0\) with \(x \gt 0\text{.}\)
 1 
In other words, for all \(N \gt 0\) there exists \(\delta \gt 0\) such that \(f'(x) \ge N\) for \(x \in (0,\delta)\text{.}\)
Thus there cannot exist \(g \in C^0([0,1])\) with \(f'(t)=g(t)\) for all \(t \in (0,1)\text{,}\) and so by definition \(f \notin C^1([0,1])\text{.}\)
To give a bit more detail about the key step, suppose for the sake of contradiction that there were such a \(g \in C^0([0,1])\text{.}\) Then \(g\) would be sequentially continuous, and so we would have
\begin{equation*} \R \ni g(0) = \lim_{n \to \infty} g(1/n) = \lim_{n \to \infty} f'(1/n) = \infty\text{,} \end{equation*}
which is a contradiction.
Comment.
Some students wrote things along the lines of “\(g(x)\) is undefined at \(x=0\)” or “\(f'(x)\) is undefined at \(x=0\)” without further explanation. Without more context, these statements do not have a precise meaning. The derivative \(f'\text{,}\) by the usual limit definition, only makes sense on \((0,1)\text{,}\) and so is undefined at \(1\) just as much as it is undefined at \(0\text{.}\) The function \(g\text{,}\) on the other hand, is a hypothetical function \([0,1] \to \R\text{,}\) i.e. a hypothetical function whose values at \(0\) and \(1\) are defined.
For instance, one way to argue would be to first say that, since \(g\) is an extension of \(f'\text{,}\) it must be given by
\begin{equation*} g(x) = \begin{cases} A \amp \text{ if } x=0 \\ \alpha x^{\alpha - 1} \amp \text{ if } x \in (0,1) \\ B \amp \text{ if } x = 1 \end{cases} \end{equation*}
for some choices of \(A,B \in \R\text{.}\) Certainly at this point we could think of \(g(0)=A\) as a well-defined real number. The next step in the argument, as in the official solution, would be to explain why no choice of \(A\) makes \(g\) continuous at \(0\text{.}\)
Several students also seemed to implicitly assume that \(f'\) and \(g\) necessarily have the same ‘formula’ \(x \mapsto \alpha x^{\alpha-1}\text{.}\) This is in general false when speaking of continuous extensions. For instance the function \(x \mapsto \sin(x)/x\) is continuous on \((0,1)\) and has a continuous extension \(g\) to \([0,1]\) with a different, piecewise formula
\begin{equation*} g(x) = \begin{cases} 1 \amp \text{ if } x = 0, \\ \frac{\sin x}x \amp \text{ if } x \ne 0. \end{cases} \end{equation*}
3.2.2.b
Solution.
We estimate
\begin{align*} [f]_{C^{0,\beta}([0,1])} \amp= \sup_{x,y \in [0,1],\, x \ne y} \frac{\abs{f(x)-f(y)}}{\abs{x-y}^\beta}\\ \amp\ge \sup_{x \in (0,1]} \frac{\abs{f(x)-f(0)}}{\abs{x-0}^\beta}\\ \amp= \sup_{x \in (0,1]} \frac{x^\alpha}{x^\beta}\\ \amp=\infty\text{.} \end{align*}
Comment.
Several students, following the hint, assumed that
\begin{align*} [f]_{C^{0,\beta}([0,1])} \amp= \sup_{x,y \in [0,1],\, x \ne y} \frac{\abs{f(x)-f(y)}}{\abs{x-y}^\beta}\\ \amp= \sup_{x \in (0,1]} \frac{\abs{f(x)-f(0)}}{\abs{x-0}^\beta}\text{.} \end{align*}
This is absolutely not true for general functions \(f \in C^{0,\beta}([0,1])\text{.}\) To see why, let’s think about what these suprema mean in more detail. Define
\begin{equation*} S = \set{(x,y)}{x,y \in [0,1],\, x \ne y} \end{equation*}
and
\begin{equation*} T = \set{(x,0)}{x \in (0,1]}\text{.} \end{equation*}
Then \(T \subset S\text{,}\) and so
\begin{align*} [f]_{C^{0,\beta}([0,1])} \amp= \sup_{(x,y) \in S}\frac{\abs{f(x)-f(y)}}{\abs{x-y}^\beta}\\ \amp\ge \sup_{(x,y) \in T}\frac{\abs{f(x)-f(y)}}{\abs{x-y}^\beta}\\ \amp= \sup_{x \in (0,1]} \frac{\abs{f(x)-f(0)}}{\abs{x-0}^\beta}\text{,} \end{align*}
where the inequality comes from the fact \(A \subseteq B\) implies \(\sup A \le \sup B\text{.}\)
3.2.2.c
Solution.
We know that \(f\) is continuous on \([0,1]\text{,}\) and easily calculate
\begin{equation*} \n f_{C^0([0,1])} = \sup_{x \in [0,1]} \abs{x^\alpha} = 1 \text{.} \end{equation*}
To estimate \([f]_{C^{0,\alpha}([0,1])}\text{,}\) we follow the hint. Differentiating, we find that \((t+1)^\alpha - t^\alpha\) is a strictly decreasing function of \(t \ge 0\text{,}\) and hence that
\begin{equation*} (t+1)^\alpha - t^\alpha \le (0+1)^\alpha - 0^\alpha = 1 \quad \text{ for } t \ge 0. \end{equation*}
For \(0 \le y \lt x\text{,}\) we then use rules of exponents to find
\begin{align*} \frac{x^\alpha-y^\alpha}{(x-y)^\alpha} \amp= \Big(\frac x{x-y}\Big)^\alpha- \Big(\frac y{x-y}\Big)^\alpha \\ \amp= \Big(1+\frac y{x-y}\Big)^\alpha- \Big(\frac y{x-y}\Big)^\alpha \\ \amp= (t+1)^\alpha - t^\alpha\\ \amp\le 1\text{,} \end{align*}
where \(t = y/(x-y) \ge 0\text{.}\) This allows us to estimate
\begin{align*} [f]_{C^{0, \alpha}([0,1])} \amp= \sup_{x, y \in [a,b], \ x \ne y} \frac{|f(y) - f(x)|}{|x - y|^\alpha}\\ \amp= \sup_{0 \le y \lt x \le 1} \frac{|f(y) - f(x)|}{|x - y|^\alpha}\\ \amp= \sup_{0 \le y \lt x \le 1} \frac{x^\alpha-y^\alpha}{(x-y)^\alpha} \\ \amp\le 1\text{.} \end{align*}
(By considering the supremum over \(x \in [0,1]\) with \(y=0\text{,}\) we can in fact show that it is equal to 1.)
3.2.3. (PS7) \(C^1\) as a subset of \(C^{0,1}\).
3.2.3.a
Solution.
First we must show that \(C^1([a,b])\) is a subset of \(C^{0,1}([a,b])\text{.}\) Let \(f \in C^1([a,b])\text{.}\) Then by definition \(f \in C^0([a,b])\text{,}\) and thus it suffices to show that \([f]_{C^{0,1}([a,b])} \lt \infty\text{.}\) So let \(x, y \in [a,b]\text{.}\) By the mean value theorem, there exists some \(c\) between \(x\) and \(y\) such that
\begin{equation*} \frac{f(x)-f(y)}{x-y} = f'(c)\text{.} \end{equation*}
In particular,
\begin{equation*} \frac{\abs{f(x)-f(y)}}{\abs{x-y}} = \abs{f'(c)} \le \n{f'}_{C^0([a,b])}\text{.} \end{equation*}
Taking a supremum we conclude that \([f]_{C^{0,1}([a,b])} \le \n{f'}_{C^0([a,b])}\text{,}\) which is finite since \(f \in C^1([a,b])\text{,}\) and hence that \(f \in C^{0,1}([a,b])\text{.}\) Thus \(C^1([a,b])\) is a subset of \(C^{0,1}([a,b])\text{.}\) As \(C^1([a,b])\) is closed under the vector space operations, it is therefore a linear subspace of \(C^{0,1}([a,b])\text{.}\)
Comment 1.
A comment from last year: It is not enough to observe that \(\abs{f(x)-f(y)}/\abs{x-y}\) is finite for any fixed \(x \ne y\) and also in the limit as \(y \to x\) — we must show that the supremum of these finite values is also bounded. On the other hand, if we could show, e.g., that \(\abs{f(x)-f(y)}/\abs{x-y}\) extends to a continuous function of \((x,y) \in [a,b]^2\text{,}\) then we could appeal to (a suitable generalisation of) the Weierstrass extreme value theorem.
Comment 2.
In principle the fact that \(C^1([a,b])\) is closed under vector space operations is contained in Theorem 3.2, but of course we skipped that part of the proof. So many students, not unreasonably, decided to give proofs here. In the context of such an argument, it is important to remember that the relevant operations here vector addition and scalar multiplication. In other words, for \(f,g \in C^1([a,b])\) and \(\alpha \in \R\text{,}\) we must show that \(f+g\) and \(\alpha f\) also lie in \(C^1([a,b])\text{.}\) We do not need to show that the product \(fg\) (i.e. \(t \mapsto f(t)g(t)\)) is an element of \(C^1([a,b])\text{!}\) Linear subspaces which are closed under this additional operation are called algebras, and will appear later in Chapter 5.
3.2.3.b
Solution.
We have already shown that, for \(f \in C^1([a,b])\text{,}\) \([f]_{C^{0,1}([a,b])} \le \n{f'}_{C^0([a,b])}\text{.}\) Recalling the formulas for the \(C^1([a,b])\) and \(C^{0,1}([a,b])\) norms, we will therefore be done if we can show the reverse inequality. For any \(x,y \in [a,b]\) with \(x\ne y\text{,}\) we have
\begin{equation*} \frac{\abs{f(x)-f(y)}}{\abs{x-y}} \le [f]_{C^{0,1}([a,b])} \text{.} \end{equation*}
Thus for any \(x \in (a,b)\) we have
\begin{equation*} \abs{f'(x)} = \lim_{y \to x} \frac{\abs{f(x)-f(y)}}{\abs{x-y}} \le [f]_{C^{0,1}([a,b])} \text{.} \end{equation*}
It follows that
\begin{equation*} \sup_{x \in (a,b)} |f'(x)| \le [f]_{C^{0, 1}([a,b])}. \end{equation*}
Since \(f'\) is extended continuously to \([a,b]\text{,}\) this then implies \(\|f'\|_{C^0([a,b])} \le [f]_{C^{0, 1}([a,b])}\) as desired.
3.2.3.c
Solution.
By Part b, for any \(f \in C^1([a,b])\) we have
\begin{align} \n f_{C^{0,1}([a,b])} \amp = \n f_{C^0([a,b])} + [f]_{C^{0,1}([a,b])}\notag\\ \amp = \n f_{C^0([a,b])} + \n{f'}_{C^0([a,b])}\notag\\ \amp = \n f_{C^1([a,b])}\text{,}\tag{✶} \end{align}
as desired. Thus \(\n\blank_{C^1([a,b])}\) is the restriction of \(\n\blank_{C^{0,1}([a,b])}\) to the set \(C^{0,1}([a,b])\text{.}\) Combining with Part a, we conclude that \(\big(C^1([a,b]),\n\blank_{C^1([a,b])}\big)\) is a normed subspace of \(\big(C^{0,1}([a,b]),\n\blank_{C^{0,1}([a,b])}\big)\text{.}\)
3.2.3.d
Solution.
Since \(C^1([a,b])\) is complete when equipped with the \(C^1([a,b])\) norm (Theorem 3.2), by Part c it is also complete as a metric subspace of \(C^{0,1}([a,b])\text{.}\) Since \(C^{0,1}([a,b])\) is complete by Theorem 3.5, Theorem 1.41 therefore implies that \(C^1([a,b])\) must be closed as a subset of \(C^{0,1}([a,b])\text{.}\)
Comment.
Note that we cannot apply Theorem 1.41 immediately, without first showing something like (✶). This is because Theorem 1.41 requires not only that \(Y \subseteq X\text{,}\) but that \((Y,d')\) is a metric subspace of \((X,d)\text{,}\) i.e. \(d\) and \(d'\) must agree on \(Y \times Y\text{.}\) But here it is not immediately obvious that \(\n\blank_{C^1([a,b])}\) and \(\n\blank_{C^{0,1}([a,b])}\) agree on \(C^1([a,b])\text{,}\) and so we have to work to prove something like (✶).
3.2.5. (PS8) \(C^{1,\alpha}([a,b])\).
Solution.
Expanding out the definitions, we note that norm in (3.2) could equivalently be written
\begin{gather} \n f_{C^{1,\alpha}([a,b])} = \n f_{C^0([a,b])} + \n{f'}_{C^{0,\alpha}([a,b])}.\tag{#} \end{gather}
As in the lecture notes, we will neglect to check that \(C^{1,\alpha}([a,b])\) is a normed space, and focus on completeness.
Let \(\seq fn\) be a Cauchy sequence in \(C^{1,\alpha}([a,b])\text{.}\) We will be done if we can show that this sequence is convergent in this space. By (3.2), \(\seq fn\) is also a Cauchy sequence in \(C^1([a,b])\text{.}\) Since \(C^1([a,b])\) is complete by Theorem 3.2, this sequence therefore converges in \(C^1([a,b])\) to some limit \(f \in C^1([a,b])\text{:}\)
\begin{gather} \n{f_n - f}_{C^1([a,b])} \to 0 \text{ as } n \to \infty \text{.}\tag{✶} \end{gather}
In particular, this means that
\begin{gather} \n{f'_n - f'}_{C^0([a,b])} \to 0 \text{ as } n \to \infty\text{.}\tag{†} \end{gather}
Looking at (#) we also notice that the sequence \(\seq {f'}n\) is Cauchy in \(C^{0,\alpha}([a,b])\text{.}\) Thus by Theorem 3.5 it is convergent in this space, with some limit \(g \in C^{0,\alpha}([a,b])\text{:}\)
\begin{gather} \n{f'_n - g}_{C^{0,\alpha}([a,b])} \to 0 \text{ as } n \to \infty \text{.}\tag{✶✶} \end{gather}
In particular, this means that
\begin{gather} \n{f'_n - g}_{C^0([a,b])} \to 0 \text{ as } n \to \infty\text{.}\tag{††} \end{gather}
Comparing (†) and (††), the uniqueness of limits (Theorem 1.34) in \(C^0([a,b])\) means that \(g=f'\text{.}\)
Therefore we have \(f' = g \in C^{0,\alpha}([a,b])\text{,}\) and hence that \(f \in C^{1,\alpha}([a,b])\text{.}\) Finally, combining (✶) and (✶✶) and using (#), we have
\begin{align*} \n{f_n - f}_{C^{1,\alpha}([a,b])} \amp = \n{f_n - f}_{C^0([a,b])} + \n{f_n' - f'}_{C^{0,\alpha}([a,b])} \\ \amp = \n{f_n - f}_{C^0([a,b])} + \n{f_n' - g}_{C^{0,\alpha}([a,b])} \\ \amp \to 0 \end{align*}
as \(n \to \infty\text{.}\)
Comment 1.
Suppose that \((X,d_X)\) and \((Y,d_Y)\) are metric spaces, and that, as sets, \(Y \subseteq X\text{.}\) If this is all we know, then we cannot say anything at all about the relationship between Cauchy sequences in \((X,d_X)\) and in \((Y,d_Y)\text{.}\) Indeed, without information about how \(d_X\) and \(d_Y\) compare, we can say very little at all about how these two spaces are related.
If, as in the above problem, though, we happen to know that \(d_X(y_1,y_2) \le d_Y(y_1,y_2)\) for all \(y_1,y_2 \in Y\text{,}\) then we can show that Cauchy sequences in \((Y,d_Y)\) are also Cauchy in \((X,d_X)\text{.}\)
Comment 2.
The above proof is slightly slick, which is not always a good thing in terms of understanding. Another perfectly valid argument involves repeating most of the key steps in the proof of Theorem 3.5.
Comment 3.
In the official solution, we first argue that \(\seq fn\) converges in \(C^1([a,b])\) while \(\seq{f'}n\) converges in \(C^{0,\alpha}([a,b])\text{.}\) Call the first limit \(f\) and the second limit \(g\text{.}\) While it turns out that \(f'=g\text{,}\) this is not immediately obvious, and requires justification. For this reason, it is a particularly bad idea to call the second limit \(f'\) from the get-go. The official solution uses the relationships between the various norms and the uniqueness of limits to establish \(f'=g\text{,}\) but this is not the only way to argue. For instance, it is possible that in Analysis 2A you were taught a theorem which is relevant here. One can also argue very directly, as in the proof of Theorem 3.2.
3.2.6. (PS7) Inclusions between Hölder spaces.
Solution.
Let \(x, y \in [a,b]\) with \(x \ne y\text{.}\) Then
\begin{align*} \abs{f(x)-f(y)} \amp \le [f]_{C^{0,\beta}([a,b])} \abs{x-y}^\beta \\ \amp \le [f]_{C^{0,\beta}([a,b])} \abs{x-y}^{\beta-\alpha} \abs{x-y}^\alpha\\ \amp \le [f]_{C^{0,\beta}([a,b])} \abs{b-a}^{\beta-\alpha} \abs{x-y}^\alpha \end{align*}
Dividing by \(\abs{x-y}^\alpha\) and taking a supremum yields the desired estimate for \([f]_{C^{0,\alpha}([a,b])}\text{.}\) In particular, \([f]_{C^{0,\alpha}([a,b])} \lt \infty\text{,}\) and so \(f \in C^{0,\alpha}([a,b])\text{.}\) Since \(f \in C^{0,\beta}([a,b])\) was arbitrary, we conclude that \(C^{0,\beta}([a,b]) \subseteq C^{0,\alpha}([a,b])\text{.}\)
Comment 1.
If, on an exam, you write
\begin{gather*} \sup_{x,y \in [a,b],\ x \ne y} \frac{ \abs{f(x)-f(y)}}{\abs{x-y}^\beta} = \sup \left(\frac{ \abs{f(x)-f(y)}}{\abs{x-y}^\alpha} \abs{x-y}^{\alpha-\beta} \right) \le \sup \cdots, \end{gather*}
then I will assume that the second two suprema are over the same values of \(x\) and \(y\text{.}\) But if you just write
\begin{gather*} \sup \frac{ \abs{f(x)-f(y)}}{\abs{x-y}^\beta} = \sup \left(\frac{ \abs{f(x)-f(y)}}{\abs{x-y}^\alpha} \abs{x-y}^{\alpha-\beta} \right) \le \sup \cdots, \end{gather*}
and never indicate what set your supremum is being taken over in any way, then you will likely lose marks.
Comment 2.
This question ends with the phrase ‘and conclude that…’. This means that you should write something about how this conclusion can in fact be reached. Failing to do so on an exam would likely cost marks.
Comment 3.
While this question generally went quite well this year, there were a handful of confusing solutions which involved statements like
\begin{gather*} [f]_{C^{0,\alpha}} = \abs{f(x)-f(y)}{\abs{x-y}^\alpha} = \abs{f(x)-f(y)}{\abs{x-y}^{\beta}} \abs{x-y}^{\beta-\alpha} = [f]_{C^{0,\beta}} \abs{x-y}^\alpha \end{gather*}
where the supremum in the definition of the Hölder seminorm has disappeared, making everything an equality, and the logical status of \(x,y\) becomes deeply unclear. Such solutions would have received few (if any) marks on an exam.

3.3 Integration Review

Exercises

3.3.1. (PS6) Integrals involving piecewise-linear functions.
3.3.1.a
Solution.
Figure 3.1. The function \(g_c\) in Exercise 3.3.1.
We calculate
\begin{align*} \int_{-1}^1 g_c(t)\, dt \amp = \int_{-1}^0 g_c(t)\, dt + \int_0^c g_c(t)\, dt + \int_c^1 g_c(t)\, dt\\ \amp = \int_{-1}^0 0\, dt + \int_0^c \frac tc\, dt + \int_c^1 1\, dt\\ \amp = \frac c2 + (1-c)\\ \amp = 1 - \frac c2\text{.} \end{align*}
3.3.1.b
Solution.
Like \(g_c\) and \(g_d\text{,}\) the function \(f\) is also piecewise linear, just with more pieces. Going through the various cases we find
\begin{equation*} f(t) = \begin{cases} 0 \amp \text{ if } t \in [-1,0] \\ t/c - t/d \amp \text{ if } t \in [0,c] \\ 1 - t/d \amp \text{ if } t \in [c,d] \\ 0 \amp \text{ if } t \in [d,1]. \end{cases} \end{equation*}
In particular, \(f(t)=0\) for \(t \notin [0,d]\text{.}\) Since \(0 \lt c \le d\text{,}\) \(f\) is increasing on \([0,c]\) and decreasing on \([0,d]\text{,}\) as shown in the following graph.
described in detail following the image
Graph of the function f
From the monotonicity, or just from looking at the graph, we see that
\begin{equation*} 0 \le f(t) \le 1 - \frac cd \le 1 \end{equation*}
for all \(t \in [-1,1]\text{,}\) as desired. Alternatively, for the upper bound we could have estimated
\begin{equation*} \sup_{t \in [-1,1]} f(t) \le \sup_{t \in [-1,1]} g_c(t) - \inf_{t \in [-1,1]} g_d(t) = 1 - 0 = 1\text{.} \end{equation*}
3.3.1.c
Solution.
We have shown in Part b that \(f(t) \ge 0\) for all \(t \in [-1,1]\text{,}\) and so the inequalities on the left hand side follow from Theorem 3.7. To prove the inequalities on the right hand side, we split up the integral and use the upper bound in Part b,
\begin{align*} \int_{-1}^1 f(t)\, dt \amp = \int_{-1}^0 f(t)\, dt + \int_0^d f(t)\, dt + \int_d^1 f(t)\, dt \\ \amp = 0 + \int_0^d f(t)\, dt + 0\\ \amp \le d \sup_{t \in [0,d]} f(t)\\ \amp \le d\text{.} \end{align*}
Similarly we estimate
\begin{align*} \int_{-1}^1 (f(t))^2\, dt \amp = \int_0^d (f(t))^2\, dt \\ \amp \le d \sup_{t \in [0,d]} (f(t))^2\\ \amp \le d\text{.} \end{align*}
Comment 1.
While the question (and hint) both ask you to estimate these integrals rather than calculate them exactly, at least a few students still calculated them explicitly. I strongly recommend getting used to doing estimates as in the official solution. This can be much faster, works in many situations where the integral cannot be calculated exactly, and is less prone to calculation errors.
For what it’s worth, the explicit values of the integrals here are
\begin{align*} \int_{-1}^1 f(t)\, dt \amp= \frac{d-c}2,\\ \int_{-1}^1 (f(t))^2\, dt \amp= \frac{(d-c)^2}{3d}\text{.} \end{align*}
Comment 2.
For all \(t \in [-1,1]\text{,}\) \(0 \le f(t) \le 1\) implies \((f(t))^2 \le f(t)\text{.}\) Thus Theorem 3.7 gives the estimate
\begin{equation*} \int_{-1}^1 (f(t))^2\, dt \le \int_{-1}^1 f(t)\, dt\text{.} \end{equation*}
This provides an alternative way of estimating the integral on the left hand side.

3.4 \(L^2([a,b])\)

Exercises

3.4.2. (PS7) Calculating and estimating \(L^2\) norms.
3.4.2.a
Solution.
We calculate
\begin{align*} \n{g_c}_{L^2([-1,1])}^2 \amp = \int_{-1}^1 (g_c(t))^2\, dt\\ \amp = \int_{-1}^0 0\, dt + \int_0^c \left( \frac tc \right)^2\, dt + \int_c^1 1\, dt\\ \amp = \frac c3 + (1-c)\\ \amp = 1 - \frac {2c}3 \text{,} \end{align*}
so that taking square roots gives
\begin{equation*} \n{g_c}_{L^2([-1,1])} = \sqrt{1-\frac {2c}3}\text{.} \end{equation*}
3.4.2.b
Solution.
Let \(f = g_c - g_d\text{.}\) Then by Part c of Exercise 3.3.1 we have
\begin{equation*} \n f_{L^2([-1,1])}^2 = \int_{-1}^1 (f(t))^2\, dt \le d\text{.} \end{equation*}
The desired inequality follows by taking square roots.
3.4.3. (PS8) \(L^2\) convergence is not pointwise convergence.
3.4.3.a
Solution.
For \(t \in [0,1)\) we have \(t^n \to 0\) as \(n \to \infty\text{,}\) while for \(t=1\) we have \(t^n = 1\) for all \(n\in \N\text{.}\) Thus, for any \(t \in [0,1]\text{,}\) we have
\begin{equation*} f_n(t) \to f(t) = \begin{cases} 0 \amp \text{if } 0 \le t \lt 1 \\ 1 \amp \text{if } t = 1 \end{cases}\text{.} \end{equation*}
Comment.
One or two students wrote something like
\begin{equation*} f = \begin{cases} 0 \amp \text{if } 0 \le t \lt 1 \\ 1 \amp \text{if } t = 1 \end{cases}\text{.} \end{equation*}
If we’re being very picky about notation for functions, this isn’t quite correct. The left hand side is a function, while the right hand side is a formula involving some variable \(t\) which hasn’t been properly introduced. One way to avoid this ambiguity is to use the symbol \(\mapsto\text{,}\) as in
\begin{equation*} f \maps t \mapsto \begin{cases} 0 \amp \text{if } 0 \le t \lt 1 \\ 1 \amp \text{if } t = 1 \end{cases}\text{,} \end{equation*}
or even
\begin{equation*} f = \left( t \mapsto \begin{cases} 0 \amp \text{if } 0 \le t \lt 1 \\ 1 \amp \text{if } t = 1 \end{cases}\right)\text{.} \end{equation*}
Alternatively, we could give up on writing things down in a single line of symbols, and say that \(f \maps [0,1] \to \R\) is the function defined by
\begin{equation*} f(t) = \begin{cases} 0 \amp \text{if } 0 \le t \lt 1 \\ 1 \amp \text{if } t = 1 \end{cases}\text{.} \end{equation*}
As the problem statement here already tells us that we’re looking for a function \(f \maps [0,1] \to \R\text{,}\) the official solutions here skip that part and just give the formula.
3.4.3.b
Solution.
We simply calculate
\begin{align*} \n{f_n-0}_{L^2([0,1])}^2 \amp = \int_0^1 (t^n)^2\, dt\\ \amp = \frac 1{2n+1}\\ \amp \to 0 \end{align*}
as \(n \to \infty\text{.}\)
Comment.
Note that the pointwise convergence \(f_n \to f\) alone is not enough to justify commuting limits and integrals to get
\begin{gather*} \lim_{n\to\infty} \int_0^1 f_n(t)\, dt = \int_0^1 \lim_{n\to\infty} f_n(t)\, dt = \int_0^1 f(t)\, dt = 0\text{.} \end{gather*}
Indeed, it is possible to find an explicit sequences \(\seq gn\) in \(C^0([0,1])\) with \(g_n \to 0\) pointwise but \(\int_0^1 g_n(t)\, dt = 1\) for all \(n \in \N\text{.}\)
From Analysis 1 and 2 you know that uniform convergence is enough to interchange limits and integrals (over bounded intervals).
3.4.3.c
Solution.
Limits are only unique once we have fixed which metric space we are talking about. In Part b, we showed that \(\seq fn\) was convergent in the metric space \(L^2([0,1])\text{.}\) In Part a, on the other hand, we were talking about a very different notion of pointwise convergence.
Comment 1.
An interesting (but non-examinable) fact is that there is no metric \(d\) on the set \(C^0([a,b])\) for which convergence in \(\big(C^0([a,b]),d\big)\) is the same thing as pointwise convergence. In this sense pointwise convergence is really quite different than convergence in the spaces we studied in Chapter 3.
Comment 2.
Note that pointwise convergence is very much not the same thing as convergence in \(C^0([0,1])\text{.}\) Indeed, the latter is the same as uniform convergence, and a substantial portion of Analysis 2A is devoted to the differences between these two concepts.

4 Compact Sets
4.1 Sequential Compactness

Exercises

4.1.1. (PS8) Compactness and minimisation.
4.1.1.a
Solution.
If
\begin{equation*} \inf_{x \in X} f(x) \gt -\infty\text{,} \end{equation*}
then this follows from Exercise B.2. Indeed, the definition of the infimum implies that for every \(n \in \N\text{,}\) there exists a point \(x_n \in X\) such that
\begin{equation*} f(x_n) - \frac{1}{n} \le \inf_{x \in X} f(x) \le f(x_n). \end{equation*}
Thus we obtain a sequence \((x_n)_{n \in \N}\) with the required property. If
\begin{equation*} \inf_{x \in X} f(x) = -\infty, \end{equation*}
then for any \(n \in \N\) there exists \(x_n \in X\) with \(f(x_n) \le -n\text{.}\) Then
\begin{equation*} \lim_{n \to \infty} f(x_n) = -\infty \end{equation*}
as well. (Note that \(\inf_{x \in X} f(x) = + \infty\) is not possible as \(X \ne \varnothing\text{.}\))
Comment.
Note that the question explicitly mentions the possibility that \(\inf_{x \in X} f(x) = - \infty\text{.}\) Many students did not treat this case; on an exam this would have cost marks.
4.1.1.b
Solution.
Let \((x_n)_{n \in \N}\) be a minimising sequence. If \((X,d)\) is sequentially compact, then there exists a convergent subsequence \((x_{n_k})_{k \in \N}\text{.}\) Let \(x_0\) be its limit. If \(f\) is continuous, then it follows (from Theorem 1.34 and Theorem 1.46) that
\begin{equation*} f(x_0) = \lim_{k \to \infty} f(x_{n_k}) = \lim_{n \to \infty} f(x_n) = \inf_{x \in X} f(x). \end{equation*}
Comment.
It’s easy to see to find counterexamples if we weaken the requirement that \(X\) is compact. For instance, the function \(x \mapsto x\) does not attains its minimum over either \(X = (0,1)\) or \(X = \R\text{.}\)

4.2 Total Boundedness

Exercises

4.2.1. Total boundedness for subsets.
Solution.
Suppose that the metric subspace \((Y,d')\) is totally bounded and let \(r \gt 0\text{.}\) Then we can find finitely many points \(y_1,\ldots,y_N \in Y\) such that
\begin{gather} Y \subseteq \bigcup_{n=1}^N B_r'(y_n) \subseteq \bigcup_{n=1}^N B_r(y_n)\text{,}\tag{✶} \end{gather}
where here \(B_r'(y_n)\) denotes the open ball in \((Y,d')\) and in the last step we have used Part a of Exercise 1.2.2. Thus \(Y\) is totally bounded as a subset of \(X\text{.}\)
Conversely, suppose that \(Y\) is totally bounded as a subset of \(X\text{.}\) Then for every \(r \gt 0\) we can find \(y_1,\ldots,y_N \in Y\) such that
\begin{gather} Y \subseteq \bigcup_{n=1}^N B_r(y_n),\tag{✶✶} \end{gather}
holds, where here \(B_r(y)\) denotes the open ball in \((X,d)\) with radius \(r\) and centre \(y\text{.}\) Intersecting both sides with \(Y\) and using Part a of Exercise 1.2.2 again we find
\begin{equation*} Y \subseteq \bigcup_{n=1}^N B_r(y_n) \cap Y = \bigcup_{n=1}^N B_r'(y_n)\text{.} \end{equation*}
Thus the metric subspace \((Y,d')\) is totally bounded.

4.3 Compactness

Exercises

4.3.1. (PS9) Direct proof of non-compactness.
Solution.
Consider the open cover \(\set{B_r(0)}{r \in (0,1)}\text{.}\) Clearly this is a collection of open sets, and it is a cover since any \(x \in B_1(0)\) also lies in \(B_r(0)\) for all \(r \in (\n x,1)\text{.}\) Let \(\{B_{r_1}(0),\ldots,B_{r_N}(0)\}\) be a finite subcollection, and set \(R = \max\{r_1,\ldots,r_N\} \lt 1\text{.}\) Then
\begin{equation*} \bigcup_{n=1}^N B_{r_n}(0) = B_R(0) \not \supseteq B_1(0), \end{equation*}
and so this cannot be a subcover.
Comment 1.
There are of course many different open covers that one can use for this problem. For instance, many students this year used something like \(\{B_{1-1/n}(0) : n \in \N \without \{1\}\}\text{,}\) which is a (countably infinite) subcover of the cover in the official solutions
Comment 2.
On exams I am inclined to accept without proof statements about sets which are both correct and relatively straightforward. For instance, in the context of this problem (i.e. in \(\R^2\) with the Euclidean metric) I would accept \(\bigcup_{r \in (0,1)} B_r(0) = B_1(0)\) and \(B_{1/2}(0) \not \supseteq B_1(0)\) without proof. On the other hand, I would not accept a statement like ‘clearly there are no finite subcovers’ without some further justification.
4.3.2. Compactness and metric subspaces.
Solution.
Suppose that \(Y \subseteq X\) is compact, and let \(\set{U_\lambda}{\lambda \in \Lambda}\) be a collection of open subsets of \(X\) with \(Y \subseteq \bigcup_{\lambda \in \Lambda} U_\lambda\text{.}\) By Exercise 1.2.2, the sets \(V_\lambda = U_\lambda \cap Y\) are open in \((Y,d_Y)\text{,}\) and intersecting the above inclusion with \(Y\) yields \(Y = \bigcup_{\lambda \in \Lambda} V_\lambda\text{.}\) Since \((Y,d_Y)\) is compact, we can therefore find a finite subset \(M \subseteq \Lambda\) such that
\begin{equation*} Y = \bigcup_{\mu \in M} V_\mu \subseteq \bigcup_{\mu \in M} U_\mu \end{equation*}
as desired.
Conversely, suppose that for every collection \(\set{U_\lambda}{\lambda \in \Lambda}\) of open subsets of \(X\) with \(Y \subseteq \bigcup_{\lambda \in \Lambda} U_\lambda\text{,}\) there exists a finite subset \(M \subseteq \Lambda\) such that \(Y \subseteq \bigcup_{\mu \in M} U_\mu\text{.}\) To see that \((Y,d_Y)\) is compact, let \(\set{V_\lambda}{\lambda \in \Lambda}\) be a collection of open sets in \(Y\) with \(Y = \bigcup_{\lambda \in \Lambda} V_\lambda\text{.}\) By Exercise 1.2.2, we can find open sets \(U_\lambda\) in \(X\) such that \(V_\lambda = U_\lambda \cap Y\text{.}\) In particular, \(U_\lambda \subseteq V_\lambda\text{,}\) and so we have \(Y \subseteq \bigcup_{\lambda \in \Lambda} U_\lambda\text{.}\) By assumption, we can then find a finite subset \(M \subseteq \Lambda\) such that \(Y \subseteq \bigcup_{\mu \in M} U_\mu\text{.}\) Intersecting both sides with \(Y\) we conclude that \(Y = \bigcup_{\mu \in M} V_\mu\) as desired.
4.3.3. (PS9) Continuity on compact spaces.
Solution 1. Using covers and subcovers
Let \(\varepsilon \gt 0\text{.}\) Since \(f\) is continuous, for every point \(x \in X\) we have a number \(\delta(x) \gt 0\) such that, for all \(x' \in X\text{,}\) the inequality \(d_X(x, x') \lt \delta(x)\) implies \(d_Y(f(x), f(x')) \lt \frac{\varepsilon}{2}\text{.}\)
The collection of balls \(\set{B_{\delta(x)/2}(x)}{x \in X}\) is an open cover of \(X\text{.}\) If \(X\) is compact, then there exist \(x_1, \dotsc, x_N\) such that
\begin{equation*} X = \bigcup_{n=1}^N B_{\delta(x_n)/2}(x_n). \end{equation*}
Set \(\delta = \frac 12 \min\{\delta(x_1), \dotsc, \delta(x_N)\}\text{.}\)
Now suppose that we have two points \(a, b \in X\) with \(d_X(a,b) \lt \delta\text{.}\) Then there exists \(n \in \{1, \dotsc, N\}\) such that \(a \in B_{\delta(x_n)/2}(x_n)\text{.}\) Thus \(d_X(a, x_n) \lt \delta(x_n)\) and also \(d_X(b, x_n) \le d_X(a,b) + d_X(a, x_n) \lt \delta(x_n)\text{.}\) Hence by the choice of \(\delta(x_n)\text{,}\)
\begin{equation*} d_Y(f(x_n), f(a)) \lt \frac{\varepsilon}{2} \quad \text{and} \quad d_Y(f(x_n), f(b)) \lt \frac{\varepsilon}{2}. \end{equation*}
The triangle inequality then implies that
\begin{equation*} d_Y(f(a), f(b)) \le d_Y(f(x_n), f(a)) + d_Y(f(x_n), f(b)) \lt \varepsilon. \end{equation*}
Solution 2. Using sequential compactness
Suppose for the sake of contradiction that \(f\) is not uniformly continuous. Then there exists \(\varepsilon \gt 0\) such that, for all \(n \in \N\text{,}\) there exists \(x_n,y_n \in X\) with \(d_X(x_n,y_n) \lt 1/n\) but \(d_Y(f(x_n),f(y_n)) \ge \varepsilon\text{.}\)
Since \(X\) is compact, Theorem 4.22 implies it is sequentially compact, and so the sequence \(\seq xn\) has a subsequence \(\subseq xnk\) converging to a limit \(x_0 \in X\text{.}\) Unfortunately, there is no guarantee that the associated subsequence \(\subseq ynk\) of \(\seq yn\) converges. But we can apply sequential compactness to \(\subseq ynk\) to find a further subsequence \(\subsubseq ynk\ell\) which does converge, say to some \(y_0 \in X\text{.}\) We note (by Theorem 1.34) that \(\subsubseq xnk\ell\) still converges to \(x_0\text{.}\)
By the triangle inequality
\begin{gather*} d_X(x_0,y_0) \le d_X(x_0,x_{n_{k_\ell}}) + d_X(x_{n_{k_\ell}},y_{n_{k_\ell}}) + d_X(y_{n_{k_\ell}},y_0) \to 0 \end{gather*}
as \(\ell \to \infty\text{,}\) where the first and last terms on the left hand side converge to zero since \(x_{n_{k_\ell}} \to x_0\) and \(y_{n_{k_\ell}} \to y_0\) while the middle term is bounded by \(1/n_{k_\ell}\) which also converges to zero as \(\ell \to \infty\text{.}\) Thus \(d_X(x_0,y_0) = 0\) and hence \(x_0 = y_0\text{.}\)
By the sequential continuity of \(f\) and the sequential continuity of \(d_Y \maps Y \times Y \to \R\) from Exercise 1.3.1, we have
\begin{gather*} \lim_{\ell \to \infty} d_Y\big(f(x_{n_{k_\ell}}),f(y_{n_{k_\ell}})\big) = d_Y(f(x_0),f(y_0)) = 0 \end{gather*}
as \(x_0 = y_0\text{.}\) On the other hand by our choice of \(\seq xn\) and \(\seq yn\) we have \(d_Y\big(f(x_{n_{k_\ell}}),f(y_{n_{k_\ell}})\big) \ge \varepsilon\) for all \(\ell\text{,}\) and so the above limit must also be \(\ge \varepsilon\text{,}\) which is a contradiction.
Comment.
Note that in the official solution just having \(a\) and \(b\) lie in the same ball \(B_{\delta(x_n)/2}(x_n)\) is not enough to immediately estimate \(d_Y(f(a),f(b))\text{.}\) This is because \(\delta(x_n)\) is defined using the continuity of \(f\) at \(x_n\) and not \(a\) or \(b\text{.}\) However, we can get what we need with a bit of extra work using the triangle inequality.
4.3.4. (PS9) Images of compact sets.
4.3.4.a
Solution.
Suppose \((X,d_X)\) is sequentially compact. We want to show that \(f(X)\) is sequentially compact (as a subset of the metric space \((Y,d_Y)\)). So let \(\seq yn\) be a sequence in \(f(X)\text{.}\) Then for each \(n \in \N\) there exists \(x_n \in X\) with \(f(x_n) = y_n\text{,}\) giving rise to a sequence \(\seq xn\) in \(X\text{.}\) As \((X,d_X)\) is sequentially compact, we can find a subsequence \(\subseq xnk\) which converges to some limit \(x_0 \in X\text{.}\) Since \(f\) is continuous, Theorem 1.46 then implies that
\begin{equation*} y_{n_k} = f(x_{n_k}) \to f(x_0) \in f(X) \end{equation*}
as \(k \to \infty\text{.}\) Thus \(\subseq ynk\) is a subsequence of \(\seq yn\) which converges (as a sequence in \(Y\)) to a limit in \(f(X)\text{.}\) We conclude that \(f(X)\) is sequentially compact.
Comment.
Some students gave arguments involving \(\varepsilon\)s, \(\delta\)s and \(N\)s, essentially repeating part of the proof of Theorem 1.46. Citing Theorem 1.46 directly saves some ink and mental energy; could also simply say “by the sequential continuity of \(f\)”.
4.3.4.b
Solution.
Suppose \(U \subseteq Y\) is open in \(Y\text{.}\) Let \(x_0 \in f^{-1}(U)\text{.}\) Then \(f(x_0) \in U\text{,}\) and, since \(U\) is open, there exists \(\varepsilon \gt 0\) such that \(B_\varepsilon(f(x_0)) \subseteq U\text{.}\) Since \(f\) is continuous at \(x_0\) we know that there exists \(\delta \gt 0\) such that for every \(x \in X\text{,}\) \(d_X(x,x_0)\lt\delta\) implies \(d_Y(f(x_0),f(x))\lt\varepsilon\text{.}\) Hence, for every \(x \in B_\delta(x_0)\) we have \(d_Y(f(x_0),f(x))\lt\varepsilon\text{,}\) that is \(f(x) \in B_\varepsilon(f(x_0)) \subseteq U\text{.}\) It follows that \(B_\delta(x_0) \subseteq f^{-1}(U)\text{.}\) Since \(x_0 \in f^{-1}(U)\) was arbitrary, we conclude \(f^{-1}(U)\) is open.
4.3.4.c
Solution.
Suppose \((X,d_X)\) is compact, and let \(\set{U_{\lambda}}{\lambda\in\Lambda}\) be an open cover of the metric subspace \(f(X)\) (thought of as a subset of \((Y,d_Y)\)). Then by Part b, the collection \(\{f^{-1}(U_{\lambda}):\lambda \in \Lambda\}\) is an open cover of \(X\text{.}\) As \(X\) is compact we find a finite subcollection \(\{f^{-1}(U_{1}), \ldots ,f^{-1}(U_{N})\}\) with \(X \subseteq \bigcup_{k = 1}^Nf^{-1}(U_{k})\text{.}\) We claim \(\{ U_{1} ,\ldots, U_{N} \}\) is a finite cover of \(f(X)\text{.}\) To see this, let \(y \in f(X)\text{.}\) Then there is \(x \in X\) such that \(f(x) = y\text{.}\) Since \(X \subseteq \bigcup_{k = 1}^Nf^{-1}(U_{k})\) we have \(x \in f^{-1}(U_{k})\) for some \(k \in \{1,\ldots,N\}\) and hence \(y = f(x) \in U_{k}\text{.}\) We conclude \(f(X) \subseteq \bigcup_{k = 1}^N U_{k} \text{,}\) and so \(\{ U_{1} ,\ldots, U_{N} \}\) is a finite subcollection of \(\{U_{\lambda}:\lambda \in \Lambda\}\) which covers \(f(X)\text{.}\) Hence \(f(X)\) is compact.
4.3.5. (PS8) A closed and bounded set which is not compact.
4.3.5.a
Solution.
Since \(\abs{\sin x} \le 1\) for all \(x \in \R\text{,}\) we clearly have \(\n{f}_{C^0([0,\pi])} \le 1\) for any \(f \in F\text{.}\) Thus \(\sup_{f \in F} \n f_{C^0([0,\pi])} \le 1\) is finite, and so \(F\) is bounded; see Definition 1.15 and Exercise 1.1.11. Alternatively, the triangle inequality gives
\begin{gather*} \diam F = \sup_{f,g \in F} \n{f-g}_{C^0([0,\pi])} \le 1 + 1 = 2 \lt \infty\text{.} \end{gather*}
Comment.
The set \(F\) is introduced as a subset of the metric space \(C^0([0,\pi])\text{.}\) Thus the term bounded should be interpreted as in Definition 1.15.
Other quick comments:
  • It is not enough for every function in \(f \in F\) to have \(\n f_{C^0([0,\pi])} \lt \infty\text{.}\)
  • On an exam it would also not be sufficient to say something like ‘this follows from some properties of \(\sin\)’.
  • Remember that \(\n f_{C^0([0,\pi])} = \sup_{t \in [0,\pi]} \abs{f(t)}\) involves absolute values. It is not enough to estimate \(\sup_{t \in [0,\pi]} f(t)\text{.}\)
  • The notation \(\sup F\) for subsets \(F\) of \(C^0([0,\pi])\) has not been defined, nor is there a standard definition I am aware of.
4.3.5.b
Solution.
Let \(n,m \in \N\) with \(n \lt m\text{,}\) and set \(t = 2^{-n-1}\pi \in [0,\pi]\text{.}\) Then
\begin{equation*} f_n(t) = \sin \tfrac \pi2 = 1 \end{equation*}
while
\begin{equation*} f_m(t) = \sin 2^{m-n-1}\pi = 0 \end{equation*}
since \(2^{m-n-1} \in \Z\text{.}\) Thus
\begin{equation*} \n{f_n-f_m}_{C^0([0,\pi])} \ge \abs{f_n(t)-f_m(t)} = 1 \end{equation*}
as desired.
Comment.
Note that it is not the case that \(\n{f_n-f_m}_{C^0([0,\pi])} = 1\) for \(n \ne m\text{.}\) Indeed, we have
\begin{equation*} \n{f_1-f_2}_{C^0([0,\pi])} \ge \abs{f_1(3\pi/4)-f_2(3\pi/4)} = 1 + \frac 1{\sqrt 2}\text{.} \end{equation*}
4.3.5.c
Solution.
Suppose for the sake of contradiction that \((f_{n_k})_{k \in \N}\) is a convergent subsequence. Then in particular it is Cauchy, and so there exists \(K \in \N\) such that \(\n{f_{n_k}-f_{n_\ell}}_{C^0([0,\pi])} \lt 1\) for all \(k,\ell \ge K\text{.}\) But by (✶) this can only happen if \(n_k = n_\ell\text{,}\) i.e. if \(k=\ell\text{,}\) and so this is a contradiction.
Comment.
Note that the fact that \(\seq fn\) is not Cauchy is not enough on its own. For instance the alternating sequence \((0,1,0,1,\ldots)\) in \(\R\) is not Cauchy, but it has many convergent subsequences.
4.3.5.d
Solution 1. \(F\) as a subset
We will show that \(F\) is not a totally bounded subset of \(C^0([0,\pi])\text{.}\) Fix \(r \in (0,1]\text{.}\) Then (✶) shows that, in \(C^0([0,\pi])\text{,}\) the ball \(B_r(f)\) satisfies \(B_r(f) \cap F = \{f\}\) for all \(f \in F\text{.}\) Thus, since \(F\) is infinite, it cannot possibly be contained in a finite union of balls of this form (by the same argument as in Example 4.18).
Solution 2. \(F\) as a metric subspace
We will show that \(F\) is not totally bounded as a metric subspace of \(C^0([0,\pi])\text{.}\) Fix \(r \in (0,1]\text{.}\) Then (✶) shows that, in the metric subspace \(F\) of \(C^0([0,\pi])\text{,}\) \(B_r(f) = \{f\}\) for all \(f \in F\text{.}\) Thus, since \(F\) is infinite, it cannot possibly be a finite union of balls of this form (by the same argument as in Example 4.18).
4.3.5.e
Solution 1. \(F\) as a subset
Here we are considering \(F\) as a subset of the metric space \(C^0([0,\pi])\text{.}\) Clearly \(\set{B_1(f)}{f \in F}\) is an open cover of \(F\text{.}\) By the previous part, for any \(g_1,\ldots,g_N \in F\) the subcollection \(\{B_1(g_1),\ldots,B_1(g_N)\}\) must have
\begin{equation*} F \cap \Big(B_1(g_1)\cup \cdots \cup B_1(g_N)\Big) = \{g_1,\ldots,g_N\}. \end{equation*}
In particular, as \(F\) is infinite, this subcollection does not cover \(F\text{.}\)
Solution 2. \(F\) as a subspace
Here we are considering \(F\) as a metric subspace of \(C^0([0,\pi])\text{.}\) By our above argument (and Lemma 1.21), the set \(\{f\}\) is open in \(F\) for all \(f \in F\text{.}\) Thus \(\set{\{f\}}{f \in F}\) is an open cover of \(F\text{.}\) As \(F\) is infinite, this cover has no finite subcover.
4.3.5.f
Solution.
Suppose that \(\seq gn\) is a sequence in \(F\) which converges, as a sequence in \(C^0([0,\pi])\text{,}\) to some \(g \in C^0([0,\pi])\text{.}\) Then \(\seq gn\) is also Cauchy, and so there exists \(N \in \N\) such that \(\n{g_n-g_m}_{C^0([0,\pi])} \lt 1\) for all \(n,m \ge N\text{.}\) By (✶), this can only happen if \(g_n=g_m\text{,}\) i.e. if \(g_n=g_N\) for all \(n \ge N\text{.}\) In particular, this means that
\begin{equation*} g = \lim_{n \to \infty} g_n = \lim_{n \to \infty} g_N = g_N \in F\text{.} \end{equation*}
Applying Theorem 1.35, we conclude that \(F\) is closed.

4.4 Relative Compactness

Exercises

4.4.1. (PS9) Relative compactness and subsequences.
Solution.
First suppose that Part i holds, and let \(\seq sn\) be a sequence in \(S\text{.}\) Then \(\seq sn\) is also a sequence in \(\overline S\text{,}\) which is compact and hence sequentially compact. Thus there exists a subsequence \(\subseq snk\) which converges to some \(x \in \overline S \subseteq X\text{.}\)
Now suppose that Part ii holds. To show that \(S\) is relatively compact, it suffices to show that \(\overline S\) is sequentially compact. So let \(\seq xn\) be a sequence in \(\overline S\text{.}\) We must find a subsequence \(\subseq xnk\) which converges to some \(x_0 \in \overline S\text{.}\) Using the definition of the closure, for each \(n \in \N\) we can find \(s_n \in S\) with \(d(s_n,x_n) \lt 1/n\text{.}\) By Part ii, the sequence \(\seq sn\) has a subsequence \(\subseq snk\) which converges to some \(x_0 \in X\text{,}\) and by Theorem 1.35 we have \(x_0 \in \overline S\text{.}\) Finally, by the triangle inequality
\begin{equation*} d(x_{n_k},x_0) \le d(x_{n_k},s_{n_k}) + d(s_{n_k},x_0) \to 0 \end{equation*}
as \(k \to \infty\text{,}\) so that \(x_{n_k} \to x_0\) as desired.
Comment.
Note that in order to show that \(\overline S\) is sequentially compact, we must show that any sequence \(\overline S\) has a subsequence converging to a limit lying in the same set \(\overline S\text{.}\)

4.5 The Arzelà–Ascoli Theorem

Exercises

4.5.1. (PS9) Finite sets are equicontinuous.
Solution.
Let \(x_0 \in X\) and \(\varepsilon \gt 0\text{.}\) For each \(n \in \{1,\ldots,N\}\text{,}\) \(f_n\) is continuous at \(x_0\text{,}\) and so there exists a \(\delta_n \gt 0\) such that, for all \(x \in X\text{,}\)
\begin{equation} d(x,x_0) \lt \delta_n \implies \abs{f_n(x)-f_n(x_0)} \lt \varepsilon\text{.}\tag{4.2} \end{equation}
Set \(\delta = \min\{\delta_1,\ldots,\delta_N\} \gt 0\text{.}\) Then for any \(f = f_n \in F\) and \(x \in X\) with \(d(x,x_0) \lt \delta \le \delta_n\text{,}\) we can apply (4.2) to get \(\abs{f_n(x)-f_n(x_0)} \lt \varepsilon \text{.}\)
Comment.
In the argument above, we wanted \(d(x,x_0) \lt \delta\) to allow us to apply (4.2). Thus we wanted \(d(x,x_0) \lt \delta\) to imply, for any \(n \in \{1,\ldots,N\}\text{,}\) that \(d(x,x_0) \lt \delta_n\text{.}\) Clearly this holds if \(\delta \le \delta_n\) for every \(n\text{,}\) which we can ensure by setting \(\delta=\min\{\delta_1,\ldots,\delta_N\}\text{.}\) If we used \(\max\{\delta_1,\ldots,\delta_N\}\) instead, as several students tried to, then this wouldn’t work. In retrospect, these students would probably have benefited from writing out this part of their argument in more detail, to see where the choice of \(\delta\) really matters.
4.5.2. (PS10) Bounded subsets of \(C^{0,\alpha}\) relatively compact in \(C^0\).
4.5.2.a
Solution.
Since \(C^0([a,b])\) and \(C^{0,\alpha}([a,b])\) are normed spaces, we use the definition of bounded from the last bullet point in Definition 1.15. Let \(f \in S\text{.}\) Then
\begin{align*} \n{f}_{C^{0,\alpha}([a,b])} \amp = \n{f}_{C^0([a,b])} + [f]_{C^{0,\alpha}([a,b])}\\ \amp \ge \n{f}_{C^0([a,b])}\text{.} \end{align*}
Taking a supremum, we conclude that
\begin{gather*} \sup_{f \in S} \n{f}_{C^0([a,b])} \le \sup_{f \in S} \n{f}_{C^{0,\alpha}([a,b])} \lt \infty \end{gather*}
where the second supremum is finite by the assumption that \(S\) is bounded in \(C^{0,\alpha}([a,b])\text{.}\)
4.5.2.b
Solution.
Since \(S\) is bounded in \(C^{0,\alpha}([a,b])\text{,}\) there exists \(M \ge 0\) such that \(\n f_{C^{0,\alpha}([a,b])} \le M\) for all \(f \in S\text{.}\)
For any \(x,x_0 \in [a,b]\) and \(f \in S\text{,}\) we can estimate
\begin{align} \abs{f(x_0) - f(x)} \amp \le [f]_{C^{0,\alpha}([a,b])} \abs{x - x_0}^\alpha \notag\\ \amp \le M \abs{x -x_0}^\alpha\text{.}\tag{✶} \end{align}
To see that this implies equicontinuity, fix \(x_0 \in [a,b]\) and \(\varepsilon \gt 0\text{,}\) and set
\begin{equation*} \delta = \Big(\frac \varepsilon M\Big)^{1/\alpha}\text{.} \end{equation*}
Then for any \(x \in [a,b]\) with \(\abs{x-x_0} \lt \delta\text{,}\) and for any \(f \in S\text{,}\) (✶) implies
\begin{equation*} \abs{f(x_0) - f(x)} \le M \abs{x -x_0}^\alpha \lt \varepsilon\text{.} \end{equation*}
Comment.
When showing that a set \(S\) is equicontinuous, your main goal is convince your reader that your choice of \(\delta\) does not depend on which \(f \in S\) you chose. Otherwise you are simply showing that every function \(f \in S\) is continuous, which is a much weaker statement; see the discussion below Definition 4.28.
4.5.2.c
Solution.
Relative compactness now follows at once from the previous two parts and Arzelà–Ascoli. By Exercise 4.4.1, this means that any sequence \(\seq fn\) in \(S\) has a subsequence \(\subseq fnk\) which converges in \(C^0([a,b])\text{.}\) That is, there exists \(f_0 \in C^0([a,b])\) such that \(\n{f_{n_k}-f_0}_{C^0([a,b])} \to 0\) as \(k \to \infty\text{.}\)
4.5.3. (PS10) A set which is not equicontinuous.
4.5.3.a
Solution.
Following the hint we take \(x_0=0\) and \(\varepsilon = 1/2\text{.}\) For any \(\delta \gt 0\text{,}\) we can choose \(n \in \N\) large enough
 1 
Clearly we can do this since the left hand side is positive and tends to zero as \(n \to \infty\text{.}\)
that
\begin{equation*} x = \frac{\pi}{2^{n+1}} \in (0,\delta). \end{equation*}
For this \(x\) we then have
\begin{equation*} \abs{f_n(x)-f_n(0)} = \abs{1-0} = 1 \gt \tfrac 12 = \varepsilon \text{.} \end{equation*}
Thus \(F\) is not equicontinuous at \(x_0\text{.}\)
4.5.3.b
Solution.
We know from Exercise 4.3.5 that \(F\) is closed and bounded, but not compact. This then implies that \(F\) is not relatively compact either, as \(\overline F = F\text{.}\) Applying Theorem 4.29, the only possibility is that \(F\) is not equicontinuous.
Comment.
Since we know so much about this set \(F\) from Exercise 4.3.5, and we have so many results in Chapter 4, there are many different ways to argument here. For instance, one can use Theorem 4.27.
4.5.4. (PS10) Compactness and indefinite integrals.
Solution.
By Corollary 4.31, it suffices to show that \(\Phi(F)\) is bounded in \(C^{0,1}([a,b])\text{.}\) As \(F\) is bounded in \(C^0([a,b])\text{,}\) there exists a constant \(M \gt 0\) such that \(\n f_{C^0} \le M\) for all \(f \in F\text{.}\) Now let \(g \in \Phi(F)\) be arbitrary, and pick \(f \in F\) such that \(\Phi(f)=g\text{.}\) Then for any \(x \in [a,b]\) we can estimate
\begin{align*} \abs{g(x)} \amp = \left| \int_a^x f(t)\, dt \right|\\ \amp \le \int_a^x \abs{f(t)}\, dt \\ \amp \le \int_a^x M\, dt \\ \amp = (x-a) M\\ \amp \le (b-a) M\text{.} \end{align*}
Thus \(\n g_{C^0([a,b])} \le (b-a)M\text{.}\) To estimate \([g]_{C^{0,1}([a,b])}\text{,}\) pick \(x,y \in [a,b]\) with \(x \ne y\text{,}\) and assume without loss of generality that \(x \lt y\text{.}\) Then
\begin{align*} \abs{g(y)-g(x)} \amp = \left| \int_a^y f(t)\, dt - \int_a^x f(t)\, dt \right|\\ \amp = \left| \int_x^y f(t)\, dt \right|\\ \amp \le \int_x^y \abs{f(t)}\, dt \\ \amp \le \int_x^y M\, dt \\ \amp = \abs{x-y} M\text{.} \end{align*}
Dividing by \(\abs{x-y}\) and taking a supremum, we obtain \([g]_{C^{0,1}([a,b])} \le M\text{.}\) Putting everything together, we have that
\begin{equation*} \n g_{C^{0,1}([a,b])} = \n g_{C^0([a,b])} + [g]_{C^{0,1}([a,b])} \le (b-a+1)M \end{equation*}
and hence, since \(g \in \Phi(F)\) was arbitrary, that \(\Phi(F)\) is bounded (with diameter \(\le 2(b-a+1)M\)).
Comment 1.
Note that in Corollary 4.31 we need \(S\) to be bounded as a subset of \(C^{0,\alpha}([a,b])\text{,}\) i.e. in the sense of Definition 1.15. This means
  1. \(S \subset C^{0,\alpha}([a,b])\text{;}\) and
  2. \(\sup_{f \in S} \n {f}_{C^{0,\alpha}([a,b])} \lt \infty\text{.}\)
Comment 2.
Note carefully the difference between the statement “for any \(f \in S\text{,}\) \(\n f_{C^{0,1}} \lt \infty\)” and the statement “\(\sup_{f \in S} \n f_{C^{0,1}} \lt \infty\)”. Getting these two statements confused on an exam (or writing in a way where I cannot tell which you mean) is extremely dangerous. For exactly the same issue in a simpler context, see the last part of Exercise B.2 on Problem Sheet 1.

5 Uniform Approximation
5.1 The Weierstrass Approximation Theorem

Exercises

5.1.1. (PS10) \(C^0([a,b])\) is separable.
5.1.1.a
Solution 1. Argument using sequences
Following the hint, \(p \in P\) with
\begin{equation*} p(x) = \alpha_0 + \alpha_1 x + \cdots + \alpha_M x^M \end{equation*}
for some \(M \in \N\) and coefficients \(\alpha_0,\ldots,\alpha_M \in \R\text{.}\) As \(\Q\) is dense in \(\R\text{,}\) we can pick a sequence \(\seq qn\) in \(Q\) of polynomials of the form
\begin{gather*} q_n(x) = \beta_0^{(n)} + \beta_1^{(n)} x + \cdots + \beta_M^{(n)} x^M \end{gather*}
whose rational coefficients converge to the coefficients of \(p\text{,}\) i.e. \(\beta_m^{(n)} \to \alpha_m\) as \(n \to \infty\) for all \(m \in \{1,\ldots,M\}\text{.}\) Since the functions \(x \mapsto x^m\) all lie in \(C^0([a,b])\text{,}\) the convergence \(q_n \to p\) in \(C^0([a,b])\) now follows by repeatedly applying the second part of Exercise 1.3.1 (e.g. using induction on \(M\)).
Alternatively, we just emulate the official solution to Exercise 1.3.1 and use the triangle inequality to estimate
\begin{align*} \n{q_n-p}_{C^0([a,b])} \amp\le \abs{\beta_0^{(n)}-\alpha_0} \n{x^0}_{C^0([a,b])} + \cdots + \abs{\beta_M^{(n)}-\alpha_M} \n{x^M}_{C^0([a,b])}\\ \amp\to 0 \end{align*}
as \(n \to \infty\text{,}\) where here we have abused notation and written \(x^0,\ldots,x^M\) for the functions \(x \mapsto x^0, \ldots, x \mapsto x^M\text{.}\)
Solution 2. Direct argument
Let \(p \in P\) with
\begin{equation*} p(x) = \alpha_n x^n + \cdots + \alpha_1 x + \alpha_0\text{.} \end{equation*}
Let \(\varepsilon \gt 0\text{.}\) Define \(c = \max\{\abs a, \abs b\}\) and choose \(\beta_0, \ldots, \beta_n \in \Q\) with
\begin{equation*} |\alpha_k - \beta_k| \lt \frac{\varepsilon}{(n + 1) c^k}. \end{equation*}
Set
\begin{equation*} q(x) = \beta_n x^n + \cdots + \beta_1 x + \beta_0. \end{equation*}
Then
\begin{equation*} \begin{split} \sup_{x \in [a,b]} |p(x) - q(x)| \amp \le |\alpha_n - \beta_n| c^n + \cdots + |\alpha_1 - \beta_1| c + |\alpha_0 - \beta_0| \\ \amp \lt \frac{\varepsilon}{n + 1} + \cdots + \frac{\varepsilon}{n + 1} = \varepsilon. \end{split} \end{equation*}
That is, we have shown that \(\|p - q\|_{C^0([a,b])} \lt \varepsilon\text{,}\) which proves that \(Q\) is dense in \(P\text{.}\)
Comment.
On the one hand the solution above using sequences is easier than direct argument: we don’t have to mess around quite as much with inequalities or estimate the \(C^0([a,b])\) norm of \(x \mapsto x^m\text{.}\) On the other hand dealing with sequences does makes complicated in other ways, for instance in terms of the notation.
5.1.1.b
Solution.
Let \(f \in C^0([a,b])\) and \(\varepsilon \gt 0\text{.}\) As \(P\) is dense in \(C^0([a,b])\) by the Weierstrass approximation theorem, there exists \(p \in P\) such that \(\n{p-f}_{C^0([a,b])} \lt \varepsilon/2\text{.}\) By the previous part, \(Q\) is dense in \(P\text{,}\) and so there similarly exists \(q \in Q\) such that \(\n{p-q}_{C^0([a,b])} \lt \varepsilon/2\text{.}\) Using the triangle inequality we conclude that \(\n{f-q}_{C^0([a,b])} \lt \varepsilon\) as desired.
5.1.1.c
Solution.
We have already shown that \(Q\) is dense in \(C^0([a,b])\text{,}\) and so it suffices to show that \(Q\) is countable.
Let \(n \in \N\) and consider the set \(Q_n\) of polynomials in \(Q\) of degree \(n\) or less. Note that we have the bijection
\begin{equation*} \Q^{n + 1} \to Q_n, \qquad (\beta_0, \ldots, \beta_n) \mapsto \beta_n x^n + \cdots + \beta_1 x + \beta_0. \end{equation*}
\(\Q^{n+1}\) is countable as it is a finite product of countable sets (Lemma 1.70), and so \(Q_n\) is also countable. Thus \(Q = \bigcup_{n = 0}^\infty Q_n\) is a countable union of countable sets, and hence countable as well (Lemma 1.72).
Comment.
Note that infinite products of countably infinite sets are not countable. Often this is an intermediate step in the proof that \(\R\) is not countable.

5.2 The Stone–Weierstrass Theorem

Exercises

5.2.1. (PS11) Some non-examples.
5.2.1.a
Solution.
Consider \(f,g \in P([0,1])\) defined by \(f(x)=x\) and \(g(x)=1-x\text{.}\) Then
\begin{equation*} (f \wedge g)(x) = \begin{cases} 1-x \amp x \in [0,1/2], \\ x \amp x \in [1/2-1], \end{cases} \end{equation*}
is not a polynomial.
5.2.1.b
Solution 1. Direct argument
Following the hint, we consider the function \(f \in C^0([0,1])\) defined by \(f(x)=x\text{.}\) For any constant function \(c \in S\text{,}\) we have
\begin{equation*} \n{f-c}_{C^0([0,1])} = \sup_{t \in [0,1]} \abs{t-c} = \max\{\abs c, \abs{1-c}\} \ge \frac 12\text{.} \end{equation*}
Solution 2. Argument by contradiction using sequences
Suppose for the sake of contradiction that \(S\) is dense, and let \(f \in C^0([0,1])\) be defined by \(f(x)=x\text{.}\) Then (by Lemma 2.3) there exists a sequence \(\seq fn\) in \(S\) converging to \(f\text{.}\) In particular this implies the pointwise limits \(f_n(0) \to f(0) = 0\) and \(f_n(1) \to f(1) = 1\text{.}\) But as the functions \(f_n\) are constant, \(f_n(0)=f_n(1)\) for all \(n \in \N\text{,}\) and so these two limits must be the same, which is the desired contradiction.
This argument is more complicated than the previous one in the sense that it is an argument by contradiction and also involves thinking about sequences. On the other hand, it is simpler than the previous argument in that we don’t actually have to estimate any norms!
Comment.
Using the same kinds of arguments as in the second solution, one can check that \(S\) is a closed subset of \(C^0([0,1])\text{.}\) Thus \(\overline S = S \ne C^0([0,1])\text{,}\) which also implies (by Lemma 2.3) that \(S\) is not dense.
5.2.1.c
Solution.
First we show that \(S\) is a linear subspace. Let \(f,g \in S\) and \(\alpha \in \R\text{.}\) Then \(f+\alpha g \in P([0,1])\) since \(P([0,1])\) is a vector space, and moreover
\begin{equation*} (f+\alpha g )(0) = f(0) + \alpha g(0) = 0 \end{equation*}
so that \(f+\alpha g \in S\text{.}\) The argument that \(S\) is an algebra is similar. For any \(f,g \in S\) as above we \(fg \in P([0,1])\) since \(P([0,1])\) is an algebra, and moreover \((fg)(0) = f(0)g(0) = 0\) so that \(fg \in S\text{.}\)
Next we show that \(S\) separates points. Clearly the function \(f(t)=t\) lies in \(S\text{.}\) Moreover, for any \(x,y \in [0,1]\) with \(x \ne y\) we have \(f(x) = x \ne y = f(y)\text{.}\)
To see that \(S\) cannot be dense, consider the constant function \(f \in C^0([0,1])\) defined by \(f(x)=1\text{.}\) Then for all \(g \in S\) we have
\begin{equation*} \n{f-g}_{C^0([0,1])} \ge \abs{f(0)-g(0)} = 1 \text{.} \end{equation*}
This does not contradict Theorem 5.14 because \(S\) does not contain the constant function \(x \mapsto 1\text{.}\)
Comment.
As in the previous part, it is not hard to show that \(S\) is a closed subset of \(C^0([0,1])\text{.}\) Thus \(\overline S = S \ne C^0([0,1])\) and so \(S\) is not dense.
5.2.5. (PS11) Separating points.
5.2.5.a
Solution.
Suppose that \(S\) does not separate points. Then there exist distinct points \(x,y \in X\) such that \(f(x)=f(y)\) for all \(f \in S\text{.}\) We claim that \(\overline S\) has the same property. To see this, let \(f_0 \in \overline S\text{.}\) Then there exists a sequence \(\seq fn\) in \(S\) with \(f_n \to f_0\text{.}\) In particular, as convergence in \(C_\bdd(X)\) implies pointwise convergence, we have
\begin{equation*} f_0(x) = \lim_{n\to\infty}f_n(x) = \lim_{n\to\infty}f_n(y) = f_0(y)\text{.} \end{equation*}
Thus \(\overline S\) also does not separate points.
5.2.5.b
Solution.
Suppose \(S \subset C^0([a,b])\) does not separate points. By the previous part we know that \(\overline S\) also does not separate points. Since \(C^0([a,b])\) does separate points, we conclude (by Lemma 2.3) that \(S\) is not dense.
The fact that \(C^0([a,b])\) separates points follows, for instance, from looking at the identity mapping \(f \in C^0([a,b])\) defined by \(f(t)=t\text{.}\) For any two distinct points \(x,y \in [a,b]\) we clearly have \(f(x)\ne f(y)\text{.}\)
Comment.
While Theorem 5.14 gives a list of sufficient conditions for an algebra to be dense, there is no claim that these conditions are also necessary. So the theorem is sadly not helpful for showing that sets are not dense.
Other the hand, this exercise shows that the hypothesis in Theorem 5.14 that \(S\) separate points is necessary for \(S\) to be dense in \(C_\bdd(X)\text{,}\) at least provided that \(C_\bdd(X)\) itself separates points.
5.2.6. A convergence criterion for series.
Solution.
Let \(m \in \N\text{.}\) Then
\begin{equation*} a_{N + m} = \frac{a_{N + m}}{a_{N + m - 1}} \cdots \frac{a_{N + 1}}{a_N} a_N \le \frac{b_{N + m}}{b_{N + m - 1}} \cdots \frac{b_{N + 1}}{b_N} a_N = \frac{b_{N + m}}{b_N} a_N. \end{equation*}
Hence if \(\sum_{n=1}^\infty b_n \lt \infty\text{,}\) then
\begin{equation*} \sum_{n = N + 1}^\infty a_n = \sum_{m = 1}^\infty a_{N + m} \le \frac{a_N}{b_N}\sum_{m = 1}^\infty b_{N + m} = \frac{a_N}{b_N} \sum_{n = N + 1}^\infty b_n \lt \infty. \end{equation*}
It follows that \(\sum_{n=1}^\infty a_n \lt \infty\) as well.
5.2.7. Another convergence criterion for series.
5.2.7.a
Solution.
Note that
\begin{equation*} \frac{b_{n + 1}}{b_n} = \left(\frac{n}{n + 1}\right)^\beta = \left(1 - \frac{1}{n + 1}\right)^\beta. \end{equation*}
Consider the function \(g \maps (0, \infty) \to \R\text{,}\) \(x \mapsto x^\beta\text{.}\) We compute \(g'(x) = \beta x^{\beta - 1}\) and \(g''(x) = \beta (\beta - 1) x^{\beta - 2} \gt 0\) for \(x \in (0, \infty)\text{.}\) Hence this is a convex function, and it follows from Taylor’s theorem that
\begin{equation*} g(x) \ge g(1) + (x - 1) g'(1) = 1 + \beta (x - 1) \end{equation*}
for all \(x \in (0, \infty)\text{.}\) Thus
\begin{equation*} \frac{b_{n + 1}}{b_n} = g\left(1 - \frac{1}{n + 1}\right) \ge 1 - \frac{\beta}{n + 1} \ge 1 - \frac{\alpha}{n + 1} \ge \frac{a_{n + 1}}{a_n} \end{equation*}
for all \(n \ge N\text{.}\)
5.2.7.b
Solution.
By Exercise 5.2.6, it suffices to verify that \(\sum_{n=1}^\infty b_n\) converges. This follows from the integral criterion, as
\begin{equation*} \int_1^\infty \frac{dx}{x^\beta} = \frac{1}{\beta - 1} \lim_{x \to \infty} (1 - x^{1 - \beta}) = \frac{1}{\beta - 1} \lt \infty. \end{equation*}
5.2.8. Proof of Proposition 5.15.
5.2.8.a
Solution.
For \(n \in \N\text{,}\) let \(a_n = \bigl|\binom{\frac 12}{n}\bigr|\text{.}\) Then
\begin{equation*} \frac{a_{n + 1}}{a_n} = \left|\frac{(\frac 12 - n)}{n + 1}\right| = \frac{n - \frac 12}{n + 1} = 1 - \frac{\frac{3}{2}}{n + 1}. \end{equation*}
According to the result of Exercise 5.2.7, it follows that the series \(\sum_{n=1}^\infty \bigl|\binom{\frac 12}{n}\bigr|\) converges absolutely. Since for any \(N \in \N\) and for \(t \in [-1, 1]\text{,}\) we have the inequality
\begin{equation*} \left|\sum_{n = N}^\infty \binom{\frac 12}{n} t^n\right| \le \sum_{n = N}^\infty \left|\binom{\frac 12}{n}\right|, \end{equation*}
this implies uniform convergence of the power series in \([-1, 1]\text{.}\)
5.2.8.b
Solution.
Every partial sum represents a continuous function on \([-1, 1]\text{.}\) Therefore, the uniform limit theorem implies that \(\phi\) is continuous on \([-1, 1]\text{.}\) Moreover, the convergence in \([-1, 1]\) implies that the radius of convergence of the power series is at least \(1\text{.}\) Hence \(\phi\) is differentiable in \((-1, 1)\) with
\begin{equation*} \phi'(t) = \sum_{n=1}^\infty \binom{\frac 12}{n} n t^{n - 1}. \end{equation*}
It follows that
\begin{equation*} \begin{split} 2(1 + t) \phi'(t) \amp = 2\sum_{n=1}^\infty \binom{\frac 12}{n} n t^{n - 1} + 2\sum_{n=1}^\infty \binom{\frac 12}{n} n t^n \\ \amp = 1 + 2\sum_{n=1}^\infty \left(\binom{\frac 12}{n + 1}(n + 1) + \binom{\frac 12}{n}n\right) t^n \\ \amp = 1 + \sum_{n=1}^\infty \binom{\frac 12}{n} t^n = \phi(t) \end{split} \end{equation*}
for \(t \in (-1, 1)\text{.}\) Therefore,
\begin{equation*} \frac{d}{dt} \left(\frac{\phi(t)}{\sqrt{1 + t}}\right) = 0 \end{equation*}
in \((-1, 1)\text{.}\) It follows that \(\frac{\phi(t)}{\sqrt{1 + t}} = \phi(0) = 1\) and hence \(\phi(t) = \sqrt{1 + t}\) for all \(t \in (-1, 1)\text{.}\) By continuity, the same holds for \(t \in [-1, 1]\text{.}\)

6 The Baire Category Theorem
6.1 The Theorem

Exercises

6.1.1. (PS11) Simple examples.
6.1.1.a
Solution.
We easily check that \(\{x\}\) is closed, and hence its own closure. Moreover, \(\{x\}^\circ = \varnothing\text{,}\) since \(B_r(x) \not\subseteq \{x\}\) for all \(r \gt 0\text{.}\)
6.1.1.b
Solution.
This set is not closed, but we can calculate its closure
\begin{equation*} \overline{\set{1/n}{n\in\N}} = \{0\} \cup \set{1/n}{n\in\N}\text{,} \end{equation*}
and the interior of this closure is empty,
\begin{equation*} (\{0\} \cup \set{1/n}{n\in\N})^\circ = \varnothing\text{.} \end{equation*}
6.1.1.c
Solution.
We calculate \(\overline{(0,1)\cap \Q} = [0,1]\text{,}\) and \([0,1]^\circ = (0,1) \ne \varnothing\text{.}\)
Comment.
If you’re wondering if you’ve calculated a closure or interior correctly, one basic check is that closures must be closed and interiors must be open; see Theorem 1.23 and Theorem 1.22.
6.1.4. (PS11) Baire for linear subspaces.
6.1.4.a
Solution.
Let \(x, y \in \overline{L}\) and \(\alpha \in \R\text{.}\) Then there exist two sequences \((x_n)_{n \in \N}\) and \((y_n)_{n \in \N}\) in \(L\) such that \(x = \lim_{n \to \infty} x_n\) and \(y = \lim_{n \to \infty} y_n\text{.}\) Since \(L\) is a linear subspace, we know that \(x_n + \alpha y_n \in L\) for all \(n \in \N\text{.}\) Moreover, thanks to Exercise 1.3.1 we have \(x_n + \alpha y_n \to x + \alpha y\text{,}\) hence \(x + \alpha y \in \overline L\) as desired.
Rather than appealing to Exercise 1.3.1, we could alternatively have shown \(x_n + \alpha y_n \to x + \alpha y\) directly from the estimate
\begin{equation*} \n{(x + \alpha y) - (x_n + \alpha y_n)} \le \n{x - x_n} + \abs \alpha \n{y - y_n} \to 0 \end{equation*}
as \(n \to \infty\text{.}\)
6.1.4.b
Solution.
Let \(x \in L^\circ\text{,}\) which we can do because \(L^\circ\) is nonempty. By the definition of \(L^\circ\text{,}\) there exists \(r \gt 0\) such that \(B_r(x) \subseteq L\text{.}\) We claim that \(B_r(0) \subseteq L\text{.}\) To see this, let \(y \in B_r(0)\text{.}\) Then the vector \(z = x + y\) belongs to \(B_r(x)\) as \(\n{z-x}=\n{(x+y)-x}=\n y \lt r\text{.}\) Hence \(z \in L\text{,}\) and since \(x \in L\) as well and \(L\) is a linear subspace, this implies that \(y = z - x \in L\text{.}\) The claim is proved.
Now let \(x \in X \setminus \{0\}\text{.}\) Then \(w = rx/(2\n x)\) belongs to \(B_r(0)\text{.}\) Again using the fact that \(L\) is a linear subspace, this implies \(x = 2\n xr^{-1} w \in L\text{.}\) Thus \(L = X\) as desired.
6.1.4.c
Solution.
Because of Part a, we may apply Part b to \(\overline{L}\) instead of \(L\text{.}\) If \(L\) is not dense in \(X\text{,}\) then \(\overline{L} \ne X\text{,}\) and hence \(\overline{L}^\circ = \varnothing\text{,}\) which means that \(L\) is nowhere dense.
6.1.4.d
Solution.
By the Baire category theorem in the form of Corollary 6.6, at least one of the spaces \(L_n\) is not nowhere dense in \(X\text{.}\) But then Part c implies that \(L_n\) is dense in \(X\text{.}\)