Monday, September 1, 2025

Would you Like Some Fusilli with your Inverse Trig?

I have a confession to make. I never really liked inverse trigonometric functions. I've had to help a number of students with them over the years, but more or less can relate when they express some distaste for them: it's easy to make mistakes with them, their domains are restricted in what sometimes seems like arbitrary and hard-to-remember ways, sometimes there is more than one valid answer, and conventions differ. Finally, even when you get your hands on a fancy calculator or computer software that supposedly can take care of things for you, it can also give answers that differ from what's expected in class, or otherwise require some interpretation to get right (often finessing with the quadrants, etc.). Actually, dealing with quadrants is cool; that's something akin to the notion of coordinate charts in probably this blog's favorite topic, manifolds. But it all still seems haphazard, and it all adds up to the perception that math is a rigged game in which people enforce rules for seemingly arbitrary reasons just to make you feel bad about yourself.

So I'm going to take you on an adventure involving some inverse trig, and we will get to the bottom of it and understand what is it that makes them so damn hard to deal with. And we'll have a good serving of pasta to go with it. Special edition fusilli. We will also revisit a favorite blog topic: the Riemann surface.

First off, I was very fortunate that my trig class made it a point to (gently) introduce complex numbers and eventually give the big revelation that they're all actually some combinations of complex exponentials. I'm definitely a fan of not making people memorize a lot of trig identities, when conceptually, just one suffices. If there's any lesson one should take from complex numbers besides, that square root of -1 sure is pretty damn useful, it's actually the concept of numbers carrying a generalized sign: neither positive nor negative, but a whole directional space of possibilities. That's personally when the light bulb went off for me.

Despite all of this, and, furthermore, enjoying complex analysis as an undergrad...

I still didn't like inverse trig functions. The problem with it is that there's some rushed discussion on "choosing a principal branch" or "making branch cuts" (I've already given Roger Penrose's opinion on that), they choose it for log and shuffle through how to define it for some functions, and then it's on to the next topic. So the concept never really had the time to "gel". On to grad school, there were many topics that had their origin in figuring out what to do with complex functions, but the overspecialization of topics, plus concerns about choosing a research topic ASAP, made it so that I never really got around to an in-depth study of the real (HAR!) solution to all of this: Riemann Surfaces. Now of course, we've mentioned them on the blog before, but even that was really more of a "beware, functions might not do what you think they will do", rather than actually getting to the bottom of what is really happening.

The Algebra of What's Going On

graphs of inverse cosine and sine
Inverse sine: blue, Inverse cosine: red.

So let's start off trying to understand the current state of frustration. The "definition" of an inverse trig function is easy enough: it's just whatever angle produces a given ratio of sides. Of course, the problem here is: more than one angle works. The function that produces this is not one-to-one, so therefore there can't be an inverse function. Usually, that's the end of the discussion. When an application comes up to solve for angles, they're usually in some restricted domain in which you can tease out the answer that you really want, by some reasoning about the problem. This is a good habit, but maybe is lost in translation. But instead there are conventions, like inverse cosine always giving an angle between 0 and $\pi$, and inverse sine giving an angle between $-\pi/2$ and $\pi/2$ (see figure above). Don't get me started on the other ones, because I don't even bother (though, you may find a love for the programmer's atan2 after this post). Once you consider the fact the functions are $2\pi$-periodic, it's now all good, right? You can just add a bunch of multiples of $2\pi$ and it's all great! Right? Right??? Oh right. There's like complementary/supplementary angles. Like $\pi$ minus the angle. And stuff. And other stuff. Oof, a headache already, right?

To help us understand what is going on, let's first figure out what these inverse functions are in terms of logs. It might be a bit of fun, because I don't think solving for inverse trig in terms of logs is on anyone's radar even after they learn Euler's identity. First consider the cosine:

$$y = \cos(x) = \frac{e^{\mathsf{i}x} + e^{-\mathsf{i}x}}{2}$$.

Now if we multiply through by $2 e^{\mathsf{i}x}$, we get

$$2 y e^{\mathsf{i} x} = e^{2\mathsf{i}x} + 1.$$

Maybe it's not obvious what to do with this, but if we rewrite it like this:

$$e^{2\mathsf{i} x} - 2 y e^{\mathsf{i} x} + 1 = 0$$

Take $u = e^{\mathsf{i}x}$. This gives $u^{2} - 2yu + 1 = 0$. Hopefully this is more familiar. Plugging into the trusty quadratic formula,

$$e^{\mathsf{i} x} = u = \frac{2y \pm \sqrt{4y^{2} - 4}}{2} = y \pm \sqrt{y^{2} - 1}.$$

Then with the logs,

$$x = \frac{1}{\mathsf{i}} \ln\left( y \pm \sqrt{y^{2} - 1}\right) = -\mathsf{i}\ln\left( y \pm \sqrt{y^{2} - 1}\right).$$

First off, whoa. We have inverse cosine in terms of logs and algebraic operations. Maybe it's old hat to the math majors, but we should remember complex numbers are often just introduced as "Oh you can't take the square root of -1? Can't stop me! 🤪🤪". It's a big unifying concept.

... but I hope you also see some, um, issues with this. First of all, that nasty $\pm$. Which one is it??? If you want purely real numbers... well... some bad news there. If $y$ is positive, then since $\sqrt{y^2-1}$ is always of smaller magnitude than $y$, both the plus and the minus give you something legal to take the log of. But then the $-\mathsf{i}$ stops you afterward. Ok, maybe we want to make the log be purely imaginary, so that the $-\mathsf{i}$ will cancel it. So we are still made to venture out into the nuances of how logs and complex numbers work. Another fact that is not obvious to start (and actually, this will be the thing that makes our pasta more interesting): it turns out that for any complex numbers $y$, the two numbers $y + \sqrt{y^{2} - 1}$ and $y - \sqrt{y^{2} - 1}$ are reciprocal (or: since there are two square roots for every complex number, this says the two possible values gotten by the square root, are reciprocal). This is easy to verify: $(y + \sqrt{y^{2} - 1})(y - \sqrt{y^{2} - 1}) = y^2 - (y^2 -1) = 1$.

Reciprocal numbers pass through the logarithm to become a minus sign, so we can (rather surprisingly) rewrite it as

$$x = \pm \mathsf{i} \ln\left(y + \sqrt{y^{2}-1}\right).$$

(Surprising, because usually it is NOT legal to take out plus/minus signs through a log like that). Now if in our classic situation we have $-1 \leq y \leq 1$, then $y^2 - 1$ is going to be negative, and thus have a complex square root. $y + \sqrt{y^2-1} = y +\mathsf{i}\sqrt{1-y^2}$. We should note that if you take the complex modulus of that, you get $y^2 + (1-y^2) = 1$. What happens when you take a log of something of complex modulus 1? The Pythagorean Trig identity and Euler's identity, the only two you need, show that you get something purely imaginary. Which, when combined with the outside factor of $\pm\mathsf{i}$, gets you two real solutions of opposite sign. If you then think about it some, it helps to recall that cosine is an even function, i.e. it gives the same result when switching the sign of its argument. So it makes sense you can have oppositely signed results for the inverse.

Finally, now the $2\pi$-periodicity of trig functions can be brought in, since Euler's identity is valid for angles that keep wrapping 'round and 'round: you get a bunch of results separated by multiples of $2\pi$ and also the result of the opposite sign separated by multiples of $2\pi$.

That's all well and good. It was a bunch of algebra and symbol wrangling. The whole point of this blog is, what the hell does this actually look like? This requires a bit more finessing, but it shows up on the teaser title image: note that it's a surface with ramps moving up and down both in a counterclockwise and a clockwise direction. This is in contrast to the usual depictions of fusilli pasta, which only has one ramp spiraling up.

The Full Complex Definition

To understand how to get a full-blown surface from all of this, we will have to stop confining ourselves to real numbers, or simple images of the real number line (i.e. $1$-dimensional subsets), such as the unit circle. In complex analysis, we define for all complex $z$,
$$\cos(z) = \frac{e^{\mathsf{i} z} + e^{-\mathsf{i} z}}{2}.$$
where you can use the complex exponential. We don't quite have the space to define that here, but one quick way is to use power series. What's important is to realize it still satisfies the same laws of exponents we know and love: $e^{z + w} = e^{z} e^{w}$, etc. Then when writing down the formula for the inverse of $w = \cos(z)$, we get
$$z = \mp\mathsf{i} \ln\left(w+ \sqrt{w^2 - 1}\right) = \mp\mathsf{i} \ln\left(w+ \mathsf{i}\sqrt{1-w^2}\right).$$
Now the question of which square root and which logarithm becomes an issue. It's easy to say, I'm going to ask you to ignore all that we drilled into your head about functions, the vertical line test, the horizontal line test, etc. Or at least, temporarily suspend it... We'll answer questions about how to precisely, unabiguously choose a different branch of a square root in programming in a bit, but for now: the most common complex square root that is implemented is, effectively, taking the polar form of the complex number, with its angle $\pi < \theta \leq \pi$ (yes, the less than or equal on the right, unless you're in Apple's grapher, for which it is $-\pi \leq \theta < \pi$, which made me have to jump through additional hoops: so please beware of this if you're going to go off on some explorations on your own. I told you that conventions differ!). The picture of the square root is to take the whole complex plane, and map it to the right half-plane, including the upper segment of the boundary, but not the lower segment. Introductory complex analysis texts pay no heed to this extra boundary happening, because the theory gives preference to open sets.

The strategy for choosing the multiple values is: take a principal branch, and then adjust based on two parameters: one, the sign of the square root, and two, which $2\pi$-period it comes from. We'll call it its ladder position. These parameters actually have a group structure which definitely surprised me the first time I saw it. We'll get to it. For now, we take the principal branch of inverse cosine to be:
$$\cos^{-1}(z) = -\mathsf{i} \ln\left(z + \mathsf{i} \sqrt{1-z^2}\right),$$
where we use the standard logarithm and square root that do the funny stuff on the negative axis. We derive this next (this choice is what reduces to that $0$ to $\pi$ range originally given, when we restrict to real values between $-1$ and $1$). The actual place where one needs to worry about a discontinuity in the complex plane (which is what branch cut means), in this case is $(-\infty, -1] \cup [1, \infty)$. 

The Graph

The classic visualization is $y = f(x)$ meaning the set of points $(x, f(x))$ in the plane, where $x$ lies in some portion of $\mathbb R$. This we've talked about many times before. It is the graph parametrization.

For complex-valued functions, ideally we would be able to visualize $(z, f(z))$ as something in 4 dimensions, two for the domain and two for the range. And visualizing 4-dimensional things is a favorite thing for mathematicians to try to do in various ways. A common way to do it for Riemann surfaces is by simply taking real and imaginary parts: $(z, \operatorname {Re}(f(z))$ and $(z, \operatorname {Im}(f(z))$ as separate 3D graphs. What's nice is that this tells you one method of visualizing the inverse function: $(f(z), z)$, or for a pair of 3D functions, in terms of parametrizations as $(\operatorname {Re}(f(z)), \operatorname{Im}(f(z)), \operatorname{Re}(z))$ and $(\operatorname {Re}(f(z)), \operatorname{Im}(f(z)), \operatorname{Im}(z))$. All of this is with $z = x+\mathsf{i}y$. The former is our title image with $-3 \leq x \leq 3$ and $-3 \leq y \leq 3$.

This is a similar graph of its imaginary part:


which rather surprisingly does not spiral around, namely, it has no more than two values per input value $z$. This comes from the fact that the real part of the complex logarithm comes from the complex number's radius. Moving things in the forward direction, though, namely the approach $(z, f(z))$ with the two coordinates of $z$ forming the $xy$ part of the parametrization, is a bit trickier, because we need to confront head on the multivaluedness of the inverse. But it does have an advantage that we can more finely control the domain coordinates. For that, we now talk about...

Learning to Live with the Branch Cuts We've Got

And just so how do we exactly how to deal with branch cuts? How do we deal systematically with them? One way was described here. Basically, the key is to consider the "problematic" part of the standard functions (both along the negative real axis) and work backward, computing what the problematic part looks like when mapped this way and that. The classical way of dealing with it is to consider continuous values of the functions along curves. But standard functions provided by computers don't take continuous curves as a parameter. However, for certain parametrizations of our surfaces, such as cylindrical coordinates, coordinate curves like the polar coordinate $\theta$ will cross the cuts at very predictable values of the coordinate. Then the key to evaluating the function is to strategically switch the sign of the square root, and the ladder position, based on the coordinate, in order to maintain the continuity. For parametrizing in terms of ladder steps going around multiple times, we will parametrize as follows: taking $z = x+iy = u e^{2\pi t}$, we take 

$$\begin{pmatrix}u \cos(2\pi t) \\ u \sin(2 \pi t) \\ \operatorname{Re}\left(-\mathsf{i} \ln \left(z + \mathsf{i} \sqrt{1-z^2}\right)\right) \end{pmatrix}$$

Now to deal with the multivaluedness of the function, we have that the crossing happens when $u \geq 1$ and $t$ at every integer and half-integer. The half-integer induces both a sign switch in the square root, and a bump along the ladder of the log, and at each integer, only the sign of the square root needs to switch in order to maintain continuity.

This changes the formula to something that's definitely less pretty... BUT it'll be much more concrete, in terms of being able to use common, readily available complex logs and square roots on your system without further custom hacks:
$$-\mathsf{i} \ln \left(z + \mathsf{i} \sqrt{1-z^2}\right)$$
is realized as
$$(-1)^{k+1}\mathsf{i} \ln \left(z + (-1)^{\lceil 2t\rceil }\mathsf{i} \sqrt{1-z^2}\right) + 2\pi\lceil t - \tfrac{1}{2}\rceil.$$
where $k$ is $0$ or $1$ to choose which ramp to go on, and the ceiling functions $2\pi\lceil t - \tfrac{1}{2}\rceil$ round you up to the next half-integer needed to make the log continuous after a full trip around, and there's an inner sign switch of the square root that keeps track both crossings.

There's actually so much more to say here about getting down into the weeds and nitty gritty to really practice wrangling with the branch cuts, but this can be an entire post in itself! I know it can be unsatisfying! As the YouTubers would say, let me know in the comments. A hidden gem in this is the intertwined nature of how the sign interacts with the log's ladder step. I originally thought the two were independent. They are not, and it turns out that it's the action of an infinite dihedral group. This would be material for yet another post! To me, this is the ultimate explanation of the double ramp, the true way I learned to love the inverse trig functions ... but hopefully the awesomeness hits sooner than having to do abstract algebra!

The result of all of this:


Notice how the coordinate curves are more uniform.

Sneak Peek

A fusilli with ramps going in two opposite directions is all well and good. But what about three directions in a 3-fold symmetry? Here's a sneak peek of that. And that's another adventure, for another post.


There's lots of directions we can go from here, and it'll be a big subject for the blog because it intertwines so many subjects! This 3-fold symmetry version will have us getting into the nitty gritty of defining branches, and a review of the cubic formula (also talked about here previously). We could talk about the algebraic topology aspect, where taking various paths around the branch points forms a group. And some unexpected cool results surrounding that.

Sunday, March 5, 2023

Interstellar Lily Pads

 


I'll someday explain what this is, but for now just enjoy! (Well, this is ultimately a visualization blog. The math explanations are bonus.)

Thursday, December 10, 2020

Some Loxodromes



Loxodromes are generated by curves of constant bearing on a sphere (always making a constant angle with circles of latitude). If you go in a direction and proceed forever in something that's NOT one of the 4 cardinal directions, you'll get something very interesting. It's not really obvious that you actually will spiral forever around a pole. These have the simplest expression in isothermal coordinates we talked about in the last post: they can be represented simply as straight lines in the $uv$ plane (well, if you allow $v$ to range over all of $\mathbb R$ and wrap around; if you insist on $v$ only being in an interval of length $2\pi$, then after the line goes off the top edge, you make it jump back to the bottom edge so you get something that looks like this: $/////$. So the full parametrization would be: $(\operatorname{sech}u \cos(au+b), \operatorname{sech}u \sin(au+b), \tanh u)$. It is easy to intuitively see why the angle between these curves and the $\theta$ curves are constant: the $uv$-plane is first mapped (conformally) to the whole $z$-plane by $(e^u\cos v, e^u\sin v)$, and then the (conformal) stereographic projection wraps it all up. The same cannot be said of the $\varphi\theta$ plane (to confirm, of course, you need to mess with metric coefficients in that last post).

Saturday, November 28, 2020

The Uniform-Distribution-on-the-Sphere's Nemesis: Isothermal Coordinates

Inspired by the last post where the conformal version of the polar grid is $(x,y) = (e^u \cos v, e^u \sin v)$ i.e. the radius is now exponentiated (namely what looks like a grid of squares in the domain gets mapped to what looks like squares in the range: the concentric circles bunch up much more densely close to the origin, to correspond to radial lines converging at the origin), I set out to do it on the sphere as well (that conformal polar grid is simply the image of the complex plane under the complex exponential $e^z$). This is in some sense the polar (hah) opposite of something that would produce a uniform distribution of points on a sphere, because this map would assign each square an equal chance (so of course, the closer you get to the poles, the squares are smaller and more of them encompass the same space in the image, so they'll be much more densely packed).

The quickest way to see this is to note the stereographic projection is a conformal map, and has the formula
\[(x,y) \mapsto \left(\frac{2x}{x^2+y^2+1}, \frac{2y}{x^2+y^2+1}, \frac{x^2+y^2-1}{x^2+y^2+1}\right)\]
and with a substitution of the complex exponential, you get \[(u,v) \mapsto \left(\frac{2e^u \cos v}{e^{2u}+1}, \frac{2e^u \sin v}{e^{2u}+1}, \frac{e^{2u}-1}{e^{2u}+1}\right) = \left(\frac{2}{e^u+e^{-u}}\cos v, \frac{2}{e^u+e^{-u}}\sin v, \frac{e^u-e^{-u}}{e^u+e^{-u}}\right)\]
I rewrote that in that form for fans of hyperbolic functions, which gives the map
\[ (u,v) \mapsto(\operatorname{sech} u \cos v, \operatorname{sech} u \sin v, \tanh u)\] (hyperbolic secant seems to be one of the least used ones; $\LaTeX$ doesn't recognize it!). But the upshot is, this formula looks very similar to the usual geographic coordinate system (just swapping out some transcendental functions).

Another version with finer grid lines. Notice how they cluster at the top.

Another way of looking at this is via manipulation of the metric coefficients directly, which raises some interesting points on the nature of the Laplace equation. If we consider the standard geographic parametrization$(x,y,z) = (\sin\varphi\cos\theta, \sin\varphi\sin\theta, \cos\varphi)$ with Euclidean metric $g = ds^2 = d\varphi^2 + \sin^2 \varphi\; d\theta^2$. To say that the parametrization is conformal (grid squares to grid squares) is to say that there are some parameters $u$ and $v$ and a function $\sigma$ such that $g = \sigma^2(du^2 + dv^2)$ (forcing the matrix to not only be diagonal, but a multiple of the identity). If we set things up so that $du = -\csc\varphi d\varphi$ and $dv = d\theta$, then $\sin^2\varphi (du^2 + dv^2) = d\varphi^2 + \sin^2\varphi\; d\theta^2)$. Consulting trusty integral tables, Wolfram Alpha'ing it, or whatever, we find that this differential equation gives $u = \ln(\csc \varphi + \cot \varphi) + C$. One can make it this derivation a little less "out of the blue" situation in a very interesting derivation of Laplace's equation using certain complex analysis hackery (George Springer's Introduction to Riemann Surfaces, section 1-3). We'll derive this another day, but the summary is, we end up deriving that $u$ has to satisfy the spherical Laplace equation \[\frac{1}{\sin\varphi} \frac{\partial}{\partial \varphi}\left(\sin \varphi \frac{\partial u}{\partial \varphi}\right) + \frac{1}{\sin \varphi} \frac{\partial^2 u}{\partial \theta^2} = 0,\] or really, it only has to satisfy this on an open subset of the sphere, in this case, a sphere with two points deleted. This fact often gets glossed over by people talking about geometry; we know that on a sphere, the only harmonic functions that are actually defined on the sphere everywhere are constant functions—so we can only find "interesting" harmonic functions on a sphere if they blow up at certain points. Using the very reasonable assumption that such a function $u$ should be independent of $\theta$, we derive $\frac{d}{d \varphi} \left(\sin\varphi \frac{d u}{d \varphi}\right) = 0$, making $\sin\varphi \frac{d u}{d \varphi} = C_1$ a constant, and $\frac{du}{d\varphi} = C_1 \csc \varphi$ to finally get $u = C_1 \ln(\csc \varphi + \cot \varphi) + C_2$, as before (just with two degrees of freedom, namely $C_1$ controls what the initial spacing of the grid lines, and $C_2$ controls what the value is at the equator). This function blows up at the poles, a fact which manifests itself visually by the grid curves $u =$ constant become more closely spaced together at the poles (and become infinitely dense there, meaning, this is only a valid coordinate system in the sphere outside the poles). We just set $C_1 = 1$ and $C_2 = 0$, which assigns the equator to $u=0$.

Finally, we have to do some algebra, though, to get to where we want: trying to find the inverse of $u = \ln(\csc \varphi + \cot \varphi)$. This says $e^u = \csc\varphi+\cot\varphi = \frac{1+\cos\varphi}{\sin\varphi}$. This is just $\frac{1+z}{r}$ where $r$ is the cylindrical coordinate $\sqrt{x^2+y^2}$. But if we're on the sphere, $z^2 + r^2 =1$, so this gives us \[e^{2u} = \frac{(1+z)^2}{1-z^2} = \frac{1+z}{1-z}.\] Rearranging we get $e^{2u}-e^{2u}z = 1+z$, or $z(1+e^{2u}) = e^{2u}-1$. This gives \[z=\frac{e^{2u}-1}{e^{2u}+1} = \frac{e^{u}-e^{-u}}{e^{u}+e^{-u}} = \tanh u.\]. To get the $x$ and $y$, we note that $\sin\varphi = r = e^{-u}(1+z) = e^{-u}(1+\tanh u)$. Finally,
\[ 1+ \tanh u = \frac{\cosh u + \sinh u}{\cosh u} = \frac{e^u}{\cosh u}, \] which, multiplying by the $e^{-u}$ gives $\frac{1}{\cosh u}$ or $\operatorname{sech} u$.

The coordinates we derived are called isothermal coordinates, and knowing what a big fan I am of stat mech, I'd like to fully understand the origin of the term someday. Finally, one might ask if there are coordinate systems (conformal or otherwise) that will cover up all the sphere without weird coordinate singularities like that. The answer is no, and the short explanation of why is because a smooth vector field on the sphere must vanish somewhere (the hairy ball theorem), and since coordinates always set up vector fields (take the vector in the direction of a coordinate curve) and aren't supposed to collapse at good points. This is a long topic for another day (we'll show some pretty cool coordinate systems on the sphere, though!)

Saturday, October 31, 2020

Happy Halloween from Nested Tori!


Spooky greetings. A quick post with more complex analysis for you. Though maybe not as visually interesting as fractals, but getting back to some basics and/or nuance regarding complex functions. For one thing, I just never had a chance to really fit in a good study of Riemann surfaces (interest in them was partially stirred on by all that recent cubic hackery). Also, a lot of stuff involving branch cuts and branch points are something quickly gone over in standard complex analysis and never really explored in depth, other than to say "we take the principal branch of the logarithm/root/power" function to be... and you don't get into the nuance with the details of differences between different branches of functions. This example here is exploring the mapping $z \mapsto \sqrt{z^2 + 1}$. This is already interesting from the Riemann surface point of view, because neither the function nor its inverse is definable in its most interesting form by mapping (subsets of) the complex plane alone. Formally, to work on $\mathbb{C}$, we have to define things by taking a continuous choice of square root on $\mathbb{C}$ minus some ray from $0$ to $\infty$. The choice here is between a square root and its opposite (which is also a square root of the same complex number!) is not natural in the complex plane (unlike in the real numbers where you can always take the positive square root of a positive number, which is in some sense, a defining characteristic of real numbers, cementing why it is the unique complete ordered field), and being forced to choose forces a discontinuity somewhere. Riemann surface theory is, in some sense, about never having to make the choice. To quote Sir Roger Penrose (congrats on that Nobel Prize!): 

In particular, the domain of the logarithm function would be ‘cut’ in some arbitrary way, by a line out from the origin to infinity. To my way of thinking, this was a brutal mutilation of a sublime mathematical structure. (The Road to Reality, p. 136)

A couple of references (e.g. Penrose's student Tristan Needham, in his otherwise fantastic Visual Complex Analysis) suggest the most common branch cut for this is function is two horizontal lines to the left of $\mathsf{i}$ and $-\mathsf{i}$. I wrangled with this a bit, but I believe a subtlety is left out. To study $\sqrt{z^2 + 1}$ as truly the composition of a square root function, and the mapping $z^2 + 1$, your branch cuts will have to be the corresponding inverse image of the branch cut of the square root, under $z^2 + 1$. Using the negative real axis, this is the ray above $\mathsf{i}$ and the ray below $-\mathsf{i}$. Two rays to the left of $\pm \mathsf{i}$ cannot be realized as such an inverse image, for any ray to infinity (the image under $z^2 + 1$ is a whole parabola to infinity). Of course, the essential definition for complex analysis is that you define the function in this set by taking paths from an origin point and analytically continuing, and entrusting your result to algebraic topology. But this is not very effective, computationally. This is basically saying that for some selections of branch cuts, the function composition $\sqrt{z^2 + 1}$ is a lie: you cannot get it by assembling square root, squaring, and plus one. What turns out to work, in terms of assembling functions rather than continuing along curves, is not $\sqrt{z^2 + 1}$, but $\sqrt{z+\mathsf{i}} \sqrt{z-\mathsf{i}}$, that is, if you take the branch of square root, along the negative axis, subtract and add $\mathsf{i}$, take the square root of each, and then multiply, you do get something that reproduces what you theoretically get by paths. One might wonder what is the difference between $\sqrt{z^2 + 1} = \sqrt{(z+\mathsf{i})(z-\mathsf{i})}$ and $\sqrt{z+\mathsf{i}}\sqrt{z- \mathsf{i}}$; these are not the same because $\sqrt{ab} \neq \sqrt{a}\sqrt{b}$ in the complex plane unless you regard it as a multivalued set equality.

The best rendering of this function is, I've discovered, is taking the branch $-\mathsf{i}$ to $\mathsf{i}$ along the imaginary axis—something that doesn't even go out to infinity. To his credit, Needham does mention that connecting the two branch points in a finite part of the plane is possible, and it just has to be done in a manner that disables you from being able to complete a turn around the two branch points $\pm \mathsf{i}$. But there's no explicit computation that develops a visceral understanding of this fact (and so I made this post).

The title picture is the image of a large disk (parametrized by the usual polar coordinates) under the mapping $\sqrt{z-\mathsf{i}} \sqrt{z+ \mathsf{i}}$. The cut (deleted out of the disk) is taken on the interval $-\mathsf{i}$ to $\mathsf{i}$ ($y=-1$ to $y=1$ in the picture). Computationally this means the square roots were taken with the angles in the interval $-\pi/2$ to $3\pi/2$, which required me to define the square root using some hackery of the arctan function. I leave you with the conformal version of this (where the concentric ring radii increase exponentially rather than linearly; the linear increase looks better for looking like a cobweb, but this version looks better in preserving the proportions of the grid rectangles formed by the intersecting lines).


Or, take a look at it in Desmos:


Sunday, October 11, 2020

Chaos Still Reigns!!

A quick follow-on to last week's post. It would be blasphemous to talk about chaos without at least mentioning the most iconic fractal since fractals became a thing via The Fractal Geometry of Nature, the set of complex numbers named after its author, Benoit Mandelbrot. I've spent a lot of time exploring the Mandelbrot set since I've been able to use computers, so you'd think I'd have plenty to say about it. And I do, but not this week. Indeed it is this, more than anything else, that started me on the visualization track. Instead, I'll leave you with the real reason why the thing is so damn captivating in the first place, with, what else, a visualization.

The Mandelbrot Set with $-1.01 \leq\operatorname{Re}(z) \leq -1.006$ and $0.309 \leq \operatorname{Im}(z) \leq 0.313$.
 
The Mandelbrot set is in fact relevant to strongest theme of this blog, that of parameter spaces. Namely, it is the set of all parameters $c$ such that the iteration $z \mapsto z^2 + c$, starting at $z=0$, remains bounded. It in fact is a catalogue of another bunch of fractals (just like having a catalogue of all lines in the plane), Julia sets, which describe for each fixed $c$, what starting values of the iteration $z \mapsto z^2 + c$ stay bounded. In other words, Julia sets describe a collection involving the $z$ values of the iteration, and the Mandelbrot set describes a collection involving $c$ values. The main dark blobby part of the image is the actual set; the fancy colors are just colorings according to how long it takes for a point there to escape outside a certain disk in the plane ($|z|\leq 2$). Here the colors are assigned to the Viridis palette, and the iteration is done up to 1000 times (at 1000, it's just considered to be in the set). It cycles through the Viridis palette by the number of iterations modulo 50 (every 50 times, the color repeats), which is why you see interesting discontinuous jumps in color. Enjoy. (The next post will probably include some examples of Julia sets, showing precisely how the Mandelbrot set is a catalogue of them). I leave you with my all-time favorite from the 90s, the "Jewel Box" as coined by Rollo Silver in a fractal newsletter Amygdala back in the '90s. Unfortunately, I don't know where he is now and what the state of the newsletter is!

Centered at $-0.74542846 + 0.11300888\mathsf{i}$ and going about $10^{-5}$ on either side in the $x$-direction


Saturday, October 3, 2020

Chaos Reigns!!!

 

Chaos and fractals were all the rage when I was growing up, and was probably one of the major factors in getting me interested in math, programming, and differential equations, since much of this is ultimately about how systems evolve over time. Oh, and getting me into TOTALLY AWESOME visualizations. This is a graph of the various possible final states of the quadratic iterator: $x_{n+1} = ax_n (1-x_n)$. What this means is, for a given parameter $a$, and some start value in the range $(0, 1)$, we consider iterating the map $F(a, x) = ax(1-x)$, namely, the sequence $F(a, x)$, $F(a, F(a, x))$, $F(a, F(a, F(a, x)))\dots$. For some ridiculous history here, one of my favorite things to do as a child was to repeatedly press buttons on a calculator over and over again to see what happens. Little did I know that this is basically, in math, the only thing that actually exists: everything is built on the foundation of recursion. Of course, I don't actually believe recursion is the only thing that exists, rather, there is interesting conceptual structure that is built up upon fundamental components, and it is damn amazing how something so simple can ultimately create something so complex. And here is one example in which this is true.

Anyway, to get back to it: if we keep iterating $F$ in the variable $x$, we will eventually fall into some stable pattern. For small $a$, iterating $F$ in the variable $x$ eventually gives convergence to a fixed point, namely $1-\frac{1}{a}$. How did we find this? If we have a fixed point $x = F(a, x)$, then $x = ax(1-x)$ or $-ax^2 + (a-1)x = 0$. This gives two solutions $x = 0$, and $-ax + (a-1) = 0$, or $x = (a-1)/a$. 

Now, for each such $a$ (which we take to be the horizontal axis), we plot this fixed point (on the vertical axis). And as you move up to $a=3$, all is peaceful, as all iterations eventually settle on one and only one point. But at $a = 3$, something strange happens. You stop getting a single fixed point, but rather, you start bouncing between two values. We can actually solve for what these two values are using none other than ... the cubic equation (that's the small extent of continuity here with the previous posts!). The way to see this: if there's a period-2 sequence, $x = F(a, y)$ and $y = F(a, x)$ for the two values, which means $x = F(a, F(a, x))$ and therefore $x = aF(a, x)(1-F(a, x)) =  a(ax(1-x))(1- ax(1-x))$. This is a priori a quartic equation, since the number of $x$'s eventually multiply out to degree $4$, but once you realize there's a trivial solution $x = 0$ here (because there's a factor of $x$ multiplying on both side), it's a cubic equation $a^2(1-x)(1-ax(1-x))  -1 = 0$. I'm not going to try solving this, despite having had much fun wrangling with Cardano's formula in the past few weeks, because that distracts from the ultimate insight (and indeed, as fun as that formula is, and the beautiful theory that eventually comes from trying to solve polynomial equations, Galois theory), you quickly enter territory in which you stop being able to solve for things in terms of roots. To visualize we instead proceed numerically, directly iterating from a start value and waiting for things to stabilize around a set of values. One sees that as $a$ grows higher and higher, the number of final points starts doubling and doubling, faster and faster (in fact, a factor of the Feigenbaum constant faster), until all chaos breaks loose and the iterator never settles on any fixed point. It's very weird how just varying the parameter $a$, one can go from very nice fixed point behavior to wild chaos.

Ok, you can't go very far analytically, but we can still try anyway:


To explain it, we'll have to back up a little. Note that I said the $2$-periodic solution involves the solution to a cubic equation. Where is the third one? (actually, as previously mentioned, it's really quartic, and $x=0$ is a solution). Is the cubic always one that has a zero discriminant? Probably not. We mentioned that these "fixed" points come about from iterating the mappings, i.e., there is dynamics involved. Solving the cubic, you will actually get $1 - 1/a$, the original fixed point, as one of the solutions (in fact, here is again where numerical analysis helps: you can get around using Cardano's formula if you realize from the picture that the original fixed point continues; then you guess the solution $1-1/a$, factor it out with synthetic division, and solve the remaining quadratic equation — this is exactly how I taught college algebra students to solve cubics). Of course, this has to satisfy $x = F(a, F(a, x)) = G(a, x)$, and the two solutions to this equation fixed points of $G$, and thus when $F$ is evaluated on it, it alternates between the two values. What goes wrong here is that this point (as well as the point $x=0$) is unstable, namely, if you start iterating from some point even slightly away from this fixed point, iterating the map will run away from that point $1 - 1/a$. What is significant about the value $a = 3$ is that the discriminant of the cubic forces two of the roots to become complex conjugates, so there is only one real solution, and the discriminant also is the switchover point from when the iteration is stable there versus anywhere else. A fixed point $x$ is stable whenever $|\partial G/\partial x| < 1$ at that point. This can be roughly seen by linearizing about the fixed point, a standard technique:

\[G(a, x+h) \approx G(a, x) + (\partial G/\partial x(a, x))h + O(h^2) = x + (\partial G/\partial x) h + O(h^2).\]

In other words, $G(a, x+h)$ is closer to $x$ than $x + h$ is, because $h$ is being multiplied by something of magnitude $< 1$. So iterating brings things close to the fixed point even closer. It can be seen that at least for a little bit after $a = 3$, the central $1-1/a$ continues, but the other two solutions are the stable fixed points. So if you plot the curves $x = G(a, x))$, you'll get something that divides into $3$ branches at $a = 3$ (referred to as a pitchfork bifurcation). Similarly, if you plot $x = F(a, F(a, F(a, F(a, x)))) = H(a, x)$, the 4th order iterate, it will divide into pitchforks twice. This is given in the above plot. However, to figure out which fixed points will actually happen if you iterate from a random starting point, you can figure out where $|\partial H/\partial x| < 1$. This gives the regions in the plane where solutions are stable. You can alternatively plot the complement, where it is $\geq 1$, which can be made to cover up the parts of the pitchforks that are not stable solutions. The weird gloopy visualization of these regions is just cool to look at, even where there aren't any stable points. The curves you see inside there are still the correct initial ones before it branches out into a complete mess.  It of course quickly becomes ferociously hard to compute for higher degrees, so a numerical study is still the way to go. But this is conceptually fun, though!

I leave you with a similar implicit function graph, but now color-coded:


The red and blue correspond to $|\partial G/\partial x| \geq 1$ and the greens are where it is $< 1$.

Another version, executed much faster using TensorFlow (yes, it can be used for non-machine learning applications!)