
I have a confession to make. I never really liked inverse trigonometric functions. I've had to help a number of students with them over the years, but more or less can relate when they express some distaste for them: it's easy to make mistakes with them, their domains are restricted in what sometimes seems like arbitrary and hard-to-remember ways, sometimes there is more than one valid answer, and conventions differ. Finally, even when you get your hands on a fancy calculator or computer software that supposedly can take care of things for you, it can also give answers that differ from what's expected in class, or otherwise require some interpretation to get right (often finessing with the quadrants, etc.). Actually, dealing with quadrants is cool; that's something akin to the notion of coordinate charts in probably this blog's favorite topic, manifolds. But it all still seems haphazard, and it all adds up to the perception that math is a rigged game in which people enforce rules for seemingly arbitrary reasons just to make you feel bad about yourself.
So I'm going to take you on an adventure involving some inverse trig, and we will get to the bottom of it and understand what is it that makes them so damn hard to deal with. And we'll have a good serving of pasta to go with it. Special edition fusilli. We will also revisit a favorite blog topic: the Riemann surface.
First off, I was very fortunate that my trig class made it a point to (gently) introduce complex numbers and eventually give the big revelation that they're all actually some combinations of complex exponentials. I'm definitely a fan of not making people memorize a lot of trig identities, when conceptually, just one suffices. If there's any lesson one should take from complex numbers besides, that square root of -1 sure is pretty damn useful, it's actually the concept of numbers carrying a generalized sign: neither positive nor negative, but a whole directional space of possibilities. That's personally when the light bulb went off for me.
Despite all of this, and, furthermore, enjoying complex analysis as an undergrad...
I still didn't like inverse trig functions. The problem with it is that there's some rushed discussion on "choosing a principal branch" or "making branch cuts" (I've already given Roger Penrose's opinion on that), they choose it for log and shuffle through how to define it for some functions, and then it's on to the next topic. So the concept never really had the time to "gel". On to grad school, there were many topics that had their origin in figuring out what to do with complex functions, but the overspecialization of topics, plus concerns about choosing a research topic ASAP, made it so that I never really got around to an in-depth study of the real (HAR!) solution to all of this: Riemann Surfaces. Now of course, we've mentioned them on the blog before, but even that was really more of a "beware, functions might not do what you think they will do", rather than actually getting to the bottom of what is really happening.
The Algebra of What's Going On
Inverse sine: blue, Inverse cosine: red.
So let's start off trying to understand the current state of frustration. The "definition" of an inverse trig function is easy enough: it's just whatever angle produces a given ratio of sides. Of course, the problem here is: more than one angle works. The function that produces this is not one-to-one, so therefore there can't be an inverse function. Usually, that's the end of the discussion. When an application comes up to solve for angles, they're usually in some restricted domain in which you can tease out the answer that you really want, by some reasoning about the problem. This is a good habit, but maybe is lost in translation. But instead there are conventions, like inverse cosine always giving an angle between 0 and $\pi$, and inverse sine giving an angle between $-\pi/2$ and $\pi/2$ (see figure above). Don't get me started on the other ones, because I don't even bother (though, you may find a love for the programmer's atan2 after this post). Once you consider the fact the functions are $2\pi$-periodic, it's now all good, right? You can just add a bunch of multiples of $2\pi$ and it's all great! Right? Right??? Oh right. There's like complementary/supplementary angles. Like $\pi$ minus the angle. And stuff. And other stuff. Oof, a headache already, right?
To help us understand what is going on, let's first figure out what these inverse functions are in terms of logs. It might be a bit of fun, because I don't think solving for inverse trig in terms of logs is on anyone's radar even after they learn Euler's identity. First consider the cosine:
$$y = \cos(x) = \frac{e^{\mathsf{i}x} + e^{-\mathsf{i}x}}{2}$$.
Now if we multiply through by $2 e^{\mathsf{i}x}$, we get
$$2 y e^{\mathsf{i} x} = e^{2\mathsf{i}x} + 1.$$
Maybe it's not obvious what to do with this, but if we rewrite it like this:
$$e^{2\mathsf{i} x} - 2 y e^{\mathsf{i} x} + 1 = 0$$
Take $u = e^{\mathsf{i}x}$. This gives $u^{2} - 2yu + 1 = 0$. Hopefully this is more familiar. Plugging into the trusty quadratic formula,
$$e^{\mathsf{i} x} = u = \frac{2y \pm \sqrt{4y^{2} - 4}}{2} = y \pm \sqrt{y^{2} - 1}.$$
Then with the logs,
$$x = \frac{1}{\mathsf{i}} \ln\left( y \pm \sqrt{y^{2} - 1}\right) = -\mathsf{i}\ln\left( y \pm \sqrt{y^{2} - 1}\right).$$
First off, whoa. We have inverse cosine in terms of logs and algebraic operations. Maybe it's old hat to the math majors, but we should remember complex numbers are often just introduced as "Oh you can't take the square root of -1? Can't stop me! 🤪🤪". It's a big unifying concept.
... but I hope you also see some, um, issues with this. First of all, that nasty $\pm$. Which one is it??? If you want purely real numbers... well... some bad news there. If $y$ is positive, then since $\sqrt{y^2-1}$ is always of smaller magnitude than $y$, both the plus and the minus give you something legal to take the log of. But then the $-\mathsf{i}$ stops you afterward. Ok, maybe we want to make the log be purely imaginary, so that the $-\mathsf{i}$ will cancel it. So we are still made to venture out into the nuances of how logs and complex numbers work. Another fact that is not obvious to start (and actually, this will be the thing that makes our pasta more interesting): it turns out that for any complex numbers $y$, the two numbers $y + \sqrt{y^{2} - 1}$ and $y - \sqrt{y^{2} - 1}$ are reciprocal (or: since there are two square roots for every complex number, this says the two possible values gotten by the square root, are reciprocal). This is easy to verify: $(y + \sqrt{y^{2} - 1})(y - \sqrt{y^{2} - 1}) = y^2 - (y^2 -1) = 1$.
Reciprocal numbers pass through the logarithm to become a minus sign, so we can (rather surprisingly) rewrite it as
$$x = \pm \mathsf{i} \ln\left(y + \sqrt{y^{2}-1}\right).$$
(Surprising, because usually it is NOT legal to take out plus/minus signs through a log like that). Now if in our classic situation we have $-1 \leq y \leq 1$, then $y^2 - 1$ is going to be negative, and thus have a complex square root. $y + \sqrt{y^2-1} = y +\mathsf{i}\sqrt{1-y^2}$. We should note that if you take the complex modulus of that, you get $y^2 + (1-y^2) = 1$. What happens when you take a log of something of complex modulus 1? The Pythagorean Trig identity and Euler's identity, the only two you need, show that you get something purely imaginary. Which, when combined with the outside factor of $\pm\mathsf{i}$, gets you two real solutions of opposite sign. If you then think about it some, it helps to recall that cosine is an even function, i.e. it gives the same result when switching the sign of its argument. So it makes sense you can have oppositely signed results for the inverse.
Finally, now the $2\pi$-periodicity of trig functions can be brought in, since Euler's identity is valid for angles that keep wrapping 'round and 'round: you get a bunch of results separated by multiples of $2\pi$ and also the result of the opposite sign separated by multiples of $2\pi$.
That's all well and good. It was a bunch of algebra and symbol wrangling. The whole point of this blog is, what the hell does this actually look like? This requires a bit more finessing, but it shows up on the teaser title image: note that it's a surface with ramps moving up and down both in a counterclockwise and a clockwise direction. This is in contrast to the usual depictions of fusilli pasta, which only has one ramp spiraling up.
The Full Complex Definition
To understand how to get a full-blown surface from all of this, we will have to stop confining ourselves to real numbers, or simple images of the real number line (i.e. $1$-dimensional subsets), such as the unit circle. In complex analysis, we define for all complex $z$,
$$\cos(z) = \frac{e^{\mathsf{i} z} + e^{-\mathsf{i} z}}{2}.$$
where you can use the complex exponential. We don't quite have the space to define that here, but one quick way is to use power series. What's important is to realize it still satisfies the same laws of exponents we know and love: $e^{z + w} = e^{z} e^{w}$, etc. Then when writing down the formula for the inverse of $w = \cos(z)$, we get
$$z = \mp\mathsf{i} \ln\left(w+ \sqrt{w^2 - 1}\right) = \mp\mathsf{i} \ln\left(w+ \mathsf{i}\sqrt{1-w^2}\right).$$
Now the question of which square root and which logarithm becomes an issue. It's easy to say, I'm going to ask you to ignore all that we drilled into your head about functions, the vertical line test, the horizontal line test, etc. Or at least, temporarily suspend it... We'll answer questions about how to precisely, unabiguously choose a different branch of a square root in programming in a bit, but for now: the most common complex square root that is implemented is, effectively, taking the polar form of the complex number, with its angle $\pi < \theta \leq \pi$ (yes, the less than or equal on the right, unless you're in Apple's grapher, for which it is $-\pi \leq \theta < \pi$, which made me have to jump through additional hoops: so please beware of this if you're going to go off on some explorations on your own. I told you that conventions differ!). The picture of the square root is to take the whole complex plane, and map it to the right half-plane, including the upper segment of the boundary, but not the lower segment. Introductory complex analysis texts pay no heed to this extra boundary happening, because the theory gives preference to open sets.
The strategy for choosing the multiple values is: take a principal branch, and then adjust based on two parameters: one, the sign of the square root, and two, which $2\pi$-period it comes from. We'll call it its ladder position. These parameters actually have a group structure which definitely surprised me the first time I saw it. We'll get to it. For now, we take the principal branch of inverse cosine to be:
$$\cos^{-1}(z) = -\mathsf{i} \ln\left(z + \mathsf{i} \sqrt{1-z^2}\right),$$
where we use the standard logarithm and square root that do the funny stuff on the negative axis. We derive this next (this choice is what reduces to that $0$ to $\pi$ range originally given, when we restrict to real values between $-1$ and $1$). The actual place where one needs to worry about a discontinuity in the complex plane (which is what branch cut means), in this case is $(-\infty, -1] \cup [1, \infty)$.
The Graph
The classic visualization is $y = f(x)$ meaning the set of points $(x, f(x))$ in the plane, where $x$ lies in some portion of $\mathbb R$. This we've talked about many times before. It is the graph parametrization.
For complex-valued functions, ideally we would be able to visualize $(z, f(z))$ as something in 4 dimensions, two for the domain and two for the range. And visualizing 4-dimensional things is a favorite thing for mathematicians to try to do in various ways. A common way to do it for Riemann surfaces is by simply taking real and imaginary parts: $(z, \operatorname {Re}(f(z))$ and $(z, \operatorname {Im}(f(z))$ as separate 3D graphs. What's nice is that this tells you one method of visualizing the inverse function: $(f(z), z)$, or for a pair of 3D functions, in terms of parametrizations as $(\operatorname {Re}(f(z)), \operatorname{Im}(f(z)), \operatorname{Re}(z))$ and $(\operatorname {Re}(f(z)), \operatorname{Im}(f(z)), \operatorname{Im}(z))$. All of this is with $z = x+\mathsf{i}y$. The former is our title image with $-3 \leq x \leq 3$ and $-3 \leq y \leq 3$.
This is a similar graph of its imaginary part:
which rather surprisingly does not spiral around, namely, it has no more than two values per input value $z$. This comes from the fact that the real part of the complex logarithm comes from the complex number's radius. Moving things in the forward direction, though, namely the approach $(z, f(z))$ with the two coordinates of $z$ forming the $xy$ part of the parametrization, is a bit trickier, because we need to confront head on the multivaluedness of the inverse. But it does have an advantage that we can more finely control the domain coordinates. For that, we now talk about...
Learning to Live with the Branch Cuts We've Got
And just so how do we exactly how to deal with branch cuts? How do we deal systematically with them? One way was described
here. Basically, the key is to consider the "problematic" part of the standard functions (both along the negative real axis) and work backward, computing what the problematic part looks like when mapped this way and that. The classical way of dealing with it is to consider continuous values of the functions along curves. But standard functions provided by computers don't take continuous curves as a parameter. However, for certain parametrizations of our surfaces, such as cylindrical coordinates, coordinate curves like the polar coordinate $\theta$ will cross the cuts at very predictable values of the coordinate. Then the key to evaluating the function is to strategically switch the sign of the square root, and the ladder position, based on the coordinate, in order to maintain the continuity. For parametrizing in terms of ladder steps going around multiple times, we will parametrize as follows: taking $z = x+iy = u e^{2\pi t}$, we take
$$\begin{pmatrix}u \cos(2\pi t) \\ u \sin(2 \pi t) \\ \operatorname{Re}\left(-\mathsf{i} \ln \left(z + \mathsf{i} \sqrt{1-z^2}\right)\right) \end{pmatrix}$$
Now to deal with the multivaluedness of the function, we have that the crossing happens when $u \geq 1$ and $t$ at every integer and half-integer. The half-integer induces both a sign switch in the square root, and a bump along the ladder of the log, and at each integer, only the sign of the square root needs to switch in order to maintain continuity.
This changes the formula to something that's definitely less pretty... BUT it'll be much more concrete, in terms of being able to use common, readily available complex logs and square roots on your system without further custom hacks:
$$-\mathsf{i} \ln \left(z + \mathsf{i} \sqrt{1-z^2}\right)$$
is realized as
$$(-1)^{k+1}\mathsf{i} \ln \left(z + (-1)^{\lceil 2t\rceil }\mathsf{i} \sqrt{1-z^2}\right) + 2\pi\lceil t - \tfrac{1}{2}\rceil.$$
where $k$ is $0$ or $1$ to choose which ramp to go on, and the ceiling functions $2\pi\lceil t - \tfrac{1}{2}\rceil$ round you up to the next half-integer needed to make the log continuous after a full trip around, and there's an inner sign switch of the square root that keeps track both crossings.
There's actually so much more to say here about getting down into the weeds and nitty gritty to really practice wrangling with the branch cuts, but this can be an entire post in itself! I know it can be unsatisfying! As the YouTubers would say, let me know in the comments. A hidden gem in this is the intertwined nature of how the sign interacts with the log's ladder step. I originally thought the two were independent. They are not, and it turns out that it's the action of an infinite dihedral group. This would be material for yet another post! To me, this is the ultimate explanation of the double ramp, the true way I learned to love the inverse trig functions ... but hopefully the awesomeness hits sooner than having to do abstract algebra!
The result of all of this:
Notice how the coordinate curves are more uniform.
Sneak Peek
A fusilli with ramps going in two opposite directions is all well and good. But what about three directions in a 3-fold symmetry? Here's a sneak peek of that. And that's another adventure, for another post.
There's lots of directions we can go from here, and it'll be a big subject for the blog because it intertwines so many subjects! This 3-fold symmetry version will have us getting into the nitty gritty of defining branches, and a review of the cubic formula (also
talked about here previously). We could talk about the algebraic topology aspect, where taking various paths around the branch points forms a group. And some unexpected cool results surrounding that.