# Extraneous Solutions – Part 2 of 3

## Solving an Equation as a Sequence of Equation Replacement Operations

Part 1 was so long because I wanted to be extremely thorough and to present things to an audience that perhaps hadn’t thought much about the logic of equation solving at all. Since we’re now all experts, perhaps it’s worth it to summarize everything very succinctly.

Given an equation in one free variable, we want to find the solution set. To do this, we replace that equation with an equivalent equation whose solution set is more obvious.

(1) $8x - 5 = 5x + 1$

(2) $8x = 5x + 6$

(3) $3x = 6$

(4) $x = 2$

If in the transition from (1)-(2), from (2)-(3), and from (3)-(4) we are careful to replace each equation with an equivalent equation, then by the transitivity of equivalence, the original equation and terminal equation are guaranteed to be equivalent. Since the solution set of the terminal equation is obvious, we know the solution set of the original equation, as well. Thus solving an equation requires establishing that certain equation replacement operations are indeed equivalence preserving and having the creativity and experience to know which ones to apply and in what order.

## What are the Equivalence-Preserving Operations on Equations?

If $a = b$, then $f(a) = f(b)$ for any well-defined function $f$. If $a$ and $b$ are expressions containing a free-variable, then any value of that variable which satisfies $a = b$ will also satisfy$f(a) = f(b)$. In other words, if you find it useful, feel free to replace any equation with a new equation which is the result of applying any function to both sides of the original equation. Any solution to the original equation will also be a solution to the new equation.

If the function $f$ is also one-to-one, then by definition, $f(a) = f(b) \Rightarrow a = b$ so any solution of $f(a) = f(b)$ will also be a solution to $a = b$. Thus applying $f$ to both sides of an equation is equivalence-preserving. If $f$ is not one-to-one, then in general, the operation is not equivalence-preserving.

In solving equation (1), we applied $f(n) = n + 5, g(n) = n - 5x$ and $h(n) = n/2$ in that order. Since all three of the functions are one-to-one, we are assured that (1) and (4) are equivalent. If we had cause to apply a non-one-to-one function, then we should be vigilant for extraneous solution.

## A More Interesting Example

Consider

(5) $\sqrt{6x-2} - \sqrt{x+1} = 2$

As I mentioned in the other post, these square roots are begging to be squared, but since there are two of them, one squaring will not be enough. Even though it’s not necessary to do so, it’s helpful to move one radical expression to the other side.

(6) $\sqrt{6x-2} = 2 + \sqrt{x+1}$

(7) $6x - 2 = 4 + 4\sqrt{x + 1} + x + 1$ We squared!

(8) $5x - 7 = 4\sqrt{x+1}$

(9) $25x^2 - 70x + 49 = 16x + 16$ We squared again!

(10) $25x^2 - 86x + 33 = 0$

(11) $(25x - 11)(x - 3) = 0$

So $x \in \{\frac{11}{25}, 3\}$

Since in the transition from (6)-(7) and again in the transition from (8)-(9) we had reason to apply the non-one-to-one function $f(n) = n^2$, we should be vigilant for extraneous solutions. [Note: since both sides of (6) are necessarily positive, applying $f(n) = n^2$ is equivalence-preserving, so no extraneous roots will be created there.] By checking back in the original equation, we see that 3 is a solution, but $\frac{11}{25}$ is not. I am more or less content to leave it at that. But some may ask for more clarity as to exactly what happened and when, so let’s indulge them.

I will now list each equation in reverse order along with its solution set:

(11) $(25x - 11)(x - 3) = 0$                            $\{\frac{11}{25}, 3\}$

(10) $25x^2 - 86x + 33 = 0$                             $\{\frac{11}{25}, 3\}$

(9) $25x^2 - 70x + 49 = 16x + 16$                 $\{\frac{11}{25}, 3\}$

(8) $5x - 7 = 4\sqrt{x+1}$                                     $\{3\}$

Since $5\cdot\frac{11}{25} - 7 = \frac{11}{5} - \frac{35}{5} = -\frac{24}{5} \neq 4\sqrt{\frac{11}{25} + 1} = 4\sqrt{\frac{11}{25} + \frac{25}{25}} = 4\sqrt{\frac{36}{25}} = 4\cdot\frac{6}{5} = \frac{24}{5}$

So we have isolated the precise moment when the extraneous solution $x = \frac{11}{25}$ is created and it appears exactly where we would expect it, in the transition from (8) to (9) as we replaced (8) with the result of applying the non-one-to-one function $f(n) = n^2$ to both sides.

More specifically, if $x = \frac{11}{25}$, (8) reads $- \frac{24}{5} = \frac{24}{5}$, which is false, but (9) reads $(- \frac{24}{5})^2 = ( \frac{24}{5})^2$, which is true. For this particular value of $x$, we squared both sides and replaced a false statement with a true statement. In retrospect, we can say that $x =\frac{11}{25}$ is not a solution to (8) or to any previous equation in the solving sequence, but is a solution to (9) and thus to all subsequent equations in the solving sequence.

(7) $6x - 2 = 4 + 4\sqrt{x + 1} + x + 1$                          $\{3\}$

Since both sides of (7) are positive when $x = 3$, it does not surprise us that,

(6) $\sqrt{6x-2} = 2 + \sqrt{x+1}$                               $\{3\}$

(5) $\sqrt{6x-2} - \sqrt{x+1} = 2$                                $\{3\}$

By fully analyzing the logic behind each step of our equation replacement sequence, we not only:

• confirm that $x = 3$ is a solution and that $x = \frac{11}{25}$ is not and
• understand that squaring both sides may produce an extraneous solution

but also

• isolate the precise step in the solving sequence in which this extraneous solution was created answering the why, how, and when for this problem
• confirm that the non-solution status of $x = \frac{11}{25}$ is not merely due to an error of algebra or arithmetic, but is a direct result of that fact that this value produces an equation (8) of the form $a = -a$

That last point is crucial in distinguishing the phenomenon of extraneous roots from the phenomenon of user error in algebra or arithmetic. If our equation solving sequence consists solely of equivalence-preserving operations, we do not even need to check to see if solutions to our terminal equation are also solutions to our original equation. If we do decide to check, perhaps out of an abundance of caution, and find a discrepancy, then user error must be to blame.

On the other hand, if a solver does employ solution-set-enlarging operations in the solving sequence and finds that a solution to the terminal equation is not a solution to the original equation, is this because the solution is extraneous or due to user error? One could perform an analysis like I did above and confirm that the non-solution is not due to user error, but instead to the logic of the process.

# Extraneous Solutions – Part 1 of 3?

## Disclaimer

Within my small inner circle of math teachers, the mystery of extraneous solutions seems to be the issue of the year. I have so much to say on this topic (algebraic, logical, pedagogical, historical, linguistic) that I don’t really know where to begin. My only disclaimer is that I’m not really sure if this topic is all that important.

## Solving an Equation with a Radical Expression

Consider the following equation:

(1) $2\sqrt{x+8} +5 = 11$

One hardly needs algebra skills or prior knowledge to solve this, but prior experience suggests trying to isolate $x$.

(2) $2\sqrt{x+8} = 6$ (we subtract 5 from both sides)

(3) $\sqrt{x+8} = 3$ (we divide both sides by 2)

Now, if the square root of something is 3, then that something must be 9, so it immediately follows that

(4) $x+8 = 9$

(5) $x = 1$ (we subtract 8 from both sides)

## Squaring Both Sides

In my transition from (3) to (4), I used a bit of reasoning. Some conversational common sense told me that “if the square root of something is 3, then that something must be 9”. But that logic is usually just reduced to an algebraic procedure: “squaring both sides”. If we square both sides of equation (3), we get equation (4).

On the one hand, this seems like a natural move. Since the meaning of $\sqrt{a}$ is “the (positive) quantity which when squared is $a$“, the expression $\sqrt{a}$ is practically begging us to square it. Only then can we recover what lies inside. A quantity “which when squared is $a$” is like a genie “which when summoned will grant three wishes”. In both cases you know exactly what to do next.

Unfortunately, squaring both sides of an equation is problematic. If $a = b$ is true, then $a^2 = b^2$ is also true. But the converse does not hold. If $a^2 = b^2$, we cannot conclude that $a = b$, because opposites have the same square.

This leads to problems when solving an equation if one squares both sides indiscriminately.

## A Silly Equation Leads to Extraneous Solutions

Consider the equation,

(6) $x = 4$

This is an equation with one free variable. It’s a statement, but it’s a statement whose truth is impossible to determine. So it’s not quite a proposition. Logicians would call it a predicate. Linguistically, it’s comparable to a sentence with an unresolved anaphor. If someone begins a conversation with the sentence “He is 4 years old”, then without context we can’t process it. Depending on who “he” refers to, the sentence may be true or false. The goal of solving an equation is to find the solution set, the set of all values for the free variable(s) which make the sentence true.

Equation (6) is only true if $x$ has value 4. So the solution set is $\left\{4 \right\}$. But if we square both sides for some reason…

(7) $x^2 = 16$ has solution set $\left\{4, -4\right\}$

We began with $x = 4$, “did some algebra”, and ended up with $x^2 = 16$. By inspection, $-4$ is a solution to $x^2 = 16$, but not to the original equation which we were solving, so we call $-4$ an “extraneous solution”. [Extraneous – irrelevant or unrelated to the subject being dealt with]

Note that the appearance of the extraneous solution in the algebra of (6)-(7) did not involve the square root operation at all. But this example was also a bit silly because no one would square both sides when presented with equation (6), so let’s look at a slightly less silly example.

(8) $2\sqrt{x+8} + 5 = -1$

(9) $2\sqrt{x+8} = -6$

(10) $\sqrt{x+8} = -3$

People paying attention might stop here and conclude (correctly) that (10) has no solutions, since the square root of a number can not be negative. Closer inspection of the logic of the algebraic operations in (8)-(10) enables us to conclude that the original equation (8) has no solutions either. Since $a = b \iff a - 5 = b -5$, any solution to (8) will also be a solution to (9) and vice versa. Since $a = b \iff a/2 = b/2$, any solution to (9) will also be a solution to (10) and vice versa. So equations (8), (9), and (10) are all “equivalent” in the sense that they have the same solution set.

But what if the equation solver does not notice this fact about (10) and decides to square both sides to get at that information hidden inside the square root?

(11) $x+8 = 9$

(12) $x = 1$

Again we have an extraneous solution. $x = 1$ is a solution to (12), but not to the original equation (8). Where did everything go wrong? By the previous logic, (8), (9), and (10) are all equivalent. (11) and (12) are also equivalent. So the extraneous solution somehow arose in the transition from (10) to (11), by squaring both sides.

So unlike subtracting 5 from both sides or dividing both sides by 2, squaring both sides is not an equivalence-preserving operation. But we tolerate this operation because the implication goes in the direction that matters. If $a = b$, then $a^2 = b^2$, so if $a$ and $b$ are expressions containing a free variable $x$, any value of $x$ that makes $a = b$ true will also make $a^2 = b^2$ true.

In other words, squaring both sides can only enlarge the solution set. So if one is vigilant when squaring both sides to the possible creation of extraneous solutions, and is willing to test solutions to the terminal equation back into the original equation, the process of squaring both sides is innocent and unproblematic.

## Those Who are Still Not Satisfied

Still there are some who are not satisfied with this explanation: “Why does this happen? What is really going on? Where do the extraneous solutions come from? What do they mean?”

One source of the problem is the square root operation itself. $\sqrt{a}$ is, by the conventional definition, the positive quantity which when squared is $a$. The reason that we have to stress the positive quantity is that there are always two real numbers that when squared equal any given positive real number. There are a few slightly different ways of making this same point. The operation of squaring a number erases the evidence of whether that number was positive or negative, so information is lost and we are not able to reverse the squaring process.

We can also phrase the phenomenon in the language of functions. Since squaring is a common and useful mathematical practice, information will often come to us squared and we’ll need an un-squaring process to unpack that information. $f(x) = x^2$, for all the reasons just mentioned, is not a one-to-one function, so strictly speaking, it is not invertible. But un-squaring is too important, so we persevere. As with all non-one-to-one functions, we first restrict the domain of $f(x) = x^2$ to $[0, \infty)$ to make it one-to-one. This inverse, $f^{-1}(x) = \sqrt{x}$ thus has a positive range and so the convention that $\sqrt{a} \geq 0$ is born. So every use of the square root symbol comes with the proviso that we mean the positive root, not the negative root. We inevitably lose track of this information when squaring both sides.

[Note: Students can easily lose track of these conventions. After a lot of practice solving quadratic equations, moving from $x^2 = 9$ effortlessly to $x = \pm 3$, students will often start to report that $\sqrt{9} = \pm 3$.]

The convention that we choose the positive root is totally arbitrary. In a world in which we restricted the domain of  $f(x) = x^2$ to $(-\infty, 0]$ before inverting, $\sqrt{9}$ would be $-3$. In that world, $x = 1$ is a perfectly good solution to $2\sqrt{x+8} + 5 = -1$, not extraneous at all.

## A Trigonometric Equation which Yields an Extraneous Solution

For parallelism, consider the (somewhat artificial) equation:

(13) $\arccos(2x-1) = \frac{4\pi}{3}$

Like in (10), careful and observant solvers might notice that the range of the $\arccos(x)$ function is $[0, \pi]$ and correctly conclude that the equation has no solutions. But there seems to be a lot going on inside that $\arccos$ expression, so many will rush ahead and try to unpack it by “cosineing”. Indeed, since $a=b \Rightarrow \cos(a) = \cos(b)$, this seems innocent.

(14) $2x - 1 = -\frac{1}{2}$

(15) $2x = \frac{1}{2}$

(16) $x = \frac{1}{4}$

But $x = \frac{1}{4}$ is an extraneous solution since $\arccos(-\frac{1}{2}) = \frac{2\pi}{3}$ not $\frac{4\pi}{3}$.

The explanation for this extraneous solution will be similar to the logic we used above. If $a = b$, then $\cos(a) = \cos(b)$, so if $a$ and $b$ are expressions containing a free variable $x$, any value of $x$ that makes $a = b$ true will also make $\cos(a) = \cos(b)$ true. So we will not lose any solutions by “taking the cosine of both sides”. But as the cosine function is not one-to-one, $\cos(a) = \cos(b)$ does not imply that $a = b$. So taking the cosine of both sides, just like squaring both sides, can enlarge the solution set.

The above paragraph explains why extraneous solutions could appear in the solution of (13), but maybe not why they do appear. For that, we again must look to the presence of the $\arccos$ function. Since $\cos$ is not one-to-one, we had to arbitrarily restrict its domain to $[0, \pi]$ prior to inverting. So every use of the $\arccos$ symbol comes with its own proviso that we are referring to a number in a particular interval of values. In a world in which we had restricted the domain of $\cos$ to $[\pi, 2\pi]$ prior to inverting, $x = \frac{1}{4}$ would be a perfectly good solution to $\arccos(2x-1) = \frac{4\pi}{3}$, not extraneous at all.

The above examples seem to suggest that one can avoid dealing with extraneous solutions by carefully examining one’s equations at each step. But in practice, this really isn’t possible. I saved the fun examples for the end, but as this post is already way way too long, they will have to wait for a bit later.

-Will Rose

## Thanks

Thanks to John Chase for letting me guest post on his blog. Thanks to James Key for encouraging me again and again to think about extraneous solutions.

# Proving identities – what’s your philosophy?

What happens in your classroom when you give students the following task?

Prove $1+\frac{1}{\cos{\theta}}=\frac{\tan^2{\theta}}{\sec{\theta}-1}$.

Sometimes the command is Verify or Show instead of Prove, but the intent is the same.

# Two non-examples

Here are two ways that a student might work the problem.

Method 1

$1+\frac{1}{\cos{\theta}}=\frac{\tan^2{\theta}}{\sec{\theta}-1}$

$1+\sec{\theta}=\frac{\tan^2{\theta}}{\sec{\theta}-1}$

$(\sec{\theta}-1)(1+\sec{\theta})=\tan^2{\theta}$

$\sec^2{\theta}-1=\tan^2{\theta}$

$\tan^2{\theta}=\tan^2{\theta}$

Method 2

$1+\frac{1}{\cos{\theta}}=\frac{\tan^2{\theta}}{\sec{\theta}-1}$

$1+\sec{\theta}=\frac{\sec^2{\theta}-1}{\sec{\theta}-1}$

$1+\sec{\theta}=\frac{(\sec{\theta}-1)(\sec{\theta}+1)}{\sec{\theta}-1}$

$\sec{\theta}+1=\sec{\theta}+1$

How do you feel about these methods? In my opinion, both methods represent a fundamental misunderstanding of the prompt. Method 1 is especially grotesque, but Method 2 also leaves a lot to be desired. Let me explain. And if you think the above methods are perfectly fine, please be patient and hear me out.

This is the crux of the issue:

The prompt was to prove the statement. But if the first line of our work is the very thing we’re out to prove, then we are already assuming the thing we want to prove. We’re Begging the Question.

It’s as if someone demands,

“Well, let’s first start by assuming that Statement X is true.”

This is nonsense.

# What went wrong?

So what is the proper way to engage this proof? Let’s roll back a bit.

The error in these approaches seems to stem from a desire to perform algebraic operations on both sides of an equation in the same way that you might if you were solving an equation.

When we “do algebra” and write Equation B below another Equation A without any words, we always mean that Equation A implies Equation B. That is, when we write

Equation A

Equation B

Equation C

etc…

we mean that Equation C follows from Equation B, which follows from Equation A.

Some might claim that each line should be equivalent to the last. But, again, when we “do algebra” by performing algebraic manipulations to both sides of an equation to transform it from equation A into equation B, we always mean $A\Rightarrow B$, we don’t mean $A\iff B$. Take, for example, the following algebra which results in an extraneous solution:

$\sqrt{x+2}=x$

$(\sqrt{x+2})^2=x^2$

$x+2=x^2$

$0=x^2-x-2$

$0=(x-2)(x+1)$

$x=2 \text{ or } x=-1$

In this example, each line follows from the previous, however reversing the logic doesn’t work. But we accept that this is the usual way we do algebra ($A\Rightarrow B\Rightarrow C\Rightarrow \cdots$). Here the last line doesn’t hold because only one solution satisfies the original equation ($x=2$). Remember that our logic is still flawless, though. Our logic just says that IF $\sqrt{x+2}=x$ for a given $x$THEN $(\sqrt{x+2})^2=x^2$.

As we move through the algebra line by line, we either preserve the solution set or increase its size. In the case above, the solution set for the original equation is {2}, and as we go to line 2 and beyond, the solution set is {2,-1}.

For more, James Tanton has a nice article about extraneous solutions and why they arise, which I highly recommend.

So if this is the universal way we interpret algebraic work, which is what I argue, then it is wrong to construct an argument of the form $A\Rightarrow B\Rightarrow C$ in order to prove statement A is true from premise C. The argument begs the question.

Both Method 1 and Method 2 make this mistake.

# How does a proof go again?

I want to actually make a more general statement. The argument I gave above regarding how we “do algebra” is actually how we present any sort of deductive argument. We always present such an argument in order, where later statements are supported by earlier statements.

ANY time we see a sequence of statements (not just equations) A, B, C that is being put forward as a proof, if logical connectives are missing, the mathematical community agrees that “$\Rightarrow$” is the missing logical connection.

That is, if we see the proof A,B,C as a proof of statement C from premise A, we assume that the argument really means $A\Rightarrow B\Rightarrow C$.

This is usually the interpretation in the typical two-column proof, as well. We just provide the next step with a supporting theorem/definition/axiom, but we don’t also go out of our way to say “oh, and line #7 follows from the previous lines.”

Example: Given a non-empty set $E$ with lower bound $a$ and upper bound $b$, show that $a\leq b$.

1. $E$ is non-empty and $a$ and $b$ are lower and upper bounds for $E$. (given)
2. Set $E$ contains at least one element $x$. (definition of non-empty)
3. $a\leq x$ and $x\leq b$. (definitions of lower and upper bound)
4. $a\leq b$. (transitive property of inequality)

Notice I never say that one line follows from the next. And also notice that it would be a mistake to interpret the logical connectives as biconditional.

# The path of righteousness

I encourage my students to work with only ONE side of the expression and manipulate it independently, in its own little dark box, and when it comes out into the light, if it looks the same as the other side, you’ve proved the equivalence of the expressions.

For example, to show that $\log\left(\frac{1}{t-2}\right)-\log\left(\frac{10}{t}\right)=-1+\log\left(\frac{t}{t-2}\right)$ for $t>2$, I would expect this kind of work for “full credit”:

$\text{LHS }=\log\left(\frac{1}{t-2}\right)-\log\left(\frac{10}{t}\right)$

$=-\log(t-2)-\log(10)+\log(t)$

$=-\log(10)+\log(t)-\log(t-2)$

$= -1 + \log\left(\frac{t}{t-2}\right)$

$=\text{ RHS}$

Interestingly, I WOULD also accept an argument of the form $A\iff B\iff C$ as justification for conclusion A from premise C, but I would want a student to say “A is true if and only if B is true, which is true if and only if C is true.” Even though it provides a valid proof, I discourage students from using this somewhat cumbersome construction.

So let’s return to the original problem and show a few ways a student could do it correctly.

# Three examples

Method A – A direct proof by manipulating only one side

$\text{LHS}=\frac{\tan^2{\theta}}{\sec{\theta}-1}$

$=\frac{\sec^2{\theta}-1}{\sec{\theta}-1}$

$=\frac{(\sec{\theta}-1)(\sec{\theta}+1)}{\sec{\theta}-1}$

$=\sec{\theta}+1$

$=\text{RHS}$

Method B – A proof starting with a known equality

$\tan^2{\theta}=\tan^2{\theta}$

$\sec^2{\theta}-1=\tan^2{\theta}$

$(\sec{\theta}-1)(1+\sec{\theta})=\tan^2{\theta}$

$1+\sec{\theta}=\frac{\tan^2{\theta}}{\sec{\theta}-1}$

$1+\frac{1}{\cos{\theta}}=\frac{\tan^2{\theta}}{\sec{\theta}-1}$

Method C – Carefully specifying biconditional implications

$1+\frac{1}{\cos{\theta}}=\frac{\tan^2{\theta}}{\sec{\theta}-1}$

$\text{if and only if}$

$1+\sec{\theta}=\frac{\tan^2{\theta}}{\sec{\theta}-1}$

$\text{if and only if}$

$(\sec{\theta}-1)(1+\sec{\theta})=\tan^2{\theta}$

$\text{if and only if}$

$\sec^2{\theta}-1=\tan^2{\theta}$

$\text{if and only if}$

$\tan^2{\theta}=\tan^2{\theta}$

While all of these are now technically correct, I think we all prefer Method A. The other methods are cool too. But please, please, promise me you won’t use Methods 1 or 2 which I presented in my introduction.

# In conclusion

Some might argue that the heavy criticism I’ve leveled against Methods 1 and 2 is nitpicking. But I disagree. This kind of careful reasoning is exactly the business of mathematicians. It’s not good enough to just produce “answers,” our job is to produce good reasoning. Mathematics, remember, is a sense-making discipline.

Thanks for staying with me to the end of this long-winded post. Can you tell I’ve had this conversation with a lot of students over the last ten years?

1. Dave Richeson has a similar rant with a similar thesis here.
2. This article was originally inspired by this recent post on Patrick Honner’s blog. A bunch of us fought about this topic in the comments, and in the end, Patrick encouraged me to write my own post on the subject. So here I am. Thanks for pushing me in the right direction, Mr. Honner!

# What does it mean to truly prove something?

Let me point you to the following recent blog post from Prof Keith Devlin, entitled “What is a proof, really?”

After a lifetime in professional mathematics, during which I have read a lot of proofs, created some of my own, assisted others in creating theirs, and reviewed a fair number for research journals, the one thing I am sure of is that the definition of proof you will find in a book on mathematical logic or see on the board in a college level introductory pure mathematics class doesn’t come close to the reality.

For sure, I have never in my life seen a proof that truly fits the standard definition. Nor has anyone else.

The usual maneuver by which mathematicians leverage that formal notion to capture the arguments they, and all their colleagues, regard as proofs is to say a proof is a finite sequence of assertions that could be filled in to become one of those formal structures.

It’s not a bad approach if the goal is to give someone a general idea of what a proof is. The trouble is, no one has ever carried out that filling-in process. It’s purely hypothetical. How then can anyone know that the purported proof in front of them really is a proof?

(more)

I won’t be shy in saying that I disagree with Keith Devlin. Maybe I misunderstand the subtle nuance of his argument. Maybe I haven’t done enough advanced mathematics. Please help me understand.

Devlin says that proofs created by the mathematical community (on the blackboard, and in journals) are informal and non-rigorous. I think we all agree with him on this point.

But the main point of his article seems to be that these proofs are non-rigorous and can never be made rigorous. That is, he’s suggesting that there could be holes in the logic of even the most vetted & time-tested proofs. He says that these proofs need to be filled in at a granular level, from first principles. Devlin writes, “no one has ever carried out that filling-in process.”

The trouble is, there is a whole mathematical community devoted to this filling-in process. Many high-level results have been rigorously proven going all the way back to first principles. That’s the entire goal of the metamath project. If you haven’t ever stumbled on this site, it will blow your mind. Click on the previous link, but don’t get too lost. Come back and read the rest of my post!

I’ve reread his blog post multiple times, and the articles he linked to. And I just can’t figure out what he could possibly mean by this. It sounds like Devlin thoroughly understands what the metamath project is all about, and he’s very familiar with proof-checking and mathematical logic. So he definitely isn’t writing his post out of ignorance–he’s a smart guy! Again, I ask, can anyone help me understand?

I know that a statement is only proven true relative to the axioms of the formal system. If you change your axioms, different results arise (like changing Euclid’s Fifth Postulate or removing the Axiom of Choice). And I’ve read enough about Gödel to understand the limits of formal systems. As mathematicians, we choose to make our formal systems consistent at the expense of completeness.

Is Devlin referring to one of these things?

I don’t usually make posts that are so confrontational. My apologies! I didn’t really want to post this to my blog. I would have much rather had this conversation in the comments section of Devlin’s blog. I posted two comments but neither one was approved. I gather that many other comments were censored as well.

Here’s the comment I left on his blog, which still hasn’t shown up. (I also left one small comment saying something similar.)

Prof. Devlin,

You said you got a number of comments like Steven’s. Can you approve those comments for public viewing? (one of those comments was mine!)

I think Steven’s comment has less to do with computer *generated* proofs as it does with computer *checked* proofs, like those produced by the http://us.metamath.org/ community.

There’s a big difference between the proof of the Four Color Theorem, which doesn’t really pass our “elegance” test, and the proof of $e^{i\pi}=-1$ which can be found here: http://us.metamath.org/mpegif/efipi.html

A proof like the one I just linked to is done by humans, but is so rigorous that it can be *checked* by a computer. For me, it satisfies both my hunger for truth AND my hunger to understand *why* the statement is true.

I don’t understand how the metamath project doesn’t meet your criteria for the filling in process. I’ll quote you again, “The trouble is, no one has ever carried out that filling-in process. It’s purely hypothetical. How then can anyone know that the purported proof in front of them really is a proof?”

What is the metamath project, if not the “filling in” process?

John

If anyone wants to continue this conversation here at my blog, uncensored, please feel free to contribute below :-). Maybe Keith Devlin will even stop by!

# Math on Quora

I may not have been very active on my blog recently (sorry for the three-month hiatus), but it’s not because I haven’t been actively doing math. And in fact, I’ve also found other outlets to share about math.

Have you used Quora yet?

Quora, at least in principle, is a grown-up version of yahoo answers. It’s like stackoverflow, but more philosophical and less technical. You’ll (usually) find thoughtful questions and thoughtful answers. Like most question-answer sites, you can ‘up-vote’ an answer, so the best answers generally appear at the top of the feed.

The best part about Quora is that it somehow attracts really high quality respondents, including: Ashton Kutcher, Jimmy Wales, Jermey Lin, and even Barack Obama. Many other mayors, famous athletes, CEOs, and the like, seem to darken the halls of Quora. For a list of famous folks on Quora, check out this Quora question (how meta!).

Also contributing quality answers is none other than me. It’s still a new space for me, but I’ve made my foray into Quora in a few small ways. Check out the following questions for which I’ve contributed answers, and give me some up-votes, or start a comment battle with me or something :-).

And here are a few posts where my comments appear:

# A TOK Lecture on Mathematical Thinking

Students in our International Baccalaureate program here at RM are required to take a core class called Theory of Knowledge (TOK) which is kind of a philosophy class for high school students–or, at least the epistemology piece.

In some schools, this course is taught by math teachers. Here at RM, no math teachers currently teach TOK, which is too bad. So I volunteered to put together a guest lecture on Mathematical Thinking. I’ve tried it out once with a TOK class and I gave the lecture for some of my math teacher colleagues today after school. I plan to give the lecture to more TOK classes this spring.

I thought I’d share it with the MTBoS as well, so here it is. Feel free to read, comment on, or borrow my materials. I think other IB math teachers would especially benefit:

# Why Calculus still belongs at the top

AP Calculus is often seen as the pinnacle of the high school mathematics curriculum*–or the “summit” of the mountain as Professor Arthur Benjamin calls it. Benjamin gave a compelling TED talk in 2009 making the case that this is the wrong summit and the correct summit should be AP Statistics. The talk is less than 3 minutes, so if you haven’t yet seen it, I encourage you to check it out here and my first blog post about it here.

I love Arthur Benjamin and he makes a lot of good points, but I’d like to supply some counter-points in this post, which I’ve titled “Why Calculus still belongs at the top.”

Full disclosure: I teach AP Calculus and I’ve never taught AP Statistics. However I DO know and love statistics–I just took a grad class in Stat and thoroughly enjoyed it. But I wouldn’t want to teach it to high school students. Here’s why: For high school students, non-Calculus based Statistics seems more like magic than mathematics.

When I teach math I try, to the extent that it’s possible, to never provide unjustified statements or unproven claims. (Of course this is not always possible, but I try.) For example, in my Algebra 2 class I derive the quadratic formula. In my Precalculus class, I derive all the trig identities we ask the students to know. And in my Calculus class, I “derive” the various rules for differentiation or integration. I often tell the students that copying down the proof is completely optional and the proof will not be tested–“just sit back and relax and enjoy the show!”

But such an approach to mathematical thinking can rarely be applied in a high school Statistics course because statistics rests SO heavily on calculus and so the ‘proofs’ are inaccessible. I’d like to make a startling claim: I claim that 99.99% of AP Statistics students and 99% of AP Statistics teachers cannot even give the function-rule for the normal distribution.

Image used by permission from Interactive Mathematics. Click the image to go there and learn all about the normal distribution!

In what other math class would you talk about a function ALL YEAR and never give its rule? The normal distribution is the centerpiece (literally!) of the Statistics curriculum. And yet we never even tell them its equation nor where it comes from. That should be some kind of mathematical crime. We might as well call the normal distribution the “magic curve.”

Furthermore, a kid can go through all of AP Statistics and never think about integration, even though that’s what their doing every single time they look up values in those stat tables in the back of the book.

I agree that statistics is more applicable to the ‘real world’ of most of these kids’ lives, and on that point, I agree with Arthur Benjamin. But I would argue that application is not the most important reason we teach mathematics. The most important thing we teach kids is mathematical thinking.

The same thing is true of every other high school subject area. Will most students ever need to know particular historical facts? No. We aim to train them in historical thinking. What about balancing an equation in Chemistry? Or dissecting a frog? They’ll likely never do that again, but they’re getting a taste of what scientists do and how they think. In general, two of our aims as secondary educators are to (1) provide a liberal education for students so they can engage in intelligent conversations with all people in all subject areas in the adult world and (2) to open doors for a future career in a more narrow field of study.

So where does statistics fit into all of this? I think it’s still worth teaching, of course. It’s very important and has real world meaning. But the value I find in teaching statistics feels VERY different than the value I find in teaching every other math class. Like I said before, it feels a bit more like magic than mathematics.**

I argue that Calculus does a better job of training students to think mathematically.

But maybe that’s just how I feel. Maybe we can get Art Benjamin to stop by and weigh in!

.

….

*In our school, and in many other schools, we actually have many more class options beyond Calculus for those students who take Calculus in their Sophomore or Junior year and want to be exposed to even more math.

** Many parts of basic Probability and Statistics can be taught with explanations and proof, namely the discrete portions–and this should be done. But working with continuous distributions can only be justified using Calculus.

# 87th Carnival of Mathematics

The 87th Carnival of Mathematics has arrived!! Here’s a simple computation for you:

What is the sum of the squares of the first four prime numbers?

That’s right, it’s

Good job. Now, onto the carnival. This is my first carnival, so hopefully I’ll do all these posts justice. We had lots of great submissions, so I encourage you to read through this with a fine-toothed comb. Enjoy!

# Rants

Here’s a post (rant) from Andrew Taylor regarding the coverage from the BBC and the Guardian on the Supermoon that occurred in March 2011. NASA reports the moon as being 14% larger and 30% brighter, but Andrew disagrees. Go check out the post, and join the conversation.

Have you ever heard someone abuse the phrase “exponentially better”? I know I have. One incorrect usage occurs when someone makes the claim that something is “exponentially better” based on only two data points. Rebecka Peterson has some words for you here, if you’re the kind of person who says this!

# Physics and Science-flavored

Frederick Koh submitted Problem 19: Mechanics of Two Separate Particles Projected Vertically From Different Heights to the carnival. It’s a fun projectile motion question which would be appropriate for a Precalculus classroom (or Calculus). I like the problem, and I think my students would like it too.

John D. Cook highlights a question you’ve probably heard before: Should you walk or run in the rain? An active discussion is going on in the comments section. It’s been discussed in many other places too, including twice on Mythbusters. (I feel like I read an article in an MAA or NCTM magazine on this topic once, as well. Anyone remember that?)

Murray Bourne submitted this awesome post about modeling fish stocks. Murray says his post is an “attempt to make mathematical modeling a bit less scary than in most textbooks.” I think he achieves his goal in this thorough development of a mathematical model for sustainable fisheries (see the graph above for one of his later examples of a stable solution under lots of interesting constraints). If I taught differential equations, I would  absolutely use his examples.

Last week I highlighted this new physics blog, but I wanted to point you there again: Go check out Five Minute Physics! A few more videos have been posted, and also a link to this great video about the physics of a dropping Slinky (see above).

# Statistics, Probability, & Combinatorics

Mr. Gregg analyzes European football using the Poisson distribution in his post, The Table Never Lies. I liked how much real world data he brought to the discussion. And I also liked that he admitted when his model worked and when it didn’t–he lets you in on his own mathematical thought process. As you read this post, you too will find yourself thinking out loud with Mr. Gregg.

Card Colm has written this excellent post that will help you wrap your mind around the number of arrangements of cards in a deck. It’s a simple high school-level topic, but he really puts it into perspective:

the number of possible ways to order or permute just the hearts is 13!=6,227,020,800. That’s about what the world population was in 2002. So back then if somebody could have made a list of all possible ways to arrange those 13 cards in a row, there would have been enough people on the planet for everyone to get one such permutation.

I think it’s good to remind ourselves that whenever we shuffle the deck, we can be almost certain that our arrangement has never been created before (since  $52!\approx 8\times 10^{67}$  arrangements are possible). Wow!

Alex is looking for “random” numbers by simply asking people. Go contribute your own “random” number here. Can’t wait to see the results!

Quick! Think of an example of a real-world bimodal distribution! Maybe you have a ready example if you teach stat, but here’s a really nice example from Michael Lugo: Book prices. Before you read his post, you should make a guess as to why the book prices he looked at are bimodal (see histogram above).

# Philosophy and History of Math

Mike Thayer just attended the NCTM conference in Philadelphia and brings us a thoughtful reaction in his post, The Learning of Mathematics in the 21st Century. Mike wrote this post because he had been left with “an ambivalent feeling” after the conference. He wants to “engage others in mathematics education in discussions about ways to improve what we do outside of the frameworks that are being imposed on us by those outside of our field.” As a secondary educator, I agree with Mike completely and really enjoyed his post. Mike isn’t satisfied with where education is going. In his post, he writes, “We are leaping ahead into the unknown with new educational models, and we never took the time to get the old ones right.”

Edmund Harriss asks Have we ever lost mathematics? He gives a nice recap of foundational crises throughout the history of mathematics, and wonders, ultimately, if we’ve actually lost any mathematics. There’s also a short discussion in the comments section which I recommend to you.

Peter Woit reflects on 25 Years of Topological Quantum Field Theory. Maybe if you have degree in math and physics you might appreciate this post. It went over my head a bit, I’m afraid!

# Book Reviews

In this post, Matt reviews a 2012 book release, Who’s #1, by Amy N. Langville and Carl D. Meyer. The book discusses the ranking systems used by popular websites like Amazon or Netflix. His review is thorough and balanced–Matt has good things to say about the book, but also delivers a bit of criticism for their treatment of Arrow’s Impossibility Theorem. Thanks for this contribution, Matt! [edit: Thanks MATT!]

Shecky R reviews of David Berlinski’s 2011 book, One, Two Three…Absolutely Elementary mathematics in his Brief Berlinski Book Blurb. I’m not sure his review is an *endorsement*. It sounds like a book that only a small eclectic crowd will enjoy.

# Uncategorized…

Peter Rowlett submitted this post about linear programming and provides a link to an interactive problems solving environment.

Peter Rowlett also weighs in on the recent news about a German high school boy who has (reportedly) solved an open problem. Many news sources have picked up on this, and I’ve only followed the news from a distance. So I was grateful for Peter’s comments–he questions the validity of the news in his recent post “Has schoolboy genius solved problems that baffled mathematicians for centuries?” His comments in another recent post are perhaps even more important though–Peter encourages us to think of ways we can remind our students that lots of open problems still exist, and “Mathematics is an evolving, alive subject to which you could contribute.”

Jess Hawke IS *Heptagrin Girl*

Here’s a fun-loving post about Heptagrins, and all the crazy craft projects you can do with them. Don’t know what a Heptagrin is? Neither did I. But go check out Jess Hawke’s post and she’ll tell you all about them!

Any Lewis Carroll lovers out there? Julia Collins submitted a post entitled “A Night in Wonderland” about a Lewis Carroll-themed night at the National Museum of Scotland. She writes, “Other people might be interested in the ideas we had and also hearing about what a snark is and why it’s still important.” When you check out this post, you’ll not only learn about snarks but also about creating projective planes with your sewing machine. Cool!

Mike Croucher over at Walking Randomly gives a shout out to the free software Octave, which is a MATLAB replacement. Check out his post, here. MATLAB is ridiculously expensive, and so the world needs an alternative like Octave. He provides links to the Kickstarter campaign–and Mike has backed the project himself. I too believe in Octave. I’ve used it a few times for my grad work and I’ve been very grateful for a free alternative to MATLAB.

# The End

Okay, that’s it for the 87th Carnival of Mathematics. Hope you enjoyed all the posts! Sorry it took me a couple days to post it–there was a lot to digest :-).

If you missed the previous carnival (#86), you can find it here. The next carnival (#88) will be hosted by Christian at checkmyworking.com. For a complete listing of all the carnivals, and more information & FAQ about the carnivals, follow this link.

Cheers!

# Pi R Squared

[Another guest blog entry by Dr. Gene Chase.]

You’ve heard the old joke.

Teacher: Pi R Squared.
Student: No, teacher, pie are round. Cornbread are square.

The purpose of this Pi Day note two days early is to explain why $\pi$ is indeed a square.

The customary definition of $\pi$ is the ratio of a circle’s circumference to its diameter. But mathematicians are accustomed to defining things in two different ways, and then showing that the two ways are in fact equivalent. Here’s a first example appropriate for my story.

How do we define the function $\exp(z) = e^z$ for complex numbers z? First we define $a^b$ for integers $a > 0$ and b. Then we extend it to rationals, and finally, by requiring that the resulting function be continuous, to reals. As it happens, the resulting function is infinitely differentiable. In fact, if we choose a to be e, the $\lim_{n\to\infty} (1 + \frac{1}{n})^n \,$ not only is $e^x$ infinitely differentiable, but it is its own derivative. Can we extend the definition of $\exp(z) \,$ to complex numbers z? Yes, in an infinite number of ways, but if we want the reasonable assumption that it too is infinitely differentiable, then there is only one way to extend $\exp(z)$.

That’s amazing!

The resulting function $\exp(z)$ obeys all the expected laws of exponents. And we can prove that the function when restricted to reals has an inverse for the entire real number line. So define a new function $\ln(x)$ which is the inverse of $\exp(x)$. Then we can prove that $\ln(x)$ obeys all of the laws of logarithms.

Or we could proceed in the reverse order instead. Define $\ln(x) = \int_1^x \frac{1}{t} dt$. It has an inverse, which we can call $\exp(x)$, and then we can define $a^b$ as $\exp ( b \ln (a))$. We can prove that $\exp(1)$ is the above-mentioned limit, and when this new definition of $a^b\,$ is restricted to the appropriate rationals or reals or integers, we have the same function of two variables a and b as above. $\ln(x)$ can also be extended to the complex domain, except the result is no longer a function, or rather it is a function from complex numbers to sets of complex numbers. All the numbers in a given set differ by some integer multiple of

[1] $2 \pi i$.

With either definition of $\exp(z)$, Euler’s famous formula can be proven:

[2] $\exp(\pi i) + 1 = 0$.

But where’s the circle that gives rise to the $\pi$ in [1] and [2]? The answer is easy to see if we establish another formula to which Euler’s name is also attached:

[3] $\exp(i z) = \sin (z) + i \cos(z)$.

Thus complex numbers unify two of the most frequent natural phenomena: exponential growth and periodic motion. In the complex plane, the exponential is a circular function.

That’s amazing!

Here’s a second example appropriate for my story. Define the function on integers $\text{factorial (n)} = n!$ in the usual way. Now ask whether there is a way to extend it to (some of) the complex plane, so that we can take the factorial of a complex number. There is, and as with $\exp(z)$, there is only one way if we require that the resulting function be infinitely differentiable. The resulting function is (almost) called Gamma, written $\Gamma$. I say almost, because the function that we want has the following property:

[4] $\Gamma (z - 1) = z!$

Obviously, we’d like to stay away from negative values on the real line, where the meaning of (–5)! is not at all clear. In fact, if we stay in the half-plane where complex numbers have a positive real part, we can define $\Gamma$ by an integral which agrees with the factorial function for positive integer values of z:

[5] $\Gamma (z) = \int_0^\infty \exp(-t) t^{z - 1} dt$.

If we evaluate $\Gamma (\frac{1}{2})$ we discover that the result is $\sqrt{\pi}$.

In other words,

[6] $\pi = \Gamma(\frac{1}{2})^2$.

Pi are indeed square.

That’s amazing!

I suspect that the $\pi$ arises because there is an exponential function in the definition of $\Gamma$, but in other problems involving $\pi$ it’s harder to find where the $\pi$ comes from. Euler’s Basel problem is a good case in point. There are many good proofs that

$1 + \frac{1}{2^2} + \frac{1}{3^2} + \frac{1}{4^2} + ... = \frac{\pi^2}{6}$

One proof uses trigonometric series, so you shouldn’t be surprised that $\pi$ shows up there too.

$\pi$ comes up in probability in Buffon’s needle problem because the needle is free to land with any angle from north.

Can you think of a place where $\pi$ occurs, but you cannot find the circle?

George Lakoff and Rafael Núñez have written a controversial book that bolsters the argument that you won’t find any such examples: Where Mathematics Comes From. But Platonist that I am, I maintain that there might be such places.

# The Important Theorems Are the Beautiful Ones

Dr. Gene Chase guest blog author here again.

What makes a math theorem important?

The usual answer is that it is either beautiful or useful. If like me you think that being useful is a beautiful thing, then important theorems are the beautiful ones.

But what makes a theorem beautiful? For example, why is the Theorem of Pythagoras widely regarded as beautiful:  and a, b, and c are not 0 if and only if a, b, and c are the sides of a right triangle? (OK, break into small groups and discuss this among yourselves! An answer appears at the bottom of this post.)

But the theorem 1223334444 = 1223334443 + 1 is not beautiful, won’t you agree?

If the theorem is geometric, we can appeal to visual beauty. For example, three circles pairwise tangent have a beautiful property that is animated here.

But beautiful theorems do not have to be geometric. Numbers are beautiful. For example, Euclid’s theorem that there are an infinity of primes is beautiful. No one has been able to draw a beautiful picture about that, although people have tried from astronomer and mathematician Eratosthenes in 200 BC to science fiction writer and mathematician Stanislaw Ulam in 1963.

For $15 you can have a mathematical theorem named after you. But I can guarantee that it won’t be beautiful. So if you want a theorem named after you, give Mr. Chase the$15 instead and he’ll find one for you. Don’t use 1223334444 = 1223334443 + 1. I claim that as “Dr. Gene Chase’s theorem.”

Answer to discussion question above: Most folks say that a beautiful theorem has to be “deep,” which is just a metaphor for “having many connections to many other things.” For example, the Theorem of Pythagoras has to do with areas, not squares specifically. The semicircle on the hypotenuse of a right triangle has an area equal to the sum of the areas of the semicircles on the adjacent sides. And so for any three similar figures.

Do you remember the joy that you feel when you first learned that two of your friends are also friends of each other? That’s the joy that a mathematician feels when she discovers that the Theorem of Pythagoras and the Theorem of Euclid are intimate with each other. But I’ll leave that connection to another post.

Math is about surprising connections. Which is to say, it’s about beauty.