Proving identities – what’s your philosophy?

What happens in your classroom when you give students the following task?

Prove 1+\frac{1}{\cos{\theta}}=\frac{\tan^2{\theta}}{\sec{\theta}-1}.

Sometimes the command is Verify or Show instead of Prove, but the intent is the same.

kreispuzzel-1713170_960_720

Two non-examples

Here are two ways that a student might work the problem.

Method 1

1+\frac{1}{\cos{\theta}}=\frac{\tan^2{\theta}}{\sec{\theta}-1}

1+\sec{\theta}=\frac{\tan^2{\theta}}{\sec{\theta}-1}

(\sec{\theta}-1)(1+\sec{\theta})=\tan^2{\theta}

\sec^2{\theta}-1=\tan^2{\theta}

\tan^2{\theta}=\tan^2{\theta}

Method 2

1+\frac{1}{\cos{\theta}}=\frac{\tan^2{\theta}}{\sec{\theta}-1}

1+\sec{\theta}=\frac{\sec^2{\theta}-1}{\sec{\theta}-1}

1+\sec{\theta}=\frac{(\sec{\theta}-1)(\sec{\theta}+1)}{\sec{\theta}-1}

\sec{\theta}+1=\sec{\theta}+1

How do you feel about these methods? In my opinion, both methods represent a fundamental misunderstanding of the prompt. Method 1 is especially grotesque, but Method 2 also leaves a lot to be desired. Let me explain. And if you think the above methods are perfectly fine, please be patient and hear me out.

This is the crux of the issue:

The prompt was to prove the statement. But if the first line of our work is the very thing we’re out to prove, then we are already assuming the thing we want to prove. We’re Begging the Question.

It’s as if someone demands,

“Prove Statement X, please!”

and we reply,

“Well, let’s first start by assuming that Statement X is true.”

This is nonsense.

What went wrong?

So what is the proper way to engage this proof? Let’s roll back a bit.

The error in these approaches seems to stem from a desire to perform algebraic operations on both sides of an equation in the same way that you might if you were solving an equation.

When we “do algebra” and write Equation B below another Equation A without any words, we always mean that Equation A implies Equation B. That is, when we write

Equation A

Equation B

Equation C

etc…

we mean that Equation C follows from Equation B, which follows from Equation A.

Some might claim that each line should be equivalent to the last. But, again, when we “do algebra” by performing algebraic manipulations to both sides of an equation to transform it from equation A into equation B, we always mean A\Rightarrow B, we don’t mean A\iff B. Take, for example, the following algebra which results in an extraneous solution:

\sqrt{x+2}=x

(\sqrt{x+2})^2=x^2

x+2=x^2

0=x^2-x-2

0=(x-2)(x+1)

x=2 \text{ or } x=-1

In this example, each line follows from the previous, however reversing the logic doesn’t work. But we accept that this is the usual way we do algebra (A\Rightarrow B\Rightarrow C\Rightarrow \cdots). Here the last line doesn’t hold because only one solution satisfies the original equation (x=2). Remember that our logic is still flawless, though. Our logic just says that IF \sqrt{x+2}=x for a given xTHEN (\sqrt{x+2})^2=x^2.

As we move through the algebra line by line, we either preserve the solution set or increase its size. In the case above, the solution set for the original equation is {2}, and as we go to line 2 and beyond, the solution set is {2,-1}.

For more, James Tanton has a nice article about extraneous solutions and why they arise, which I highly recommend.

So if this is the universal way we interpret algebraic work, which is what I argue, then it is wrong to construct an argument of the form A\Rightarrow B\Rightarrow C in order to prove statement A is true from premise C. The argument begs the question.

Both Method 1 and Method 2 make this mistake.

 

How does a proof go again?

I want to actually make a more general statement. The argument I gave above regarding how we “do algebra” is actually how we present any sort of deductive argument. We always present such an argument in order, where later statements are supported by earlier statements.

ANY time we see a sequence of statements (not just equations) A, B, C that is being put forward as a proof, if logical connectives are missing, the mathematical community agrees that “\Rightarrow” is the missing logical connection.

That is, if we see the proof A,B,C as a proof of statement C from premise A, we assume that the argument really means A\Rightarrow B\Rightarrow C.

This is usually the interpretation in the typical two-column proof, as well. We just provide the next step with a supporting theorem/definition/axiom, but we don’t also go out of our way to say “oh, and line #7 follows from the previous lines.”

Example: Given a non-empty set E with lower bound a and upper bound b, show that a\leq b.

1. E is non-empty and a and b are lower and upper bounds for E. (given)
2. Set E contains at least one element x. (definition of non-empty)
3. a\leq x and x\leq b. (definitions of lower and upper bound)
4. a\leq b. (transitive property of inequality)

Notice I never say that one line follows from the next. And also notice that it would be a mistake to interpret the logical connectives as biconditional.

The path of righteousness

I encourage my students to work with only ONE side of the expression and manipulate it independently, in its own little dark box, and when it comes out into the light, if it looks the same as the other side, you’ve proved the equivalence of the expressions.

For example, to show that \log\left(\frac{1}{t-2}\right)-\log\left(\frac{10}{t}\right)=-1+\log\left(\frac{t}{t-2}\right) for t>2, I would expect this kind of work for “full credit”:

\text{LHS }=\log\left(\frac{1}{t-2}\right)-\log\left(\frac{10}{t}\right)

=-\log(t-2)-\log(10)+\log(t)

=-\log(10)+\log(t)-\log(t-2)

= -1 + \log\left(\frac{t}{t-2}\right)

=\text{ RHS}

Interestingly, I WOULD also accept an argument of the form A\iff B\iff C as justification for conclusion A from premise C, but I would want a student to say “A is true if and only if B is true, which is true if and only if C is true.” Even though it provides a valid proof, I discourage students from using this somewhat cumbersome construction.

So let’s return to the original problem and show a few ways a student could do it correctly.

Three examples

Method A – A direct proof by manipulating only one side

\text{LHS}=\frac{\tan^2{\theta}}{\sec{\theta}-1}

=\frac{\sec^2{\theta}-1}{\sec{\theta}-1}

=\frac{(\sec{\theta}-1)(\sec{\theta}+1)}{\sec{\theta}-1}

=\sec{\theta}+1

=\text{RHS}

Method B – A proof starting with a known equality

\tan^2{\theta}=\tan^2{\theta}

\sec^2{\theta}-1=\tan^2{\theta}

(\sec{\theta}-1)(1+\sec{\theta})=\tan^2{\theta}

1+\sec{\theta}=\frac{\tan^2{\theta}}{\sec{\theta}-1}

1+\frac{1}{\cos{\theta}}=\frac{\tan^2{\theta}}{\sec{\theta}-1}

Method C – Carefully specifying biconditional implications

1+\frac{1}{\cos{\theta}}=\frac{\tan^2{\theta}}{\sec{\theta}-1}

\text{if and only if}

1+\sec{\theta}=\frac{\tan^2{\theta}}{\sec{\theta}-1}

\text{if and only if}

(\sec{\theta}-1)(1+\sec{\theta})=\tan^2{\theta}

\text{if and only if}

\sec^2{\theta}-1=\tan^2{\theta}

\text{if and only if}

\tan^2{\theta}=\tan^2{\theta}

While all of these are now technically correct, I think we all prefer Method A. The other methods are cool too. But please, please, promise me you won’t use Methods 1 or 2 which I presented in my introduction.

In conclusion

Some might argue that the heavy criticism I’ve leveled against Methods 1 and 2 is nitpicking. But I disagree. This kind of careful reasoning is exactly the business of mathematicians. It’s not good enough to just produce “answers,” our job is to produce good reasoning. Mathematics, remember, is a sense-making discipline.

Thanks for staying with me to the end of this long-winded post. Can you tell I’ve had this conversation with a lot of students over the last ten years?

Further reading

  1. Dave Richeson has a similar rant with a similar thesis here.
  2. This article was originally inspired by this recent post on Patrick Honner’s blog. A bunch of us fought about this topic in the comments, and in the end, Patrick encouraged me to write my own post on the subject. So here I am. Thanks for pushing me in the right direction, Mr. Honner!

 

Improper integrals debate

Here’s a simple Calc 1 problem:

Evaluate  \int_{-1}^1 \frac{1}{x}dx

Before you read any of my own commentary, what do you think? Does this integral converge or diverge?

image from illuminations.nctm.org

Many textbooks would say that it diverges, and I claim this is true as well. But where’s the error in this work?

\int_{-1}^1 \frac{1}{x}dx = \lim_{a\to 0^+}\left[\int_{-1}^{-a}\frac{1}{x}dx+\int_a^{1}\frac{1}{x}dx\right]

= \lim_{a\to 0^+}\left[\ln(a)-\ln(a)\right]=\boxed{0}

Did you catch any shady math? Here’s another equally wrong way of doing it:

\int_{-1}^1 \frac{1}{x}dx = \lim_{a\to 0^+}\left[\int_{-1}^{-a}\frac{1}{x}dx+\int_{2a}^{1}\frac{1}{x}dx\right]

= \lim_{a\to 0^+}\left[\ln(a)-\ln(2a)\right]=\boxed{\ln{\frac{1}{2}}}

This isn’t any more shady than the last example. The change in the bottom limit of integration in the second piece of the integral from a to 2a is not a problem, since 2a approaches zero if does. So why do we get two values that disagree? (In fact, we could concoct an example that evaluates to ANY number you like.)

Okay, finally, here’s the “correct” work:

\int_{-1}^1 \frac{1}{x}dx = \lim_{a\to 0^-}\left[\int_{-1}^{a}\frac{1}{x}dx\right]+\lim_{b\to 0^+}\left[\int_b^{1}\frac{1}{x}dx\right]

= \lim_{a\to 0^-}\left[\ln|a|\right]+\lim_{b\to 0^+}\left[-\ln|b|\right]

But notice that we can’t actually resolve this last expression, since the first limit is \infty and the second is -\infty and the overall expression has the indeterminate form \infty - \infty. In our very first approach, we assumed the limit variables a and b were the same. In the second approach, we let b=2a. But one assumption isn’t necessarily better than another. So we claim the integral diverges.

All that being said, we still intuitively feel like this integral should have the value 0 rather than something else like \ln\frac{1}{2}. For goodness sake, it’s symmetric about the origin!

In fact, that intuition is formalized by Cauchy in what is called the “Cauchy Principal Value,” which for this integral, is 0. [my above example is stolen from this wikipedia article as well]

I’ve been debating about this with my math teacher colleague, Matt Davis, and I’m not sure we’ve come to a satisfying conclusion. Here’s an example we were considering:

If you were to color in under the infinite graph of y=\frac{1}{x} between -1 and 1, and then throw darts at  the graph uniformly, wouldn’t you bet on there being an equal number of darts to the left and right of the y-axis?

Don’t you feel that way too?

(Now there might be another post entirely about measure-theoretic probability!)

What do you think? Anyone want to weigh in? And what should we tell high school students?

.

**For a more in depth treatment of the problem, including a discussion of the construction of Reimann sums, visit this nice thread on physicsforums.com.

Great NCTM problem

Yesterday I presented this problem from NCTM’s facebook page:

Solve for all real values of x:

\frac{(x^2-13x+40)(x^2-13x+42)}{\sqrt{x^2-12x+35}}

We’ve had an active discussion about this problem on their facebook page, and you should go check it out and join the conversation yourself. Go ahead and try it if you haven’t already.

Don’t read below until you’ve tried it for yourself.

Okay, here’s the work. Factor everything.

\frac{(x-8)(x-5)(x-7)(x-6)}{\sqrt{(x-5)(x-7)}}=0

Multiply both sides by the denominator.

(x-8)(x-5)(x-7)(x-6)=0

Use the zero-product property to find x=5,6,7,8. Now check for extraneous solutions and find that x=5 and x=7 give you \frac{0}{0}\neq 0 and x=6 gives x=\frac{0}{\sqrt{-1}}=\frac{0}{i}=0. This last statement DOES actually hold for x=6 but we exclude it because it’s not in the domain of the original expression.The original expression has domain (-\infty,5)\cup(7,\infty). We could have started by identifying this, and right away we would know not to give any solutions outside this domain. The only solution is x=8.

Does this seem problematic? How can we exclude x=6 as a solution when it (a) satisfies the equation and (b) is a real solution? This is why we had such a lively discussion.

But this equation could be replaced with a simpler equation. Here’s one that raises the same issue:

Solve for all real values of x:

\frac{x+5}{\sqrt{x}}=0

Same question: Is x=-5 a solution? Again, notice that it DOES satisfy the equation and it IS a real solution. So why would we exclude it?

Of course a line is drawn in the sand and many people fall on one side and many fall on the other. It’s my impression that high-school math curriculum/textbooks would exclude x=-5 as a solution.

Here’s the big question: What does it mean to “solve for all real values of x“? Let’s consider the above equation within some other contexts:

Solve over \mathbb{Z}:

\frac{x+5}{\sqrt{x}}=0

Is x=-5 a solution? No, I think we must reject it. If we try to check it, we must evaluate \frac{0}{\sqrt{5}} but this expression is undefined because \sqrt{5}\notin\mathbb{Z}. Here’s another one:

Solve over \mathbb{Z}_5:

\frac{x+5}{\sqrt{x}}=0

Is x=-5 a solution? No. Now when we try to check the solution we get \frac{0}{\sqrt{5}}=\frac{0}{\sqrt{0}}=\frac{0}{0} which is undefined.

The point is that, if we go back to the same question and ask about the solutions of \frac{x+5}{\sqrt{x}}=0 over the reals, and we check the solution x=-5, we must evaluate \frac{0}{\sqrt{-5}} which is undefined in the reals.[1]

So in the original NCTM question, we must exclude x=6 for the same reason. When you test this value, you get \frac{0}{i} on the left side which YOU may think is 0. But this is news to the real numbers. The reals have no idea what \frac{0}{i} evaluates to. It may as well be \frac{0}{\text{moose}}.

There’s a lot more to say here, so perhaps I’ll return to this topic another time. Special thanks to all the other folks on facebook who contributed to the discussion, especially my dad who helped me sort some of this out. Feel free to comment below, even if it means bringing a contrary viewpoint to the table.

________________________

[1] This last bit of work, where we fix the equation and change the domain of interest touches on the mathematical concept of algebraic varieties, which I claim to know *nothing* about. If someone comes across this post who can help us out, I’d be grateful! 🙂

Inverse functions and the horizontal line test

I have a small problem with the following language in our Algebra 2 textbook. Do you see my problem?

Horizontal Line Test

If no horizontal line intersects the graph of a function f more than once, then the inverse of f is itself a function.

Here’s the issue: The horizontal line test guarantees that a function is one-to-one. But it does not guarantee that the function is onto. Both are required for a function to be invertible (that is, the function must be bijective).

Example. Consider f:\mathbb{R}\to\mathbb{R} defined f(x)=e^x. This function passes the horizontal line test. Therefore it must have an inverse, right?

Wrong. The mapping given is not invertible, since there are elements of the codomain that are not in the range of f. Instead, consider the function f:\mathbb{R}\to (0,\infty) defined f(x)=e^x. This function is both one-to-one and onto (bijective). Therefore it is invertible, with inverse f^{-1}:(0,\infty)\to\mathbb{R} defined f(x)=\ln{x}.

This might seem like splitting hairs, but I think it’s appropriate to have these conversations with high school students. It’s a matter of precise language, and correct mathematical thinking. I’ve harped on this before, and I’ll harp on it again.