Archive for the ‘Problem solving’ category

Prove a more general theorem

July 20, 2010

Often when you try to prove something, it is easier prove a stronger statement. Many of the examples of this, are statements that are proven by induction. E.g. define the sequence of functions f_n:\mathbb{R}\to\mathbb{R} by f_1(x)=x^2-1 and f_n(x)=f_{n-1}(f_1(x)) for n\geq 2. Prove that f_n has n+1 real roots (inspired by problem 2 day 1  IMC 2004).

When solving this problem, it helps to find a stronger statement, that is true for all n, because that gives you a stronger induction hypothesis, so that you can make the induction step. But sometimes, this is not enough, and you have to “invent” (or should I say discover?) other cases. Let me give an example of this:

Problem 5 day 1 IMC 1999: Suppose that 2n points of a n\times n grid are marked. Show that for some k > 1 one can select 2k distinct marked points, say a_1,...,a_{2k}, such that a_{2i-1} and a_{2i} are in the same row, a_{2i} and a_{2i+1} are in the same column, \forall i, indices taken mod 2k.

We have 2n point and n rows and n columns, so on average there is 2 points in each row and in each column. If we knew that there was at least two points in each row and column, the problem would be easy: We could just start at one point, go to a point in the same row,  then to a point in the same column, and so on. Continue to do this (you can do that, because there is a least two points in each row and column), until you hit a point you have visited before, and you have a loop (if you both began and ended with a “row-move” or with “column-moves”, you can just “jump over” the first point). But unfortunately, there could be rows or columns with only one point or none points at all. If there is both a row and a column with at most 1 point in each, we can delete this row and column, and we have a problem with a (n-1)\times (n-1) grid and at least 2(n-1) marked points. This gives us a hope, that we can prove the statement by induction. But what if all rows contains 2 marked points, but some columns only contain 1? If we delete a column, we would get a n\times (n-1) grid with 2n-1 marked points, so this suggests that we should try to prove something stronger:

Stronger statement: Let n+m points of a n\times m grid be marked. Now for some k > 1 you can select 2k distinct marked points, a_1,...,a_{2k}, such that a_{2i-1} and a_{2i} are in the same row, a_{2i} and a_{2i+1} are in the same column, \forall i, indices taken mod 2k.

Proof: By induction on n and m. This is true if n=1 or if m=1 because you can’t choose m+1 points in a 1\times m grid. If we have two point in every row and column, we can just use the above proof, if not, we delete a row or column with at most one point, and thereby reduces the problem to a smaller one.

Here is another IMC problem, you can try to solve:

Problem 5 day 1 IMC 2004: Let S be a set of \displaystyle { 2n \choose n } + 1 real numbers, where n is a positive integer. Prove that there exists a monotone sequence \{a_i\}_{1\leq i \leq n+2} \subset S such that
|x_{i+1} - x_1 | \geq 2 | x_i - x_1 | ,\text{for all } i=2,3,\ldots, n+1

Update: Here are two MathOverflow questions about this subject, where you can find more examples

Advertisements

A solution to IMC09 problem 2

May 7, 2010

In this post I will explain how I solved the following problem:

Problem: Let A, B, and C be real square matrices of the same size, and suppose that A is invertible. Prove that if (A-B)C=BA^{-1} then C(A-B)=A^{-1} B.

It is possible to prove this in three lines, but I don’t think that anyone would learn anything about problem solving from just seeing the proof. Instead I want to describe how I solved it. In order to make it easier to read, I decided to write the description in present tense and to include the reader, so instead of “Then I tried to” I write “Now we try to”. I solved this problem about two months ago, so I probably had a lots of thoughts that I have forgotten about now.

At first this problem seems a bit confusing, so we’ll try get a better understanding of the problem. It is a problem from a contest, so unlike when you’re doing research, you don’t have to worry about the possibility that the statement could be false. The assumption is that A, B, and C are real square matrices, so the first problem is to decide if this is important. What if A, B, and C are complex matrices? Or more generally: What if A, B, and C are elements in a ring? (If you don’t know what a ring is, don’t worry. All I’m asking is: Is it possible to prove that statement, by only using algebraic rules like (A+B)C=AC+BC?). It is easy to state the problem for elements in a ring, and it seems unlikely that it should be false for rings, given that it is true for real matrices, so let’s try to work on this conjecture:

Conjecture: Let A, B, and C be elements of a ring R, and suppose that A is invertible. Now (A-B)C=BA^{-1} implies C(A-B)=A^{-1}B.

In other words, you shouldn’t think of the matrices as sets of real numbers arranged in arrays, but only of their algebraic properties. The problem is still a bit difficult to get your head around, so we’ll try to “turn off” some of the difficulties. The three most obvious difficulties are A, B and C, so let’s see what happens if we let one of them be the identity. We get three new problems:

  1. (I-B)C=B \Rightarrow C(I-B)=B
  2. (A-I)C=A^{-1} \Rightarrow C(A-I)=A^{-1}
  3. A-B=BA^{-1} \Rightarrow A-B=A^{-1}B

Proving one or more of these won’t solve the original problem, but it might give an idea of how to solve it. (You could also set two of the three matrices to be equal to I, but the resulting statements are too trivial to be interesting). Let’s look at the first of the three statements:

(I-B)C=B \Rightarrow C(I-B)=B

That is, if (I-B)C=B then I-B and C commute. How can we prove that two elements in a ring commute? Well, if their product is the identity, we know that they are each others inverses, and therefore commute (Edit: This is not true in a general ring, but it is true for matrices. See my comment). Unfortunately the right hand side is B, and if we assume B=I the statement is trivial. So instead we assume B to be invertible and multiply by B^{-1}. Remember that right now we are not trying to give a formal proof of anything, but only to get some ideas of what a proof might look like, so we are free to add this “niceness” –assumptions about B. Assume also that (I-B)C=B. Now we get:

(I-B)CB^{-1}=BB^{-1}=I

That is, I-B is the inverse of CB^{-1}, so they commute and we have:

CB^{-1}(I-B) =I \Rightarrow C(B^{-1}-I)= I \Rightarrow C(I-B)B^{-1}=I \\ \Rightarrow C(I-B)=B

We have now proved the statement in the case where A=I and B is invertible. Now it’s natural to try the same idea in the B=I-case. Without any further assumptions we get:

(A-I)C=A^{-1}\Rightarrow (A-I)CA=I\Rightarrow CA(A-I)=I\Rightarrow \\ C(A^2-A)=I \Rightarrow C(A-I)A=I \Rightarrow  C(A-I)=A^{-1}

As we wanted. The final case is equivalent to showing that if A-B=BA^{-1} then B and A^{-1} commute. We can’t just use the same trick as before, because we don’t know that A and B commute. But we have learned one important lesson from the two other cases: In order to show that two matrices commute, it is useful to find two matrices that are each other’s inverse, and use that they commute. So we add the identity on both sides and move everything else to the left hand side, and look for a factorization:

A-B=BA^{-1}\Rightarrow A-B-BA^{-1}+I=I\Rightarrow \\ (A-B)(A^{-1}+I)=I\Rightarrow  (A^{-1}+I)(A-B)=I\Rightarrow \\I-A^{-1}B+A-B=I\Rightarrow A-B =A^{-1}B

Now that we have solved the three simpler problems, we can look back at the original conjecture.

(A-B)C=BA^{-1}\Rightarrow C(A-B)=A^{-1}B

There are two ways you can use a proof of a simpler case to prove a more general theorem: The first is to use the fact that the simpler case is true. If we make the assumption that C is invertible, this technique is actually useful. We get:

(A-B)C=BA^{-1}\Rightarrow AC-BC=BC(AC)^{-1}\Rightarrow \\AC-BC=(AC)^{-1}BC\Rightarrow  A-B=C^{-1}A^{-1}B\Rightarrow \\ C(A-B)=A^{-1}B

Where we use that A-B=BA^{-1} \Rightarrow A-B=A^{-1}B with AC and BC as A and B. Now we have proved the statement under the assumption that C is invertible, but if we don’t have this assumption, we have to do something else. (We could use a limit argument, like Gowers did in a comment to my last post, but I didn’t think of this trick when I solved the problem).

I mentioned that there are two ways of using a proof of a special case to prove a more general theorem, but I only gave one. The other is one to modify your proof to cover the more general theorem. So let’s look back at the proof that A-B=BA^{-1} \Rightarrow A-B=A^{-1}B, and try to modify the proof to showing that (A-B)C=BA^{-1}\Rightarrow C(A-B)=A^{-1}B. Now the proof almost writes itself:

(A-B)C=BA^{-1}\Rightarrow (A-B)C-BA^{-1}+I=I\Rightarrow \\ (A-B)(A^{-1}+C)=I\Rightarrow (A^{-1}+C)(A-B)=I\Rightarrow \\ I-A^{-1}B+C(A-B)=I\Rightarrow C(A-B) =A^{-1}B

QED.

IMC09 problem 2

May 5, 2010

For a couple of years I have thought about starting a mathematical blog, but I decided not to, because I thought I didn’t have enough time and ideas to keep a high enough post rate. But now and then I have something I want to say, and no place to say it, so I have finally decided to start this blog anyway, but also that I won’t have to post something every month.

This post and the next are partly inspired by these two posts by Gowers, where he tries to explore how people think when they do math, by asking his readers to solve a very simple equation, and explain how they solved it. In this and my next post I will consider a more difficult problem: It is problem 2 from IMC09 (IMC, International Mathematics Competition, is a math completion for university students). The reason I’m using an IMC-problem is that University of Copenhagen is going to participate in IMC for the first time this year, and I hope that talking about how we solve problems will be a good preparation for the contest both for me and for the rest of the team. Still, I hope that this discussion will be of interest to both IMO contestants and to professional mathematicians.

I solved this problem about two months ago, but I thought a lot about what problem solving strategies I used, so I still remember most of my thoughts. I have chosen to write about this particular problem because I noticed that I used many general problem solving strategies, that are good to know. You are welcome to post your thoughts on the problem in a comment, even if you haven’t solved the problem. If you have solved the problem, please try to describe the whole thought process, not just the thoughts that led you to the solution. I will post a description of how I solved it in a couple of days.

The problem is:
Let A, B, and C be real square matrices of the same size, and suppose that A is invertible. Prove that if (A-B)C=BA^{-1} then C(A-B)=A^{-1} B.