A solution to IMC09 problem 2

In this post I will explain how I solved the following problem:

Problem: Let $A$, $B$, and $C$ be real square matrices of the same size, and suppose that $A$ is invertible. Prove that if $(A-B)C=BA^{-1}$ then $C(A-B)=A^{-1} B$.

It is possible to prove this in three lines, but I don’t think that anyone would learn anything about problem solving from just seeing the proof. Instead I want to describe how I solved it. In order to make it easier to read, I decided to write the description in present tense and to include the reader, so instead of “Then I tried to” I write “Now we try to”. I solved this problem about two months ago, so I probably had a lots of thoughts that I have forgotten about now.

At first this problem seems a bit confusing, so we’ll try get a better understanding of the problem. It is a problem from a contest, so unlike when you’re doing research, you don’t have to worry about the possibility that the statement could be false. The assumption is that $A$, $B$, and $C$ are real square matrices, so the first problem is to decide if this is important. What if $A$, $B$, and $C$ are complex matrices? Or more generally: What if $A$, $B$, and $C$ are elements in a ring? (If you don’t know what a ring is, don’t worry. All I’m asking is: Is it possible to prove that statement, by only using algebraic rules like $(A+B)C=AC+BC$?). It is easy to state the problem for elements in a ring, and it seems unlikely that it should be false for rings, given that it is true for real matrices, so let’s try to work on this conjecture:

Conjecture: Let $A$, $B$, and $C$ be elements of a ring $R$, and suppose that $A$ is invertible. Now $(A-B)C=BA^{-1}$ implies $C(A-B)=A^{-1}B$.

In other words, you shouldn’t think of the matrices as sets of real numbers arranged in arrays, but only of their algebraic properties. The problem is still a bit difficult to get your head around, so we’ll try to “turn off” some of the difficulties. The three most obvious difficulties are $A$, $B$ and $C$, so let’s see what happens if we let one of them be the identity. We get three new problems:

1. $(I-B)C=B \Rightarrow C(I-B)=B$
2. $(A-I)C=A^{-1} \Rightarrow C(A-I)=A^{-1}$
3. $A-B=BA^{-1} \Rightarrow A-B=A^{-1}B$

Proving one or more of these won’t solve the original problem, but it might give an idea of how to solve it. (You could also set two of the three matrices to be equal to $I$, but the resulting statements are too trivial to be interesting). Let’s look at the first of the three statements:

$(I-B)C=B \Rightarrow C(I-B)=B$

That is, if $(I-B)C=B$ then $I-B$ and $C$ commute. How can we prove that two elements in a ring commute? Well, if their product is the identity, we know that they are each others inverses, and therefore commute (Edit: This is not true in a general ring, but it is true for matrices. See my comment). Unfortunately the right hand side is $B$, and if we assume $B=I$ the statement is trivial. So instead we assume $B$ to be invertible and multiply by $B^{-1}$. Remember that right now we are not trying to give a formal proof of anything, but only to get some ideas of what a proof might look like, so we are free to add this “niceness” –assumptions about $B$. Assume also that $(I-B)C=B$. Now we get:

$(I-B)CB^{-1}=BB^{-1}=I$

That is, $I-B$ is the inverse of $CB^{-1}$, so they commute and we have:

$CB^{-1}(I-B) =I \Rightarrow C(B^{-1}-I)= I \Rightarrow C(I-B)B^{-1}=I \\ \Rightarrow C(I-B)=B$

We have now proved the statement in the case where $A=I$ and $B$ is invertible. Now it’s natural to try the same idea in the $B=I$-case. Without any further assumptions we get:

$(A-I)C=A^{-1}\Rightarrow (A-I)CA=I\Rightarrow CA(A-I)=I\Rightarrow \\ C(A^2-A)=I \Rightarrow C(A-I)A=I \Rightarrow C(A-I)=A^{-1}$

As we wanted. The final case is equivalent to showing that if $A-B=BA^{-1}$ then $B$ and $A^{-1}$ commute. We can’t just use the same trick as before, because we don’t know that $A$ and $B$ commute. But we have learned one important lesson from the two other cases: In order to show that two matrices commute, it is useful to find two matrices that are each other’s inverse, and use that they commute. So we add the identity on both sides and move everything else to the left hand side, and look for a factorization:

$A-B=BA^{-1}\Rightarrow A-B-BA^{-1}+I=I\Rightarrow \\ (A-B)(A^{-1}+I)=I\Rightarrow (A^{-1}+I)(A-B)=I\Rightarrow \\I-A^{-1}B+A-B=I\Rightarrow A-B =A^{-1}B$

Now that we have solved the three simpler problems, we can look back at the original conjecture.

$(A-B)C=BA^{-1}\Rightarrow C(A-B)=A^{-1}B$

There are two ways you can use a proof of a simpler case to prove a more general theorem: The first is to use the fact that the simpler case is true. If we make the assumption that $C$ is invertible, this technique is actually useful. We get:

$(A-B)C=BA^{-1}\Rightarrow AC-BC=BC(AC)^{-1}\Rightarrow \\AC-BC=(AC)^{-1}BC\Rightarrow A-B=C^{-1}A^{-1}B\Rightarrow \\ C(A-B)=A^{-1}B$

Where we use that $A-B=BA^{-1} \Rightarrow A-B=A^{-1}B$ with $AC$ and $BC$ as $A$ and $B$. Now we have proved the statement under the assumption that $C$ is invertible, but if we don’t have this assumption, we have to do something else. (We could use a limit argument, like Gowers did in a comment to my last post, but I didn’t think of this trick when I solved the problem).

I mentioned that there are two ways of using a proof of a special case to prove a more general theorem, but I only gave one. The other is one to modify your proof to cover the more general theorem. So let’s look back at the proof that $A-B=BA^{-1} \Rightarrow A-B=A^{-1}B$, and try to modify the proof to showing that $(A-B)C=BA^{-1}\Rightarrow C(A-B)=A^{-1}B$. Now the proof almost writes itself:

$(A-B)C=BA^{-1}\Rightarrow (A-B)C-BA^{-1}+I=I\Rightarrow \\ (A-B)(A^{-1}+C)=I\Rightarrow (A^{-1}+C)(A-B)=I\Rightarrow \\ I-A^{-1}B+C(A-B)=I\Rightarrow C(A-B) =A^{-1}B$

QED.

Explore posts in the same categories: Problem solving

Tags: , ,

You can comment below, or link to this permanent URL from your own site.

4 Comments on “A solution to IMC09 problem 2”

1. Jon Says:

Sune,

I really like your blog. For you blog being so awesome, here is an invite to OpenStudy. We’re based in Atlanta, Georgia and a community of students that study math and cs.

http://openstudy.com/signup/star/new/4be86a2373c33a7ff64bc3bb

2. Qiaochu Yuan Says:

Very nice. This is closely related to another famous problem in “noncommutative high school algebra” whose standard heuristic translates into the following. Suppose you live in a ring where power series make sense; in particular, where $(1 - x)^{-1} = 1 + x + x^2 + ...$. Then $(A - B)C = BA^{-1}$ is equivalent to

$C = A^{-1} BA^{-1} + A^{-1} BA^{-1} BA^{-1} + ...$

which you can verify is also equivalent to the desired identity. One can turn this into a proof of the original statement by scaling A appropriately so that the above sequences converge in an operator norm.

The really cool thing is that there is a theorem to the effect that an identity in “noncommutative high school algebra” holds if and only if it holds in power series rings! Unfortunately, I can never remember the appropriate reference for this.

3. I claimed that if A and B are elements in a ring with $AB=I$ then $BA=I$. This is not true in general: It could be that $A$ was only right invertible (Actually I don’t know any rings with such an element. Do you?). But it is true for matrices: Matrices corresponds to linear maps, and if $AB=I$ the map corresponding to $A$ must be surjective. It is map from a finite dimensional space to itself so it must be injective, and thus it has a left-inverse. If $C$ is a left-inverse for $A$ we have $C=CI=CAB=IB=B$, so $B$ is the inverse of $A$.

• Qiaochu Yuan Says:

Take, for example, the ring of endomorphisms of a vector space of countable dimension. With respect to a basis $e_1, e_2, ...$, the linear operator $Se_1 \to 0, Se_{i+1} \to e_i$ is right invertible but not left invertible.