Entries in C (1)

Saturday
May292010

I Found A Compiler Bug (really)

When you read the title if you said to yourself "sure you did" you're probably not alone.  In fact I would probably have the same reaction myself reading that title.  I can remember when I first was learning to program I would occasionally want to blame problems I was having with my code on one of the tools or libraries I was using.  I think this is an experience many programmers have when they start out -- then you learn to blame yourself for problems.  The answer to the question "Why isn't the computer do what I want it to?" is almost always "Because it is doing what I am telling it to do."  However about a year ago, while I was working on the firmware for my USB Password Manager project, I had my first experience where that was not the answer...

So the bug I found wasn't in a big name compiler or anything.  It was in SDCC, a great open source C compiler that targets several different microcontroller architectures.  I used SDCC to develop the firmware for my USB Password Manager project.

I hadn't done a lot of embedded development before, so throughout my work on this project I ran into many normal bugs that were explained by something I was overlooking or misunderstanding about the 8051 architecture or the capabilities of the microcontroller I was using.  So while tracking down this bug I was quite skeptical that the code generation was actually incorrect.  However, after closely examining the assembly code the compiler was generating I was convinced that I was actually telling the compiler (in C code) to do exactly what I wanted the microcontroller to do, but the compiler was generating machine code that told the microcontroller to do something different. 

After I was convinced, I filed a bug report with the SDCC developers.  I was very impressed with how quickly the developers responded, fixing the problem in a little over 3 hours!

So, what fun would this blog entry be if I didn't dissect the technical details of the bug and explain it? I found it very interesting to understand the details of the bug an why it was happening, so I am hoping others will be interested too and learn something from my explanation of it here.

So let's get started, here is the C code that I submitted with my bug report that would reproduce the problem:

 1 typedef unsigned char U8;
 2
 3 __xdata U8 a[] = "111111";
 4 __xdata U8 alen = sizeof(a)-1;
 5 //dummy is just to push b back into an address that causes the bug
 6 __code  U8 dummy[] = "ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd";
 7 __code  U8 b[] = "2222222222222222";
 8 __xdata U8 blen = sizeof(b)-1;
 9 __xdata U8 buffer[128];
10
11 void main (void)
12 {
13     U8 i;
14     for (i=0; i<alen; i++)
15     {
16         buffer[i] = a[i];
17     }
18     // if this loop were indexed from 0 and the body was
19     // buffer[i+64] = b[i] it would work properly
20     for (i=64; i<(64+blen); i++)
21     {
22         buffer[i] = b[i-64];
23     }
24 }

 

The problem occurs in the code generated for the second for loop (lines 20-23).  This loop should copy elements from b (starting at index 0) to buffer (starting at index 64).  What happens is that instead of copying from b[0] it will effectively try to copy data from b[256] (obviously overrunning the length of b).  As the comment says, if an equivalent loop is written that is indexed from 0 the bug doesn't happen.  While this workaround would fix the problem, there is no reason why this C code should not work correctly as is.

So lets dive into a snippet of the assembly code SDCC was generating for this loop.  This assembly code is what is generated for the entire second loop in the C code.  Don't be overwhelmed as I will only be focusing on a small portion of it.

177 00104$:
178 ;       bug.c:20: for (i=64; i<(64+blen); i++)
179         mov    r2,#0x40
180 00105$:
181         mov    dptr,#_blen
182         movx   a,@dptr
183         mov    r3,a
184         mov    r4,#0x00
185         mov    a,#0x40
186         add    a,r3
187         mov    r3,a
188         clr    a
189         addc   a,r4
190         mov    r4,a
191         mov    ar5,r2
192         mov    r6,#0x00
193         clr    c
194         mov    a,r5
195         subb   a,r3
196         mov    a,r6
197         xrl    a,#0x80
198         mov    b,r4
199         xrl    b,#0x80
200         subb   a,b
201         jnc    00109$
202 ;       bug.c:22: buffer[i] = b[i-64];
203         mov    a,r2
204         add    a,#_buffer
205         mov    r3,a
206         clr    a
207         addc   a,#(_buffer >> 8)
208         mov    r4,a
209         mov    a,r2
210
211 ; **BUG**
212 ; a=64 (0x40) and _b is 0x121 (the address of array b in code memory)
213 ; Breaking down the add instruction:
214 ;   * #0xc0+_b = 0x0121+0xC0 = 0x1E1 (computed at assemble-time)
215 ;   * a = a + 0xE1 = 0x40 + 0xE1 = 0x121 (but a is 8-bits so a=0x21 and
216 ;     and the carry bit is set.)
217         add    a,#0xc0+_b
218         mov    dpl,a
219         clr    a
220 ; Since the carry is set by the add above this means that the result of the
221 ; addc instruction is that a = 0x02 (0x00 + 0x01 + Carry). Therefore when the
222 ; movc instruction is executed it will get data from 0x0221 instead of 0x0121.
223         addc   a,#(_b >> 8)
224         mov    dph,a
225         clr    a
226         movc   a,@a+dptr
227         mov    r5,a
228         mov    dpl,r3
229         mov    dph,r4
230         movx   @dptr,a
231 ;       bug.c:20: for (i=64; i<(64+blen); i++)
232         inc    r2
233         sjmp   00105$
234 00109$:
235         ret

 

First a couple of notes about the assembly code:

  • a is the accumulator (it is 8-bits wide)
  • dptr is a data pointer register used to access data memory (it is 16-bits wide)
  • dph is the high-order byte of dptr and dpl is the low-order byte of dptr.
  • _b is the address of the array b from the C code.  In the following discussion I will use _b even when referring to this array using C syntax since the 8051 has a register called b.

The instructions in lines 217-224 are supposed to load the address of _b[i-64] into the dptr register so that the value of _b[i-64] can be retrieved.  For my description here I will assume that we are executing the first iteration of the loop so that i=64.  Note that on line 209 a is set to the value of i (the compiler chose to store the value of i in r2).  Lines 217-224 should set dptr to the address of _b[64-64] or _b[0] (which is 0x121).  However what actually happens is that it is set to 0x221.

The bug originates on line 217 of the listing.  The intention of this line is to set calculate the low-order byte of the address of _b[64-64] and store it in a.  So after line 217 executes a=0x21, which is correct.  However because the addition that is done is 0x40 + 0xE1 the result 0x121 does not fit in 8-bits so a is set to 0x21 and the carry bit is set.

This carry bit causes a problem on line 223 when the addc instruction is executed to set a to the high-order byte of the address of _b[0] (the high-order byte should be 0x01).  Since a is currently 0, the addc is setting a to the high-order byte of _b (0x01) plus any carry.  In this case the carry bit is set, so a is set to 0x02 instead of 0x01.  There are circumstances where the carry would need to be set to calculate the correct high-order byte.  For example if we _b were a larger array and were calculating _b[0xdf] (which in this case would be at address 0x121 + 0xdf = 0x200).  In this example the first add instruction would calculate that the low-order byte is 0x00 and set the carry bit.  The carry bit would cause the high-order byte to incremented by 1 when the addc executes, causing the high byte to be 0x02.  When calculating _b[0xdf] this would be correct, but having the carry set when calculating _b[0] is not correct.

So the code that is incorrect is line 217.  It makes an assumption that the carry will only be set if the addition of the loop index and the address of the array causes a carry.  This seems correct at first, but what is happening in this case is that this single add instruction has two "jobs".  Its "first job" is to calculate the loop index from the expression (i - 64).  Its "second job" is to take that loop index and add it to the address of the array.  The assumption the code makes is really that the carry bit will only be set as a consequence of the "second job".  However in our case it is really the first job that causes the carry bit to be set.

To further illustrate this idea that line 217 performs two "jobs" we can split it up into two lines that each perform only one of the "jobs".  This looks like this:

add a,#0xc0 ; Calculate the index (i-64)
add a,_b ; Add the address and index

 

Since the (i-64) subtraction is done by adding the 2s complement, the first instruction causes the carry bit to be set (0x40 + 0xC0 = 0x100).  However, the second instruction causes the carry bit to be cleared.  So if this code were used in place of line 217 the addc instruction at 223 would result in 0x01 which would be correct.

As it turns out the fix the SDCC developers made was to disable an optimization rule that replaced two add instructions with a single add instruction where the operand is the sum of the operands to the original two instructions.  This replacement would not always be valid because in the case of two add instructions only the carry from the second addition matters to subsequent carry-sensitive instructions.  So there is a subtle difference in using one add instruction versus two.  So the two add instructions I illustrated above is actually what the compiler originally generated and then during an optimization step these two add instructions were replaced by one.

I hope that you found this article interesting and maybe even learned something from it.  I think that digging into this bug and understanding it provides a small illustration of what a C compiler's job is and how much of the details of the underlying machine it really it hides from you.