Beginning x64 Assembly Programming Errata
I’ve recently completed the book Beginning x64 Assembly Programming Errata.
The book has several errors, at least a couple of which are significant; since there is no official errata, I’m publishing my findings.
- Page 68: Addressing forms
- Page 156: Using objdump
- Page 160: Working with I/O
- Page 206: Moving Strings
- Page 217: Using cpuid
- Page 328: Matrix Print: printm4x4
- Page 384: Using More Than Four Arguments
This book left me very conflicted. I was enthusiastic at the beginning, but I’ve found its production to be very unprofessional.
First, Apress didn’t publish an errata (besides a small file with two corrections in the book’s companion repository). One of the errors is also, very amusingly, caused by improper typography.
Second, the authors don’t seem to take the subject seriously, either:
We have carefully written and tested the code used in this book. However, if there are any typos in the text or bugs in the programs, we do not take any responsibility. We blame them on our two cats, who love to walk over our keyboard while we are typing.
It’s worrying that two errors are conceptual (the explanations given are fundamentally wrong); after finishing the book I’m now questioning the quality of what I’ve learned.
Errors are a fact of life, so there’s nothing wrong, but since reader will consume time and hair on them (I did), the matter should be addressed; if a production company doesn’t want to spend resources, at a minimum, they could setup a web page were readers can submit their findings (Mannings adopts this approach, for example).
But now, to the errors!
Page 68: Addressing forms
I find this error very funny. In the following listing:
mov rax, text1+1 ;load second character in rax lea rax, [text1+1] ;load second character in rax
the operations actually load the address of the second character in rax.
The funny part is that the text is actually there: copy/pasting reveals the missing text (
address) is there, but due to a typographic error in the PDF, it’s not visible.
Page 156: Using objdump
This is one the two conceptual errors.
From page 156:
The assembler took the liberty to change the sal instruction into shl, and that is for performance reasons.
The two instructions are exactly the same: they’re actually one; therefore, the explanation of why
sal is turned into
shl is baseless.
Consequently, also the statement that follows the previous:
As you remember from Chapter 16 on shifting instructions, this can be done without any problem in most cases.
is not exact; the change
shl can be done without any problem in any case, not in most cases.
Page 160: Working with I/O
In the following listing:
reads: push rbp mov rbp, rsp ; rsi contains address of the inputbuffer ; rdi contains length of the inputbuffer mov rax, 0 ; 0 = read mov rdi, 1 ; 1 = stdin syscall leave ret
the length of
inputbuffer is in rdx, not rdi.
Page 206: Moving Strings
This is the other conceptual error, which I find alarming; it is also very interesting.
In page 206, there is a routine to print a string in reverse:
;reverse copy my_string to other_string prnt string6,40 mov rax, 48 ;clear other_string mov rdi,other_string mov rcx, length rep stosb lea rsi,[my_string+length-4] lea rdi,[other_string+length] mov rcx, 27 ;copy only 27-1 characters std ;std sets DF, cld clears DF rep movsb prnt other_string,length leave ret
The companion repository has an additional error in the comment; it reads “copy only 10 characters”.
The paper version reads as above; since the string consists of the alphabet, it’s intuitive that the loops count should be 26 instead of 27.
However, the authors don’t notice this error, and in the following page, they give another baseless explanation to support the value 27:
Why do we put 27 in rcx when there are only 26 characters? It turns out that rep decreases rcx by 1 before anything else in the loop. You can verify that with a debugger such as SASM.
Anybody who really tries the routine in a debugger (I did) will find that it is incorrectly copying one byte more than it should (the first copied). As a consequence, also rsi and rdi should be decreased by one.
Something that I find mystyfying is that the authors themselves copy the specification of the
WHILE CountReg =/ 0 DO Service pending interrupts (if any); Execute associated string instruction; CountReg ← (CountReg – 1); IF CountReg = 0 THEN exit WHILE loop; FI; IF (Repeat prefix is REPZ or REPE) and (ZF = 0) or (Repeat prefix is REPNZ or REPNE) and (ZF = 1) THEN exit WHILE loop; FI; OD;
This is in conflict with their statement (the associated operation is performed before rcx is decreased).
I find amusing that this bug is hidden (this is the likely reason why the authors didn’t notice the bug(s)) by the fact that the
prnt routine takes the string length as argument, so copying any text before or after the correct locations, doesn’t yield any visible effect.
Above all, this bug leads to a fundamental reflection. It is an off-by-one error - a very famous type - which shows how difficult and utterly fragile Assembly programming is; so much, that the error found its way even in a book written by experienced programmers.
Page 217: Using cpuid
In the following listing:
ssse3: test ecx,9h ;test bit 0 (SSE 3) jz sse41 ;SSE 3 available
the correct values are:
test ecx,200h ; test bit 9 (SSE 3)
Page 328: Matrix Print: printm4x4
In the following reference (emphasis mine):
To align the stack on a 16-byte boundary, we cannot use the trick with the and instruction from Chapter 16.
the trick is actually in Chapter 15 (page 125).
Page 384: Using More Than Four Arguments
The following listing shows how to perform a Windows call with more than four arguments:
sub rsp, 8 mov rcx, fmt mov rdx, first mov r8, second mov r9, third push tenth push ninth push eighth push seventh push sixth push fifth push fourth sub rsp, 32 ; shadow space call printf add rsp, 32 + 8
However, the stack point reset following the call is not accounting for the (7) pushes; the correct reset is:
add rsp, 32 + 56 + 8 ; 56 = 7 * 8
In the alternative call structure, on pages 385/386, the stack pointer is correctly reset, by adding the value (32 + 56 + 8).
I’m not sure if I should suggest this book or not.
For a casual user who wants a fun (!) read, it may be an effective book. On the other hand, motivated readers who wish quality knowledge, should definitely consider The Art of 64-Bit Assembly, written by a veteran Assembly programmer (although, sadly, it’s based on MASM/Windows).
Happy optimizing 😃