Archive for March, 2006

The simplest working compiler!
Saturday, March 18th, 2006

比较弱智,呵呵。
源程序如下:

class Program {
static int i;
static int j;
static boolean b;

static void Main() {
i = 3 * 5;
j = i + 10;
func();
print(j);
}
static void func() {
print(i);
}
}

翻译成汇编码:

.section    .rodata
format:
.string    "%dn"
.text
.globl main
.type    main, @function
main:
pushl    %ebp
movl    %esp, %ebp
subl    $8, %esp
movl    $3, %eax
movl    %eax, %ebx
movl    $5, %eax
imull    %ebx, %eax
movl    %eax, i
movl    i, %eax
movl    %eax, %ebx
movl    $10, %eax
addl    %ebx, %eax
movl    %eax, j
call    func
movl    j, %eax
movl    %eax, 4(%esp)
movl    $format, (%esp)
call    printf
leave
ret
.globl func
.type    func, @function
func:
pushl    %ebp
movl    %esp, %ebp
subl    $8, %esp
movl    i, %eax
movl    %eax, 4(%esp)
movl    $format, (%esp)
call    printf
leave
ret
.local    i
.comm    i, 4, 4
.local    j
.comm    j, 4, 4
.local    b
.comm    b, 4, 4
GAS编译的第一个汇编小程序
Thursday, March 16th, 2006

在Linux里写了第一个汇编小程序,呵呵,输入两个数字,计算出结果并显示。
保存为example.s,
# gcc -o example example.s

.section	.rodata
prompt:
.string	"Please input an integer:"
format:
.string	"%d"
oform:
.string "%d + %d = %dn"
.text
.globl main
.type	main, @function
main:
pushl	%ebp
movl	%esp, %ebp
subl	$32, %esp
andl	$-16, %esp
movl	$0, %eax
subl	%eax, %esp	;Instructions above have nothing meaningful.

movl	$prompt, (%esp)
call	printf		;Prompt for the first integer.
movl	%esp, %eax
addl	$20, %eax
movl	%eax, 4(%esp)
movl	$format, (%esp)
call	scanf		;Accept the first input.
movl	$prompt, (%esp)
call	printf		;Prompt for the second.
movl	%esp, %eax
addl	$16, %eax
movl	%eax, 4(%esp)
movl	$format, (%esp)
call	scanf		;Accept the second.
movl	16(%esp), %eax
addl	20(%esp), %eax	;Compute the sum of two integers
movl	%eax, 12(%esp)	;sum
movl	16(%esp), %eax
movl	%eax, 8(%esp)	;num2
movl	20(%esp), %eax
movl	%eax, 4(%esp)	;num1
movl	$oform, (%esp)
call	printf		;printf("%d + %d = %dn", num1, num2, sum);
leave
ret
AT&T assembly syntax
Sunday, March 12th, 2006

记得以前是弄过Windows里的汇编的,还调用了printf。现在不知道为什么不会弄了,只好转入Linux。现转载一篇AT&T语法格式的介绍(原文http://www.delorie.com/djgpp/doc/brennan/brennan_att_inline_djgpp.html):
# Register naming:
Register names are prefixed with “%”. To reference eax:

AT&T: %eax
Intel: eax

# Source/Destination Ordering:
In AT&T syntax (which is the UNIX standard, BTW) the source is always on the left, and the destination is always on the right.
So let’s load ebx with the value in eax:

AT&T: movl %eax, %ebx
Intel: mov ebx, eax

# Constant value/immediate value format:
You must prefix all constant/immediate values with “$”.
Let’s load eax with the address of the “C” variable booga, which is static.

AT&T: movl $_booga, %eax
Intel: mov eax, _booga

Now let’s load ebx with 0xd00d:

AT&T: movl $0xd00d, %ebx
Intel: mov ebx, d00dh

# Operator size specification:
You must suffix the instruction with one of b, w, or l to specify the width of the destination register as a byte, word or longword. If you omit this, GAS (GNU assembler) will attempt to guess. You don’t want GAS to guess, and guess wrong! Don’t forget it.

AT&T: movw %ax, %bx
Intel: mov bx, ax

The equivalent forms for Intel is byte ptr, word ptr, and dword ptr, but that is for when you are…
# Referencing memory:
DJGPP uses 386-protected mode, so you can forget all that real-mode addressing junk, including the restrictions on which register has what default segment, which registers can be base or index pointers. Now, we just get 6 general purpose registers. (7 if you use ebp, but be sure to restore it yourself or compile with -fomit-frame-pointer.)
Here is the canonical format for 32-bit addressing:

AT&T: immed32(basepointer,indexpointer,indexscale)
Intel: [basepointer + indexpointer*indexscale + immed32]

You could think of the formula to calculate the address as:

immed32 + basepointer + indexpointer * indexscale

You don’t have to use all those fields, but you do have to have at least 1 of immed32, basepointer and you MUST add the size suffix to the operator!
Let’s see some simple forms of memory addressing:

* Addressing a particular C variable:

AT&T: _booga
Intel: [_booga]

Note: the underscore (”_”) is how you get at static (global) C variables from assembler. This only works with global variables. Otherwise, you can use extended asm to have variables preloaded into registers for you. I address that farther down.

* Addressing what a register points to:

AT&T: (%eax)
Intel: [eax]

* Addressing a variable offset by a value in a register:

AT&T: _variable(%eax)
Intel: [eax + _variable]

* Addressing a value in an array of integers (scaling up by 4):

AT&T: _array(,%eax,4)
Intel: [eax*4 + array]

* You can also do offsets with the immediate value:

C code: *(p+1) where p is a char *
AT&T: 1(%eax) where eax has the value of p
Intel: [eax + 1]

* You can do some simple math on the immediate value:

AT&T: _struct_pointer+8

I assume you can do that with Intel format as well.

* Addressing a particular char in an array of 8-character records:
eax holds the number of the record desired. ebx has the wanted char’s offset within the record.

AT&T: _array(%ebx,%eax,8)
Intel: [ebx + eax*8 + _array]

Whew. Hopefully that covers all the addressing you’ll need to do. As a note, you can put esp into the address, but only as the base register.

锻炼身体,好好学习
Thursday, March 9th, 2006

上大学后体质越来越差了,上学期竟得了气胸,唉。
最近决定每天早上打会篮球,昨天早上去了,中午没睡觉,结果今天早上没起来,中等补打了会,呵呵。坚持下去,身体会好起来的。
记得高中的时候,在家里摸那个房梁,可以超半截手指的,这次回家只摸到了一次,后来再也摸不到了。。。