Learn "The Art of Assembly Language" - Chapter 1

07 Sep 2019

Install

Ref:

Download the HLA and unzip to /usr/hla

Add below contents to .bashrc or .zshrc

# For HLA
export PATH="/usr/hla:$PATH"
alias hla='hla -main:_main -l"macosx_version_min 10.9" -l"lSystem" -l"no_pie"'

Then we can use hla -v hello_world.hla to compile out hello_world.o and hello_word files.

We will still get a warning:

ld: warning: The i386 architecture is deprecated for macOS (remove from the Xcode build setting: ARCHS)

We can just ignore it.

Some Basic Declaration

static for contant statement

:= for give a initial value, this is a constant expression which means can’t assign the value of other variables.

static
	i8: int8 := 8;
	i16: int16;
	i32: int32 := -320000;

stdin.get get value and save to variable

stdin.put can get multiple params

// for code comments

boolean bool type, true is 1, false is 0 and save in 1 byte.

static
	BoolVar: 	boolean;
	HasClass: boolean := false;
	IsClear:	boolean := true;

char or literal

static
	c: char;
	LetterA: char := 'A';

CPU Family

Von Neumann Architecture Machine contins three building blocks:

  • CPU - central processing unit
  • Memory
  • I/O - input/output

They are using system bus to comunicate.

80x86 CPU registers has four categories:

  • General-purpse registers
  • Special-purpose application-accessible registers
  • Segment registers
  • Special-purpose kernel-mode registers

Register name:

  • General-purpose
    • Eight 32-bit registers
      • EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP
      • “E-“ means extension
    • Eight 16-bit registers
      • AX, BX, CX, DX, SI, DI, BP, SP
    • Eight 8-bit registers
      • AL, AH, BL, BH, CL, CH, DL, DH

80x86 does not provide 24 independent registers, instead, it overlays the 32-bit registers with 16-bit registers, overlays 16 bit registers with 8-bit registers.

It means, for general-purpose registers, modifying one register may modify as many as three other registers.

Almost all calculations on the 80x86 CPU involve a register

Memory Subsytem

A typical 80x85 processor runnign a modern 32-bit OS can access a maximum of 2^32 different memory location.

Memory [125] := 0;

CPU := Memory [125];

Modern 80x86 processor don’t connect directly to memory. There is a special buffer on the CPU know as cache.

4-byte object’s address is multiples of 4.

2-byte object’s address is multiples of 2.

1-byte object’s address can by any.

Basic Machien Instructions

mov( a, b ):

same with b = a

Technical 80x86 instructions set does not allow both operands to be memory variables.

But HLA, will automatically translate to mov instructions with two-word or double-word memory operands into a pair of instructions that will copy the data from one location to antother.

Actually, the mov is doing copy. Because it doesn’t change the a.

8 bit can only work with 8 bit,

16 bit can only work with 16 bit.

32 bit can only work with 32 bit

add( a, b )
sub( a, b)

same with ` b + a and b - a`

Condition and Conjunction and Disjunction and negation

&& takes precedence over the ||

! operator may only prefix a regiter or bollean variable

while(  ) do
endwhile

for ( Initial_Stmt; Termination_Expression; Post_Body_Statement ) do
endfor

repeat
until(  );

break;
breakif();

forever
endfor;

try
exception( exceptionID )
exception( exceptionID )
endtry;

The HLA Standard Library

nl is a constant predefined by stdio module. It can be used outside stdio namespace.

stdio.bell, stdio.bs, stdio.tab, stdio.lf, stdio.cr are contants predefined inside the stdio namespace. It can’t be used outside the stdio namespace.

Standard IN / OUT

testpgm <input.data
testpgm >output.txt
testpgm <in.txt >output.txt

<infile change input source to “infile”

>outfile redirect the output to “outfile”

stdout.newln(); is equal to stdout.put( nl );

stdout.puti8( var_name ) print a single parameter one byte as a signed integer varlue.

stdout.puti16( var_name ) print a single parameter two byte as a signed interger value.

stdout.puti32( var_name ) print a single parameter four byte. as a singed interger value.

stdout.puti8Size( Value8, width, padchar );

stdout.puti16Size( Value16, widht, padchar );

stdout.puti32Size( Value32, width, padchar );

  • When abc(width) < value size, won’t add any padchar.
  • When abs(width ) > value size and width is positive, the padchar will print to the left of the Value.
  • When abc(width) > value size and width is negative, the padchar will print to the right of the Value.

stdout.put(Value)

stdout.put(Value:width) ; i.e. stdout.put(.i32:5 )

stdin.getc

stdin.geti8, stdin.geti16, stdin.get32

In general, the stdin routines read text from the user only when the input buffer is empty. As long as the input buffer contains additional characters, the input routines will attempt to read their data from the buffer.

We can use this feature by input two arguments at same time:

stdout.put( "Enter two integer values: " )
stdin.geti32();
mov( eax, intval );
stdin.geti32();
mov( eax, AnotherIntVal );

stdin.readln

stdin.flushInput

The stdin.readln routine is rarely necessary, so you should use stdin.flushInput unless you really need to immediately force the input of a new line of text.

stdin.get()

stdout.put( "Enter two integer values: " );
stdin.get( intval, AnotherIntVal );