80x86 16-bit Compiling How-to
by Alexei A. Frounze

1 Introduction
2 Revising Memory Addressing in Real Mode of 80x86 CPU
2.1 From 8080/8085 to 8086
2.2 16 or 20 Bits? Meet the Segment:Offset Pair!
2.3 More Than 1 MB?
2.4 Which Segment Register?
3 Memory Models Employed by Realmode Compilers
4 The Two Memory Models and File Formats We'll Use
4.1 Tiny Memory Model (.COM)
4.2 Small Memory Model (.EXE)
5 The Two Compiler Sets We'll Use
6 Compiling with Borland/Turbo C/C++ and NASM
6.1 Important details on Borland/Turbo C/C++ compiler
6.2 Calling Conventions and Register Conventions
6.3 Turbo C++ Tools
6.3.1 TCC, C/C++ compiler
6.3.2 TLINK, linker
6.3.3 TLIB, librarian
6.3.4 MAKE
6.4 NASM, assembler (not TASM, has nothing to do with Borland)
7 Compiling with Open Watcom C/C++
7.1 Important details on Open Watcom C/C++ compiler
8 Downloads
9 Work In Progress

1 Introduction

The need for making 16-bit code in is primarily due to the following facts:

An 80x86 CPU starts up in the real mode, employing its 16-bit addressing scheme
An 80x86 PC BIOS (which is what the CPU starts executing after reset/power on) is mostly 16-bit and cannot be easily used in 32-bit protected mode of the 80386+ CPU
To load the OS kernel from a disk (floppy or hard) it's natural to use the BIOS, when no other I/O drivers are available
To change screen modes, perform power management, etc, it's also natural to use the BIOS functionality (for the same reason as above)

So, one would want 16-bit real mode code to run on the 80x86 PC to take advantage of using the BIOS and/or prepare to switch to the 32-bit protected mode of the CPU, like in e.g. bootloaders or OS loaders. For some purposes, pure 16-bit real mode code is enough as well. And you can compile your own ROM BIOS for an embedded x86-based system!

2 Revising Memory Addressing in Real Mode of 80x86 CPU

Let's revise realmode 80x86 memory addressing.

2.1 From 8080/8085 to 8086

The intel 8086 CPU was derived from intel 8080/8085 CPU and inherited 16-bit ideas from it. Although being 16-bit and somewhat compatible with 8080/8085, the 8086 CPU has an enhanced memory addressing mechanism, which isn't condemned to the 16 lines of the address bus, instead the 8086 has a 20 lines-wide address bus. So, unlike 8080/8085 (which could address up to 2¹⁶ = 65536 bytes of memory, i.e. 64 KB), the 8086 can address up to 2²⁰ = 1048576 bytes of memory, i.e. 1 MB.

Now, let's see how intel implemented memory addressing...

An 8080/8085 would access its worth of 64 KB memory using direct and indirect forms of address specifications in the CPU instructions.

For example:

Instruction	Action
`LDA 2050H`	Load A (8-bit accumulator register) with byte from memory location 2050H.
`LHLD 0A00H`	Load HL (16-bit register) with word from memory location 0A00H (byte at 0A00H would go to L (least signigicant half of HL) and byte at 0A01H would go to H (most significant half of HL)).
`MOV A, M`	Load A (8-bit accumulator register) with byte from memory location specified in the 16-bit register HL (M designates accessing memory indirectly thru the HL register).
`LDAX B`	Load A (8-bit accumulator register) with byte from memory location specified in the 16-bit register BC.

Hence, it's very simple with 8080/8085. Either the 16-bit address is a constant value encoded in the CPU instruction and the memory location is accessed directly by using the encoded address (this is direct addressing) or the 16-bit address is contained in a 16-bit register of the CPU (BC or HL in our examples) and this address is read from the register before accessing a memory location by this address (this is indirect addressing).

Now, the 8086 can do the same thing...

Instruction	Action
`MOV AL, [2050H]`	Load AL (least significant half of 16-bit accumulator register AX) with byte from memory location 2050H.
`MOV BX, [0A00H]`	Load BX (16-bit register) with word from memory location 0A00H (byte at 0A00H would go to BL (least signigicant half of BX) and byte at 0A01H would go to BH (most significant half of BH)).
`MOV AL, [BX]`	Load AL (least significant half of 16-bit accumulator register AX) with byte from memory location specified in the 16-bit register BX.
`LODSB`	Load AL (least significant half of 16-bit accumulator register AX) with byte from memory location specified in the 16-bit register SI.

Same thing.
Almost...

2.2 16 or 20 Bits? Meet the Segment:Offset Pair!

Do you remember that 8086 has been said to have 20-bit-wide address bus?
You surely do, don't you?
Then how come the four 8086 instructions above specify only 16 bits of the address?
Where's the leftover, the 4 other bits to make it 20-bit? :)

The fun part is that there's one special address register involved, the DS register (data segment register). The DS register is also a 16-bit register.
The value of the DS register is concatenated with the 16-bit address specified in the instruction. The concatenation is a bit tricky. The DS value is shifted left by 4 binary positions (or, equivalently, multiplied by 16) and then added to the 16-bit address specified in the instruction.

Example:

BX=341BH DS=123AH MOV AL, [BX]

would load the AL register with a byte from memory location 123AH * 16 + 341BH = 123A0H + 341BH = 157BBH. 157BBH is the physical 20-bit address that is placed on the address bus so that the memory value at this address can be transferred to CPU (or backward, from CPU to memory, with e.g. MOV [BX], AL).

Really simple.

The address of the form 123AH:341BH is referred as to logical address.
The part that is specified before the colon is referred as to segment part of the address (or often for shortness just segment). The part that is specified after the colon is referred as to offset part of the address (for shortness just offset or sometimes displacement).

segment:offset pair is a logical address
segment * 16 + offset = physical address

So, with a constant value of segment (say, constant DS; there can be other segment registers used) but with different values of offset, we can address up to 2¹⁶ = 65536 bytes = 64 KB of memory starting at the physical address equal to segment * 16. This 64 KB region of memory is referred as to segment. Right, same word is often used to refer to different things and smart guys are known to do it all the time. :) This is important to remember, if you're new to this addressing stuff and its terminology. Hopefully, you'll be able to deduce from context what segment stands for.

By changing the segment value (say DS value) and offset value we can generate all the physical addresses from 0 up to 2²⁰-1, but this is not the upper bound. Technically, if we take segment=0FFFFH and offset=0FFFFH, then we'll end up with physical address equal to 10FFEFH, which needs 21 bit to be represented. The 8086 CPU has only 20 address lines, so such an address would lose its most significant bit and wrap around zero and in this example the 8086 CPU would access the byte at physical address 0FFEFH instead of 10FFEFH.

It is important to mention that there are many different logical address possible such that transform to the same physical address. This is the effect of the way the segment:offset pair is transformed to the final, physical, address.

Just an example:

123AH * 16 + 341BH = 123A0H + 341BH = 157BBH 1239H * 16 + 342BH = 12390H + 342BH = 157BBH 143AH * 16 + 141BH = 143A0H + 141BH = 157BBH
...

2.3 More Than 1 MB?

With introduction of the intel 80286 CPU, the number of address lines extended to 24, so on the 80286 you can access memory above 1 MB mark by using the segment:offset pair. Only FFF0H = 65520 bytes (almost 64 KB) above 1 MB can be accessed this way. But that can only be possible if you enable the A20 address line (8086 had only A0 thru A19 lines). For compatibility (with 8086 PCs) reasons, the PC engineers had added a programmable hardware mechanism on 80286+ based PCs to enable and disable the A20 address line, so that the address wrap around be possible just like on the 8086. When the A20 is disabled, both 10FFEFH and 0FFEFH physical addresses, generated by a 80286+ CPU, would appear to the memory as physical address 0FFEFH, i.e. the 20th address bit would always be 0.
We won't discuss details of A20 enabling and disabling here because it's an off-topic.
For now, let's just mention that in the protected mode of the intel 80286+ and 80386+ CPUs, it's possible to access to much more memory than 1 MB. The 80286 can access up to 16 MB of memory and the 80386 and 80486 can access up to 4 GB. Pentium class CPUs can access even more. That's it about protected mode for now.

2.4 Which Segment Register?

OK. Let's get back to the segment registers... In fact, the 8086 CPU always uses some segment register to read code/data from memory or write data to memory.

The instructions executed by the 8086 CPU are sequentially read from memory using the CS:IP pair of CPU registers (CS is Code Segment register, IP is Instruction Pointer register). After execution of an instruction has completed, the IP will increment so the next instruction can be feched and executed. IP can also be changed by the near jump, call and return instructions, e.g. the control is transferred within 64 KB segment starting at physical address equal to CS * 16. The far jump, call and return instructions modify both IP and CS and make it possible to transfer control to any part of a program anywhere in the 1 MB of addressable memory. Interrupt and return from interrupt instructions always modify CS and IP, similarly to far call and return instructions.

The 8086 CPU stack is organized with the SS:SP pair of registers (SS is Stack Segment register, SP is Stack Pointer register). SP decrements by 2 before a 16-bit word is stored on the stack, and conversly increments by 2 after a 16-bit word is removed from the stack. All interrupt, call and return instructions affect SP, not affecting SS.

Let alone instruction fetch (with CS:IP) and stack manipulations (with SS:SP)... The interesting thing is how the 8086 CPU transfers data between itself and memory using direct and indirect addressing with registers other than IP and SP. It might look a bit complicated, but here's how it works...

The 8086 CPU registers are:

AH	AL
AX

BH	BL
BX

CH	CL
CX

DH	DL
DX

FLAGS

Just for the completeness, 8086 CPU registers description:

Register	Description
AX	16-bit Accumulator register, least and most significant halves (AL and AH respectively) are separately accessible. Most suited for/dedicated to the ALU operations and I/O.
BX	16-bit Base register, least and most significant halves (BL and BH respectively) are separately accessible. Can be used as indirect address register when accessing memory.
CX	16-bit Counter register, least and most significant halves (CL and CH respectively) are separately accessible. Can be used to organize loops and repeat string instructions.
DX	16-bit Data register, least and most significant halves (DL and DH respectively) are separately accessible. Used in some special ALU and I/O operations.
FLAGS	16-bit Flags register. Contains control/status flags.
IP	16-bit Instruction Pointer register. Points to an instruction to be executed.
SP	16-bit Stack Pointer register. Points to the last 16-bit word pushed to the stack.
BP	16-bit Base Pointer register. Can be used as indirect address register when accessing memory (handy for stack memory accesses).
SI	16-bit Source Index register. Can be used as indirect address register when accessing memory (used by string instructions).
DI	16-bit Destination Index register. Can be used as indirect address register when accessing memory (used by string instructions).
CS	16-bit Code Segment register. Selects the 64 KB region of memory, from which instructions are fetched and executed by the CPU.
SS	16-bit Stack Segment register. Selects the 64 KB region of memory, where the CPU stack is located.
DS	16-bit Data Segment register. Selects the 64 KB region of memory, with which most of memory reads and writes are done.
ES	16-bit Extra data Segment register. Selects an additional 64 KB region (additional to one selected by DS) of memory, with which more memory reads and writes can be done. Used by string instructions that work with DI.

Now, having introduced all of the 8086 CPU registers, let's see how we can access memory using them for indirect addressing. What if I want to use say register SI to indirectly address memory? Which segment register will be used by default in this case? The following table below lists all possible addressing modes and the default data segment register used in each of them.

Addressing Mode	Address Operand Format	Default Segment Register
Direct/Displacement	[displacement/offset/label/whatever you call it]	DS
Indirect	[BX]	DS
Indirect	[BP]	SS
Indirect	[SI]	DS
Indirect	[DI]	DS (ES for string instructions)
Indirect+Displacement	[BX+displacement]	DS
Indirect+Displacement	[BP+displacement]	SS
Indirect+Displacement	[SI+displacement]	DS
Indirect+Displacement	[DI+displacement]	DS
Double Indirect+Displacement	[BX][SI]+displacement	DS
Double Indirect+Displacement	[BX][DI]+displacement	DS
Double Indirect+Displacement	[BP][SI]+displacement	SS
Double Indirect+Displacement	[BP][DI]+displacement	SS

Notes:

displacement is a constant 8/16-bit value.
[reg] means that a memory location is being indirectly accessed thru the register reg. The memory address (offset) is contained in the register reg.
[reg+displacement] means that a memory location is being indirectly accessed thru the register reg. The memory address (offset) is the sum of the register reg value and the displacement value.
[reg1][reg2]+displacement means that a memory location is being indirectly accessed thru the two registers reg1 and reg2. The memory address (offset) is the sum of the values of the registers reg1 and reg2 and the displacement value. That is, all three values are added together to form the offset.

To summarize:

Wherever the BP register used as indirect, SS is used as the default segment register to make up the physical address
Wherever the DI register is used by a string instruction, it's used together with the ES segment register
In all other cases, DS is used as default segment register for accessing data

If you need to override the use of the default segment register, you can explicitly specify the segment register to use, like so:
MOV AL, CS:MyTable[BP][SI] or
MOV AL, [CS:BP+SI+MyTable] whichever format is supported by your assembler (TASM/MASM/WASM/NASM/etc).
The prefix, consisting of segment name and colon, overrides the default segment register to the one specified before the colon.

3 Memory Models Employed by Realmode Compilers

The following table summarizes the most common memory models employed by 16-bit realmode 80x86 compilers.

Near pointers (in real mode) are 16-bit pointers, consisting only of a 16-bit offset. The default segment register (CS for code, DS/SS for data/stack) is assumed to be constant. Near pointers are small and quick, need less code to handle.

Far pointers (in real mode) are 32-bit pointers, consisting of the both 16-bit parts, segment and offset. Far pointer increment/decrement usually doesn't affect the segment part of the far pointer. Far pointers are big and slow, need more code to handle.

It is problematic to access objects or arrays bigger than 64 KB with both near and far pointers in HLL (C/C++) compilers because this needs manual implementation of far pointer arithmetics.

Memory Model	Code Segment Size, Pointer Type	Data Segment Size, Pointer Type	Description
Tiny	< 64 KB, near	< 64 KB, near	Use the tiny model for small size applications. All four segment registers (CS, DS, ES, SS) are set to the same address, so you have a total of 64 KB for all of your code, data, and stack. Near pointers are always used. Tiny model programs can be compiled to .COM format. SS=ES=DS=CS, always
Small	< 64 KB, near	< 64 KB, near	Use the small model for average size applications. The code and data segments are different and don't overlap, so you have 64 KB of code and 64 KB of data and stack. Near pointers are always used. SS=DS, usually
Medium	< 1 MB, far	< 64 KB, near	The medium model is best for large programs that don't keep much data in memory. Far pointers are used for code but not for data. As a result, data plus stack are limited to 64 KB, but code can occupy up to 1 MB. SS=DS, usually
Compact	< 64 KB, near	< 1 MB, far	Use Compact model if your code is small but you need to address a lot of data. The opposite of the medium model is true for the compact model: far pointers are used for data but not for code; code is then limited to 64 KB, while data has a 1 MB range. All functions are near by default and all data pointers are far by default. SS!=DS, usually
Large	< 1 MB, far	< 1 MB, far	Use Large model for very large applications, only. Far pointers are used for both code and data, giving both a 1 MB range. All functions and data pointers are far by default. SS!=DS, usually
Huge	< 1 MB, far	< 1 MB, far	Use Huge Model for very large applications only. Far pointers are used for both code and data. Turbo C++ normally limits the size of all data to 64 KB; the huge memory model sets aside that limit, allowing data to occupy more than 64 KB. The Huge model allows multiple data segments, (each 64 KB in size), up to 1 MB for code, and 64 KB for stack. All functions and data pointers are assumed to be far. SS!=DS, usually

4 The Two Memory Models and File Formats We'll Use

Let limit us to only the two simplest memory models, which, I believe, are sufficient for most of our 16-bit applications. These memory models are Tiny and Small, both with near pointers, never changing segment registers (except probably when we need this, e.g. when calling BIOS functions and managing memory outside our program). This makes things easy.

4.1 Tiny Memory Model (.COM)

An application compiled with the Tiny memory model option is usually compiled to the .COM file format, the format well-known from the DOS world. This format is basically a raw/flat executable binary w/o any relocation information. Since a program in this format is assumed to always have CS=DS=ES=SS and occupy 64 KB at most (i.e. full 64 KB segment), relocation of such a program in memory is just a simple matter of choosing a segment of memory for the program, loading it there as-is, and setting segment registers to point to this segment before jumping to the entry point of the program.

The good side of the Tiny/.COM model/format is that it's simplest ever for relocation (basically, no relocation needed) and you can always run your program in DOS, in DOS-box of windows, in DOSemu in Linux.

4.2 Small Memory Model (.EXE)

The Small memory model allows making a program whose code and data/stack segments can be both as big as 64 KB because these segments are separate and CS!=DS unlike Tiny/.COM. An application compiled with the Small memory model option is compiled to the .EXE format, also well-known from the DOS world. Applications in this format can also be relocated in the memory and this format (unlike .COM) keeps relocation information inside as well as entry point address (CS:IP) and initial stack configuration (SS:SP), which vary from prgram to program (unlike .COM, where entry point and stack configuration is fixed).

DOS allocates segment(s) of memory to load an .EXE, loads the .EXE image (which goes after the .EXE header portion of the file) and pefrorms address fix ups inside the loaded image using the relocation information from the .EXE file header, thereby completing relocation. After that DOS sets up stack and performs a jump to the entry point.

I won't discuss .EXE relocation with its address fixups here, the explanation of this process can be found elsewhere (e.g. http://www.wotsit.org/). .EXE format is best for 16-bit applications, and with it you can more than with .COM (all other memory models (Medium, Compact, Large, Huge) also naturally compile to .EXE).

5 The Two Compiler Sets We'll Use

As a full-time DOS/windows user, I found the following popular (and now free!) compilers to be very well suited for compiling non-DOS 16-bit 80x86 realmode applications such as bootloaders and various tools:

Compiler	Free?	16/32 Bit	Assembler	Linker	Librarian	Make	Debugger	IDE	Description
Borland's Turbo C++ 1.01	Free	16-bit	TASM (not free) Use NASM or other	TLINK	TLIB	MAKE	TD (not free) Use ZD86, WD or other	TC	An old, but very good 16-bit C/C++ compiler by Borland. Has a very nice IDE with integrated debugger. Unfortunately, the free distribution doesn't include assmbler (TASM) and as result you can't have inline assembly with this compiler. But if you're not afraid of writing external subroutines in assembly, and linking them with the high-level C/C++ code, you can use NASM with Turbo C++. Together they go very well. The free compiler distribution doesn't include a standalone 16-bit debugger (TD), but you may use ZD86, WD or some other.
Open Watcom 1.x C++ compiler	Free	16,32-bit	WASM	WLINK	WLIB	WMAKE	WD	IDE (for win32)	A really good 16/32 bit C/C++ compiler, free, comes with everything (all development tools, documentation, examples, source code), runs under DOS, Windows, OS/2, can be used to compile applications for those OSes. Open Watcom also includes a Fortran compiler.

6 Compiling with Borland/Turbo C/C++ and NASM

6.1 Important details on Borland/Turbo C/C++ compiler

Code is put to _TEXT segment with class CODE and USE16 attribute
Initialized data are put to _DATA segment with class DATA
Example:
```
     const int cvar = 1;
     int var = 2;
     int table[5] = {1, 2, 3, 4, 5};
     char* birds[3] = {"robin", "finch", "wren"};
```
All these variables "cvar", "var", "table" and "birds" and strings "robin", "finch", "wren" are put to the _DATA segment.
Uninitialized data are put to _BSS segment with class BSS.
Example:
```
     int var1;
     int array1[400];
```
For Tiny/.COM model/format _TEXT, _DATA and _BSS segments are grouped together into the DGROUP group. The _TEXT segment must have "ORG 100h" or equivalent ("RESB 100h" if NASM used) directive so the .COM format be possible. The _TEXT segment must be the first in the DGROUP group.
For Small/.EXE model/format only _DATA and _BSS segments are grouped together into the DGROUP group. The .EXE stack segment (named _STACK, with attribute STACK and class STACK) is either small unused (instead SS:SP is initialized by application to point to end of data segment (DGROUP), so that SS=DS=DGROUP) or big enough to be usable (and also grouped to DGROUP, so that SS=DS=DGROUP).
Some of the arithmetic operators (long multiplication, division and shift) and data copying routines (structure copying) are implemented as functions and must be additionally linked with your program.
The compiler prepends an underscore character to the function and variable names when compiling C/C++ code to the object file, e.g. "void MyFunction()" would have "_MyFunction" name in the object file. Therefore, any external assembly functions must be written with this in mind. An assembler function must have a name with leading underscore to be accessible from C/C++, e.g. asm name "_MyAsmFxn" will be seen to the C/C++ code as say "extern void MyAsmFxn()". And of course, if MyAsmFxn() needs to call MyFunction(), it must "call _MyFunction" because in the object files the C/C++ names must have the leading underscore.
Note: the additional underscore character in function/variable names appears at different positions in Borland/Turbo C/C++ and Open Watcom C/C++. By default Borland/Turbo does leading underscore, Watcom does trailing underscore.

6.2 Calling Conventions and Register Conventions

When calling a function, the following is pushed into the stack, in the specified order:
function arguments from last to first (notice reverse order!),
return address.
The called function never removes its arguments from stack when returning to the caller. The caller pushes arguments to the stack and removes them after the call.
8-bit arguments are extended to 16-bit when pushed on the stack.
Function return values are placed into
AL (8-bit value) or
AX (16-bit value) or
both DX and AX (32-bit value, most significant half goes to DX, least significant half goes to AX). Pointers in the real mode can be 16-bit (near) and 32-bit (far). Segment part of a far pointer goes to DX, while offset part goes to AX.
A function must preserve values of:
DS, SS, BP, SI, DI (remember them for writing functions in assembler).
The direction flag (DF, in FLAGS register) should also be preserved as 0.
The ES register is not guaranteed to be equal to DS. Set ES to value of DS if needed.
The reserved interrupt keyword provides additional entry and exit code for void(void) functions to make them directly usable as interrupt service routines. Their addresses can be directly stored to the interrupt vector table. Remember that the compiler is 16-bit. The entry and exit code of interrupt functions won't save/restore 80386+ 32-bit registers entirely, it will only save/restore AX of EAX, etc. Floating point unit state isn't saved/restored either.
Structure passing and returning by value isn't covered here. Neither is floating point data. If you feel need this information, you may create a C function that makes use of structures of floating point types and generate assembly source code from it. You may find most of answers to your questions by inspecting the generated assembly source code. I'm not used to pass structures, for most things a pointer to a structure is enough. And I don't consider floating point support in Turbo C++ any serious or really helpful for the kind of stuff OS developers do at first place.

6.3 Turbo C++ Tools

6.3.1 TCC, C/C++ compiler

Command line:
TCC [options] file[s]

Useful options:

-K- Default char is signed (int and long are signed, likewise char is very often signed)

-1 Generate 80186/80286 instructions

-N- Don't check stack overflow

-k Standard stack frame (arguments referenced thru SS:BP+disp)

-ms Set memory model to small (for .COMs and discussed .EXEs)

-c Compile only (TCC can also link by calling linker and produce executable)

-S Produce assembly output (useful for studying assembler and finding bugs)

-wxxx Warning control

-v Include source level debug information into output object files (useful for TD only)

-y Include source file line number debug information into output object files (useful for TD only)

6.3.2 TLINK, linker

Command line:
TLINK [options] objfiles, executablefile, mapfile, libfiles
Note: you may omit names of executable and/or map file, but commas must remain if there are any file names specified after them (e.g. library files)

Useful options:

/m	Generate map file with public symbols (map files are useful to find link stage bugs, e.g. wrong addresses)
/s	Include detailed map of segments into the map file (map files are useful to find link stage bugs, e.g. wrong addresses)
/n	No default libraries linked
/d	Warn if duplicate symbols in libraries
/c	Case sensitive processing of symbols, e.g. name!=NAME
/3	Enable 32-bit processing (link 32-bit code if any encountered)
/t	Create .COM file (or raw binary, extenstion in executablename must not be .COM for raw binaries or the linker will try to make .COM and thus will fail)
/v	Include full debug information into executable file (useful for TD only)

Note #1: to avoid any problems with linking, always specify the object file containing entry point (e.g. c0s.obj or c0t.obj) the first to be linked.

Note #2: default libraries are:

c?.lib	Standard C library. ? denotes memory model symbol (s for tiny/small).
math?.lib	Standard C library, mathematical functions. ? denotes memory model symbol (s for tiny/small).
emu.lib	Emulation of 80x87 floating point unit.
fp87.lib	Functions for 80x87 floating point unit.

6.3.3 TLIB, librarian

Command line:
TLIB [/C] [/E] libfile, commands, listfile
Note: commands and listfile are optional

Useful options and commands:

/C	Case-sensitive library
+module	Add module (object file) to library
-module	Remove module (object file) from library
*module	Extract module (object file) from library
-+module +-module	Replace module (object file) in library
-module -module	Extract module (object file) and remove it from library

6.3.4 MAKE

MAKE doesn't need tab characters in the makefile where unix make would normally require.
Note: make sure, you don't have another make available thru PATH environment variable, if you intend to use this particular MAKE.

6.4 NASM, assembler (not TASM, has nothing to do with Borland)

Command line:
NASM [-o outfilename] [-f format] [-l listfilename] [options] filename

Useful options and commands:

-f obj	Will generate Intel/OMF .OBJ object outfile (compatible with Borland/Turbo C/C++/Pascal compilers) from the specified file.
-F obj	Will generate Borland debug information (useful for TD only).
-D[=value]	Predefines a macro.
-U	Undefines a macro.

7 Compiling with Open Watcom C/C++

7.1 Important details on Open Watcom C/C++ compiler

Code is put to _TEXT segment with class CODE and USE16 attribute
By default, the data group DGROUP consists of the CONST, CONST2, _DATA and _BSS segments. The compiler places certain types of data in each segment.
The CONST segment (of class DATA) contains constant literals that appear in your source code.
Example:
```
     char* birds[3] = {"robin", "finch", "wren"};
     printf ("Hello world\n");
```
In the above example, the strings "Hello world\n", "robin", "finch", etc. appear in the CONST segment.
The CONST2 segment (of class DATA) contains initialized read-only data.
The _DATA segment (of class DATA) contains initialized writable data.
Example:
```
     const int cvar = 1;
     int var = 2;
     int table[5] = {1, 2, 3, 4, 5};
     char* birds[3] = {"robin", "finch", "wren"};
```
In the above example, the constant variable "cvar" is placed in the CONST2 segment, "var", "table" and "birds" are placed in the _DATA segment. Finally, the strings "robin", "finch", "wren" are placed in the CONST segment.
The _BSS segment (of class BSS) contains uninitialized data such as scalars, structures or arrays.
Example:
```
     int var1;
     int array1[400];
```
For Tiny/.COM model/format _TEXT, _DATA, CONST, CONST2 and _BSS segments are grouped together into the DGROUP group. The _TEXT segment must have "ORG 100h" or equivalent ("RESB 100h" if NASM used) directive so the .COM format be possible. The _TEXT segment must be the first in the DGROUP group.
For Small/.EXE model/format only _DATA, CONST, CONST2 and _BSS segments are grouped together into the DGROUP group. The .EXE stack segment (named _STACK, with attribute STACK and class STACK) is either small unused (instead SS:SP is initialized by application to point to end of data segment (DGROUP), so that SS=DS=DGROUP) or big enough to be usable (and also grouped to DGROUP, so that SS=DS=DGROUP).
Some of the arithmetic operators (long multiplication and division) routines are implemented as functions and must be additionally linked with your program.
By default, the compiler uses the register-based argument passing (unlike Turbo C++). This register convention isn't covered here, but I suppose, it can be deduced from the generated code and from the assembler source codes for the Watcom standard C library.
By default, the compiler appends an underscore character to the function and variable names when compiling C/C++ code to the object file, e.g. "void MyFunction()" would have "MyFunction_" name in the object file. Therefore, any external assembly functions must be written with this in mind. An assembler function must have a name with trailing underscore to be accessible from C/C++, e.g. asm name "MyAsmFxn_" will be seen to the C/C++ code as say "extern void MyAsmFxn()". And of course, if MyAsmFxn() needs to call MyFunction(), it must "call MyFunction_" because in the object files the C/C++ names must have the trailing underscore.
Note: the additional underscore character in function/variable names appears at different positions in Borland/Turbo C/C++ and Open Watcom C/C++. By default Borland/Turbo does leading underscore, Watcom does trailing underscore.
It is, however, possible to generate code with stack-based argument passing and link Watcom compiled code with the code whose functions have leading underscore in the object files. For this, there's a special reserved keyword cdecl (may also be _cdecl and __cdecl). Functions definded as, say, int cdecl fxn (int x); will compile for stack-based argument passing and the additional underscore in the name will appear in front of the C name, e.g. _fxn. This (cdecl) calling and naming convention is exactly the same as adopted by the Turbo C++ compiler, see 6.2 Calling Conventions and Register Conventions.

8 Downloads

Item

URL

Turbo C++ 1.01

http://community.borland.com/museum/
You will have to register at the Borland/Inprise web site to download the compiler.

NASM 0.98+

http://nasm.sourceforge.net/

Open Watcom 1.2 or 1.5

http://www.openwatcom.org/
C/C++ & Fortran compilers.
You will need these files for DOS/realmode/DPMI development:

File	Description

readme.txt
c_doswin.zip	C compiler (DOS & Win16 hosts)
clib_a16.zip	C runtime libraries (16-bit, All targets)
clib_d16.zip	C runtime libraries (16-bit DOS target)
clib_samples.zip	C runtime library sample programs
cm_clib_a16.zip	C runtime libraries (16-bit, all targets)
cm_clib_a32.zip	C runtime libraries (32-bit, all targets)
cm_clib_d16.zip	C runtime libraries (16-bit DOS target)
cm_clib_d32.zip	C runtime libraries (32-bit DOS target)
cm_clib_hdr.zip	C runtime library header files
cm_core_all.zip	Core binaries (All hosts)
cm_core_dos.zip	Core binaries (DOS host)
cm_core_doswin.zip	Core binaries (DOS & Win hosts)
cm_dbg_all.zip	Debugger (All hosts)
cm_dbg_dos.zip	Debugger, profiler & sampler (DOS host)
cm_dbg_dosos2.zip	Debugger (DOS & OS/2 hosts)
cm_dbg_doswin.zip	Debugger (DOS & Win16 hosts)
cm_dbg_misc1.zip	Debugger (DOS host or target)
cm_hlp_dos.zip	Help files (Dos host)
cm_hlp_win.zip	Help files (Win16 host), may be easier to use
cm_ide_all.zip	IDE (All hosts)
cm_ide_dos.zip	IDE (DOS host)
cm_plib_a16.zip	C++ runtime libraries (16-bit, all targets)
cm_plib_a32.zip	C++ runtime libraries (32-bit, all targets)
cm_samples.zip	Sample programs (all targets)
core_all.zip	Core binaries (All hosts)
core_doswin.zip	Core binaries (Dos & Win16 hosts)
ext_causeway.zip	Causeway DOS extender / DPMI host
ext_dos32a.zip	DOS32 CauseWay DOS extender / DPMI host
ext_dos4gw.zip	DOS/4GW DOS extender / DPMI host
ext_pmodew.zip	PMODE/W DOS extender / DPMI host
hlp_dos.zip	Help files (Dos host)
hlp_win.zip	Help files (Win16 host), may be easier to use
ide_samples.zip	Sample IDE files
misc_src.zip	Misc source files and sample programs, include application startup codes
plib_a16.zip	C++ runtime libraries (16-bit, all targets)
plib_a32.zip	C++ runtime libraries (32-bit, all targets)
plib_hdr.zip	C++ runtime library header files
plib_samples.zip	C++ runtime library sample programs
open_watcom_1.2.0-src.zip	Open Watcom 1.2 source codes - you may learn more from them

Borland C++ clib src

BCpp31CLibSrc.zip
Borland C++ 3.1 standard C/C++ library source codes. Make your own using its API and helper functions.

SDK for Turbo C++ & NASM

C16SDKTurboNASM.zip

SDK for Open Watcom C/C++

C16SDKWatcom.zip

9 Work In Progress

The work on this document is in progress. Meanwhile, try learning things from the compiler documentation and source codes provided here (already available).

If you want to contact me regarding this doc or anything else, please post a message on the usenet: news:alt.os.development.
To post, use http://groups.google.com/ or http://news.individual.de/.

Alexei A. Frounze
July the 4^th, 2004

-K-	Default char is signed (int and long are signed, likewise char is very often signed)
-1	Generate 80186/80286 instructions
-N-	Don't check stack overflow
-k	Standard stack frame (arguments referenced thru SS:BP+disp)
-ms	Set memory model to small (for .COMs and discussed .EXEs)
-c	Compile only (TCC can also link by calling linker and produce executable)
-S	Produce assembly output (useful for studying assembler and finding bugs)
-wxxx	Warning control
-v	Include source level debug information into output object files (useful for TD only)
-y	Include source file line number debug information into output object files (useful for TD only)

80x86 16-bit Compiling How-to by Alexei A. Frounze

Table of Contents

80x86 16-bit Compiling How-to
by Alexei A. Frounze