anos1.asm

[ORG 0x7c00] ; and it goes on from here for 512 bytes, the so called “bootsector” of the memory map.
; “… It doesn’t matter if you use 0000:7c00 or 07c0:0000, ….”:
; from http://www.brokenthorn.com/Resources/OSDev4.html
; because “Technically, there is exactly 4,096 different combinations of segment:offset that can refer to the same byte in memory — This is for each byte in memory! ”
; that the segment:
; 1/23/2014 degeneracy of address means cpu deals mostly with metaphors for addresses resolving [pm qm and relativity: india indian guy resembling albert einstein welcoming his wife
; from the hospital back home] to real address only when it is called for … gia ba?o shut room door and said bad words … prof mike longo …
; using metaphorical addresses is connected to how the equal sign becomes not the equal sign but becomes an assignment operator
; e.g. “muo^n loa`i” <= “muo^n loa`i va` messageA va` messageB va` messageC va` ….”
; gia ba?O “ba?o to^`n chi’nh nghi~a” [in a world where no one is outside of “muo^n loa`i ddu+o+.c so^’ng la^u bi`nh thu+o+`ng; everyone live long and well”,
; each and everything meaning resolves to “muo^n loa`i ddu+o+.c so^’ng la^u bi`nh thu+o+`ng; everyone live long and well”]
; offset form is “degenerate” is suggestive of how the origin or ORG is also “degenerate” in the following picture of two circles
; enclosing one another meeting at a common point, the origin:
;                      __
;                     /  \
;                     \()/
; from http://www.numberplanet.com/number/07c0/index.html: The Number 1984 (hex 0x07C0)
; from http://www.numberplanet.com/number/7c00/index.html: The Number 31744 (hex 0x7C00)
; pi = 3.14159265359
; 3:25 AM 1/22/2014 nurse at Women’s Health “…if you’re going to San Francisco, remember to wear some flower on your hair …”
; This is where BIOS loads the bootloader in a typical memory map for a typical pc board
; it is a word, or rather a number, play on the circle and on the British socialist-communist George Orwell
; from http://www.glamenv-septzen.net/en/view/6:
;”0x7C00″ was decided by IBM PC 5150 BIOS developer team (Dr. David Bradley).
; As mentioned above, this magic number was born at 1981 and “IBM PC/AT Compat” PC/BIOS
; vendors did not change this value for BIOS and OS’s backward compatibility.
; nothing-kho^ng … 7C0: ba^?y co’/co^? …; C00 ~ sound of a dove or–father and
; pak-ming ho koo stark poster– “CU” ~ “bird” ~ “sex” 00 ~ oolitic ~ egg c ~ sea 0x7c00 ~ seven seas “travel the world and the seven seas ….”
; [the “well” in “Orwell”: “muo^n loa`i ddu+o+.c so^’ng la^u bi`nh thu+o+`ng; everyone live long and well”:
; nguye^n thu?y origin is “muo^n loa`i ddu+o+.c so^’ng la^u bi`nh thu+o+`ng; everyone live long and well”]
; Perpetual Motion or “forever young” or hoa`i hoa`i ma~i ma~i vi~nh vie^~n “muo^n loa`i ddu+o+.c so^’ng la^u bi`nh thu+o+`ng; everyone live long and well”
; jmp $  ;
; or this version
;hang: jmp hang
; the universal programming loop …. in pseudo-code …
; Gia Ba?o is struggling with his two selves “Gia” ~ “wall” ~ “if (maintainable) …” and “Ba?o” ~ “bao” ~ “tolerate” ~ expansion of what’s maintainable to
; ever larger inclusions …
;     {
;       start-ORG-nguye^n-thu?y: maintain-gi`n-giu+~ (“muo^n loa`i ddu+o+.c so^’ng la^u bi`nh thu+o+`ng; everyone live long and well”); // in “gia ba?o”, “ba?o” ~ maintain as in “ba?o thu?” …
; push-and-pop-or-sent-and-receive (&messageNEW-hay-tinLA`NH); // tin and shakespeare’s version of “all roads lead to rome”: “doubt thou the stars are fire doubt truth to be a liar but never doubt I loved ‘muo^n loa`i ddu+o+.c so^’ng la^u bi`nh thu+o+`ng; everyone live long and well'”:  1/19/2014 Sunday Service … Gospel ~ Good News Tin La\nh …”Gia Ba?o”:  the “gia” attempts to reach an agreement with the “ba?o” …
; if (maintainable-giu+~-ddu+o+.c) maintain-thi`-giu+~ (“muo^n loa`i va` messageA va` messageB va` messageNEW va` tinLA`NH va`… ddu+o+.c so^’ng la^u bi`nh thu+o+`ng; everyone live long and well”); // the message “stack” is loaded or push-pop with messages …
; go-to-jump-tro+?-ve^` start-ORG-nguye^n-thu?y: maintain-gi`n-giu+~ (“muo^n loa`i ddu+o+.c so^’ng la^u bi`nh thu+o+`ng; everyone live long and well”);
;     }
;
; from http://forum.osdev.org/viewtopic.php?f=1&t=20933
; The BIOS loads the boot block at physical address 07C00H but there is no guarantee that it is logical address 0000:7C00H or 07C0:0000H.
; To set all segment registers to 0, this should do:
;[org 7C00H]
;    jmp 0:start
;    nop
;//global start
;start:
;    xor ax, ax
;    mov ds, ax
;    mov es, ax
;    mov ss, ax
;    mov sp, …
; I prefer this because it gives you direct [segment:offset 0:offset] access to the Interrupt Vector Table and the BIOS Data Area.
; in fact, this is not only a software problem: hardware-wise voltages that claims to be “digital” with sharp boundaries/transitions
;are supposed to be impresed upon the pins of a cpu [when the ocean waves hit a boat, the boat has no choice/computation but to
; respond by bobbing to the waves: tri ha`nh ho+.p nha^’t:  gia ba?o “pho^’i ho+.p” “sie^u nha^n” … ] can be more similar to
;analog voltages–To^n DDi.nh asked about common mode rejection ratio in layman terms balanced or unbalanced audio cables for the speakers he
;just bought that just arrived–and these analog voltages take a finite amount of time to stabilized to digital form
; from http://geezer.osdevbrasil.net/johnfine/segments.htm:
;WARNING: A 386 needs a very tiny delay (any instruction would be more than enough) after switching to pmode, before it can correctly load a selector
;into a segment register. In one version of my switch to flat real mode I had the selector value in a general register before switching to pmode, and
;the very first instruction after switching to pmode was a fast instruction to MOV that selector to a segment register. Depending on instruction alignment,
;it could corrupt the hidden part of the segment register (on a 386 only). You can safely write to a segment register with the very first instruction after
;switching to pmode if it is a slow instruction like a POP or a far JMP, but not if it is a fast instruction like “MOV DS,BX”. Normally you wouldn’t even
;notice this problem because it is more natural to move the selector to a register right before you move it to the segment register.

;from http://www.supernovah.com/Tutorials/BootSector2.php:
;Setting up CS and IP
;As stated earlier, we cannot be sure if the BIOS set us up with the starting address of 0x7C0:0x0 or 0x0:0x7C00.
;We will use the second segment offset pair to execute our boot sector so we know for sure how the CPU will access
;our code. To do this, our very first instruction will be a far jump that simply jumps to the next instruction.
;The trick is, if we specify a segment, even if it is 0x0, the jmp will be a far jump and the CS register will be
;loaded with the value 0x0 and the IP register will be loaded with the address of the next instruction to be executed.
; jmp 0x0:Start
;This code will set the CS segment to 0x0, set the IP register to the the very next instruction which will be slightly past 0x7C00, to label “Start:”

; from http://faydoc.tripod.com/cpu/jmp.htm
;Description
; Transfers program control to a different point in the instruction stream without recording return information. The destination (target) operand specifies the address of the instruction being jumped to. This operand can be an immediate value, a general-purpose register, or a memory location.
;
; This instruction can be used to execute four different types of jumps:
; Near jump A jump to an instruction within the current code segment (the segment currently pointed to by the CS register), sometimes referred to as an intrasegment jump.
; Short jump A near jump where the jump range is limited to –128 to +127 from the current EIP value.
; Far jump A jump to an instruction located in a different segment than the current code segment but at the same privilege level, sometimes referred to as an intersegment jump.
; Task switch A jump to an instruction located in a different task.
;
;A task switch can only be executed in protected mode (see Chapter 6, Task Management, in the Intel Architecture Software Developer’s Manual, Volume 3, for information on performing task switches with the JMP instruction).
;
;Near and Short Jumps.  When executing a near jump, the processor jumps to the address (within the current code segment) that is specified with the target operand. The target operand specifies either an absolute offset (that is an offset from the base of the code segment) or a relative offset (a signed displacement relative to the current value of the instruction pointer in the EIP register). A near jump to a relative offset of 8-bits (rel8) is referred to as a short jump. The CS register is not changed on near and short jumps.
;
; An absolute offset is specified indirectly in a general-purpose register or a memory location (r/m16 or r/m32). The operand-size attribute determines the size of the target operand (16 or 32 bits). Absolute offsets are loaded directly into the EIP register. If the operand-size attribute is 16, the upper two bytes of the EIP register are cleared to 0s, resulting in a maximum instruction pointer size of 16 bits.
;
; A relative offset (rel8, rel 16, or rel32) is generally specified as a label in assembly code, but at the machine code level, it is encoded as a signed 8-, 16-, or 32-bit immediate value. This value is added to the value in the EIP register. (Here, the EIP register contains the address of the instruction following the JMP instruction). When using relative offsets, the opcode (for short vs. near jumps) and the operand-size attribute (for near relative jumps) determines the size of the target operand (8, 16, or 32 bits).
;
;Far Jumps in Real-Address or Virtual-8086 Mode. When executing a far jump in real-address or virtual-8086 mode, the processor jumps to the code segment and offset specified with the target operand. Here the target operand specifies an absolute far address either directly with a pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). With the pointer method, the segment and address of the called procedure is encoded in the instruction, using a 4-byte (16-bit operand size) or 6-byte (32-bit operand size) far address immediate. With the indirect method, the target operand specifies a memory location that contains a 4-byte (16-bit operand size) or 6-byte (32-bit operand size) far address. The far address is loaded directly into the CS and EIP registers. If the operand-size attribute is 16, the upper two bytes of the EIP register are cleared to 0s.

; from http://glob.inamidst.com/bootloader:
;So I guess that jmp far 0:0x7C00 works. Maybe. Though then that is probably going to reach the jmp instruction again, so that’ll just loop. You can’t set CS and EIP directly, hence the jump. Perhaps you can also do something like this:
;org 0x7C00
;jmp far 0:start
;start:
;    …

; from http://gaztek.sourceforge.net/osdev/boot/gbootsect.txt:
;Creating your own bootsector is simpler than you may think,
;the only requirement is that the bootsector is 512 bytes long, and at
;offset 0x1FE (decimal=510), the word 0xAA55 is placed. This is the first
;thing the BIOS does when the PC boots up, it first looks on the first
;floppy drive at the first sector for 0xAA55 at the end, and if it finds it
;then it loads it into memory, and starts executing it, otherwise it trys the
;primary harddisk, and if that isn’t found it just bombs out with an error.
;You should place your boot sector at:
;  Sector 1
;  Cylinder 0
;  Head 0
;The BIOS loads the bootsector at linear offset 0x7C00, the state of
;the registers are:
;
; DL = Boot drive, 1h = floppy1, 80h = primary harddisk, etc
; CS = 0
; IP = 0x7c00

; from http://webcache.googleusercontent.com/search?q=cache:F-Lp9kDLukcJ:www.cs.cmu.edu/~410/lectures/L20_Bootstrap.pdf+&cd=25&hl=en&ct=clnk&gl=us:
;Ground Zero
;You turn on the machine
;Execution begins in real mode at a specific memory address
;Real mode – primeval x86 addressing mode
;Only 1 MB of memory is addressable
;First instruction fetch address is 0xFFFF0 (???)
;“End of memory” (20-bit infinity), minus 15…
;Contains a jump to the actual BIOS entry point
;Great, what’s a BIOS?

; a code segment
[SEGMENT CODESEGMENT1 PROGBITS ALIGN=16]  [BITS 16]  ; SECTION .text USE16 … [] is “primitive form”, BITS 16 is “user-level form”, “.text” is a standard well-known name version of “CODESEGMENT1”

start:
;mov     ax,seg DATASEGMENT1
mov ax, 0x0
mov     ds,ax
mov     ax,seg STACKSEGMENT
mov     ss,ax
mov     sp,stacktop

; another code segment
[SEGMENT CODESEGMENT2 PROGBITS FOLLOWS=CODESEGMENT1 ALIGN=16]  [BITS 16]  ; SECTION .text USE16 … [] is “primitive form”, BITS 16 is “user-level form”
; chay tron tinh yeu movie “kho^ng nghe chu’ khuye^n nu+~a dda^u”
; and another code segment
[SEGMENT CODESEGMENT3 PROGBITS FOLLOWS=CODESEGMENT2 ALIGN=16]  [BITS 16]  ; SECTION .text USE16 … [] is “primitive form”, BITS 16 is “user-level form”

; a stack segment
[SEGMENT STACKSEGMENT  PROGBITS FOLLOWS=CODESEGMENT3 ALIGN=16]  [BITS 16]  ; SECTION .data USE16 … [] is “primitive form”, BITS 16 is “user-level form”
; resb 64  ; resb = reserve-byte, length of stack is 64 bytes
;stacktop:
; from http://www.osdever.net/bkerndev/Docs/basickernel.htm
; Remember that a stack actually grows downwards, so we declare the size of the data before declaring the label “stacktop:”

; a data segment
[SEGMENT DATASEGMENT1 PROGBITS FOLLOWS=STACKSEGMENT ALIGN=4]  [BITS 16]  ; SECTION .data USE16 … [] is “primitive form”, BITS 16 is “user-level form”
db    ‘hello’,13,10,’$’   ; so are string constants

; another data segment
[SEGMENT DATASEGMENT2 PROGBITS FOLLOWS=DATASEGMENT1 ALIGN=4]  [BITS 16]  ; SECTION .data USE16 … [] is “primitive form”, BITS 16 is “user-level form”
db    0x55,0x56,0x57      ; three bytes in succession

; be sure that all the segments above do not add up in size to more than 512 bytes …
times 512 – 2 -($-$$) db 0 ; Pad remainder of 512-bytes “boot sector” with zero’s …
dw 0xAA55          ; and The standard PC boot signature
; ye^’n  mentioned $ 500 day earning …
[SEGMENT CODESEGMENT1]
end:
times 512 – 2 -($-$$) db 0 ; Pad remainder of 512-bytes “boot sector” with zero’s …
dw 0xAA55          ; and The standard PC boot signature
; ye^’n  mentioned $ 500 day earning …

; what if one try :
;absolute  0x7C00 + 0x200 ; 0x200 is 512 bytes 0x1FE is 510
;
;    magic_chr    resw    1
; it’s only reserved memory no codes is generated by the assembler …
;***************************************************************************************************************************
;***** General NOTES *****
; from http://www.bioscentral.com/misc/biosbasics.htm:
; 1.  Power is applied to the computer
;
;   When power is applied to the system and all output voltages from the power supply are good, the power supply will generate a power good signal which is received by the motherboard timer.  When the timer receives this signal, it stops forcing a reset signal to the CPU and the CPU begins processing instructions.
;
;    2.  Actual boot
;
;    The very first instruction performed by a CPU is to read the contents of a specific memory address that is preprogrammed into the CPU.  In the case of x86 based processors, this address is FFFF:0000h.   This is the last 16 bytes of memory at the end of the first megabyte of memory.   The code that the processor reads is actually a jump command (JMP) telling the processor where to go in memory to read the BIOS ROM.  This process is traditionally referred to as the bootstrap, but now commonly referred to as boot and has been broadened to include the entire initialization process from applying power to the final stages of loading the operating system.
;
;   3.  POST
;
;    POST stands for Power On Self Test.  It’s a series of individual functions or routines that perform various initialization and tests of the computers hardware.  BIOS starts with a series of tests of the motherboard hardware.  The CPU, math coprocessor, timer IC’s, DMA controllers, and IRQ controllers. The order in which these tests are performed varies from mottherboard to motherboard. Next, the BIOS will look for the presence of video ROM between memory locations C000:000h and C780:000h.  If a video BIOS is found, It’s contents will be tested with a checksum test.  If this test is successful, the BIOS will initialize the video adapter. It will pass controller to the video BIOS, which will inturn initialize itself and then assume controller once it’s complete.  At this point, you should see things like a manufacturers logo from the video card manufacturer video card description or the video card BIOS information.  Next, the BIOS will scan memory from C800:000h to DF800:000h in 2KB increments.  It’s searc;hing for any other ROM’s that might be installed in the computer, such as network adapter cards or SCSI adapter cards. If a adapter ROM is found, it’s contents are tested with a checksum test.  If the tests pass, the card is initialized. Controller will be passed to each ROM for initialization then the system BIOS will resume controller after each BIOS found is done initializing. If these tests fail, you should see a error message displayed telling you “XXXX ROM Error”.  The XXXX indicates the segment address where the faulty ROM was detected.  Next, BIOS will begin checking memory at 0000:0472h.  This address contains a flag which will tell the BIOS if the system is booting from a cold boot or warm boot.  A value of 1234h at this address tells the BIOS that the system was started from a warm boot. This signature value appears in Intel little endian format , that is, the least significant byte comes first, they appear in memory as the sequence 3412. In the event of a warm boot, the BIOS will will skip the PO;ST routines remaining.  If a cold start is indicated, the remaining POST routines will be run.  During the POST test, a single hexadecimal code will be written to port 80h.  Some other PC’s send these codes to other ports however. Compaq sends them to port 84h, IBM PS/2 model 25 and 30 send them to port 90h, model 20-286 send them to port 190h. Some EISA machines with an Award BIOS send them to port 300h and system with the MCA architecture send them to port 680h. Some early AT&T, Olivetti, NCR and other AT Clones send them to the printer port at 3BC, 278h or 378h. This code will signify what is being tested at any given moment.   Typically, when the BIOS fails at some point, this code will tell you what is failing.
;
;   4.  Looking for the Operating System
;
;    Once POST is complete and no errors found, the BIOS will begin searching for an operating system.   Typically, the BIOS will look for a DOS Volume Boot Sector on the floppy drive.   If no operating system is found, it will search the next location, the hard drive C.  If the floppy drive (A), has a bootable floppy in it, the BIOS will load sector 1, head 0, cylinder 0 from the disk into memory starting at location 0000:7C00h.  The first program to load will be IO.SYS, then MSDOS.SYS.  If the floppy does not contain a DOS volume boot sector, then BIOS will next search the computers hard drive for a master partition boot sector and load it into memory at 0000:7C00h.  There are some occasions in which you will encounter problems with the proper loading of the Volume Boot Sector.  Below are some of those:
;
;            A.  If the first byte of the Volume Boot Sector is less than 6h, then you will receive a message similar to “Diskette boot record error”.
;
;            B.  If the IO.SYS or MSDOS.SYS are not the first two files in the Volume Boot Sector, then you will see a message similar to “Non-system disk or disk error”.
;
;            C.  If the Volume Boot Sector is corrupt or missing, you will get a message similar to “Disk boot failure”
;
;Once the BIOS has searched for a bootable floppy device, it should turn it’s attention to the next boot device it’s programmed to look for.  The next device is typically the hard drive, or C.   Like a floppy drive, the BIOS will attempt to load the Volume Boot Sector from sector 1, head 0, cylinder 0 from the Master Boot Sector, or MBS, into memory starting at 0000:7C00h.  The BIOS will check the last two bytes of the MBS.  They should be 55h and AAh respectively.  If they are not, then you will receive an error message similar to “No boot device available” and “System initialization will halt”.  If they are correct, then the BIOS will continue the loading process.   At this point, the BIOS will scan the MBR in search of any extended partitions.   If any extended partitions are identified, the original boot sector will search for a boot indicator byte which indicates a active and bootable partition.  If it cannot find one, you will receive a message similar to “Invalid partition table”.
;
;At this, once a active partition is found, the BIOS will search for a Volume Boot Sector on the bootable partition and load the VBS into memory and test it.  If the VBS is not readable or corrupt, you will see a message similar to “Error loading operating system”.  At the point, the BIOS will read the last two bytes of the VBS.  These bytes should be 55h and AAh respectively.  If they are not, then you will see a message similar to “Missing operating system”  It is at this point that the BIOS will begin loading of the operating system.

; from http://en.wikipedia.org/wiki/BIOS
;When the x86 processor is reset, it loads its program counter with a fixed address near the top of the 1 megabyte real-mode address space. The address of the BIOS’s memory is located such that it will be executed when the computer is first started up. A jump instruction then directs the processor to start executing code in the BIOS. If the system has just been powered up or the reset button was pressed (“cold boot”), the full power-on self-test (POST) is run. If Ctrl+Alt+Delete was initiated (“warm boot”), a special flag value is detected in Nonvolatile memory (NVRAM) and the BIOS does not run the POST. This saves the time otherwise used to detect and test all memory. The NVRAM is in the real-time clock (RTC).
;
;The power-on self-test tests, identifies, and initializes system devices such as the CPU, RAM, interrupt and DMA controllers and other parts of the chipset, video display card, keyboard, hard disk drive, optical disc drive and other basic hardware. The BIOS then locates boot loader software held on a storage device designated as a ‘boot device’, such as a hard disk, a floppy disk, CD, or DVD, and loads and executes that software, giving it control of the PC.[8] This process is known as booting, or booting up, which is short for “bootstrapping”.
;
;Boot devices[edit]
;
;The BIOS selects candidate boot devices using information collected by POST and configuration information from EEPROM, CMOS RAM or, in the earliest PCs, DIP switches. Option ROMs may also influence or supplant the boot process defined by the motherboard BIOS ROM. The BIOS checks each device in order to see if it is bootable. For a disk drive or a device that logically emulates a disk drive, such as an USB Flash drive or perhaps a tape drive, to perform this check the BIOS attempts to load the first sector (boot sector) from the disk to memory address 0x007C00, and checks for the boot sector signature 0x55 0xAA in the last two bytes of the (512 byte long) sector. If the sector cannot be read (due to a missing or blank disk, or due to a hardware failure), or if the sector does not end with the boot signature, the BIOS considers the disk unbootable and proceeds to check the next device. Another device such as a network adapter attempts booting by a procedure that is defined by its option ROM (or the equivalent ;integrated into the motherboard BIOS ROM). The BIOS proceeds to test each device sequentially until a bootable device is found, at which time the BIOS transfers control to the loaded sector with a jump instruction to its first byte at address 0x007C00 (1 KiB below the 32 KiB mark).

;from http://www.brokenthorn.com/Resources/OSDev7.html:
;General x86 Real Mode Memory Map: •0x00000000 – 0x000003FF – Real Mode Interrupt Vector Table
;•0x00000400 – 0x000004FF – BIOS Data Area
;•0x00000500 – 0x00007BFF – Unused
;•0x00007C00 – 0x00007DFF – Our Bootloader
;•0x00007E00 – 0x0009FFFF – Unused
;•0x000A0000 – 0x000BFFFF – Video RAM (VRAM) Memory
;•0x000B0000 – 0x000B7777 – Monochrome Video Memory
;•0x000B8000 – 0x000BFFFF – Color Video Memory
;•0x000C0000 – 0x000C7FFF – Video ROM BIOS
;•0x000C8000 – 0x000EFFFF – BIOS Shadow Area
;•0x000F0000 – 0x000FFFFF – System BIOS
;
;Note: It is possible to remap all of the above devices to use different regions of memory. This is what the BIOS POST does to map the devices to the table above.
;
;Okay, this is cool and all. Because these addresses represent different things, by reading (or writing) to specific addresses, we get obtain (or change) information with ease from different parts of the computer.
;***** NOTES *****

; from http://wiki.osdev.org/Babystep1:
; jmp $  ;
; or this version
;hang: jmp hang
; Near and Short Jumps.  When executing a near jump, the processor jumps to the address (within the current code segment) that
;is specified with the target operand. The target operand specifies either an absolute offset (that is an offset from the base of
;the code segment) or a relative offset (a signed displacement relative to the current value of the instruction pointer in the
;EIP register). A near jump to a relative offset of 8-bits (rel8) is referred to as a short jump. The CS register is not changed on
; near and short jumps.

; degeneracy of the segment:offset scheme of addressing means it’s a “metaphor”: the cpu interfaces with the “muo^n loa`i” expansion
; as a metaphor resolving it to a real physical address …

; from http://wiki.osdev.org/Babystep2:
; In real mode, addresses are calculated as segment * 16 + offset. Since offset can be much larger than 16, there are many pairs
; of segment and offset that point to the same address. For instance, some say that the bootloader is is loaded at 0000:7C00,
; while others say 07C0:0000. This is in fact the same address: 16 * 0x0000 + 0x7C00 = 16 * 0x07C0 + 0x0000 = 0x7C00.
; It doesn’t matter if you use 0000:7c00 or 07c0:0000, but if you use ORG you need to be aware of what’s happening. By default,
; the start of a raw binary is at offset 0, but if you need it you can change the offset to something different and make it work.
; For instance the following snippet accesses the variable msg with segment 0x7C0.
; segment:offset  ds:offset   0x07CO:offset-from-0
; ; boot.asm
; ; by default, [ORG 0]
;   mov ax, 0x07c0
;   mov ds, ax
; or the other version:
; segment:offset ds:offset 0:offset-from-0x7COO
; ; boot.asm
; ; [ORG 0x7c00]
;   xor ax, ax ; make it zero
;   mov ds, ax
; from http://geezer.osdevbrasil.net/johnfine/segments.htm : In real mode the CPU shifts the segment
; register value left by four places (multiplying it by 16) and adds the 16 bit offset to get a 20 bit physical address.
; Any physical address can be represented in multiple ways, with different segments and offsets. For
; example, physical address 0x210 can be 0020:0010, 0000:0210, or 0021:0000.
; thus,

; from NASM manual:
;The bin format provides an additional directive to the list given in chapter 6: ORG.
; The function of the ORG directive is to specify the origin address which NASM will assume the program begins at when it is loaded into memory.

;For example, the following code will generate the longword 0x00000104:
;
;        org     0x100
;        dd      label
;label:
; guesss statement “dd label” is not a “critical expression” and we can assume that on the first pass NASM does not know what
; the argument to “dd”, namely “label”, is to initialized “dd”, but on the second pass NASM would have figured that “label”
; is the address of “label:” which is double word distance from “org” or 4 bytes from “org” or 104 …

;ALIGN is used, as shown above, to specify how many low bits of the segment start address must be forced to zero [that is, “round off” or “modulo”].
;nasm default
;section .text    code  align=16
;section .data    data  align=4
;section .bss     bss   align=4

; from http://ece425web.groups.et.byu.net/stable/labs/NASM.html:
;The align directive allows programmers to align their code to word, dword, or larger boundaries in memory. To align to a word boundary, the following
;line of assembly could be used:
; align 2 ; Align to nearest 2-byte boundary
;This will cause an unused byte to be inserted if the address of the next instruction or data would have been odd. The parameter given to align must be a
;power of 2. Code and data alignment are important in ensuring memory performance.
; When linking several .OBJ files into a .EXE file, you should ensure that exactly one of them has a start point
; defined (using the ..start special symbol defined by the obj format: see section 6.2.6). If no module defines
; a start point, the linker will not know what value to give the entry-point field in the output file header; if
; more than one defines a start point, the linker will not know which value to use.

; OMF linkers require exactly one of the object files being linked to define the program entry point, where execution will begin when
; the program is run. If the object file that defines the entry point is assembled using NASM, you specify the entry point by declaring
; the special symbol ..start at the point where you wish execution to begin.
; An example of a NASM source file which can be assembled to a .OBJ file and linked on its own to a .EXE is given here.
; It demonstrates the basic principles of defining a stack, initialising the segment registers, and declaring a start point.
; This file is also provided in the test subdirectory of the NASM archives, under the name objexe.asm.
; This initial piece of code sets up DS to point to the data segment, and initialises SS and SP to point to the top of the
; provided stack. Notice that interrupts are implicitly disabled for one instruction after a move into SS, precisely for this
; situation, so that there’s no chance of an interrupt occurring between the loads of SS and SP and not having a stack to execute on.

;NASM contains no mechanism to support the various C memory models directly; you have to keep track yourself of which one you are writing for. This means you have to keep track of the following things:
;•In models using a single code segment (tiny, small and compact), functions are near. This means that function pointers, when stored in data segments or pushed on the stack as function arguments, are
;16 bits long and contain only an offset field (the CS register never changes its value, and always gives the segment part of the full function address), and that functions are called using ordinary near
;CALL instructions and return using RETN (which, in NASM, is synonymous with RET anyway). This means both that you should write your own routines to return with RETN, and that you should call external C
;routines with near CALL instructions.
;•In models using more than one code segment (medium, large and huge), functions are far. This means that function pointers are 32 bits long (consisting of a 16-bit offset followed by a 16-bit segment),
;and that functions are called using CALL FAR (or CALL seg:offset) and return using RETF. Again, you should therefore write your own routines to return with RETF and use CALL FAR to call external routines.
;•In models using a single data segment (tiny, small and medium), data pointers are 16 bits long, containing only an offset field (the DS register doesn’t change its value, and always gives the segment
;part of the full data item address).
;•In models using more than one data segment (compact, large and huge), data pointers are 32 bits long, consisting of a 16-bit offset followed by a 16-bit segment. You should still be careful not to modify
;DS in your routines without restoring it afterwards, but ES is free for you to use to access the contents of 32-bit data pointers you are passed.
;•The huge memory model allows single data items to exceed 64K in size. In all other memory models, you can access the whole of a data item just by doing arithmetic on the offset field of the pointer you
;are given, whether a segment field is present or not; in huge model, you have to be more careful of your pointer arithmetic.
;•In most memory models, there is a default data segment, whose segment address is kept in DS throughout the program. This data segment is typically the same segment as the stack, kept in SS, so that
;functions’ local variables (which are stored on the stack) and global data items can both be accessed easily without changing DS. Particularly large data items are typically stored in other segments. However,
;some memory models (though not the standard ones, usually) allow the assumption that SS and DS hold the same value to be removed. Be careful about functions’ local variables in this latter case.
;
;In models with a single code segment, the segment is called _TEXT, so your code segment must also go by this name in order to be linked into the same place as the main code segment. In models with a single
;data segment, or with a default data segment, it is called _DATA.

; the advantage of using the SEGMENT directive to “label” code:
; When you define a segment in an obj file, NASM defines the segment name as a symbol as well, so that you can access the segment
; address of the segment. So, for example:
;          segment data
;dvar:     dw 1234
;          segment code
;function: mov ax,data            ; get segment address of data
;          mov ds,ax              ; and move it into DS
;          inc word [dvar]        ; now this reference will work
; ith, bin, …: this is a flat memory image format with no support for relocation or linking.

;7.1.3 Multisection Support for the bin Format

;The bin format allows the use of multiple sections, of arbitrary names, besides the “known” .text, .data, and .bss names.
;•Sections may be designated progbits or nobits. Default is progbits (except .bss, which defaults to nobits, of course).
;•Sections can be aligned at a specified boundary following the previous section with align=, or at an arbitrary byte-granular position with start=.
;•Sections can be given a virtual start address, which will be used for the calculation of all memory references within that section with vstart=.
;•Sections can be ordered using follows=<section> or vfollows=<section> as an alternative to specifying an explicit start address.
;•Arguments to org, start, vstart, and align= are critical expressions. See section 3.8. E.g. align=(1 << ALIGN_SHIFT) – ALIGN_SHIFT must be defined before it is used here.
;•Any code which comes before an explicit SECTION directive is directed by default into the .text section.
;•If an ORG statement is not given, ORG 0 is used by default.
;•The .bss section will be placed after the last progbits section, unless start=, vstart=, follows=, or vfollows= has been specified.
;•All sections are aligned on dword boundaries, unless a different alignment has been specified.
;•Sections may not overlap.
;•NASM creates the section.<secname>.start for each section, which may be used in your code.
;7.4.1 obj Extensions to the SEGMENT Directive
;
;The obj output format extends the SEGMENT (or SECTION) directive to allow you to specify various properties of the segment you are defining. This is done by appending extra qualifiers to the end of the segment-definition line. For example,
;
;segment code private align=16
;
;
;defines the segment code, but also declares it to be a private segment, and requires that the portion of it described in this code module must be aligned on a 16-byte boundary.
;
;The available qualifiers are:
;•PRIVATE, PUBLIC, COMMON and STACK specify the combination characteristics of the segment. PRIVATE segments do not get combined with any others by the linker; PUBLIC and STACK segments get concatenated together at link time; and COMMON segments all get overlaid on top of each other rather than stuck end-to-end.
;•ALIGN is used, as shown above, to specify how many low bits of the segment start address must be forced to zero. The alignment value given may be any power of two from 1 to 4096; in reality, the only values supported are 1, 2, 4, 16, 256 and 4096, so if 8 is specified it will be rounded up to 16, and 32, 64 and 128 will all be rounded up to 256, and so on. Note that alignment to 4096-byte boundaries is a PharLap extension to the format and may not be supported by all linkers.
;•CLASS can be used to specify the segment class; this feature indicates to the linker that segments of the same class should be placed near each other in the output file. The class name can be any word, e.g..
;•OVERLAY, like CLASS, is specified with an arbitrary word as an argument, and provides overlay information to an overlay-capable linker.
;•Segments can be declared as USE16 or USE32, which has the effect of recording the choice in the object file and also ensuring that NASM’s default assembly mode when assembling in that segment is 16-bit or 32-bit respectively.
;•When writing OS/2 object files, you should declare 32-bit segments as FLAT, which causes the default segment base for anything in the segment to be the special group FLAT, and also defines the group if it is not already defined.
;•The obj file format also allows segments to be declared as having a pre-defined absolute segment address, although no linkers are currently known to make sensible use of this feature; nevertheless, NASM allows you to declare a segment such as SEGMENT SCREEN ABSOLUTE=0xB800 if you need to. The ABSOLUTE and ALIGN keywords are mutually exclusive. ;
;
;NASM’s default segment attributes are PUBLIC, ALIGN=1, no class, no overlay, and USE16.
; An example of a NASM source file which can be assembled to a .OBJ file and linked on its own to a .EXE is given here.
; It demonstrates the basic principles of defining a stack, initialising the segment registers, and declaring a start point.
; This file is also provided in the test subdirectory of the NASM archives, under the name objexe.asm.
; This initial piece of code sets up DS to point to the data segment, and initialises SS and SP to point to the top of the
; provided stack. Notice that interrupts are implicitly disabled for one instruction after a move into SS, precisely for this
; situation, so that there’s no chance of an interrupt occurring between the loads of SS and SP and not having a stack to execute on.

;from http://www.supernovah.com/Tutorials/BootSector2.php : The processor uses the SS:SP segment offset address to determine the
;location of the stack. We must also clear the interrupt flag because we set the stack segment register. Setting the stack segment
;may cause an interrupt to be fired. Calling cli will prevent this from happening. After we setup the stack we, will re-enable
;interrupts. Ignore the fact that we disable interrupts right after re-enabling them. This won’t be the case much longer.
;..start:
;        mov     ax,DATASEGMENT1
;        mov     ds,ax
;        mov     ax,STACKSEGMENT
;        mov     ss,ax
;        mov     sp,stacktop

;NASM’s directives come in two types: user-level directives and primitive directives. Typically, each directive has a user-level
;form and a primitive form. In almost all cases, we recommend that users use the user-level forms of the directives, which are
;implemented as macros which call the primitive forms.
;Primitive directives are enclosed in square brackets; user-level directives are not.
;The BITS directive specifies whether NASM should generate code designed to run on a processor operating in 16-bit mode, 32-bit mode or 64-bit mode. The syntax is BITS XX, where XX is 16, 32 or 64.

;In most cases, you should not need to use BITS explicitly. The aout, coff, elf, macho, win32 and win64 object formats, which are designed for use in 32-bit or 64-bit operating systems, all cause NASM to select 32-bit or 64-bit mode, respectively, by default. The obj object format allows you to specify each segment you define as either USE16 or USE32, and NASM will set its operating mode accordingly, so the use of the BITS directive is once again unnecessary.
;The most likely reason for using the BITS directive is to write 32-bit or 64-bit code in a flat binary file; this is because the bin output format defaults to 16-bit mode in anticipation of it being used most frequently to write DOS .COM programs, DOS .SYS device drivers and boot loader software.
;You do not need to specify BITS 32 merely in order to use 32-bit instructions in a 16-bit DOS program; if you do, the assembler will generate incorrect code because it will be writing code targeted at a 32-bit platform, to be run on a 16-bit one.
;When NASM is in BITS 16 mode, instructions which use 32-bit data are prefixed with an 0x66 byte, and those referring to 32-bit addresses have an 0x67 prefix. In BITS 32 mode, the reverse is true: 32-bit instructions require no prefixes, whereas instructions using 16-bit data need an 0x66 and those working on 16-bit addresses need an 0x67.
;When NASM is in BITS 64 mode, most instructions operate the same as they do for BITS 32 mode. However, there are 8 more general and SSE registers, and 16-bit addressing is no longer supported.
;The default address size is 64 bits; 32-bit addressing can be selected with the 0x67 prefix. The default operand size is still 32 bits, however, and the 0x66 prefix selects 16-bit operand size. The REX prefix is used both to select 64-bit operand size, and to access the new registers. NASM automatically inserts REX prefixes when necessary.
;When the REX prefix is used, the processor does not know how to address the AH, BH, CH or DH (high 8-bit legacy) registers. Instead, it is possible to access the the low 8-bits of the SP, BP SI and DI registers as SPL, BPL, SIL and DIL, respectively; but only when the REX prefix is used.

;There are two approaches to storing data in memory called big endian and little endian. Big endian order means that the most
; significant byte (or word) is stored first in memory. That is, at a lower memory address. Intel IA-32 processors store data in little endian order.

; from http://forum.osdev.org/viewtopic.php?f=1&t=20933:
; Also, those “0x55 0xAA” magic bytes really are meant to be at offset 511 and 512 in the first sector (rather than the last 2 bytes of the sector).
; This might seem like it’s exactly the same thing, until you consider (for e.g.) floppy disks that are formatted with 1024-byte sectors or larger
; sectors (which is something that the BIOS is meant to support, but also something that I’d assume most BIOSs have bugs/problems with).
;

 

 

Leave a comment