Reference Manual
for the 8051 microcontroller
Instruction Set Simulator
Simon Southwell
August 2012
Contents
Introduction
The 8051 ISS package
comprises an instruction set simulator,
modelling the 8051 8 bit processor. It implements all the non-optional
features, and most of the optional features for that core. It has a C
compatible API and is extensible to include additional modelled functionality. The
source is free-software, released under the terms of the GNU licence (see
LICENSE included in the package).
Features
Features include
- All supported core instructions
- Built-in internal and external memory
- Built-in code memory
- Standard SFRs
- Execution timing model based on the Silion Labs CIP-51
core
- Model of timer 0 and 1
- Configurable user callbacks
- Interrupt generating callback
- External memory access callback
- SFR access callback
- General purpose callback
- Serial interface callback
- Configurable execution breakpoints
- On a given address
- After a fixed number of cycles
- On 'loop to self' locking instruction
- Access to internal memory and SFRs
- Access to external memory
- Compatible with GNU tool chain
The code is a simple exercise in modelling an 8 bit embedded
CPU. It comes with absolutely no warranties for accuracy, or fitness for any
given purpose, and is provided 'as-is'. Hopefully it is useful for someone, and
feel free to extend and enhance the model, and maybe let me know how it's
going.
Simon Southwell (simon@anita-simulators.org.uk)
Cambridge, August 2012
Source Files
Listed and described here are those source files that make
up the 8051 ISS library (e.g. libcpu8051.so).
These are the source files needed for integration into other C environments.
The source files for the example executable and test bench program, cpu8051, are not
described (i.e. cpu8051.c).
These are still freely available for use, under the terms of the GNU public
license, but do not form part of the core functionality of the simulator, and
are not documented.
The main header files comprise those listed below:
- src/cpu8051.h
- src/8051.h
- src/read_ihx.h
For integrating the model with external programs only cpu8051.h needs be
included in source code that references the API. The 8051.h header is only used by the internal
source files, and includes all the definitions and types needed by this code.
The read_ihx.h
header is only used by the internal source code to interface the program
loading code.
The following listed files define the functions that belong
to the 8051 ISS. The functions are split over several files, but all belong to
the 8051 core model.
- src/execute.c
- src/inst.c
- src/read_ihx.c
Building Code
Included in the package is a makefile to build the code under Linux or
Cygwin, and support is also provided for MSVC 2010. Under the UN*X systems, by
default (i.e. simply typing 'make')
it will build the following:
- cpu8051
- libcpu8051.a
- libcpu8051.so
The first is an executable for running simple programs,
particularly the self-test programs provided in the package-"Testing"
section below. The next two are a static and dynamic library respectively, and
are the libraries an external program can use to link with the model, choosing
the appropriate one depending whether static or dynamic linking was most
appropriate for the particular application. The API for the libraries, and its
use, is described in the "API" section below.
The makefile also, by default, builds the code with debug
information (with gcc
option -g) and
as position-independent code (with option -fPIC---though this option is not needed in
Cygwin). These are defined in the make variable "COPTS", and can be overridden at
the make command line.
Support for MSVC 2010 is provided, with a solution file (.sln) in the msvc/ directory, along
with the minimal set of project files to read in to the MSVC 2010 IDE, and
compile and run the model, but if MSBuild.exe is in the PATH under Cygwin, then the makefile has support
to build from the command line with "make MSVC". The MSBuild.exe executable is part of
Microsoft.NET, and thus can normally be found in a directory (for example,
under Cygwin) such as:
<cdrive_path>/Windows/Microsoft.NET/Framework/v4.0.30319
The <cdrive_path>
is the Cygwin path to the windows disk (most likely /cygdrive/c) and the final directory
name will depend on the particular version of Microsoft NET installed. For 64
bit machines, a 64 bit version of the executable will be under Framework64.
By default, a make build for MSVC builds a "Release"
executable, which is placed in the same directory as for the other builds of
the makefile. If a "Debug" version is required, then the default can
be overridden via the MSVCCONF
make variable---i.e.:
make
MSVCCONF="Debug" MSVC
Like the make for UN*X, the MSVC build produces a cpu8051.exe executable,
but only a single library, libcpu8051.dll.
API
The API to the model is a C interface that consists of a set
of functions for configuring the model, setting control of program flow, and
running executable code. Definitions are provided in cpu8051.h needed to communicate with
some of these functions, and set their parameters. This is all described in the
sections to follow. A summary of the API functions is given below.
void set_verbosity_lvl
(int lvl);
void set_output_stream
(FILE* file_pointer);
void set_disable_lock_break
();
void clr_disable_lock_break
();
int run_program
(char* ihx_fname, int run_cycles, int
break_addr, int timer_enable);
int register_int_callback
(pintcallback_t callback_func, int
callback_type);
int register_time_callback
(ptimecallback_t callback_func);
int register_ext_mem_callback
(pmemcallback_t callback_func);
int register_sfr_callback
(pmemcallback_t callback_func);
int get_iram_byte
(int addr);
void set_iram_byte
(int addr, int
data);
int get_ext_ram_byte
(int addr);
void set_ext_ram_byte
(int addr, int
data);
int get_cycle_time
(void);
Initialisation
There is little initialisation to be done on the 8051 model,
and it can be run using all the default settings. The following allow some changes
to the default settings before code execution:
void
set_verbosity_lvl (int lvl)
|
default
|
VERBOSITY_LVL_OFF
|
description
|
Allows the verbosity level to be changed. Its single
parameter has only two valid values currently:
VERBOSITY_LVL_OFF 0
VERBOSITY_LVL_1 1
As well as at initialisation, this can be called during
execution. This is useful for debugging of long programs, which would
generate a large output if verbosity specified from time 0. A break can be set
up to return at a known point before the area of interest, and verbosity
increased before continuing.
|
void
set_output_stream (FILE* file_pointer)
|
default
|
stdout
|
description
|
Set the file pointer to which all output is directed (e.g.
verbose output)
|
void
set_disable_lock_break ()
void
clr_disable_lock_break ()
|
default
|
Breaking on lock condition enabled
|
description
|
Set/clear the disable for breaking on lock condition,
i.e.:
label0 : SJMP label0 (0x80 0xfe)
When enabled, the 'jump to self' instruction above will
cause a program execution break.
|
Execution and Breakpoints
A single function call is used to run a program. The
function loads a program from the specified filename, with two configurable breakpoint
arguments and an enable for the timers.
int run_program (char* ihx_fname, int
run_cycles, int break_addr,
int timer_enable)
|
default
|
Returns NO_ERROR
|
description
|
The ihx_fname
specifies the program filename. This must be a valid intel hex file format
8051 program, otherwise an error results and a non-zero value is returned.
The number of cycles to run is specified in the run_cycles
parameter. This can be any valid positive number < 2^31, after which the
program execution will break. Two special values are defined:
FOREVER
ONCE
The first of these disables the break on cycle count, and
the second single steps through the code.
A break address (break_addr) may also be specified which will terminate
the program execution if the PC hits the specified address. If the address is
less than 0, the PC cannot reach this address, and breaking on address is disabled.
The function returns and integer:
NO_ERROR -- returned successfully
> NO_ERROR -- breakpoint or error
|
Callbacks
The model supports four types of user defined callbacks that
can be registered with the model. The first is a general purpose callback that is
called at regular intervals, of lengths from a single instruction execution to
a user defined sleep time. A similar callback is called regularly but,
additionally, can return an interrupt status. A third callback can be
registered for invocation on any access to external memory, whilst the final
callback is called on any SFR access *not* to a standard SFR register, to allow
extension of the SFR space (e.g. for SFR mapper peripherals).
int
register_time_callback (ptimecallback_t callback_func)
|
default
|
No callback
|
description
|
Registers a callback function to be called at set regular
times. The registered callback functions must be of type ptimecallback_t,
i.e.
int cb_func (int time, int *wakeup_time)
When the function is called, the current model time is
passed in as the argument "time". By default the callback is invoked after
every instructions execution, but the function can delay this invocation to a
later cycle by returning a cycle time in the pointer "wakeup_time". The
function can return an absolute cycle count, or as a delta by adding the
offset to the value passed in as "time".
This callback is useful for extending the model with
functionality that causes no exceptions, such as a debug output port.
|
int
register_int_callback (pintcallback_t callback_func, int callback_type)
|
default
|
No callback
|
description
|
Registers a callback function to be called at set regular
times, but also can generate an interrupt exception to the core. The
registered callback functions must be of type pintcallback_t, i.e.
int cb_func (int time, int *wakeup_time)
When the function is called, the current model time is
passed in as the argument "time". By default the callback is
invoked after every instructions execution, but the function can delay this
invocation to a later cycle by returning a cycle time in the pointer "wakeup_time". The
function can return an absolute cycle count, or as a delta by adding the
offset to the value passed in as "time".
If the callback returns 0, then no exception is generated
in the core. If the callback returns >0, then an interrupt is generated.
The type of the interrupt is defined at registration with the "callback_type"
argument. This can be one of two values:
INT_CALLBACK_EXT0
INT_CALLBACK_EXT1
These map to the two external interrupts defined for the
8051.
This callback is useful for extending the model with
functionality that cause exceptions, such as a UART
|
int
register_ext_mem_callback (pmemcallback_t callback_func)
|
default
|
No callback
|
description
|
Registers a callback function that is called for any
access to external memory. The function must be of type pmemcallback_t,
i.e.:
int cb_func (int addr, int data, int rnw, uint8_t *mem, int time)
Upon invocation, an address is supplied (addr), along with a
data value
(valid only on writes), a rnw
flag indicating read (1) or write (0), a pointer to a memory buffer (*mem) and the
current time (time).
If the callback does not model functionality located at
the supplied address, then the function must access the buffer pointed to by mem[] on behalf of
the model, returning any read value. If the callback models the functionality
at the supplied address then it may respond as necessary, including returning
any read value, and must not updated mem[].
This callback is useful for adding peripherals to the
core, with memory mapped registers in the external memory space.
|
int
register_sfr_callback (pmemcallback_t callback_func)
|
default
|
No callback
|
description
|
Registers a callback function that is called for any
access to the SFR registers outside of the standard set. The function must be
of type pmemcallback_t,
i.e.:
int cb_func (int addr, int data, int rnw, uint8_t *mem, int time)
The functionality is essentially the same as for the
external memory callback (see above), but for non-standard SFR accesses.
|
Memory Access
The API allows inspection of the memory of the model, both
external memory and internal memory (including SFRs). If callbacks are
registered for access to these memories, these will be invoked by calls to these
functions, allowing access to memory modelled externally to the core.
int get_iram_byte (int
addr)
void set_iram_byte (int
addr, int data)
|
description
|
Access internal memory, including SFRs. get_iram_byte does
a read access to addr,
returning the byte value. set_iram_byte
does a write access to addr,
writing the byte value in data.
|
int get_ext_ram_byte
(int addr)
void set_ext_ram_byte
(int addr, int data)
|
description
|
Access external memories. get_ext_ram_byte does a read access to addr, returning the
byte value. set_ext_ram_byte
does a write access to addr,
writing the byte value in data.
|
Internal State Access
Currently a single function is implemented for internal
state access, fetching the current cycle time.
int get_cycle_time
(void)
|
description
|
return the current model cycle count.
|
Timing Model
The timings for various 8051 implementations are not standard
and vary anywhere from 12 cycles for a particular instruction to 1 cycle for
the same instruction in another device. The timing model used here is based on
that for the Silicon Laboratories' CIP-51 core (see [1], section 8.1.1).
Each instruction has an base execution cycle count, which is
added to the internal cycle count of the core when the instruction is executed.
In addition, the jump instructions have an extra cycle if the jump is
performed, which is added appropriately.
Source Code Architecture
It is not the intention to go into minute detail for the
internal architecture of the model here, but a brief overview of the main
program flow, internal state, and major structures is in order, to allow anyone
wishing to understand or modify the code enough of a handle, that they can
explore the details on their own.
Main execution flow
Below is shown some pseudo-code of the main program flow
when executing a program. The main functions are shown as "<funcname>()",
and the phrases between "<"
and ">"
describe local functionality. The indentation of the pseudo-code shows the calling
hierarchy as implemented in the code.
run_program(){
if timers enabled...
register_time_cb( timer() )
end if
// Load program
read_ihx()
reset_cpu()
// Main execution loop
while no breakpoint condition ...
process_interrupts() {
if interrupts enabled and not already
interrupting ...
if ext mem 0 callback and not sleeping ...
if ext_int_cb_0()
check if interrupting
if interrupting ...
interrupt(EXT0_INT_VECTOR)
end if
end if
end if
if timer 0 callback and not sleeping ...
if timer_cb() ...
check if interrupting
if interrupting ...
interrupt(TIM0_INT_VECTOR)
end if
end if
end if
if ext mem 1 callback and not sleeping ...
if ext_int_cb_1()
check if interrupting
if interrupting ...
interrupt(EXT1_INT_VECTOR)
end if
end if
end if
if timer 1 callback and not sleeping ...
if timer_cb() ...
check if interrupting
if interrupting ...
interrupt(TIM1_INT_VECTOR)
end if
end if
end if
if serial interface callback and not
sleeping ...
if serial_cb() ...
check if interrupting
if interrupting ...
interrupt(SER_INT_VECTOR)
end if
end if
end if
end if
}
if timer callback ...
timer()
end if
// Execution instruction
execute() {
fetch next opcode from code_mem[pc]
lookup instruction function and decode data from
decode_table[opcode]
fetch any argument bytes from code_mem[pc++]
if verbose...
print verbose output
execute instruction *func()
}
<process breakpoints>
end while
}
The above pseudo-code is a rough outline only. The main
structure is a while loop that continues until a breakpoint (such as reaching a
break address or executing a specified number of cycles etc.). Interrupts are
processed first, with a hierarchy of inspection, as dictated by the 8051
functionality. Peripherals are then processed, with the timers being the only
internal peripherals, before the execute() function is called to decode and execute the
next opcode. After execution, any new breakpoint state is processed to flag for
the next loop iteration.
Key Model State
The list below shows some of the key model state used in the
model.
- code_mem[],
ext_ram[], int_ram[] : internal 8 bit memories
- acc,
sp, pc, b : variables containing the specific 8051 register state
- *r
: pointer to the currently active bank of r registers
- dptr,
tcon, tmod, tl0, tl1, th0, th1, scon, sbuf, pcon, p0, p1, p2, p3, psw, ie,
ip : Variables holding the standard function register state.
- cycle_count
: counter of execution cycles.
- int_level
: current model interrupt level
- decode_table[]
: table of instruction decode information. See the Decode Table section below
for more details.
Decode Table
At the heart of the execution of the model is a decode table
used for quick lookup of decode information for a given instruction's opcode. The
decode table consists of 256 entries with the following structure type:
typedef struct
{
pfunc_t func; // Pointer to execution function
char*
instr_name; // Instruction name
int
instr_size; // Instruction size in bytes
int
clk_cycles; // Number of clock cycles for
instruction execution
int
addr_mode_op1; // Addressing mode operand 1
int
addr_mode_op2; // Addressing mode operand 2
} DecodeData_t, *pDecodeData_t;
It is a constant table, and held in the global 'decode_table'
variable, initialised at compilation. The instr_name field is a string for
verbose/debug purposes, whilst the instruction size defines the number of bytes
used by the instruction. The timing model information for instruction is
defined in the clk_cycles
field, and is taken directly from [1]. The addr_mode_op1 and addr_mode_op2 fields define the
addressing mode operand's 1 and 2 (or just operand 1, or both don't care, as
appropriate for the instruction. These can be one of the following:
- ACC
: accumulator
- REG[0-7]
: registers 0 to 7
- DPTR
: data pointer
- REL
: relative
- CODE
: code address
- CC
: carry flag
- BIT
: bit address
- NBIT
: bit address
- BREG
: register B
- IMM16
: immediate, 16 bits
- PC
: program counter
All instructions have the same structure type in the decode
table, but fields not relevant for some instructions are set to 'don't care'
definitions. An example entry is show below:
{DIV,
"DIV AB ", 1, 8, ACC, BREG}
During execution, the opcode is used to index into the
decode table and the entry retrieved. If further bytes are required, as
indicated by "instr_size",
then these are fetched.
A new structure type is populated with this information,
and has the following definition:
struct DecodeStruct {
uint8_t opcode;
// Instruction opcode
uint8_t arg0;
// First byte (where applicable)
uint8_t arg1;
// Second byte (where applicable)
pDecodeData_t decode;
// Pointer to entry in decode table for opcode
};
The opcode
field is set to the raw fetch opcode byte, whilst arg0 and arg1 are populated with any subsequent opcode
bytes. The decode table entry is pointed to by decode, and the function pointed to by "func" is called,
passing in the DecodeStruct
variable just constructed.
Instruction Functions
The actual instruction execution functions are defined in the
source file inst.c,
and all have a similar basic format. An example is shown below for the divide
instruction.
void DIV
(pDecode_t d) {
int op1, op2, tmp;
// Update the cycle count
cycle_count += d->decode->clk_cycles;
fetch_arg(d->decode->addr_mode_op1, &op1, d->arg0, d->arg1);
fetch_arg(d->decode->addr_mode_op2, &op2, d->arg0, d->arg1);
SET_PSW_OV(psw, op2 == 0 ? 1 : 0);
// Clear carry
SET_PSW_CY(psw, 0);
if (op2) {
tmp = op1 % op2;
op1 = op1 / op2;
}
write_arg(d->decode->addr_mode_op1, op1, d->arg0);
write_arg(d->decode->addr_mode_op2, tmp, d->arg0);
pc
+= d->decode->instr_size;
}
The function is passed in the decode table entry and adds
the clk_cycles count
to the master cycle_count
time state. A local function "fetch_arg()" is used to get the values indicated by op1 and op2
operands, based on the passed in addressing mode fields from the decode table
entry.
Testing
A set of assembler programs were developed and are provided
for execution on cpu8051, that execute
a range of self-tests to verify the model. These tests are all directed tests, but
cover nearly all aspects of the model including all instructions and all exceptions.
Each program lives in a solitary directory under the directory test/<category>/ and each
sub-directory has a single source file, '<category>.asm'.
These tests are self-checking and return a value 0x99
in internal memory location 0x7f if
the test passes, or 0xbb if it fails
(if the program never terminates cleanly, then this value is undefined-but is
unlikely to be the pass value).
Executing Tests
The tests are all run via a 'runtest.sh'
script that lives in the test/ directory.
Changing directory to 'test/' and
running the 'runtest' script will
execute all the tests, giving a pass/fail criteria for each. An easier way to
execute the tests is to use the makefile.
When building code, a command 'make test'
will get the build up-to-date, and then run the test script. The tail end of
the output should be something like that shown below:
.
.
jz : PASS
push : PASS
mov : PASS
movc : PASS
xch : PASS
reti : PASS
There are currently 27 tests that cover all instructions and
an EXT0 exception test. Note that some of the tests (e.g. rl/) actually cover
multiple instructions: A full list is given below.
- acall
includes testing of lcall,
ret
- rl
includes testing of rlc,
rr, rrc
- jb
includes testing of jbc,
jnb
- jc
includes testing of jnc
- jz
includes testing of jnz
- jmp
includes testing of sjmp,
ljmp, ajmp
- push
includes testing of pop
- movc
includes testing of movx
- xch
includes testing of xchd,
swap
All the others tests are for the single instructions
specified by the name of the test directory:
- add
- addc
- subb
- anl
- orl
- xrl
- cjne
- clr
- setb
- cpl
- da
- dec
- inc
- div
- mul
- djnz
- mov
- reti
Downloads
The model is released under version 3 of the GPL, and comes with no warranties whatsoever.
A copy of the license is included.
The cpu8051 package is available for download from github. As
well as all the source code, make files and MSVC 2010 file, the package contains all the
test assembly code, and scripts to run them.
Further Reading
[1] C8051F52x/F53x Data Sheet, Rev 1.4, Silicon
Laboratories, April 2012
[2] KEIL/ARM 8051
instruction manual website,
www.keil.com/support/man/docs/is51/is51_instructions.htm
[3] 8052.com tutorial: http://www.8052.com/tut8051
[4] "Using as, the GNU Assembler", version 2.19.51, 2009
Copyright © 2012-2014 Simon Southwell
simon@anita-simulators.org.uk