Retardos Assembler

by Vasily Koudymov
In programming the PIC16 family of microcontrollers, it is sometimes necessary to do absolutely nothing for
a certain number of cycles thereby causing a real world delay of some amount of time. This can be useful if
one is programming a clock or frequency generator for example, and it is often easier to implement a delay
loop rather than using a built in TMR timer.
This document will use the following PIC16 assembly instructions:
decfsz f,d:
Although f refers to a memory location, when referring to operations performed upon f it is easier to say
something like [ f = (f - 1) ] to refer to the idea of “the value at the address which is called f will now be equal
to the original value at address f with one subtracted from it.” I only say this as any reader who is feeling
particularly anal will say that I am in error when I say, “The new value of f will be (f - 1).” Note that although I
will say that, it refers to the longer idea, it just makes following the operations much easier. Let's return to the
original topic.
The instruction, decfsz performs the operation (f - 1), and if d is substituted with the character 'W', the result
(f - 1) will be placed in the Working Register. If d is substituted with the character 'F', it will be placed back
into f such that [ f = (f - 1) ].
With regards to the cycle time of the decfsz instruction, it goes as follows “one cycle if the (f - 1) is not equal
to 0 and execute the next instruction, and one cycles if (f - 1) is equal to 0 and discard the instruction
immediately after.” This is irrespective of whether or not d is 'W' or 'F'.
Examples:
; W = 0x00
; file = 0x05
decfsz file,F ; 1 cycle, will execute 'decfsz file,W'
; W = 0x00
; file = 0x04
decfsz file,W ; 1 cycle, will execute 'movlw 0x01'
; W = 0x03
; file = 0x04
movlw 0x01 ; 1 cycle, move the value 0x01 to Working Register

movwf file ; 1 cycle, file = 0x01
; W = 0x01
; file = 0x01
decfsz file,F ; 2 cycle, file = (file - 1) is zero, so discard next instruction

movlw 0x20 ; discarded, we ignore
; W = 0x01
; file = 0x00
goto k:
This command goes to the address k in the program memory. Although in higher level languages, the “goto”
statement is looked down upon, in assembly languages, it is necessary to complete most programs. What is
great about Microchip's assembler, which we will use to compile our PIC16 code, is that it allows using relative
addresses for k rather than absolute addresses. For example, 0x1234 is an absolute address which refers to
the address 0x1234 in the PIC16's program memory. Please do not confuse program memory with RAM.
Ram is where file registers are located and they can be modified at runtime. However, program memory is
the actual programming code which you upload to the microcontroller. In all cases while the program is
running, program memory is read only. In contrast, relative addresses are as follows: If the character ' $' is
substituted for k, then it would refer to the current address of the goto instruction (an infinite loop resulting from
this code). If “$+1” is substituted for k, then it will go to the code one instruction below the goto. If k is replaced
with “$-5”, it will go to the code 5 instructions above the goto. In all cases, goto takes two cycles to execute.
Examples:
clrf PORTA ; 1 cycle - clear PORTA file register.
goto $-1 ; 2 cycles - go one instruction above and continue execution.
In the above case, it would cause an infinite loop where PORTA is constantly cleared.
goto $+2 ; 2 cycles - go two instructions below and continue execution.
movlw 0x00 ; 1 cycle - place the value 0x00 in the W register.
andlw 0x01 ; 1 cycle - AND 0x01 and the contents of W register.
In the above case, movlw 0x00 is skipped over due to the goto, and it takes 3 cycles altogether.
Lets combine the the decfsz and goto instructions to create the simplest loop:
decfsz aa, F
goto $-1
We will refer to this as a one stage delay loop, but what would this do? So long as the instruction decfsz does
not produce a value of zero, it will take 1 cycle to execute, and then will execute the goto statement after it
which takes 2 cycles to execute. In sum, this segment of code will take 3 instructions each time decfszdoes
not produce zero. When it produces zero, decfsz will discard the goto instruction and will then take 2 cycles.
So in short: 3 cycles if not resulting in zero, 2 cycles if resulting in zero.
Lets plug in some values for aa:
 aa = 1 which yields zero, and makes the total cycles 2
 aa = 2 which does not yield zero, making the first cycles 3, then it goes again with aa = 1, which takes 2
cycles, thus making a total of 5 cycles.
 aa = 3 which does not yield zero on the first pass, takes 3 cycles, then follows the cycle established by aa =
2, thus making a total of 8 cycles.
Or, in general [3(aa - 1)] + 2 cycles for this delay loop, because eventually there will be one value for aa
which yields zero, while all of the rest do not. As such, if we had aa = 32, 31 of the values will take 3 cycles,
and 1 of the values will take 2 cycles. We can simplify this function as follows:
[3(aa - 1)] + 2
3aa - 3 + 2
3aa - 1
What about when aa is initialized to zero?
when aa = 0, it will be decreased by one first, which will yield 255, which is a non-zero value. Effectively,
initializing aa = 0, is like initializing aa = 256. Note that 256 is not a valid value for an 8 bit number, so in order
to initialize aa = 256, one has to use aa = 0.
What are the limits of the delay loop?
Given that aa can effectively be set from aa = 1 to aa = 256(by way of initializing aa = 0), the minimum and
maximum number of cycles which can be generated using this delay loop are found by plugging those values
into the derived equation:
min: 3(1) - 1 = 2 cycles
max:3(256) - 1 = 767 cycles
Knowing this range will become important when we want to figure out how many variables we need in a delay
loop. In order to simplify its presentation, the minimum and maximum number of cycles for a delay loop will
be written as [2 ~767] which indicates that that using the loop with that specified range will only allow for at
least 2 cycles and at most 767 cycles.
If we want to make a delay loop that includes more cycles, we would simply add another variable as follows:
decfsz aa,F
goto $-1
decfsz bb,F
goto $-3
Before we delve into what this specific code does, we will introduce new notation. cyc() will be used to refer
to a delay loop cyc(aa) would mean that the delay loop only has one eight-bit variable in it (called aa), and
we refer to it as a one stage loop, its output being a number of cycles. cyc(aa,bb) would mean that the loop
involves two variables(called aa and bb) respectively, and we refer to it as a two stage loop. This pattern
continues onward to however many variables you use. From this point onward, the following:
cyc(aa) = 3aa - 1 u [2~767]
Will be taken to mean a one stage delay loop with the formula (3aa - 1) which has one 8 bit variable called
aa within it, and which can generate between 2 and 767 cycles inclusively. Note that the range for all inputs
variables (such as aa) is actually [0~255] with 0 equating to 256.
With regards to what the above code is, it is a two stage loop. If time is taken to analyze the code, the following
pattern will emerge:
[3(aa - 1) + 2] + { [(max of one stage loop) + 3](bb - 1) + 2}
The reason for this pattern is beyond the scope of description, but to actually derive it, it would be prudent to
sit down with an empty sheet of paper, and go through a few iterations of a two stage loop. Now to actually
plug in the values for the above derivation:
[3(aa - 1) + 2] + {[767 + 3](bb - 1) + 2}
[3aa - 3 + 2] + {770(bb - 1) + 2}
[3aa - 1] + {770bb - 770 + 2}
3aa - 1 + 770bb - 770 + 2
3aa + 770bb - 769
After plugging in the minimum values allowed for aa and bb as well as the maximum values, we obtain the
range and our result is the following:
cyc(aa,bb) = 3aa + 770bb - 769 u [4~197119]
In addition, a general formula for any stage is found in the following:
[ previous stage cycle formula ] + [(max of previous formula) + 3][(new variable) - 1] + 2
Let us apply this to a three stage loop:
decfsz aa,F
goto $-1
decfsz bb,F
goto $-3
decfsz cc,F
goto $-5
Cycle formula:
[3aa + 770bb - 769] + [197119 + 3](cc - 1) + 2
[3aa + 770bb - 769] + [197122](cc - 1) + 2
[3aa + 770bb - 769] + [197122cc - 197122] + 2
3aa + 770bb + 197122cc -769 -197122 + 2
3aa + 770bb + 197122cc - 197889
After finding the range via plugging in allowable minimums and maximums, we obtain the formula:
cyc(aa,bb,cc) = 3aa + 770bb + 197122cc - 197889 u [6~50463231]
Thus far, our formulas are:
cyc(aa) = 3aa - 1 u [2~767]
cyc(aa,bb) = 3aa + 770bb - 769 u [4~197119]
These formulas are great, so long as you copy and paste the code each time you need a delay. However, in
the real world, it's more efficient and often much easier to use subroutines. A subroutine in assembly
language is analogous to a function in a higher level language. Instead of typing out your code each time you
need a delay, you can use subroutines. Here's how they appear in code:
; this is an excerpt from the main code section of an assembly program
call delay ; 2 cycles, call the subroutine which we call 'delay', it's like calling a function
; the actual code for a one stage delay loop subroutine
delay:
decfsz a,F ; use the formula for next two lines
goto $-1
return ; 2 cycle return
Or, in general:
; this is an exerpt from the main code section of an assembly program
call delay ; 2 cycles, call the delay subroutines
; the declaration for the delay subroutine
delay:
<code for an N-stage loop>
return
You may notice that before, we were just using the code for an N-stage loop, however, when we turn it into
a subroutine, we add on four more cycles. This applies to any number of stages. In order to adjust the stage
delay loop formulas for these addition cycles, we add four to both the formula and the limits. Such that:
cyc(aa) = 3aa - 1 u [2~767]
cyc(aa,bb) = 3aa + 770bb - 769 u [4~197119]
becomes:
cyc(aa) = 3aa - 1 + 4 u [2 + 4~767 + 4]
cyc(aa,bb) = 3aa + 770bb - 769 + 4 u [4 + 4~197119 + 4]
cyc(aa,bb,cc) = 3aa + 770bb + 197122cc - 197889 + 4 u [6 + 4~50463231 + 4]
and simplifies to:
cyc(aa) = 3aa + 3 u [6~771]
cyc(aa,bb) = 3aa + 770bb - 765 u [8~197123]
Although the formula is becoming more and more proper, there is still one last step before our formula is
complete. It involves the role of initialization a loop with specific values so as to get obtain the desired number
of cycles.
It is best to initialize the loop within the routine as it makes the code less clunky and it is easier to use
conditional code (such as btfss, decfsz, and incfsz). The reason being that if the loop is initialized after the
call statement, code like this can be used:
btfss STATUS,Z ; check if the previous operation yielded zero
call delay ; delay for some amount of time
while in contrast, this would not be possible if the loop was initialized outside of the subroutine:
btfss STATUS,Z ; check if the previous operation yields zero
clrf aa ; clear the aa variable
call delay ; call the delay
As you see by these two examples, only in the first example is the delay conditional. The second example is
not equivalent to the first as the only part which is conditional, is the clrf aa. If the previous operation does
not yield zero then do not clear aa is not equivalent to call the delay routine if the previous operation does
not yield zero.
As to initializing within subroutines there are two choices, static delay loops and variable delay loops
Here are examples of how static loops are to be initialized:
; one stage delay loop subroutine
delay:
movlw D'5' ; 1 cycle
movwf aa ; 1 cycle
<code for one stage delay loop>
return
; two stage delay loop subroutine
delay:
movwf aa ; 1 cycle
movwf bb ; 1 cycle
<code for two stage delay loop>
return
; three stage delay loop subroutine
delay:
movwf aa ; 1 cycle
movwf bb ; 1 cycle
movwf cc ; 1 cycle
<code for three stage delay loop>
return
From here, it should be noticed that for static delay loop subroutines, that if the loop involves N variables, it
will require 2N cycles to initialize a variable. What this translates to is that for a one stage loop it takes 2
cycles to initialize, for a two stage loop it takes 4 cycles to initialize, and for a three stage loop it takes 6 cycles
to initialize. Similarly to the adjustments required in turning simple delay loops into a subroutines, we must
also add to those formulas the following additional cycles as follows:
cyc(aa) = 3aa + 3 u [6~771]
cyc(aa,bb) = 3aa + 770bb - 765 u [8~197123]
becomes:
cyc(aa) = 3aa + 3 + 2 u [6 + 2~771 + 2]
cyc(aa,bb) = 3aa + 770bb - 765 + 4 u [8 + 4~197123 + 4]
cyc(aa,bb,cc) = 3aa + 770bb + 197122cc - 197885 + 6 u [10 + 6~50463235 + 6]
and simplifies to:
cyc(aa) = 3aa + 5 u [8~773]
cyc(aa,bb) = 3aa + 770bb - 761 u [12~197127]
Static delay loops are great for when you only need a constant number of delay cycles, but what about when
it varies? We can use variable delay loops, and the code is as follows:
; one stage delay loop subroutine
delay:
movf aak,W ; 1 cycle
movwf aa ; 1 cycle
return
; two stage delay loop subroutine
delay:
movwf aa ; 1 cycle
movf bbk,W ; 1 cycle
movwf bb ; 1 cycle
<code for two stage delay loop>
return
; three stage delay loop subroutine
delay:
movwf aa ; 1 cycle
movf bbk,W ; 1 cycle
movwf bb ; 1 cycle
movf cck,W ; 1 cycle
movwf cc ; 1 cycle
<code for three stage delay loop>
return
Where aak, bbk, and cck, and the values you need to initialize only once. Afterward, whenever you call the
delay subroutine, it initializes the loop with aa = aak, bb = bbk, cc = cck, or mnemonically aa(variable) =
aa(constant). Should you desire to change the number of cycles to delay for, all that is needed to be changed
are the constant values.
With regards to the number of cycles that variable delay loops take, it is exactly the same as static delay
loops, thereby making the formulas (with the exception of the constant values playing a role):
cyc(aak) = 3aak + 5 u [8~773]
cyc(aak,bbk) = 3aak + 770bbk - 761 u [12~197127]
cyc(aak,bbk,cck) = 3aak + 770bbk + 197122cck - 197879 u [16~50463241]
Please be certain that you initialize variable delay loops before their initial calling.
Now we can discuss the topic of solving for the the constants given that a certain number of cycles is required:
We need 600 cycles:
cyc(aak) = 3aak + 5 = 600
3aak = 600 - 5
3aak = 595
aak = 595/3 = 198.3333
as aak can only be a whole integer, we assign 198 to it. We absolutely do not round up or down, we always
truncate. What we do with the fractional part 0.3333 is multiply it by 3 to convert back to who many cycles
are needed in addition to what the loop can supply, essentially, we are finding the remainder:
198.3333 - 198 = [value after division - value assigned to constant aak] = 0.3333
0.3333 x 3 = 0.9999
We round at the very last step to the nearest integer, thereby this remainder will tell us how short of our
desired number of cycles we are if we use the value 198 for aak so as to obtain a cycle count of 600.
Therefore, as 0.9999 rounded to the nearest integer is 1, we are short 1 cycle which indicates that the delay
loop only generates 595 cycles. In practice, after the final division, before multiplying by three a 0.3333 excess
indicates one cycle short, while a 0.6666 indicates that it is two cycles short. To remedy this deficiency, we
recommend the following:
; this can be placed in the main code
call delay ; 595 cycle
nop ; 1 cycle
; or we can add the null operation to the subroutine
delay:
movwf aa ; 1 cycle
nop ; 1 cycle
return
Note that for the second solution, embedding a one cycle null operation within the delay loop will add one
more cycle to this subroutine making the formula change from:
cyc(aak) = 3aak + 5 u [8~773]
to this:
cyc(aak) = 3aak + 5 + 1 u [8 + 1~773 + 1]
which simplifies to:
cyc(aak) = 3aak + 6 u [9~774]
Therefore, in case you decide to embed the nop instruction within your delay loop subroutine, make certain
to modify the formula as well.
Before we continue, I will now explain the procedure for finding the remainder on a calculator:
We want to take 50002 and divide it by 35. We require both the quotient and the remainder and begin by
dividing this expression in our calculators:
50002/35 = 1428.628571
This makes the quotient 1428, an now to find the remainder:
1428.628571 - 1428 = 0.628571
We take this and multiply it by the number we divided by:
0.628571 x 35 = 22
Therefore, the final answer is:
1428 remainder 22
or in the shorthand we will use (where ex is excess) throughout this document:
1428 ex 22
With this knowledge in mind, we can now solve for a three stage delay loop for 600 cycles:
cyc(aak,bbk,cck) = 3aak + 770bbk + 197122cck - 197879 u [16~50463241]
The procedure should be fairly intuitive if followed. Its rules are nearly the same as above, but remember to
bring the constant (integer at the very end) to the other side, and to start with the largest coefficient in division:
cyc(aak,bbk,cck) = 3aak + 770bbk + 197122cck - 197879 = 600
3aak + 770bbk + 197122cck = 600 + 197879
3aak + 770bbk + 197122cck = 198479
solve for cck by taking the new constant, and dividing it by the coefficient in front of cck:
198479/197122 = 1 ex 1357 [as a result, cck = 1]
we find bbk by taking the remainder and dividing it by the coefficient in front of bbk:
1357/770 = 1 ex 587 [as a result, bbk = 1]
and finally, we find cck by taking this remainder and dividing it by the coefficient in front of aak:
587/3 = 195 ex 2 [as a result, aak = 195]
Because of the last remainder being two, we now know that this formula is two cycles short of 600. Lets
verify:
cyc(aak,bbk,cck) = 3aak + 770bbk + 197122cck - 197879
cyc(195,1,1) = 3(195) + 770(1) + 197122(1) - 197879
cyc(195,1,1) = 585 + 770 + 197122 - 197879
cyc(195,1,1) = 585 + 770 - 757
cyc(195,1,1) = 585 + 13
cyc(195,1,1) = 598
We must also discuss the topic of picking stages. Clearly with the three formulas we have derived before,
any of them will work for a 600 cycle loop, however, in practice it is better to use the loop with fewest variables
where possible as it conserves memory. Consider that a one stage loop requires two bytes of ram for aa and
aak, a two stage loop requires four bytes of ram for aa, aak, bb, and bbk. As you can see by the pattern a
three stage loop will require six bytes of ram, and in general an N-stage loop will require 2N bytes of ram. In
addition, to initialize the constants aak, bbk, and cck, it takes two cycles per each variable, so for a six stage
loop, this means 6 cycles. This may be wasteful if your application barely fits into the microcontroller.
Another concern when using delay loops is offset errors. You may be tempted to use a 5000 delay loop if
you need your instruction to execute every 5000 cycles, however this would create an offset error since your
instruction would take at least 1 cycle, thereby making your instruction would now execute every 5001 cycles.
In order to avoid this, pad your time critical routine or set of instructions so that it always takes the same
amount of cycles to process, and set the delay loop to be equal to whatever your desired value is with the
number of cycles your instructions or routine takes.
Sometimes, it is important to use delay loops to create real time delays. What this means is that sometimes
it is necessarily to calculate how many seconds a delay loop will take. To do this, we include the following:
The PIC16 architecture is odd in that when you use a crystal oscillator oscillating at 4.000 Mhz, the number
of instructions per second with simple instructions (such as nop, movlw, bcf, etc.) is actually 1.000 MIPS (million
instructions per second). Therefore, the number of MIPS can be found by taking the frequency in megahertz,
and dividing it by 4, and the number of instructions per second(IPS) can be found by multiplying this answer
by 1,000,000. Thus:
[ips] = [xtal frequency in Mhz] * 1,000,000 / 4
[ips] = [xtal frequency in Mhz] * 250,000
Now that we know how many instructions per second it executes, we can figure out how many seconds a
number of instructions will take by dividing the number of cycles by the number of instructions per second.
Thereby:
[seconds] = [cycles] / [ips]
or:
[seconds] = [cycles] / ([xtal frequency in Mhz] * 250,000)
and since frequency is the inverse of time(period):
[frequency in Hz] = 1 / [seconds]
[frequency in Hz] = 1 / [[cycles] / ([xtal frequency in Mhz] * 250,000)]
[frequency in Hz] = [xtal frequency in Mhz] * 250,000 / [cycles]
Last Revision: 2007-05-14

Retardos Assembler

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Retardos Assembler

Caricato da

Copyright:

Formati disponibili

by Vasily Koudymov

decfsz file,F ; 1 cycle, will execute 'decfsz file,W'

decfsz file,W ; 1 cycle, will execute 'movlw 0x01'

movlw 0x01 ; 1 cycle, move the value 0x01 to Working Register

decfsz file,F ; 2 cycle, file = (file - 1) is zero, so discard next instruction

Potrebbero piacerti anche