Sei sulla pagina 1di 7

The Nano Processor [nP[. is a stored-program processor that achieves application-specific performance with general purpose programmable control.

The nP implements applicationspecific functionality through the development of custom instructions. Similar to the Reconfigurable Microprocessor, the nP implements the processor control within a FPGA instead of using a standard microprocessor. Not only does this reduce the part count, but it allows full control over processor operation. the nP offers available reconfignrable logic for implementing application-specific hardware to achieve application-specific performance. the development of custom execution milts, the nP offers the ability to develop custom hardware modules for each individual processor. Yet, unlike other reconfigurable processors that re-quire extensive FPGA resources. , the nP requires only a fraction of the resources mailable in a moderate sized FPGA. Minimizing the control logic, registers and busses frees the logic and routing resources necessary to implement application-specific hardware in a single FPGA. With most of the FPGA resources dedicated to applicationspecific hardware, the nP can approach the performance achieved by application-specific hardware systems. The nP is currently implemented on any of the Xilinx 3000 series parts 17. in conjunction with a variable size 8-bit static RAM (Figure 1). Many Xilinx device specific features are implemented to minimize FPGA resource utilization, but the architecture can be adapted to other FPGA families with similar results. Multiple Nano Processors can be implemented on relatively small printed circuit boards to obtain a low-cost reconfigurable multiprocessing system.

Working :
The nP contains an inner core that serves as the hardware basis for each custom processor. This core implements six instructions using 21 10Bs. and 40 CLBs of any part in the Xi1inx 3000 series FPGA family. Depending on the amount of custom hardware needed, any of the 3000 parts can be chosen . Resources available after implementing the nP core vary from 24 CLBs when using the XC3020 to 444 CLBs when using the XC3195. 3.1 Processor Organization

3.1.1 nanoproocessor Core

The inner most processor level is the nP core. This core is a general purpose processor that has been carefully developed to accommodate a wide range of custom instructions and is not intended to be modified. The core contains six essential instructions, and can operate without any customization. In fact, several designs have been implemented on smaller FPGAs with little or no customization.

3.1.2 Custom Instruction Set The next processor level is the custom instruction set. With the core nP design minimized, most of the FPGA resources are available, for application-specific hardware in the form of a custom instruction set. An instruction set is built by choosing instructions from an instruction library or designing new instructions. New instructions are currently developed standard schematic entry or high level synthesis tools. After a new custom instruction has been designed and verified, it is placed in the instruction library of nP custom instructions. This allows custom functions to be reused - unique operations and instructions only have to be used once. As more and more special-purpose instructions are developed. it becomes much easier to develop high speed custom processors. Implementing special-purpose functionality in the form of an instruction allows quick and easy control of the custom functionality. Custom logic of nearly any form can be encapsulated in a custom instruction to provide easy interfacing and control. The instruction can become an active member of the processor and operate in parallel with other events in the processor. Custom instructions can also take over the functions of dedicated logic in conventional computer systems. As an example. a special-purpose data sorting processor could be built with high-speed., hardware sorting algorithms. Without any custom Instructions, the nP core could perform simple ,sorting algorithms. But, like most processors. it must proceed byte by byte through the data sfructure and perform individual comparisons on the data set. A custom sort instruc-tion could be developed that, when given two address pointers, would read the values, compare, and swap of necessary. Much of the overhead of data calculation and instruction processing would be removed. If additional reconfigurable logic is available, a

more complex sorting algorithm could be implemented. A "sort-block" instruction could be developed that loads several bytes of data into custom registers, performs a hardware sort, and writes the block back. to memory in sorted order. Such instruction modules may require much more logic than simple compare and swap instructions, but they could dramatically improve performance. Custom instructions can remove much of tile overhead associated with general purpose com-puting algorithms by encapsulating time consuming activities within dedicated logic.

Once the instruction set of a processor has been chosen, the processor must be mapped to a specific FPGA device. Using manufacturer tools, the netlists of the nP core and the custom instructions are flat-tened and converted to a vendor specific netlist. Using place and route tools,the custom processor netlist is implemented.

3.1.3 Software Executable The software executable is the outermost level of the processor. Users program the nP in assembly language using any of the core nP instructions or custom instruction, specified in the processor definition. Hardware processors for a class of applications can be reused so users do not have to create a custom processor for each application. This gives users the ability to develop custom applications without any understand-ing of the hardware in the special-purpose processor. (Then writing applications on a custom processor. no extra tools are required except the nP assembler. In summary, the multi-level organization of the nP provides users with the flexibility necessary to reconfigure the processing environment at two levels - hardware and software.

3.2 nP Core Architecture

Figure 3: Nano Processor Core Architecture.

The data path size for the nP core is eight bits - the width of the attached SRAM. The various register sizes are established as a result of this 8-bit data width. The nano processor consists of five registers: Instruction Register (IR), Page Address Register (PAR), Program Counter (PC), Address Register (AR), Accumulator (A). To conserve resources. the IR. PAR, and the AR are all stored in Xilinx IOb flip-flops (Figure 3). Under the current architecture, the IR contains five bits and the PAR contains three bits. Five IR bits allows up to 32 unique instructions, and three PAR bits allows up to eight different pages (256-byte pages). For the Xilinx implementation, both registers can be mapped into IOBs to conserve available registers and logic. The program counter (PC) and the address register (AR) are both eleven bits wide allowing for a 2K addressing space. The PC controls the program flow as in conventional processors, and is often loaded into the AR. The AR is the final register that addresses external memory.

The arithmetic capabilities are contained in the single data register of the processor, the accumulator (A). The accumulator is eight bits wide with a single carry bit. Under the current implementation, the accumulator can perform addition, and subtraction. All other logical functions are possible, but limiting functionality to these two instructions insures that each bit its within a single CLB for single level logic performance. Additional functionality should be performed in custom instructions.

The internal data paths of the processor include the 8-bit data bus and the 11-bit address bus. The bi-directional data bus is used to load the IR. PAR. A. and AR registers. This bus is coupled with the external SRAM. The address, bus is used to address the external SRAM, and to load the program counter. The AR can be loaded by multiplexing between the PC. and a combination of the PAR and the data bus. The limited bus connections allows for easy FPGA routing.

The control circuitry for the processor is hardwired in the control module. This module controls the latches, multiplexers. and global clocking.As stated previously. the core uP consume, 90 Xiinx CLBs with resource's divided among the functional units . The goal in this design is to minimize the logic necessary for control in order to leave valuable reconfigurable logic for custom hardware.

3.3 Instruction Set As stated previously, the nP core instruction set consists of six standard instructions. To simplify execution, the nano processor has fixed instruction lengths of two bytes. Each instruction contains only two parts: an instruction opcode, and one operand reference. The operand reference i5 split into two parts: the page address (3-bits) that specifies which of the eight 236-byte pages the reference belongs, and the page offset, an eight bit offset value within the specified page. The first byte contains the instruction opcode in the lower five bits, and the page address in the upper three bits. The second byte contains the page offset lows for flexible control over the interface without redesigning the np. 4.2 Interface Operating System The audio interface nP offers all the hardware capability necessary to control the external device simultaneously. Although the hardware for the interfaces is available, software

modules must be present to control each interface. Software modules allow custom control of the interfaces to tailor the hardware to the specific needs of the user. Currently, there are five software modules that run on the audio interface. Other software modules may be available in the future to allow further control over the processor. The five software modules differ in the control over the PC and Codec interfaces. For varying audio data formats. each interface must transfer data differently. Each of the five software modules changes the control of the interfaces to adapt the card to the appropriate data format. The five data formats are as follows: 16-bit stereo (in/out). 16-bit mono (in /out), 8-bit stereo (in/out). 8-bit mono (in/out) dual channel 16-bit mono (in/out) Using a custom program for custom interfacing provides exceptional flexibility in controlling the audio interface. Adding other software modules will provide further flexibility and customization of the X2 sound system. The X2 reconfigurable sound system is a good example of how the nP can be implemented to take advantage of customization at two levels of development. Multiple nP hardware configurations optimizes hardware resources to maximize performance for applicationspecific algorithms and control. In addition. multiple software executable modules for the various hardware nP configuration reuse carefully designed application-specific functionality while customizing these resources to unique algorithms. 5 Conclusion We have found that the Nano Processor, a low resource reconfigurable stored-program processor, is an effective tool for implementing reconfigurable logic systems. Its low resource utilization frees essential reconfigurable hardware needed to implement high performance application-specific hardware. Custom instructions have been well implemented that take advantage of application-specific hardware to produce exceptional results not available on general purpose processor.

Potrebbero piacerti anche