File

ABSTRACT
The paper is a study of USB standard based on authors own concept of specialized digital architecture providing USB communication. Presented module connects peripheral device with a computer via USB cable. In accordance with functionality, the design was partitioned to three units. UTMI block deals with USB cable, time frame synchronization and serial transmission. PIE block, divided to two parts, is responsible for packet construction/extraction and byte oriented communication with the peripheral system. Each module is controlled by the dedicated Finite State Machine. The architecture was implemented in VHDL, verified and synthesized. Design complexity arises questions about feasibility of USB interface application in specialized devices and low volume segments of electronic devices, especially when compared to traditional RS232/UART interfaces. The universal serial bus (USB) transceiver macro cell interface (UTMI) is a two wire, bidirectional serial bus interface. UTMI consists of transmitting and receiving sections, in which the transmitter of the UTMI sends data to different USB devices through D+ and D- lines whereas the receiver gets data on the same lines. UTMI is one of the important functional blocks of USB controller, which can transmit and receive data to or from USB devices. There are three functional blocks present in USB controller; those are Serial Interface Engine (SIE), UTMI and Device Specific Logic (DSL). The parallel data from SIE is taken into the transmit hold register and this data is sent to transmit shift register from where the data is converted serially. This serial data is bit stuffed to perform data transitions for clock recovery and NRZI (1) encoding. Then the encoded data is sent on to the serial bus. When the data is received on the serial bus, it is decoded, bit unstuffed and is sent to receive shift register. After the shift register is full, the Data is sent to receive hold register. This presentation reveals the FPGA implementation of UTMI transmission rate providing with USB 2.0 specifications. Further UTMI has been designed by using VHDL code and simulated, synthesized and programmed to the targeted Spartan family of FPGA in the Xilinx environment.
UNIT-I INTRODUCTION
Universal Serial Bus (USB) interface since its introduction has gained huge interest and scale of application. In spite of this popularity, details and capabilities of protocol remain unknown to some digital circuits designers. One of the reasons is surprisingly high level of functional and structural complexity of USB interface. Analysis of some details especially when compared with the other serial data transmission systems, supported by observation of the market trends may lead to creative ideas opening the new paths in contemporary digital communication. This paper presents the authors own concept of USB receiver/transmitter architecture. The concept was implemented in hardware description language to provide model for simulations. Simultaneously the code is synthesizable, and may be physically implemented in programmable logic devices. At this stage the full functionality is not ready; however significant part is available for embedding in any digital system that requires high speed communication with computer. Universal Serial Bus (USB) is an industry standard developed in the mid-1990s that defines the cables, connectors and protocols used in a bus for connection, communication, and power supply between computers and electronic devices. USB was designed to standardize the connection of computer peripherals (including keyboards, pointing devices, digital cameras, printers, portable media players, disk drives and network adapters) to personal computers, both to communicate and to supply electric power. It has become commonplace on other devices, such as smart phones, PDAs and video game consoles. USB has effectively replaced a variety of earlier interfaces, such as serial and parallel ports, as well as separate power chargers for portable devices. As product form factors continue to shrink, designers are increasingly pushed to find new and innovative ways to incorporate the USB interface into their products. To enable the development of these new system architectures, the SMSC USB Transceiver products are also evolving to address the needs associated with the High Speed USB 2.0 interface. To enable maximum reliability and flexibility of design in the customer application, SMSCs USB Transceivers offer superior software configurability and hardware integration:
Programmable Transmitter Amplitude Programmable Receiver Sensitivity Integrated USB Switch Integrated ESD Protection Optimizing both the Transmitter and Receiver allows for maximum USB transfer speeds and simplifies High Speed USB-IF certification. Through the use of PHYBoost and VariSense technology, SMSC USB Transceivers allow designers maximum flexibility when defining their product while reducing risk and time to market. Whether you are looking to improve margin on an existing product design or avoid a hardware modification to pass USB-IF certification, SMSC transceivers provide unsurpassed configurability to enable product success. Universal Serial Bus (USB) is a standard interface for connecting peripheral devices to a host computer. The USB system was originally devised by a group of companies including Compaq, Digital Equipment, IBM, Intel, Microsoft, and Northern Telecom to replace the existing mixed connector system with a simpler architecture. USB was designed to replace the multitude of cables and connectors required to connect peripheral devices to a host computer. The main goal of USB was to make the addition of peripheral devices quick and easy. All USB devices share some key characteristics to make this possible. All USB devices are selfidentifying on the bus. All devices are hot-pluggable to allow for true PlugnPlay capability. Additionally, some devices can draw power from the USB which eliminates the need for extra power adapters. To ensure maximum interoperability the USB standard defines all aspects of the USB system from the physical layer (mechanical and electrical) all the way up to the software layer. The USB standard is maintained and enforced by the USB Implementers Forum (USB-IF). USB devices must pass a USB-IF compliance test in order to be considered in compliance and to be able to use the USB logo.
UNIT-II USB PROTOCOL OVERVIEW

USB protocol provides communication between computer and peripheral device. Its construction is based on 3 layers: functional, which covers high-level relations between a computer program and a peripheral device, logic, responsible for the flow of data stream, Physical, including wires, connections, analog devices. Physical connection consists of 4 wires 2 for power and 2 for bi-directional differential data transmission. The same set of wires may be shared by up to 127 peripheral devices (USB 2.0). The master functionality of host includes control of the time division multiple access to communication channel. Depending on character (speed) of the peripheral device this time division may be based on frames or micro-frames. For a single frame 1 ms is allotted whilst for micro-frame it is 125 s. The begin of each frame is determined by the specific Start of Frame packet sent by host. The rest of frame space may be used for transmission of other packets, which may contain control, symbols or payload. Each peripheral device is recognized by its unique endpoint number (ENDP) and this number is used to allocate a frame in a time scale. Time division multiple access enables concurrent communication of host with several devices. Whilst the frames organize the time space for transmission, the data is transmitted in packets. General structure of USB packet is presented in Fig. 1. The start and end of packet are determined by the 8-bit SYNC and EOP patterns respectively.
PID is the 8-bit packet identifier. Depending on transmission type, the PAYLOAD field may contain various kinds of data and some more specific fields (e.g. address, frame number). For packets containing data the length of PAYLOAD field may reach 8192 bits. For control packets it is shorter, e.g. 8 bit. CRC is the result of cyclic redundancy check calculated for the payload field only. In the typical communication scenario the host - computer, starts transmission by sending some data to the peripheral device. In the very begin the transmission
parameters (e.g. speed, kind of packets to be transmitted) are clarified and then the transmission may start. There are 4 main types of transmission: control transmission, used for setting up the devices, isochronous transmission, used for real time transmission of data (to/from e.g. camera, microphone, speaker), interrupt transmission, used for very fast transmission of small data structures (e.g. from a mouse), Bulk transmission, asynchronous, designed for huge amount of data, optimized for speed and protection of data integrity (e.g. flash memory, hard disk).
All USB packets are prefaced by a SYNC field and then a Packet Identifier (PID) byte. Packets are terminated with an End-of-Packet (EOP). The SYNC field, which is a sequence of KJ pairs followed by 2 Ks on the data lines, serves as a Start of Packet (SOP) marker and is used to synchronize the devices transceiver with that of the host. This SYNC field is 8 bits long for full/low-speed and 32 bits long for high speed. The EOP field varies depending on the bus speed. For low- or full-speed buses, the EOP consists of an SE0 for two bit times. For high-speed buses, because the bus is at SE0 when it is idle, a different method is used to indicate the end of the packet. For high-speed, the transmitter induces a bit stuff error to indicate the end of the packet. So if the line state before the EOP is J, the transmitter will send 8-bits of K. The exception to this is the high-speed SOF EOP, in which case the high-speed EOP is extended to 40-bits long. This is done for bus disconnect detection. The PID is the first byte of valid data sent across the bus, and it encodes the packet type. The PID may be followed by anywhere from 0 to 1026 bytes, depending on the packet type. The PID byte is self-checking; in order for the PID to be valid, the last 4 bits mus t be a ones complement of the first 4 bits. If a received PID fails its check, the remainder of the packet will be ignored by the USB device.
There are four types of PID which are described in Table. USB Packet Types PID Type Token PID Name OUT IN SOF SETUP Data DATA0 DATA1 DATA2 MDATA Handshake ACK NAK STALL NYET Special PRE ERR SPLIT PING EXT Description Host to device transfer Device to Host transfer Start of Frame marker Host to device control transfer Data packet Data packet High-Speed Data packet Split/High-Speed Data packet The data packet was received error free Receiver cannot accept data or the transmitter could not send data Endpoint halted or control pipe request is not supported No response yet Preamble to full-speed hub for low-speed traffic Error handshake for Split Transaction Preamble to high-speed hub for low/full-speed traffic High-speed flow control token Protocol extension token
UNIT-III ARCHITETURE
USB interface in the peripheral device consists of two main blocks - the USB Transceiver Macro cell Interface (UTMI) and the Parallel Interface Engine (PIE). UTMI interacts with the cable and PIE with the digital system embedded in the peripheral device. Consequently the UTMI block deals with SYNC and EOP patterns, whilst PIE constructs packets and deals with the PID field. Communication between the 2 modules is provided by 3 groups of signals. In the first group there are data in and data out separate signals for data transmission in both directions. The second group TX valid and TX ready controls the transmission in the direction from peripheral device to the computer, i.e. from PIE to UTMI. The third group consists of 3 signals RX active, RX valid and RX error, which control the process of data transmission from the computer to the peripheral device, i.e. from UTMI to PIE. 3.1. UTMI block:General schematic of UTMI block is presented in Fig. 2. According to the functionality it is divided to parts responsible for transmitting and receiving the data. Separate part provides two clocks - 60 and 480 MHz. On the receiver side the incoming serial data (480 MHz) is recovered from the nonreturn to zero inverted (NRZI) shape and stored in the deserializer shift register. Then it is sent to 8-bit data out bus.
Figure 2. USB Transceiver Macrocell Interface (UTMI) block
Figure 3: Finite State Machine controlling the USB Transceiver Macrocell Interface (UTMI) block. Finite State Machine controls the process and drives all the RX signals communicating with PIE block. The transmitter part collects the data coming in parallel from PIE (8-bit data out bus, 60 MHz), serializes the stream, codes it to NRZI shape and sends serially to the cable, with the frequency of 480 MHz. This functionality is controlled by another branch of the Finite State Machine, dependent on the TX valid signal coming from PIE. Simultaneously the FSM is responsible for sending the TX ready signal to PIE. Additional function of receiver and transmitter parts is recognition and construction of the SYNC and EOP patterns respectively. Schematic of the UTMI FSM is presented in Fig. 3. During typical receive operation the FSM transits from SWAIT idle state to RSYNC when the SYNC pattern sent by a computer is
detected in a shift register. The next state is RDATA_LOAD. In this state the stream of data is continuously deserialized and sent to the 8-bit data in bus, until the EOP signal is recognized. (The erroneous detection of SYNC and EOP is avoided by bit stuffing/unstuffing). After detection of EOP, the FSM transits to REOP state where they receive operation is finished. If there are no errors detected, the next state is SWAIT. The other branch of FSM controls the transmit process. PIE module may generate a request to transmit data to a computer, by setting the TX valid signal to 1. If the UTMI is not busy e.g. receiving data from computer or dealing with errors , i.e. the FSM is in the SWAIT state, it may transit to the TSYNC_LOAD state. In this single clock cycle state the SYNC pattern is loaded to the serial output buffer. Then the FSM transits to the TSYNC state, where the SYNC pattern is transmitted serially to the USB cable. In the next state - DATA_SENDING, the regular transmission takes place. PIE block sends consecutive bytes via the 8-bit data out bus with 60 MHz frequency and UTMI serializes them for 480 MHz output. When the PIE block decides to stop the transmission, it switches the TX valid signal back to 0. The FSM transits to TEOP state where the EOP pattern is loaded to the buffer and then shifted out. The next state is SWAIT again. Switching the TX valid line to 0 during the transmission causes the FSM transit to the SERROR state.
Additional requirements and clarifications on top of UTMI:1. Use of Line State for timers:The UTMI spec mentions several times that Line State is the most accurate signal to be used for timing a certain state on the USB bus. It is not a hard requirement for the USB device core designer to use this signal. He can use whatever method he wants as long as correct behavior on the USB bus is guaranteed without forcing additional constraints on the PHY design. 2. Line State filtering:Minimal filtering should be applied to Line State to ensure that skew on the DP/DM signals does not generate unwanted SE0 or SE1 states between J and K states. For instance, for FS mode Table 7-9 of the USB 2.0 Specification identifies the Width of SE0 interval during differential transition to be 14ns max. These SE0 states are noise to the SIE and should not be propagated by Line State. To be able to filter worst case SE0 noise, the transceiver should implement filtering as indicated. Filtering should only occur on an SE0. If during filtering the SE0 a nonSE0 event occurs then the filtering should stop and line state behavior continues as previously.
Bus speed
-----
8-bit interface (CLK = 60MHz)
------ 16-bit interface (CLK = 30 MHz)
Low-speed mode filtering ---------- 14 CLK cycles -------- 7 CLK cycles Full-speed mode filtering ------- 2 CLK cycles -----------1 CLK cycle High-speed mode filtering --------------2 CLK cycles --------------1 CLK cycle 3. RxActive/RxValid during transmit:The UTMI PHY must internally block the USB receive path once a USB transmit has begun. The receive path can be unblocked when the internal Squelch (HS) or SE0-to-J (FS/LS) is seen. 4. TxReady behavior when not bit stuffing:TxReady must be used in chirp mode. If TxReady is not asserted by the UTMI PHY when the USB device core was sending a chirp, it can cause the device core to lock-up if the device core is holding the transmit data on the bus until it sees TxReady asserted. By explicitly requiring that TxReady must be asserted for all transmit data including chirp data, this problem can be avoided. 5. Receive End Delay:At the end of page 59 of the UTMI spec v.1.05 there is a contradiction between the number of bit times and the number of clock cycles for Total Receive End Delay for an interface running at 30MHz. 6 30 MHz clock cycles is actually 96 bit times. For a 16 bit transceiver interface, the Total Receive End Delay must be between 32-96 bit times or 2-6, 30 MHz CLKs This block handles the low level USB protocol and signaling. This includes features such as; data serialization and deserialization, bit stuffing and clock recovery and synchronization. The primary focus of this block is to shift the clock domain of the data from the USB 2.0 rate to one that is compatible with the general logic in the ASIC. The UTMI is designed to support HS/FS, FS Only and LS Only UTM implementations. The three options allow a single SIE implementation to be used with any speed USB transceiver. A vendor can choose the transceiver performance that best meets their needs. A HS/FS implementation of the transceiver can operate at either a 480 Mb/s or a 12 Mb/s rate. Two modes of operation are required to properly emulate High-speed device connection and suspend/resume features of USB 2.0, as well as Full-speed connections if implementing a Dual-Mode device. FS Only and LS Only UTM implementations do not require the speed selection signals since there is no alternate speed to switch to.
Serial Interface Engine:This block can be further sub-divided into 2 types of sub-blocks; the SIE Control Logic and the Endpoint logic. The SIE Control Logic contains the USB PID and address recognition logic, and other sequencing and state machine logic to handle USB packets and transactions. The Endpoint Logic contains the endpoint specific logic: endpoint number recognition, FIFOs and FIFO control, etc. Generally the SIE Control Logic is required for any USB implementation while the number and types of endpoints will vary as function of application and performance requirements. SIE logic module can be developed by peripheral vendors or purchased from IP vendors. The standardization of the UTMI allows compatible SIE VHDL to drop into an ASIC that provides the macro cell.
Device Specific Logic: This is the glue that ties the USB interface to the specific application of the device. Applications:The UTMI has been developed into a common code (Generalized USB Transceiver) which can be used for developing the complete USB device stack. Some of the Low speed and High speed USB devices, which are presently in the market, are: 1. Optical Mouse 2. Key Board 3. Printer 4. Scanner 5. Joy Stick 6. Memory Stick 7. Flash Memory 8. Mobiles 9. Video cameras Universal Serial Bus(USB) Transceiver Macro cell Interface (UTMI) is one of the most important blocks of USB Controller. This block handles the low level USB protocol and signaling. This includes features such as data serialization, de serialization, bit stuffing, bit de stuffing, Non Return to Zero Invert on 1 (NRZI) encoding, decoding, clock recovery and
synchronization. The primary focus of this block is to shift the clock domain of the data from the USB 2.0 rate to one that this compatible with the general logic in the ASIC.
Key features of the USB 2.0 Transceiver: Supports 480 Mbit/s "High Speed" (HS)/ 12 Mbit/s Full Speed (FS), FS Only and "Low Speed" (LS) Only 1.5 Mbit/s serial data transmission rates. Utilizes 8-bit parallel interface to transmit and receive USB 2.0 cable data SYNC/EOP generation and checking High Speed and Full Speed operation to support the development of "Dual Mode" devices Data and clock recovery from serial stream on the USB Bit-stuffing/unstuffing; bit stuff error detection Holding registers to stage transmit and receive data Logic to facilitate Wake Up and Suspend detection Ability to switch between FS and HS terminations/signaling Single parallel data clock output with on-chip PLL to generate higher speed serial data clocks.
The UTMI is divided into two modules, which are the transmitter module and the receiver module. 1. The Transmitter module Specifications:The transmitter module of UTMI has been implemented by considering the following specifications. The SYNC pattern 01111110 has to be transmitted immediately after the transmitter is initiated by the SIE. After six consecutive 1s occur in the data stream, a zero to be inserted. The data should be encoded using Non Return to Zero Invert on 1(NRZI -1) encoding technique. The EOP pattern two single ended zeros (D+ and D- lines are carrying zero for two clock cycles) and a bit 1 have to be transmitted after each packet or after SIE suspends the transmitter.
The behavior of the Transmit State Machine given by USB 2.0 specifications is described. The Reset signal forces the state machine into the Reset state which negates TX Ready. When Reset is negated the transmit state machine will enter the TX Wait state. In the TX Wait state, the transmit state machine looks for the assertion of TX Valid. When TX Valid is detected, the state machine will enter the Send SYNC state and begin transmission of the SYNC pattern. When the transmitter is ready for the first byte of the packet (PID), it will enter the TX Data Load state, assert TX Ready and load the TX Holding Register. The state machine may enter the TX Data Wait state while the SYNC pattern transmission is completed. TXReady is used to throttle transmit data. The state machine will remain in the TX Data Wait state until the TX Data Holding register is available for more data. In the TX Data Load state, the state machine loads the Transmit Holding register. The state machine will remain in the TX Data Load state as long as the transmit state machine can empty the TX Holding Register before the next rising edge of CLK. When TXValid is negated the transmit state machine enters the Send EOP state where it sends the EOP. While the EOP is being transmitted TXReady is negated and the state machine will remain in the Send EOP state. After the EOP is transmitted the Transmit State Machine returns to the TX Wait state.
2. The Transmitter Module Design:The transmitter module is designed by considering all the above specifications. VHDL is used to design the transmitter module. The transmitter module of the UTMI consists of various blocks such as SYNC generator, transmit hold and shift register, bit stuffer, NRZI encoder and EOP generator. A transmit state machine is developed by considering all the states given by USB 2.0 transmit state machine. Initially the transmitter is at Reset state where the reset signal is high. If reset signal goes low state the state of the transmitter is changed to TX wait state where it is waiting for assertion of TX valid signal by the SIE. 2.1 SYNC generator:When the TX valid signal is asserted by the SIE, transmit state machine enters into send Sync state where a signal called of the clock out side the state machine. When this signal is enabled sync enable is asserted. This signal is checked at every rising edge, a sync pattern 01111110 is send to the NRZI encoder.
2.2 Transmit hold and shift register:The data byte placed on the data lines by the SIE sampled by the UTMI at the rising edge of the clock. For this purpose, an 8-bitvector is declared in the entity declaration of the transmitter module. This 8-bit vector is considered as transmit hold and shift register. The transmit hold and shift register is loaded with 8-bit parallel data from SIE at the rising edge of the clock. At this movement the transmit state machine is in Data load state. After the register is loaded, the data is sent to the other modules serially. Each bit of the register is sent to the Bit stuff module. After all the bits are serially sent to the Bit stuff module, Tx ready signal is asserted by the transmit state machine. During parallel to serial conversion data, the transmit state machine is in Data wait state. 2.3 EOP Generator:When TX valid signal is negated by the SIE, the transmit state machine enters into send EOP state where it enables a signal called eop_enable. This signal is checked out side the state machine for every clock. If this signal is high then the EOP pattern: two single ended zeroes (i.e, DP, DM lines contain zeroes) and a J (i.e, a 1 on DP line and a 0 on DM line) is transmitted on to DP, DM lines. The Receiver Module Specification:The receiver module has been implemented by considering the fallowing specifications. When SYNC pattern is detected that should be intimated to the SIE. If a zero is not detected after six consecutive 1s an error should be reported to the SIE. When EOP pattern is detected that should be intimated to the SIE. The behavior of the Receive State Machine is illustrated in the assertion of Reset will force the Receive State Machine into the Reset state. The Reset state negates RXActive and RXValid. When the Reset signal is negated the Receive State Machine enters the RX Wait state and starts looking for a SYNC pattern on the USB. When a SYNC pattern is detected the state machine will enter the Strip SYNC state and assert RXActive. The length of the received SYNC pattern varies and can be up to 32 bits long. As a result, the state machine may remain in the Strip SYNC state for several byte times before capturing the first byte of data and entering the RX Data state. After 8 bits of valid serial data is received the state machine enters the RX Data state, where the data is loaded into the RX Holding Register on the rising edge of CLK and RXValid is asserted.
The SIE must clock the data off the DataOut bus on the next rising edge of CLK. Stuffed bits are stripped from the data stream. Each time 8 stuffed bits are accumulated the state machine will enter the RX Data Wait state, negating RXValid thus skipping a byte time. When the EOP is detected the state machine will enter the Strip EOP state and negate RXActive and RXValid. After the EOP has been stripped the Receive State Machine will reenter the RX Wait state and begin looking for the next packet. If a Receive Error is detected, the Error State is entered and RXError is asserted. Then either the Abort 1 State is entered where RXActive, RXValid, and RXError are negated, or the Abort 2 State is entered where only RXValid, and RXError are negated. The Abort 1 State proceeds directly to the RX Wait State, while Abort 2 State proceeds to the Terminate State after an Idle bus state is detected on DP and DM. The Terminate State proceeds directly to the RX Wait State. When the last data byte is clocked off the DataOut bus the SIE must also capture the state of the RXError signal. The description of the receiver given above is summarized below.
3.2. PIE block:Parallel Interface Engine provides byte oriented communication between UTMI and the peripheral device functional blocks. On the receiver side it extracts and deals with special parts of packet Packet Identifier (PID) and CRC coming from the cable, via UTMI. The transmit functionality contains preparation of CRC-5 and CRC-16 codes, General schematic of PIE block interconnection is presented in Fig. 4. Its architecture was divided to two parts PACKET and TRANS, controlled by two separate Finite State Machines. The key functionality of PACKET module is extraction, analysis and construction of packets, in particular PID and CRC fields. Schematic of the appropriate FSM is presented in Fig. 5. Receive and transmit functions are realized in two different branches accessible form the common NONE state. The receive sequence starts from the detection of SYNC pattern in the UTMI, signaled by the high state of RX active and RX valid lines. In the next state RPID, the packet identifier (PID) is recognized. PID determines the content of the packet and consequently further sequence of states. For packets containing data the next state would be RPID_DATA, whilst for special packets like token or start of frame, the next states are RPID_TOKEN and RPID_SOF respectively.
Regardless of the branch taken, the low state of RX active line from UTMI signals end of transmission. In the next state RCHECK, the CRC code which has been calculated on-line is compared with the incoming one attached to the end of packet. Depending on CRC result the next state is either RFIN where the transmission is closed or the SERROR where the alerts about invalid data are generated. The other branch of FSM controls data transmission functionality, with alternative sequences depending on the kind of packet to be sent (data or various kinds of control again).
Figure 4. Parallel Interface Engine (PIE) block structure and connection schematic Behavior of the TRANS module strongly depends on control packets received by UTMI block and recognized by PACKET module. TRANS module performs dialog with the computer, but on the higher level of abstraction, consequently variety of its reactions is more complicated. Depending on the requested action receive or transmit, the Finite State Machine, presented in Fig. 6, transits form the IDLE state to SOUT or SIN states respectively. SOUT starts the receive operation. The basic sequence of states WAIT_4_PID_DATA, RPID_DATA, RDATA, and RECEIVING_DATA leads to collection of bulky data. Final stage of receive operation requires handshake response sent to the computer in the form of acknowledgment or not acknowledgement packets. This action is performed in the two alternative branches with WAIT2_ACK/TACK and WAIT2_TNAK/TNAK sequences respectively.
After sending the handshake the Finite State Machine goes back to the IDLE state. Alternative branch of the FSM controlling TRANS module starts from the SIN state. The next state is WAIT2T_PID_DATA, where the TRANS module sets the TX valid signal to 1 requesting UTMI block to send the SYNC pattern to the USB cable. If the UTMI is able to do it, it responds with a TX ready signal (via the PACKET block) and the next state is TPID_DATA. In this state the TRANS block sends the packet identifier to 8-bit data out bus. Simultaneously it sets the TX valid signal to inform the prospective peripheral system that the transmission may start. In the next TPID_DATA state the main transmission takes place until the peripheral system resets the Ready line to 0 (typically after sending all the data) and then the PACKET module sets the TX fin signal to 1 (after transporting the last byte of packet to UTMI). The next state is WAIT_4_ACK. All the blocks wait for appropriate handshake packet, which must be sent by the computer. If the appropriate PID is recognized the next state is IDLE. If any other PID is received the next state is SERROR. This state may be reached from most of the states when further step of transmission is impossible, e.g. because some device is busy. Depending on the reason, the appropriate alerts may be sent to the peripheral or to PACKET and UTMI blocks.
Figure 5. Finite State Machine controlling PACKET module of Parallel Interface Engine block
Figure 6. Finite State Machine controlling TRANS module of Parallel Interface Engine block
Finite state machines for modeling control logic:A finite state machine is a model of a reactive system. The model defines a finite set of states and behaviors and how the system transitions from one state to another when certain conditions are true. Finite state machines are used to model complex logic in dynamic systems, from automatic transmissions to robotic systems to mobile phones. Examples of this complex logic include: Scheduling a sequence of tasks or steps for a system Defining fault detection, isolation, and recovery logic Supervising how to switch between different modes of operation You can model finite state machines with State flow software and integrate them into a Simulink model. These finite state machines are represented by state charts, which provide additional capabilities beyond traditional finite state machines, including: Modeling hierarchical states for large-scale systems Adding flow graphs to define complex decision logic Defining orthogonal states to represent systems with parallelism
UNIT-4 FPGA
A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturinghence "field-programmable". The FPGA configuration is generally specified using a hardware description language (HDL), similar to that used for an application-specific integrated circuit (ASIC) (circuit diagrams were previously used to specify the configuration, as they were for ASICs, but this is increasingly rare). Contemporary FPGAs have large resources of logic gates and RAM blocks to implement complex digital computations. As FPGA designs employ very fast I/Os and bidirectional data buses it becomes a challenge to verify correct timing of valid data within setup time and hold time. Floor planning enables resources allocation within FPGA to meet these time constraints. FPGAs can be used to implement any logical function that an ASIC could perform. The ability to update the functionality after shipping, partial re-configuration of a portion of the design and the low non-recurring engineering costs relative to an ASIC design (notwithstanding the generally higher unit cost), offer advantages for many applications. FPGAs contain programmable logic components called "logic blocks", and a hierarchy of reconfigurable interconnects that allow the blocks to be "wired together"somewhat like many (changeable) logic gates that can be inter-wired in (many) different configurations. Logic blocks can be configured to perform complex combinational functions, or merely simple logic gates likeAND and XOR. In most FPGAs, the logic blocks also include memory elements, which may be simple flip-flops or more complete blocks of memory. Some FPGAs have analog features in addition to digital functions. The most common analog feature is programmable slew rate and drive strength on each output pin, allowing the engineer to set slow rates on lightly loaded pins that would otherwise ring unacceptably, and to set stronger, faster rates on heavily loaded pins on high-speed channels that would otherwise run too slowly.[4][5]Another relatively common analog feature is differential comparators on input pins designed to be connected to differential signaling channels. A few "mixed signal FPGAs" have integrated peripheral analog-to-digital converters (ADCs) and digital-to-analog converters (DACs) with analog signal conditioning blocks allowing them to operate as a system-on-achip. Such devices blur the line between an FPGA, which carries digital ones and zeros on its
internal programmable interconnect fabric, and field-programmable analog array (FPAA), which carries analog values on its internal programmable interconnect fabric. Applications of FPGAs include digital signal processing, software-defined vision, speech
radio, ASIC prototyping, medical
imaging, computer
recognition, cryptography, bioinformatics, computer hardware emulation, radio astronomy, metal detection and a growing range of other areas. FPGAs originally began as competitors to CPLDs and competed in a similar space, that of glue logic for PCBs. As their size, capabilities, and speed increased, they began to take over larger and larger functions to the state where some are now marketed as full systems on chips (SoC). Particularly with the introduction of dedicated multipliers into FPGA architectures in the late 1990s, applications which had traditionally been the sole reserve of DSPs began to incorporate FPGAs instead. Traditionally, FPGAs have been reserved for specific vertical applications where the volume of production is small. For these low-volume applications, the premium that companies pay in hardware costs per unit for a programmable chip is more affordable than the development resources spent on creating an ASIC for a low-volume application. Today, new cost and performance dynamics have broadened the range of viable applications. Common FPGA Applications Aerospace and Defense o Avionics/DO-254 o Communications o Missiles & Munitions o Secure Solutions o Space ASIC Prototyping Audio o Connectivity Solutions o Portable Electronics o Radio o Digital Signal Processing (DSP)
Automotive o High Resolution Video o Image Processing o Vehicle Networking and Connectivity o Automotive Infotainment Broadcast o Real-Time Video Engine o EdgeQAM o Encoders o Displays o Switches and Routers Consumer Electronics o Digital Displays o Digital Cameras o Multi-function Printers o Portable Electronics o Set-top Boxes Distributed Monetary Systems o Transaction verification o BitCoin Mining Data Center o Servers o Security o Routers o Switches o Gateways o Load Balancing High Performance Computing o Servers o Super Computers o SIGINT Systems
o High-end RADARS o High-end Beam Forming Systems o Data Mining Systems Industrial o Industrial Imaging o Industrial Networking o Motor Control Medical o Ultrasound o CT Scanner o MRI o X-ray o PET o Surgical Systems Security o Industrial Imaging o Secure Solutions o Image Processing Video & Image Processing o High Resolution Video o Video Over IP Gateway o Digital Displays o Industrial Imaging Wired Communications o Optical Transport Networks o Network Processing o Connectivity Interfaces Wireless Communications o Baseband o Connectivity Interfaces o Mobile Backhaul
o Radio
ARCHITECTURE:The most common FPGA architecture consists of an array of logic blocks (called Configurable Logic Block, CLB, or Logic Array Block, LAB, depending on vendor), I/O pads, and routing channels. Generally, all the routing channels have the same width (number of wires). Multiple I/O pads may fit into the height of one row or the width of one column in the array. An application circuit must be mapped into an FPGA with adequate resources. While the number of CLBs/LABs and I/Os required is easily determined from the design, the number of routing tracks needed may vary considerably even among designs with the same amount of logic. For example, a crossbar switch requires much more routing than a systolic array with the same gate count. Since unused routing tracks increase the cost (and decrease the performance) of the part without providing any benefit, FPGA manufacturers try to provide just enough tracks so that most designs that will fit in terms of Lookup tables (LUTs) and I/Os can be routed. This is determined by estimates such as those derived from Rent's rule or by experiments with existing designs. In general, a logic block (CLB or LAB) consists of a few logical cells (called ALM, LE, Slice etc.). A typical cell consists of a 4-input LUT, a Full adder (FA) and a D-type flip-flop, as shown below. The LUTs are in this figure split into two 3-input LUTs. In normal mode those are combined into a 4-input LUT through the left mux. In arithmetic mode, their outputs are fed to the FA. The selection of mode is programmed into the middle multiplexer. The output can be either synchronous or asynchronous, depending on the programming of the mux to the right, in the figure example. In practice, entire or parts of the FA are put as functions into the LUTs in order to save space.
FIGURE: Simplified example illustration of a logic cell
FPGA DESIGN AND PROGRAMMING:To define the behavior of the FPGA, the user provides a hardware description language (HDL) or a schematic design. The HDL form is more suited to work with large structures because it's possible to just specify them numerically rather than having to draw every piece by hand. However, schematic entry can allow for easier visualisation of a design. Then, using an electronic design automation tool, a technology-mapped netlist is generated. The netlist can then be fitted to the actual FPGA architecture using a process called place-and-route, usually performed by the FPGA company's proprietary place-and-route software. The user will validate the map, place and route results via timing analysis, simulation, and
other verificationmethodologies. Once the design and validation process is complete, the binary file generated (also using the FPGA company's proprietary software) is used to (re)configure the FPGA. This file is transferred to the FPGA/CPLD via a serial interface (JTAG) or to an external memory device like an EEPROM. The most common HDLs are VHDL and Verilog, although in an attempt to reduce the complexity of designing in HDLs, which have been compared to the equivalent of assembly languages, there are moves to raise the abstraction level through the introduction of alternative languages. National Instrument's LabVIEW graphical programming language (sometimes referred to as "G") has an FPGA add-in module available to target and program FPGA hardware. To simplify the design of complex systems in FPGAs, there exist libraries of predefined complex functions and circuits that have been tested and optimized to speed up the design process. These predefined circuits are commonly called IP cores, and are available from FPGA vendors and third-party IP suppliers (rarely free, and typically released under proprietary licenses). Other predefined circuits are available from developer communities such as OpenCores (typically released under free and open source licenses such as the GPL, BSD or similar license), and other sources. In a typical design flow, an FPGA application developer will simulate the design at multiple stages throughout the design process. Initially the RTL description
in VHDL or Verilog is simulated by creating test benches to simulate the system and observe results. Then, after the synthesis engine has mapped the design to a netlist, the netlist is translated to a gate level description where simulation is repeated to confirm the synthesis proceeded without errors. Finally the design is laid out in the FPGA at which point propagation
delays can be added and the simulation run again with these values back-annotated onto the netlist.
BASIC PROCESS TECHNOLOGY TYPE:
SRAM - based on static memory technology. In-system programmable and reprogrammable. Requires external boot devices. CMOS. Currently in use.
Antifuse - One-time programmable. CMOS. PROM - Programmable Read-Only Memory technology. One-time programmable because of plastic packaging. Obsolete.
EPROM -
Erasable
Programmable
Read-Only
Memory
technology.
One-time
programmable but with window, can be erased with ultraviolet (UV) light. CMOS. Obsolete.
EEPROM - Electrically Erasable Programmable Read-Only Memory technology. Can be erased, even in plastic packages. Some but not all EEPROM devices can be in-system programmed. CMOS.
Flash - Flash-erase EPROM technology. Can be erased, even in plastic packages. Some but not all flash devices can be in-system programmed. Usually, a flash cell is smaller than an equivalent EEPROM cell and is therefore less expensive to manufacture. CMOS.
Fuse - One-time programmable. Bipolar. Obsolete.
UNIT-V CONCLUSIONS
Selected details of construction and functionality of USB protocol were presented. Introduction was followed by authors proposal of dedicated hardware containing the most of data transmission mechanisms. The design was partitioned to three modules forming a system of bidirectional, sequential data flow. Each module, responsible for specific stage of data processing is controlled by its own Finite State Machine. The other imaginable solutions are splitting the architecture in accordance with direction of data transmission, to the two parts receive and transmit, controlled by e.g. two Finite State Machines or implementation of all the functionality in a single automaton. Selected approach brings some risk of deadlock caused by crossdependencies of the three automata. Its key advantage however is preservation of natural concurrency of operation of the three modules. Functionality of USB interface quite often requires simultaneous processing of various parts of transmitted data. USB is nowadays probably the most commonly used interface between computer and peripherals. There is constantly growing number of low-cost devices, whose great advantage shall be ease of use. Detailed analysis of protocol mechanisms, accompanying presented project led to the surprising conclusion that these mechanisms are very complicated. Variety of sequential actions and reactions that must be performed to provide the plug and play operation of the target device requires substantial effort devoted to both design and verification of the prospective module. This level of complication is hard to justify. And it forces digital design teams to buy and reuse the appropriate Intellectual Property modules rather than develop their own versions and integrate them with the application specific architectures, e.g. on the Hardware Description Language code level. This practice remains in total opposition to the old RS232 serial interface, where the design of UART modules used to be a popular task for students and young engineers starting their careers in digital electronic design. Another functional difference between USB and RS232 is the need for software drivers providing the computer operating systems supervision and communication with the USB equipped device. Problems with drivers compatibility and installation are well known for all users of niche products and made them skeptical about the plug and play myth. These problems are rarely observed for massive products where the vendors are able to (technically speaking) buy the compatibility with e.g. Microsoft
Windows systems. The third issue and limitation for individual constructions of USB interfaces and their embedding in the designs implemented in low-cost programmable logic devices is relatively high clock frequency of the serial transmission. 480 MHz remains quite high and difficult to reach in the lowcost programmable circuits available nowadays (A.D. 2011/2012). Continuous development and growing speed of FPGA circuits will eventually enable implementation of architectures working with this speed. Before that time however, the faster version of USB may become the new standard. These three issues rise serious questions about the sense of continuous growth of variety and volume of USB based devices on the market. Seems like the reliable (unlikely to crash), designer friendly and user friendly RS232-UART solution was replaced by much faster USB for the price of more frequent crashing, problems with drivers, and loosing open character of design. This replacement is extensively supported by personal computer vendors, removing RS232 from main boards as obsolete. The little and desperate exception from this trend is RS232 over USB concept based on popular off-the-shelf FT232 circuit which enables connection of classic UART interface with RS232 dedicated computer software via the USB socket. This approach is extensively used by developers of specialized devices and in some low-cost products. For the higher speeds however, at this stage, there is no escape from purchasing USB cores developed by small group of the design houses and then either fabricated as separate chips or embedded in the higher quality programmable logic devices.

File

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

File

Caricato da

Copyright:

Formati disponibili

ABSTRACT

UNIT-II USB PROTOCOL OVERVIEW

Figure 2. USB Transceiver Macrocell Interface (UTMI) block

8-bit interface (CLK = 60MHz)

------ 16-bit interface (CLK = 30 MHz)

radio, ASIC prototyping, medical

FIGURE: Simplified example illustration of a logic cell

BASIC PROCESS TECHNOLOGY TYPE:

Fuse - One-time programmable. Bipolar. Obsolete.

Potrebbero piacerti anche