HERA-B ECAL PRETRIGGER BOARD DESCRIPTION
C.Baldanza, M.Bruschi, I.D'Antone, M.Piccinini, M.Zuffa
Istituto Nazionale di Fisica Nucleare
In this report the HERA-B ECAL pretrigger board is described. It is
also shown how, in the board, all the operations are pipelinized. The latency
of all the operations is also described.
CENTRO DI ELETTRONICA
ISTITUTO NAZIONALE DI FISICA NUCLEARE
Sezione di Bologna
In the following the HERA-B ECAL pretrigger board is described. It is
also shown how, in the board, all the operations are pipelinized. The latency
of all the operations is also described.
2. Pretrigger data input
The input data are given by the readout cards, two wires per channel, according to a serial synchronous protocol.
The rate of the input data is equal to the bunch crossing (BX) of HERAB, 96nS. Each datum contains 8 bit, seven of which are the value given by the ADC of the front-end card and the other bit is a flag given by a comparator on the front-end card, indicating, for each channel, a value over a threshold. The pretrigger recomposes the data distributed on two lines and elaborates only the data with the flag bit set.
The data synchronization is performed by means of the experiment HERA_clock, whose period is of 96nS. In the pretrigger board the HERA_clock is quadruplicated with PLL circuits and the extracted fCLK signal, 24nS of period, is used to synchronize all the circuits of the board.
The data on the input lines are serialized with 24nS clock. The timing
of the data transfer on a couple of lines respects to the HERA_clock and
to the fCLK could be seen in fig.1.
3. Data organization
The pretrigger board must elaborate groups of 48 or 50 data coming from
the calorimeter. The data are organized in a 12x8 matrix in such a way
that each pretrigger card can deal with two groupings of calorimeter cells:
10x5 and 8x6 plus the border cells.
For each BX the pretrigger board is able to elaborate three clusters. Each cluster is composed by a central cell and the eight cells that surround it (nonet).
To elaborate a nonet having the central cell on the border of the matrix , for example the C1 or the C30 cells in a 10x5 matrix (fig.2), it is essential to get the values of the cells on the border of the confinant matrixes.
In fig.2 is represented the disposition of the cells in the 10 x 5 and
8x 6 matrixes. The cells of each matrix are indicated with the C letter
(C1, C2, ..., C50). The U, D, L and R border cells originate from the eight
matrixes in the neighbour pretrigger boards.
4. Block diagram description
The serial data are transferred simultaneously to the Event Buffer& Multiplexer (EBM) and to the Local Maxima Finder Unit (LMFU). The EBM stores the data of an event associating them to the BX number. The LMFU recognizes three cells containing a value higher than the neighbours and with the flag actived.
The addresses of the three cells are sent to the Process Controller (CTRL) that stores in three FIFOs the address together to the relative BX number.
Furthermore the CTRL contains other 4 FIFOs in which are stored the coordinates of the cells that must be elaborated for the searching of the Bremsstrahlung photon(BRM).
Two FIFOs are used for the Bremsstrahlung recovery on-board and other two FIFOs for the Bremsstrahlung recovery requested by the external pretrigger boards on the right side or on the left side.
These coordinates characterize the photon of right side and of left side belonging to the current nonet.
A sequencer within the CTRL extracts 3 coordinates for each BX from the FIFOs according to a suitable criterion of priority and it sends them to the EBM and to the Message Buffer (MBUF).
The data coming from the EBM are elaborated by the LUT, whose results are sent to the MBUF. The MBUF is a circuit composed by 8 dual port RAM and a controller that handles the flow of the data. The function of the controller is the memorization of the results in the dual port RAM, the recognition of a results coming from a main cluster elaboration and those coming from the Bremsstrahlung recovery and, finally, the extraction of the complete data.
The complete data are transferred to the VME Interface & Message Formatter (VIMF) in which the message for the TFU is assembled. The message is composed by a packet of four 20 bit words.
MBUF checks also the BRM Request Interface (BRI) that gives, by means of a look-up table, the coordinates for the Bremsstrahlung recovery.
The coordinates could belong to the matrix elaborated on the same board or to the two matrixes to the right and to left side. In the first cases these coordinates are sent directly to the FIFO of the CTRL on-board, in the second case are sent to the CTRL of the pretrigger board of the right or the left side.
Simultaneously the BRI could get some requests from the card on the right or the left side and send them to the CTRL. The results of these elaborations goes through the BRI to the external boards.
The FCS interface gets the bunch crossing number (BCN) and transfers it to the CTRL. The FCS gets besides the HERA_clock and send it to the Fast Clock Generator (FCG) where the fCLK is generated and distributed to all the circuits on-board.
The FGC contains some PLL circuits that are able to delay or to advance the fCLK. In this way all the clock edges in all the pretrigger boards can be synchronized.
The data coming from the Input Interface (InI) are in differential TTL format and in the InI are converted in TTL and synchronized with the fCLK.
The block diagram of the pretrigger board is shown in fig.3.
5. Input Interface
|JCA||Central cells first part||pCO1-12, nCO1-12, pCE1-13, nCE1-13|
|JCB||Central cells second part||pCO13-25, nCO13-25, pCE14-25, nCE14-25|
|JCC||Central cells third part||pCO26-37, nCO26-37, pCE26-38, nCE26-38|
|JCD||Central cells fourth part||pCO38-50, nCO38-50, pCE39-50, nCE39-50|
|JU||Upper border cells||pUE1-10, nUE1-10, pUO1-10, nUO1-10|
|JD||Lower border cells||pDE1-10, nDE1-10, pDO1-10, nDO1-10|
|JL||Left border cells||pLE1-8, nLE1-8, pLO1-8, nLO1-8|
|JR||Right border cells||pRE1-8, nRE1-8, pRO1-8, nRO1-8|
The Input Interface (InI) gets the signals coming from the front-end
cards in differential TTL format, it converts them in TTLsingle ended,
synchronizes them with the fCLK and distributes them to the EBM and to
The input connectors are eight ; in Table 1 is shown the signal distribution. The signals are named according to their destination. The presence of a 'p' or a 'n' at the beginning of the name distinguishes positive and negative signals. The second letter of the abbreviation characterizes the group to which the signal belongs: 'C' for central cells, 'U' for the upper edge, 'L' for the lower edge, 'L' for the left edge and 'R' for the right edge. The third letter could be 'E' if the signal line carries the even bit, 'O' if it carries the odd bit (Table 1).
The number of signals is 344, 50 central cells, 10 upper, 10 lower, 8 at right side and 8 at left side multiplied by 4 (2 lines carry the even or odd bits and 2 lines for the positive and negative of the differential signal). The number of cells handled by the input interface is however more than the format 10x5 (84 cells included the edges) and more than the format 8x6 (80 cells included the edges). In this way the two formats are handled with the same pretrigger board.
To the connectors are connected the differential receivers and the latches
synchronized with the fCLK. Then the signals are transferred to the LMFU
and to the multiplexers within the EBM that create the matrix 8x6 or 10x5
(fig. 2). The selection among the two systems happens by means of a jumper.
The multiplexers distribute the signals in input in two matrices like those
shown in fig.2). The signals are distributed in this way because, to extract
the nonet, the data in the EBM are organized for columns.
6. Local Maxima Finder Unit
The Local Maxima Finder Unit (LMFU) is a device that finds in a matrix three cells having a cluster. It is implemented in a 4013FPGA Xilinx. The capability to elaborate two types of matrixes, 10x5 or 8x6, is obtained loading two different programs in the FPGA at the start-up, according to the position of a jumper.
The data in input originate from the InI, in serial format, and they concern the central and the edges cells. In output the device gives three addresses of cells containing an event and a flag for each address that validate the address.
The criterions for the choice of a cell are:
1. the cell must have a value higher than the cell in upper side and the cell to the left side.
2. the cell must have an equal or higher value than the cell in lower side and to the right side.
3. the cell must have the bit of flag actived ('0').
To compare two cells it is used a serial comparator in double line, adapted to the format of the input data. At each fCLK edge two 2bit data arrive. The first couple of bit, i.e. the most significant bit of the datum, is recognized by the presence of a '1' on a synchronization line. The most significant bit of this couple contains the flag. If the flag is not active, the cell is immediately discarded. If the first couple of 2bit data is equal, the comparator waits the second , then the third and the fourth couple. When two couples are different the comparator is stopped. The information of the comparator is stored and transferred to the next circuit at the next synchronization impulse.
Any C cell is connected to 4 comparators called Upper (U), Lower (D),
Left (L) and Right (R). As it is seen in Fig.5, two cells are connected
to the d input of the R and D comparators and to the c input
of the U and L comparators .
The flow chart of the comparator cycle is shown in Fig.6. In the description that follows the words are referred to the elements of the comparator in Fig. 5.
The cycle starts at the arrival of a synchronization signal R, simultaneously arrive the bit 7 and 6 of the data on the c and d inputs.
At the beginning is tested the flag bit and if it is not active ('1') the cell connected to d is immediately discarded . Otherwise the couple of bits are compared sequentially.
Each cell has a circuit that picks up the results of the four comparators to which it is connected.
If the cell has a value higher than the CU , CD, CL, CR cells that surround it, the response of the comparator process is zero if the cell is valid and "1" if the cell has been discarded.
This output is taken by means of a latch synchronized with the fCLK.
Then the results are transferred to three priority encoders in pipeline,
with 50 inputs everyone. Each encoder contains five priority encoders with
ten inputs everyone, for the matrix 10x5, and six priority encoders with
eight inputs, for the matrix 8x6. These encoders are moreover connected
to a secondary encoder with variable priority (fig.7).
The matrices are divided in horizontal lines, 5 for the 10x5 and 6 for the 10x6. Each line is connected to a main encoder. The encoder recognizes the active cell with the higher priority and brings, in the output line, the address of the correspondent cell position. The first cell to the right of each line has address 0, the seconds 1 and so on. The main encoders are connected to the secondary encoder across a line that becomes active when at least one of the encoder cells is valid. If more than one encoder is active the secondary encoder selects that with higher priority, enabling the output buffer and giving in output the information of the column (Column). Simultaneously it gives
in output the address of the row elaborated by the encoder (Row). The secondary encoder checks also the Valid line, that it is active when at least a cell in the matrix is valid.
The row with higher priority is variable at each BX. At the power up
the line at higher priority is the first, then the second, the third and
as so on.
The first encoder with 50 inputs selects the first event and at the following BX send the data to the second encoder, blanking the cell that it has selected. The second encoder, after a subsequent BX send the data to a third encoder, blanking the cell that has selected always.
The data, before the transmission to the CTRL, are converted from the Row / Column value to the address of the cell in the matrix minus one: the cell C1 has address 0, the cell C2 address 1 and so on. This address of cell from now identifies the cells of the matrix.
To the CTRL are sent, in three different times, the addresses of the
three select cells (HA, HB, HC) and relative to the same BX.
7. Process Controller
The process control handles the elaboration of the data contained in the cells. It gets the addresses of the valid cells and the addresses of the cells with BRM photon and it starts three elaborations for each BX. The CTRL extracts the Bunch Crossing Number (BCN) from the FCS interface and use it as address to write in the EBM. The part of reception and distribution of the signals within the process controller has been implemented on a 4013E Xilinx. The control of the EBM and the management of the BCN within the process controller has been implemented in a FPGA 4003E Xilinx. This last part gets the BCN from the FCS card, delays it by means of a shift register, to recover the latency of the read-out cards, and stores it to the same address of the matrix regarding the same BX, to dispatch it to the MBUF when the valid cells of the matrix have been elaborated.
The sources of the cell addresses to elaborate are seven :
To the addresses BR, BL, XR, XL are associate the BCN of the active cell whose elaboration has given the address. The BCN of the addresses HA, HB and HC is associated directly by the CTRL.
The addresses are stored in seven FIFO deep 16. Each FIFO is realized with a RAM block in a FPGA and it could be read and written simultaneously. Each FIFO has four state signals that inform the extraction circuit about the content. The signals are :
The extraction circuit takes from the FIFO three addresses for each
BX and send them to the EBM and to the MBUF. The addresses exit at 24nS
rate. Beyond to the address and the associate BCN, also a group of 5 flag
containing information on the data are sent to the MBUF (tab.2)
||1 = The packet contains valid data.|
||If EVENT è 0 and bit = 1 the packet contains left BRM data, if bit = 0 the packet contains rightBRM data.|
||If EVENT = 0 and bit = 1 the packet contains an external BRM request to be trasferred to the right or the left board following the status of bit LEFT; if bit= 0 the packet contains an internal to board BRM request.|
||If bit= 1 data concern an event processing, otherwise a BRM request processing.|
||If bit=1 an external BRM request has to be checked; if this is the case the processing is discarded.|
The rules with which the extraction circuit operates are :
The output of each FIFO contains the next address (and the associate BCN) that must be sent and it is connected to a tri-state buffer. The extraction circuit enables only one buffer when it must send an address from a FIFO to the other part of the card.
In each FIFO are realized three circuits to check the data and to highlight
and correct some error conditions :
1. OVERRUN : It checks if the BCN associated to the address is too old. The EBM in fact has a depth of 64 word and after 64 BX a buffered event is overwritten.
The addresses of cells with events HA, HB, HC are eliminated before
the addresses of cells with BRM, because the elaboration of an event is
always followed by the BRM elaboration. In case of an error on a FIFO BRM
the elaboration of the relative BRM is not launched and the message, incomplete,
is overwritten and it is missing within the MBUF. The error flag is set.
2. NOEXTERNAL : Starting from the BCN relative to an address of event HA, HB, HC, it checks if the system has time to elaborate an external BRM request after the event elaboration.
In a limit situation, there is time to elaborate an internal BRM but
not an external given BRM ; the BRM external cycle has a longer latency
of 6 BX. Consequence : in this case the ERR_IF_X flag is set and if the
result of the elaboration foresees an external BRM elaboration the message
3. FIFOFULL : It check that the write cycles are inhibited, when the
line FULL of the FIFOs is activated. In this case the older address of
the FIFO is overwritten.
The three described circuits go to three 7bit registers readable from the VME bus. Each register contains the state of the FIFO and it is reset after a reading from the VME.
Furthermore, the output of each group of registers is connected to a
red led on the frontal panel. The three led has called : OVER, XERR and
8. Event Buffer & Multiplexer
The EBM stores 64 matrixes 8x6 or 10x5 in a buffer composed by static RAM implemented in 6
FPGA Xilinx 4013. The first couple of bit of the data arrives to the input of the EBM where a serial to parallel converter handles the input serial data. At a suitable clock phase a valid parallel data is obtained which is stored in the buffer at the address given by the CTRL. For each cell of the matrix a RAM block of 64 7bit words has been implemented in the FPGA ; globally the buffer is 64 x 96 7bit words.
For each BX are extracted 3 nonet. The data extracted are sent to the
LUT by means of 9 7bit bus ; each bus carries a cell of the nonet. During
a write cycle the CTRL furnishes the current BCN that is used as address
in the buffer, while during a read cycle the CTRL furnishes the address
of the central cell of the nonet and the BCN relative to the event that
must be analyzed. The address of the central cell in the EBM is given by
the CTRL, in a command word, to the multiplexer in the EBM that it send
the data on the bus for the LUT.
|Col ‘l’ bit||SLl||SCl||SRl||SLh||SCh||SRh||MLl||MCl||MRl||MLh||MCh||MRh||Col ‘h’ bit|
|Col ‘l’ bit||SLl||SCl||SRl||SLh||SCh||SRh||MLl||MCl||MRl||MLh||MCh||MRh||Col ‘h’ bit|
Each FPGA can memorize 16 cells organized in two columns of 8 cells. In tab.4 is shown the mapping of the 6 FPGA on the matrix. The cells are part of a 12x8 matrix. The matrixes 8x6 and 10x5 can be imbedded in the 12x8 matrix ; in this way the two cases are handled with only a lightly larger matrix. This configuration has been adopted to optimize the FPGA area of the global EMB implementation.
The six FPGAs has been divided in 3 Slaves and 3 Masters. The Slaves has called SL, SR and SC and are connected to the Masters ML, MR and MC by means of a 21bit bus. The Masters are connected to the Slaves and also among them with three 21bit bus. Each Master has an output that connects it to the other two Masters (fig.8).
From each FPGA Master, three 7bit bus depart that carry the nonet to the LUT.
The first FPGA Master, ML, transfers the values of left side of the
nonet. The second, MC, transfers the central values of the nonet. The third
FPGA Master, MR, transfers the values of right side of the nonet (fig.8).
The multiplexer that extracts the nonet is distributed in the six FPGAs
and acts in the following steps:
1. From all the columns are extracted the values that are located on
the same row of the matrix, on the superior line and on the inferior line.
2. A second multiplexer, with a signal that could be different for each
Master-Slave couple , selects the trio of the data in the first or in the
second column in all the FPGAs.
3. A third multiplexer within the Master, with a signal different for
each Master-Slave couple, selects the trio of the data from the second
multiplexer or from the Slave FPGAs connected with the 21bit bus. The data
at this point have been reduced to 9 values that compose the nonet.
4. Finally could be needed the rotation of the three trios. Each FPGA has an output 21bit AB bus and two input 21bit A and B buses (fig.8). To correct the positions of the trios is used a further multiplexer to three output of the Master performing the suitable rotation.
The multiplexer selects, with a common signal to the three MASTERs,
the trio on the output A, realizing an anticlockwise rotation of the data,
or on the B output, realizing a clockwise rotation of the data.At the multiplexer
output is connected the bus that bring the nonet to the LUT.
The LUT performs the elaboration to produce the x-y coordinates of the center of gravity, the energy of the cluster and other useful information for the first level trigger (Energy, h , x , dx , ddx , DEST, BCN).
The result of this processing is a message as defined in [2 ]. In fig. 9 the logical scheme with which the RAMs perform the processing is shown. Furthermore is also indicated the elaboration performed on the nonet.
The LUT evaluates also the coordinates of the cells where the Brehmsstrahlung
correction have to be searched.
10. Message Buffer
The MBUF temporarily stores the data coming from the LUT, relative to the energy of the selected cell. It waits for the data of the BRM elaboration to complete the message. To do this it uses a RAM DUAL PORT 256x75 bit assembled with IDT7014S12J chip controlled by a FPGA Xilinx 4013E.
For each data group coming from the LUT, the CTRL furnishes the event parameters: the address of the central cell, the BCN of the event and the flags describing the event.
The parameters are re-synchronized through a delay circuit to compensate
the latency of the LUT. A control circuit routes the data arriving from
the LUT discriminating the data of a principal event from those of right
or left side BRM. It is performed by means of some flags describing the
event and following these rules:
1. The bit 0 (VALID) must be active otherwise the packet of data is
2. If the bit 3 (EVENT) is active the data are relative to a cell having
an event selected by the LMFU. All the elements (Energy, h
, x , dx , ddx
, DEST, BCN) are stored in the RAM DUAL PORT to the address BCN. Then it
is transferred to the BRI the address of the cell and the Dbrem parameter
from the LUT, to evaluate the address of the cells with the two BRM photons.
3. If the bit 3 (EVENT) is not active the data coming from the LUT contain
the energy of a cell with a BRM photon that could be of the right side,
bit1 (LEFT) active, or of the left side, bit 1 no active. The BRM request
could be internal or external, according to the state of the bit 2 (EXT):
4. When VALID is not active, the control circuit enables the memorization
of the energy values coming from the right and from the left boards, results
of precedent elaborations. The interface, that gets the data coming from
the right or from the left side, contains a FIFO that hold the data, waiting
for the extraction performed by the control circuit. Also this FIFO contains
some test circuits to
LUT block diagram
recognize error conditions that are lighted on the frontal panel by
means of a led and are readable by the VME bus.
11. Message Formatter & VME Interface
The communication of the message to the following cards, the TFU, is controlled by the MBUF + MFVI. The MBUF, when the board is enabled and it has a message to transfer, send the data to the MFVI that creates the message formatted in four 20bit words for the TFU.
Up to 16 pretrigger cards could be connected in parallel on a local bus. At each BX one of the cards on the bus transfers the message to the TFU.
In fig.10 is shown the output section of 4 Message Formatter blocks.
A bus arbiter handles the bus access, in a way that there is no privileged
board .The arbiter handles, by means of a local bus on a J3 connector,
the data transfer of 16 pretrigger boards. The 80 bit data are transferred
to the TFU on a 20 bit data bus at a rate 4 x 20bit / BX.
12. Pretrigger Latency
In the pretrigger board the events are synchronized with the fCLK. Therefore the event synchronization happens with the temporal resolution of fCLK= 0.25* BX= 24 nsec.
In the case of an event with 3 clusters, the latency of the complete process in the pretrigger board, included the Bremsstrahlung recovery, follow the time scale shown in fig.11.
If we have 3 events with 3 clusters for each event, the FIFOs in the
Process Controller CTRL takes the data in queue. The Process Controller,
due to the queue, gives the BCN+SEL relative to the last input 4 BX after
the entrance. The data are extracted sequentially and elaborated by the
LUT. In this case the latency, included the
Bremsstrahlung recovery, increase by 4 BX.
In this report the HERA-B ECAL pretrigger board has been described. It has been shown how, in
the board, all the operations are pipelinized. Finally the latency of
all the operations has been described.
2. "HERA-B FLT Message Transfer Module", Universitat Mannheihm, 1996
3. D.Ressing, "FCS Specifications", January 13, 1997
Page edited by Bisi