I.D'Antone, S.Meneghini, G.Mandrioli
Istituto Nazionale di Fisica Nucleare
Via Irnerio 46, 40126 Bologna, Italy
Abstract
In this report is presented a new VME hardware processor realized to perform a fast reconstruction of straight tracks. It performs a correlation with suitable masks generated with the aid of the Rademacher functions.
The reconstruction algorithm and the VLSI realization of the correlator are described.
For a fast track finding the processor works more times with different resolutions and the passage to different resolutions is controlled by an external DSP.
A VLSI ASIC has been developed; it contains 4 RAM 1024x1 and has a cascadable architecture to realize a correlator for more than 4 planes. The VME board contains 8 ASICs and globally it can handle images composed by 1024x32 pixels in a view. It is used to analyze the projections of the tracks to perform a spatial track reconstruction.
The algorithm has been verified by means of a hardware processor that
has been set up for a physics experiment at the Gran Sasso laboratory in
Italy.
ISTITUTO NAZIONALE DI FISICA NUCLEARE
Sezione di Bologna
1. Introduction
A hardware processor for track reconstruction has been set up for a physics experiment at the Gran Sasso laboratory in Italy [1] to analyse spatial images of 14x800 pixels in the front view and 30x800 in the lateral view. A VME board containing the processor, controlled by a commercial DSP, can handle images 32x1024 and has been used to elaborate both the views, for a spatial reconstruction of the tracks.
The present paper describes the algorithm developed to perform a fast reconstruction of straight tracks, the hardware architecture of a chip implementing it and finally the VME board architecture.
The track identification is carried out in two steps: the first step evaluates the slope, which is a global characteristic because the tracks are parallel, and in the second step the tracks are counted. In the present paper the algorithm and the VLSI correlator are described. The correlator functionality is controlled by an extenal DSP board that performs loading of the fired pixels, the starting of the correlator, and furthermore it outputs the slope value, the number of tracks and the number of points for each track.
To solve the problem of the image recognition it is useful to begin
studying the properties of the images. In our case we must analyse parallel
straight tracks contained in a matrix 32x1024. To count the track in the
image it is useful to know the common feature of these objects: the slope.
Then in the first phase of the algorithm we try to evaluate this slope.
The slope evaluation algorithm can be seen as the application of the pattern recognition method known as "template matching" in which the templates are the prototypes of the straight tracks, having all possible slopes in the matrix.
Due to the noise the real tracks are a little different from the prototype tracks evaluated a priori and therefore a maximum correlation is evaluated to establish matching; in other words, we search for the template similar to the original. In the spatial domain this process is essentially a correlation between the original image F and all the S_{i }templates corresponding to all the slopes. By rotating a template with vertical tracks in all possible positions, all the S_{i} are obtained. The correlation is evaluated counting the pixels in common. In the case of a large image, the memory for storing all the templates must be very large. To overcome this drawback the comparisons can be performed in a sequential mode; it is possible to store only one matrix and to pass from one template to another with suitable functions. The sequence for performing these operations is shown in fig.1. The functions obtained are the Rademacher [2] [3] functions RAD(n), which have 2^{(n-1)} periods of square wave over a normalized time base 0<=t<=1.
Adding a Rademacher function at each step, a track rotation is obtained; in an 8x8 image 8 functions are necessary where the RAD(1), RAD(2), RAD(1), RAD(3), RAD(1), RAD(2), RAD(1) functions perform a complete rotation.
In an NxN image the number of the required Rademacher functions for a complete sequence is N_{Rad}=N; therefore, one can store N_{Rad} functions in a memory and generate all the templates in a sequential mode. If the image is not squared, i.e. NxM, the Rademacher functions repeat themselves after N rotations (fig.1).
At each step the correlation between the original image and a template is evaluated by performing all the vertical sums. A sum larger than a prefixed threshold shows a high value in the correlation; the number of steps required to reach this condition gives the value of the slope.
The global image property, i.e. the slope of the objects in the image, can be obtained with less computational effort if a resolution hierarchy is employed, which requires the construction of a pyramid of images representing the pixel data in different resolutions.
Given an image at a certain resolution level r(n), it is necessary to evaluate the image at the next reduced level r(n-1).
In the resolution pyramid the image at the base is the full resolution original and the image at level l is obtained from the image (l-1) below by a process of OR-ing. The pyramid is used to guide the correlation process (hierarchical correlation). Essentially, we start with the lower resolution r_{min}. High correlation maxima at this starting level of the pyramid constrain the orientation at which correlation must be performed at higher resolution level.
Each ASIC can be used to implement a layer and more ASICs to construct the resolution pyramid. To simplify the board architecture we have used the same memory structure and we have loaded on this structure the pattern configurations corresponding to the different resolution layers.
The hardware based on the VLSI ASICs has been used to perform the slope
search of the straight tracks.
3. Multiplicity evaluation by means of the DSP
When the slope has been exactly evaluated, the DSP performs a scanning of the memory with a direction equal to the slope, i.e. it adds a variable offset to the linear set of addresses corresponding to the slope found. During the scanning the correlation with this slope is evaluated. The peaks in the correlation give the number of the tracks, i.e. the multiplicity of the event. In fig.2 is shown a flow chart of the work performed by the DSP.
The DSP 2181 is a single-chip microcomputer optimized for digital signal processing and other high speed numeric processing applications; it is delivered by the Analog Devices.
It combines a base architecture of three computational units, data address generators and program sequencer with two serial ports, a 16-bit internal DMA port, a byte DMA port, a programmable timer, extensive interrupt capabilities, and on-chip program and data memory.
The memory is configured as 16K (24-bit) of program RAM, and 16k words (16-bit) of data RAM.
The DSP supports instructions that include bit manipulations, multiplication instruction, I/O memory transfers. It operates with a 30ns instruction cycle time and every instruction can execute in a single processor cycle. It provides flexible data moves and multifunction, i.e. one or two data move with a computation.
fig.2
The DSP communicates with an external central CPU and by means of the VME bus it loads the image on the memory, gives the start to the correlator for the slope search and finally rewrite the memory (fig.2).
The DSP algorithm parameters
have been tuned with a simulation of the correlator written in a high level
program. The efficiency of the algorithm has been verified in thousands
of real events.
fig.3
4. VME board description
To perform the on-line spatial reconstruction of the tracks coming from the detector we use two boards controlled by a DSP to elaborate the front and the lateral views (fig.3). Each board can handle 32x1024 images and can be used to elaborate one view, i.e a bidimensional array.
A VME board contains the logic to perform the slope search using 8 ASICs and the logic to interface the DSP. In fig.4 is shown a block diagram of this board.
The sequential rotation previously described to implement the slope search is performed keeping the data fixed and addressing the ASIC matrix with a flexible address generator.
The NxM image is stored in a memory block within the ASICs, in which a 1xM RAM takes the pixels of a plane and N RAMs realize the complete matrix. These are addressed
fig.4
sequentially by means of the Rademacher functions. The addresses, generated with EPROMs, are a linear set of addresses, i.e. the difference of consecutive addresses is constant and equal to the slope.
The developed ASIC [4] contains 4 RAM 1024x1 and has a cascadable architecture to realize a correlator for more than 4 planes. However, to have a fast slope search we have chosen to elaborate the ASIC outputs in parallel and, furthermore, we have used two image resolutions [5]; the passage from lower to higher resolution is controlled by the external DSP. For a faster computation it is possible to use a chip for each layer.
The sequencer has been chosen external to the chip because the required Rademacher functions depend on the geometry of the detector planes.
The VME board functionality is controlled by the external DSP board
(fig.3), as described in the flow chart shown in fig.2.
5. Track finding performance
At each step we address the memory and we count the pixel fired at these addresses. The evaluation of the "original image - templates" correlation contains an error due to the spatial quantization. This error depends on the number of planes: err < (N/2)-1. Therefore, the discrimination threshold must be set to N-err. Choosing a conservative value we decide that the slope is identified when more than 4 points are found.
If the first point of the first plane is a noise the slope cannot be identified, then the search
fig.5
restarts from the second point, and so on. For these reason there is a mean number of attempts to find the slope.
The ASIC clocked at 50 MHz analyzes an image at a rate of about 7 MHz, i.e. it takes about 150msec to address the memory with a set of Rademacher functions, to read the contents of the RAM and to evaluate the number of hits. The time to evaluate the slope in an image 32x800, obviously, depends on the noise in the image, the number of the chosen resolution layers and the work performed by the external DSP.
In average, 3-4 attempts need to find a point belonging to a track and then the average time to evaluate the slope, working with a 8 MHz clock, is approximately 150-200 msec.
However, to perform the complete track finding, for an event with Np points, it is necessary to add the DSP work. It loads the matrix in Np*1.4msec, performs the correlator management in about 170msec (in this time is included the slope search) and clears the matrix in Np*1.4msec.
Furthermore the DSP performs the multiplicity evaluation that take about
300msec+Nt*3.6msec,
where Nt is the number of the tracks. The total time is depending by the
number of points and the number of tracks following the relation:
T_{tot} =0.47 + 0.0028*Np + 0.0036 *Nt msec.
A plot of this function is shown in the fig.5.
fig.6
The track finding system described in this report is an upgrade of a
previous board [5] in which the correlations have been performed on a RAM
block and the controller was a commercial mcontroller.
In fig.6 is shown the performance of the new ASIC+DSP board compared with
the previous RAM+mcontroller board.
6. Conclusion
A method for a fast reconstruction of straight tracks that is easy to be implemented in hardware has been presented. It is a correlation algorithm combined with a multiresolution technique. A VME board containing the correlator, controlled by a commercial DSP, has been realized; it is a flexible solution to analyze image coming from position detectors used in the particle physics research. The VME board contains 8 ASICs and globally it can handle images composed by 32x1024 pixels. It can be used to elaborate both the views of the tracks.
The architecture presented is however an interesting solution in all
the industrial application in which straight tracks must be on-line detected.
References
[1] P.Calligola, I.D’Antone, G.Mandrioli, S.Meneghini, "A Real Time Diagnostic System in a Complex Data Acquisition of a Particle Physics Experiment", Proceeding of the Twelfth IASTED International Conference, "Applied Informatics", May 17-20, 1994 Annecy, France.
[2] H.F.Harmuth , "Transmission of Information by Orthogonal Functions",
Springer-Verlag, 1972.
[3] H.C. Andrews, "Computer Techniques in Image Processing", Academic
Press, 1970.
[4] I.D'Antone et.al, "A VLSI hierarchical correlator for track reconstruction in real time",
Technical note CEB/NT-97/08.
[5] P.Calligola et al.,"A hardware processor for real time track reconstruction", Technical note
CEB/NT-96/01.
Page edited by Bisi
Fabio