I.D'Antone, S.Meneghini, G.Torromeo
Istituto Nazionale di Fisica Nucleare
Via Irnerio 46, 40126 Bologna, Italy
ABSTRACT
In this paper an ASIC performing a fast reconstruction of straight tracks is presented. In the present paper after the algorithm analysis the VLSI correlator is described. It performs a correlation with suitable masks generated with the aid of the Rademacher functions. This correlation is combined with the multiresolution technique for a fast identification of the slope. It requires the construction of a pyramid of images representing the pixel data in different resolutions. Each ASIC is used to implement a layer and more ASICs can be used to construct the resolution pyramid. The algorithm has been verified by means of a hardware processor that has been set up for a physics experiment at the Gran Sasso laboratory in Italy.
The design has been implemented using Cadence tools on a SUN workstation
and the chip size is 4.12 mm x 3.94 mm for 1.0 mm
CMOS process.
ISTITUTO NAZIONALE DI FISICA NUCLEARE
Sezione di Bologna
The present paper describes the algorithm developed to perform a fast reconstruction of straight tracks and the hardware architecture of a chip implementing it.
The track identification is carried out in two steps: the first step evaluates the slope, which is a global characteristic because the tracks are parallel, and in the second step the tracks are counted. In the present paper after the algorithm analysis the VLSI correlator is described. It performs a correlation with suitable masks and is controlled by a CPU that performs loading of the fired pixels, the starting of the correlator, and furthermore it outputs the slope value, the number of tracks and the number of points for each track.
A hardware processor for track reconstruction has been set up for a
physics experiment at the Gran Sasso laboratory in Italy in order to analyse
images of 10x800 pixels. A VME board contains the correlator controlled
by a commercial microprocessor.
To solve the problem of the image recognition it is useful to begin studying the properties of the images. In our case we must analyse parallel straight tracks in a matrix 10x800. To count the track in the image it is useful to know the common feature of these objects: the slope. Then in the first phase of the algorithm we try to evaluate this slope. This phase can be seen as the application of the pattern recognition method known as "template matching" in which the templates are the prototypes of the straight tracks, having all possible slopes in the matrix.
Due to the noise the real tracks are a little different from the prototype tracks evaluated a priori and therefore a maximum correlation is evaluated to establish matching; in other words, we search for the template similar to the original. In the spatial domain this process is essentially a correlation between the original image F and all the Si templates corresponding to all the slopes. By rotating a template with vertical tracks in all possible positions, all the Si are obtained. The correlation is evaluated counting the pixels in common. In the case of a large image, the memory for storing all the templates must be very large. To overcome this drawback the comparisons can be performed in a sequential mode; it is possible to store only one matrix and to pass from one template to another with suitable functions. The sequence for performing these operations is shown in fig.1. The functions obtained are the Rademacher functions RAD(n), which have 2^{(n-1)} periods of square wave over a normalized time base 0<=t<=1.
Adding a Rademacher function at each step a track rotation is obtained;
in an 8x8 image 8 functions are necessary where the RAD(1), RAD(2), RAD(1),
RAD(3), RAD(1), RAD(2), RAD(1) functions perform a complete rotation.
In an NxN image the number of the required Rademacher functions for a complete sequence is N_{Rad}=N; therefore, one can store N_{Rad} functions in a memory and generate all the templates in a sequential mode. If the image is not squared, i.e. NxM, the Rademacher functions repeat themselves after N rotations (fig.1).
At each step the correlation between the original image and a template
is evaluated by performing all the vertical sums. A sum larger than a prefixed
threshold shows a high value in the correlation; the number of steps required
to reach this condition gives the value of the slope.
The global image property, i.e. the slope of the objects in the image, can be obtained with less computational effort if a resolution hierarchy is employed, which requires the construction of a pyramid of images representing the pixel data in different resolutions.
Below, we use the term "hierarchy" to mean resolution hierarchy. In a hierarchical system, the global information contained at a given level in the hierarchy is derived from operations at a lower resolution level. Of course we assume that the "global" information supplied by the lower resolution level contains the same information used by the global operation on the image at higher resolution.
In real time track recognition, the objective is to reduce the computation time, and therefore, in order to evaluate the best resolution hierarchy, we have to calculate the time necessary to produce all the images at different resolutions and that required to find the slope at each resolution.
The number of layers in an image pyramid between a maximum resolution r_{max} and a minimum resolution r_{min} is: Nl=log_{p}(r_{max}/r_{min}), where p=[points]_{r(n+1)}/[points]_{r(n)} is the ratio between the number of points in two adjacent resolutions.
Given an image at a certain level r(n), it is necessary to evaluate the image at the next reduced resolution level r(n-1). The reduction of the resolution is achieved by a generating kernel g_{ik}, i,k=0,1,..,m-1. In our case the kernel g_{ik} performs the OR operation of p points at the resolution r(n) to obtain one value at the resolution r(n-1).
The elaboration time to construct all the layers increase and the slope computation decrease with the number of layers; then for each r_{max}, r_{min} and number of pixels fired an optimal number of layers in the resolution pyramid exist. Furthermore the value r_{min} must be selected by evaluating the maximum number of points fired in the image. At a resolution lower than r_{min} the global information contained in the original image is lost.
At this point the optimal pyramid as a stack of images is obtained,
where the image at the base is the full resolution original and the image
at level l is obtained from the image (l-1) below by a process of OR-ing.
The pyramid is used to guide the correlation process (hierarchical correlation).
Essentially, we start with the lower resolution r_{min}.
High correlation maxima at this level of the pyramid constrain the orientation
at which correlation must be performed at the level below, with an higher
resolution.
Each ASIC is used to implement a layer and more ASICs can be used to
construct the resolution pyramid.
4. CHIP ARCHITECTURE
The sequential rotation previously described could be performed with a shift register for each plane and with suitable clock sequences. However, with a large matrix it is more convenient to keep the data fixed and address the matrix with a flexible address generator.
The NxM image can be stored in a memory block in which a 1xM RAM takes
the pixels of a plane and N RAMs are necessary to realize the complete
matrix. These are addressed sequentially by means of the Rademacher functions.
The addresses generated are a linear set of addresses, i.e. the difference
of consecutive addresses is constant and equal to the slope.
Fig.2. shows a block diagram of this architecture. The developed ASIC performs the operations into the dotted box. It contains 4 RAM 1024x1 and has a cascadable architecture to realize a correlator for more than 4 planes.
The correlator can work more times with different resolutions and the passage to higher resolution is controlled by an external microprocessor. However, for a faster computation, it is possible to use a chip for each layer.
The sequencer has been chosen external to the chip because the required Rademacher functions depend on the geometry of the detector planes.
To implement the described algorithm with this chip, at each step we address the memory with the Rademacher functions and count the pixels fired at these addresses.
The evaluation of the "original image - templates" correlation contains an error due to the spatial quantization. This error depends on the number of planes: err < (N/2)-1. Therefore, the discrimination threshold must be set to N-err.
When the slope has been exactly evaluated, the sequencer of the correlator
performs a scan of the memory with a direction equal to the slope, i.e.
it adds a variable offset to the linear set of addresses corresponding
to the slope found. In this way the peaks in the correlation give the number
of the tracks.
The chip clocked at 50 MHz analyzes an image at a rate of about 7 Mhz, i.e. about 150msec to address the memory with a set of Rademacher functions, to read the contents of the RAM and to evaluate the number of hits. The time to evaluate the slope in an image 10x800, obviously, depends on the noise in the image, the number of the chosen resolution layers and the work performed by the external microprocessor.
The chip design has been implemented using Cadence tools on a SUN workstation
and the ES2 Standard Cell Design Kits. After the place and route operations
the die size is 4.12 mm x 3.94 mm for 1.0 mm
CMOS process. A floorplan of the chip is shown in fig.3 and it has required
a 68-pin package.
A method for a fast reconstruction of straight tracks that is easy to
be implemented in hardware has been presented. It is a correlation algorithm
combined with a multiresolution technique. The algorithm presented can
be parallelized using different processors in order to analyse different
groups of slopes. The developed ASIC is a flexible solution to analyze
image coming from position detectors used in the particle physics research.
However it is an interesting solution in all the industrial application
in which straigth tracks must be on-line detected.
REFERENCES
[1] P.Calligola, I.D’Antone, G.Mandrioli, S.Meneghini, "A Real Time
Diagnostic System in a Complex Data Acquisition of a Particle Physics Experiment",
Proceeding of the Twelfth IASTED International Conference, "Applied Informatics",
May 17-20, 1994 Annecy, France.
[2] H.F.Harmuth , "Transmission of Information by Orthogonal Functions",
Springer-Verlag, 1972.
[3] H.C. Andrews, "Computer Techniques in Image Processing", Academic
Press, 1970.
Page edited by Bisi
Fabio