 |
The Small Vision System (SVS) is an efficient software implementation of the SRI stereo algorithms, running on standard PC hardware, either MS Windows or Linux. These algorithms are 3 to 4 times faster than similar algorithms, and have high-quality filtering to reject false stereo matches. Coupled with stereo cameras and an IEEE 1394 (Firewire) interface, it is a complete, low-cost development environment for realtime stereo applications.
An alternative to PC-based computation is the Stereo-on-a-Chip (STOC) device. These devices perform realtime stereo right on the camera, freeing up the PC for application processing.
|
You can use the SVS with any of the stereo heads available from Videre design. You can also use images available in the computer memory, under both MS Windows and Linux operating systems.
Specification Summary
- General
- Two-image stereo computation
- Arbitrary frame sizes
- Area correlation algorithm
- Video-rate implementation at up to 320x240 frames on standard PCs
- Calibration
- Fixed-baseline devices (STH-DCSG, STOC) are pre-calibrated at the factory
- All devices can be field-calibrated by presentation of a simple planar target
- Internal parameters - radial and tangential distortion, lens decentering, focal length, pixel aspect ratio
- External registration - baseline, orientation of each camera
- Based on Tsai's algorithms [Tsai 1991]
- Rectification
- Bilinear interpolation using the calibration parameters
- Disparity computation
- Laplacian of Gaussian image filter
- Correlation: sum of absolute differences over a square window
- Correlation window sides from 5 to 21 pixels
- Disparity search from 8 to 128 pixels
- Subpixel interpolation to 1/16 pixel
- Post-filtering
- Low-texture confidence check
- Uniqueness check
- 3D Reconstruction
- Transform routines to generate 3D points from image point and disparity
Performance
The SVS algorithms are optimized for Pentium processors with MMX instructions. Frame rates are a function of frame size (number of pixels) times the number of disparities (search range). Here are some timings on Pentium M and IV processors.
Correlation window: 15
Texture filter
Uniqueness filter
Input: rectified grayscale images
| Processor |
Speed |
OS |
Memory |
Resolution |
Disparities |
FPS |
FOM
(Mp*d/s) |
| Pentium M |
1.4GHz |
MSW |
500MB |
512x384 |
48 |
28 |
264 |
| Pentium M |
1.4GHz |
MSW |
500MB |
640x480 |
64 |
15 |
295 |
| Pentium M |
2GHz |
Linux |
1GB |
512x384 |
48 |
43 |
405 |
| Pentium M |
2GHz |
Linux |
1GB |
640x480 |
64 |
22 |
432 |
| Pentium 4 |
2.5GHz |
Linux |
500MB |
640x480 |
64 |
15 |
295 |
For demanding stereo applications, the recommended PC configuration is a Pentium M. These processors, besides being power-efficient, are better than Pentium IV's at executing integer and MMX/SSE instructions, which are used heavily by the algorithm.
The Figure of Merit (FOM) is the best direct comparison of the efficiency of the algorithm on different systems. It gives the number of pixel-disparities processed per second.
Because the stereo algorithms are storage-efficient, performance scales linearly with increasing frame sizes. The algorithms execute almost entirely from L1 cache, so that future increases in processor speed will translate directly to higher frame rates. The diagram below, which normalizes different frame sizes and disparity ranges to a common scale based on the pixel-disparity law, shows how the amount of processing needed per pixel-disparity stays relatively constant across frame sizes and disparity ranges.

System Description
The SVS is a set of algorithms implemented as a software library. There are routines for:
- Calibration of stereo heads using a simple planar target
- Capturing video streams using standard frame grabbers
- Computing dense stereo range images at video rates
- Displaying video images and range information

Host Requirements
The Stereo Engine code is written in optimized MMX assembly code for Pentium-based PCs running Linux or MS Windows. The recommended hardware configuration for best performance is a Pentium III/IV processor and a PCI bus, and a display card with at least 8 MB of video memory.
If you have your own cameras, then you must use frame grabbers to digitize the stereo video stream and place it in main memory, where the Stereo Engine can process it. You must write your own code to do this; SVS provides function calls to take images from memory and process them.
Videre Design has developed several stereo head assemblies that have direct interfaces to SVS.
- STH-MDCS3 and STH-DCSG (fixed and variable baseline models)
Digital devices with resolutions up to 1280 x 960, and a full range of user controls including sub-sampling and exposure control. They connect to a host system via the digital IEEE 1394 (Firewire) bus.
|