A single-pixel machine vision framework leverages optical networks designed for deep learning to bypass the need for an image sensor array or digital processor. The system, developed in the lab of UCLA Chancellor Professor Aydogan Ozcan, paves the way to solving some challenges that are beyond the capabilities of current imaging and machine learning technologies.
Most machine vision systems in use today use a lens camera that sends information to a digital processor that performs machine learning tasks. Even with modern and state-of-the-art technology, these systems suffer from certain drawbacks; the video stream, by the nature of the camera’s high pixel count, contains a large volume of data, usually with redundant information. This can overload the processor and create inefficiencies in terms of the power and memory needed to handle this often redundant information.
UCLA researchers have created a single-pixel machine vision system that can encode objects’ spatial information in the light spectrum to optically classify input objects and reconstruct their images using a single pixel detector. Courtesy of Ozcan Lab, UCLA.
Additionally, fabricating high pixel count image sensors beyond the visible region presents some challenges that limit the application of standard machine vision methods to longer wavelengths, such as in the infrared regions. and terahertz.
“Here in this work, we have circumvented these challenges because we can classify objects and reconstruct their images through a single-pixel detector using a diffractive grating that encodes the spatial characteristics of objects or a scene in the spectrum. diffracted light,” Ozcan said. Photonic media. The team deployed deep learning to design optical gratings created from successive diffractive surfaces to perform statistical calculations and inferences as input light passes through the specially designed 3D fabricated layers. Unlike lens cameras, these diffractive optical gratings are designed to process incoming light at set wavelengths.
The goal is to extract and then encode the spatial characteristics of an input object on the spectrum of diffracted light. A single pixel detector is used to collect this light.
Different types of objects, or classes of data, are then assigned to different wavelengths. An automated all-optical classification process classifies the input images, using the output spectrum detected by a single pixel. The process overcomes the need for an image sensor array or a digital processor.
“During the formation phase of this diffractive grating, we assign a unique wavelength to each class of objects,” Ozcan said.
In this case, the object classes were handwritten digits; the researchers selected 10 wavelengths evenly over an available bandwidth to represent the numbers 0 through 9.
In addition to the all-optical classification of object types, the team also connected the system to a simple, shallow electronic neural network to rapidly reconstruct images of classified input objects, based solely on detected power. at distinct wavelengths, demonstrating specific image decompression tasks.
“The diffractive features on the layers were iteratively tuned (using deep learning principles) until the physical structure of the diffractive layers converged to patterns that gave the greatest spectral power among these 10 wavelengths selected to match the class/digit of the input object,” Ozcan told Photonics Media. “In other words, the diffractive grating in front of the single pixel has learned to channel most large spectral power at the single pixel detector depending on the class (digit) of the input object.”
Researchers have created a single-pixel machine vision system that can encode objects’ spatial information in the light spectrum to optically classify input objects and reconstruct their images using a single-pixel detector. pixel in a single snapshot. Courtesy of Ozcan Lab, UCLA.
During subsequent training of the neural network, an error was generated when the diffractive network misidentified, or could not identify, the input handwritten digit based on the peak spectral power at the detector at a single pixel. This error was used to adjust the diffractive features on the transmissive layers until they showed the correct inference of the input class based on the detected spectral power.
Despite using the single pixel detector, the researchers eventually achieved an optical classification accuracy of over 96% of the handwritten digits. A proof-of-concept experimental study with 3D-printed diffractive layers showed close agreement with numerical simulations, demonstrating the effectiveness of the single-pixel machine vision framework for building low-latency, cost-efficient machine learning systems. in resources.
Applications and usability
The framework follows research published earlier this year in which Ozcan and his colleagues used diffractive surfaces to shape terahertz pulses. Previous work, he said, demonstrated a “deterministic” task, where the new advance is best characterized as the implementation of “statistical inference”, since the handwritten digit recognition task is not not a deterministic operation or function.
“In our previous work (pulse shaping), we knew exactly what the input pulse profile is and what we want it to be at the output location (in other words, the inputs were known),” Ozcan said. “In this work, we have demonstrated a single-pixel object classifier where the input objects are unknown, new handwritten digits that have never been seen by the network before. In this sense, our diffractive classifier at a single pixel recognized the spatial characteristics of handwritten digits, using the spectral power detected by a single pixel.
The importance of both lines of research extends to the lack of readily available large pixel count image sensors in the far and mid-infrared and terahertz bands. From the machine vision framework, Ozcan said, it is conceivable to envision a focal plane grating based on an infrared-operated diffractive optical grating that can be applied to defense and security applications to detect and image certain target objects, and under certain conditions.
Other uses cover applications in biomedical imaging and metrology/interferometry.
“This work shows how the spatial features of a scene can be encoded into the spectrum, in a single shot, and demonstrates that a properly trained diffractive grating can perform this encoding to achieve a fully optical classification of input objects and the reconstruction of their images,” Ozcan told Photonics Media. “The basic principles of this spectrally-encoded single-pixel machine vision framework can therefore be applied to construct new 3D imaging systems, finding uses, for example, in optical coherence tomography (OCT).”
The research has been published in Scientists progress (www.doi.org/10.1126/sciadv.abd7690).