Acquiring the spatial , temporal (), and spectral () information of an object is very important in natural science exploration. Multi-dimensional optical imaging, as a visualization method, can provide information covering the space, time, and spectrum.1 So far, multi-dimensional optical imaging has played an irreplaceable role in exploring the unknown world and decrypting natural mysteries such as light–matter interactions,2 light scattering in tissues,3 and physical or biochemical reactions.4–6 Scanning multi-dimensional optical imaging had to be sequentially operated, and thus its imaging speed was restricted to hundreds of frames per second (fps) due to the limited data readout speed and on-chip storage of charge-coupled devices or complementary metal-oxide semiconductors (CMOSs).7 Therefore, snapshot multi-dimensional optical imaging has aroused great interest among researchers because of its ability to capture dynamic scenes with imaging speeds of up to a billion or a trillion fps, corresponding to the temporal frame intervals at the picosecond or femtosecond scales. To capture as much spatial–temporal–spectral information as possible, various multi-dimensional optical imaging techniques have been developed. For example, the spectral imaging techniques, including coded aperture snapshot spectral imaging,8 adaptive optics spectral-domain optical coherence tomography,9 volume holographic spatial–spectral imaging,10 and compressive spectral time-of-flight (ToF) imaging,11 could capture the spatial–spectral four-dimensional (4D) information, but there was no temporal information. However, the ultrafast imaging techniques, such as compressed ultrafast photography (CUP),12–15 sequentially timed all-optical mapping photography,16 and single-shot femtosecond time-resolved optical polarimetry,17 could record the spatial–temporal three-dimensional (3D) information, while both the depth (i.e., ) and spectral information were missing. Some improved techniques have been developed to further extend the imaging dimensions of CUP, such as hyperspectrally compressed ultrafast photography (HCUP)18 and compressed ultrafast spectral photography,19 which could capture the spatial–temporal–spectral (4D) information, but they still lacked the depth information. Recently, a stereo-polarimetric compressed ultrafast photography method was able to detect spatial–temporal-polarization five-dimensional (5D) information.20 Unfortunately, the spectral information could not be detected. Consequently, there are no imaging optical techniques that can capture the whole spatial–temporal–spectral 5D information in a single exposure, until now.