Stochastically structured illumination microscopy scan less super resolution imaging

Introduction
The diffraction limit, formulated almost 150 years ago by Ernst Abbe1,2, sets the maximal resolution that can be achieved with a specific lens and wavelength. This barrier can be surpassed thanks to super-resolution techniques: a set of technologies merging prior knowledge on the experiment, together with multiple image acquisitions (such as sample scanning, or photoactivation cycles) to generate a single higher resolution image3. Single-molecule localization-based techniques can deliver an optical resolution of up to 10 nm4,5,6, requiring, however, chemically engineered fluorochromes and activation/bleaching cycles performed with high-power laser7. On the other hand, structured illumination microscopy (SIM) based techniques do not require the use of nonlinear optical effects, resorting instead to the additional information obtained by administering to the sample a set of spatially tailored illumination8,9, which can now benefit from fast and artifact free algorithms10,11. By merging the information obtained from the multiple (N) measurements, it is possible to expand the support space in the frequency domain, so that the super-resolved image cutoff frequency kSR originally equal to the collection cutoff kC frequency becomes kSR ~ kEX + kC, where kEX is the spatial cutoff frequency of the illumination pattern12. Being the resolution enhancement RE inversely proportional to kSR, one argues that both reducing the speckle grain size (kEX) or improving the objective numerical aperture (kC), can enhance imaging resolution13. SIM is limited by the strict requirement of a fully controlled illumination pattern: the presence of aberrations or deformations in the spatial structure of the illumination results in an immediate decrease in the image enhancement performance. Indeed, SIM requires an external “scanning systems”, capable of cycling between multiple illumination configurations. Several approaches have been developed to control illumination, including multiple ray interferometry14, microlens array15, digital micromirror devices16, or phase reflective masks17. It is important to note that all these approaches require micro-metric alignment and need to maintain this precision while operating at high speed, as the cycling speed affects the super-resolution framerate. These illumination cycling devices, are typically expensive, light wasting, and need manutention and realignment by an expert operator.
The cost for this increased experimental complexity is typically compensated with the possibility of extracting information from the nano-world, such as capturing images of sub-cellular structures in the range of a few tens of nanometers. This enhanced optical capability requires typically the use of SIM togheter with the more performing optical system available, i.e., immersion objectives with the highest commercially available numerical aperture. These systems, however, operate at very close range to the sample (few hundreds of microns) thus maximal super resolution performance cannot be reached in intrinsically long working distance experiments such as astronomic or atmospheric imaging, extreme conditions experiments requiring special chambers, (e.g., high-pressure or high-temperature chambers), or in non-invasive ophthalmoscopic environments, where the eye optical system (the cornea+crystalline systems) is employed as a ~22 mm focal lens18,19. In these scenarios, the distance between the first optical element and the sample cannot be further reduced, resulting in images with a resolution much larger than the illumination wavelength. Thus structured illumination techniques can improve resolution without the need of nonlinear high power pulses (potentially harmful in live imaging or clinical contexts) or engineered (and potentially toxic) fluorochromes.
The eye is more than just the organ responsible for vision; it serves as a direct portal to the central nervous system and is increasingly recognized for its potential in the early diagnosis of neurodegenerative diseases20,21. Recent advancements have introduced specific fluorescent markers for Alzheimer’s-related protein aggregates22, suggesting that clinical fluorescence retinal microscopy could play a significant future role in the prevention and study of neurodegenerative conditions. Thus, the detection of these protein aggregates and subtle morphological changes in the vasculature associated with Alzheimer’s and other neurodegenerative disorders necessitates the development and use of superresolved imaging techniques to enhance the resolution and clarity of retinal images23,24,25,26.
Here we present a novel super resolution technique the stochastically structured illumination microscopy (S2IM): a scan-less version of the structured illumination technique in which we leverage the natural and stochastic movement of the sample to realize resolution enhancement. Our idea is an evolution of the computational structured illumination microscopy (C-SIM) which is a “translation version” of the SIM. Indeed the classical SIM reported in the seminal paper from Gustafsson9 relies on illumination with a controlled spatial structure consisting of a periodic two dimensional parallel lines patterns in the sample plane (coordinates [x, y] = r). Fluorescence pattern to be reconstructed (ρ(r)) and illumination (I(r)), superpose multiplicatively, giving rise to beat patterns known as moirè fringes. Merging the information from the moirè fringes with the known shape of the illumination pattern, it is possible to reconstruct with high fidelity the spatial distribution of the fluorescent ρ(r), which typically contains the biologically relevant information. Each line pattern provides information increases the Fourier support along the axis of the fluctuation, thus to reconstruct a fully super resolved image, multiple patterns with different tilt have to be delivered to the sample, so as to cover all the directions/phases in the Fourier space thus typically requiring at least 9 independent acquisitions9,27. Blind SIM revolutionized the field of linear super resolution, by providing the possibility to employ uncontrolled laser speckle patterns28, thus relaxing the constraint on the illumination control. However also blind sim works under restrictive constraints such as homogeneity of the speckle grains, appearance probability, sparsity of the grains, and fully developed speckles. In a environment non-ideal for microscopy such as the human eye, it is difficult to meet these requirements: aberrations, back reflections and retinal curvature play a role, providing a resolution enhancement much smaller than two29,30. Indeed C-SIM31,32 lies between SIM and blind SIM: it does not require a strict control on the illumination structure, but the capability to carefully translating the input pattern without inducing modification and with a controlled (and exactly known) translation amplitude. On the other hand C-SIM enables to reach the maximally achievable resolution enhancement RE = kSR/kC which is ~2 in the case of a single optical element employed both for illumination and collection (see ref. 31 for the full theoretical background of this “translated speckles” approach).
S2IM is instead applicable on an “active” sample, i.e., an object which is characterized by an intrinsic movement. For samples characterized by intrinsic movement such as spinning astronomical elements, active matter, imaging should be accompanied by image registration that is the process that overlays two or more images taken at different times, or from the same scene that geometrically rotates or translates. Traditional super resolution for moving object, would be an exceptionally difficult task because one should de-multiplex images variation due to illumination scanning from variations originating from sample movement. With S2IM we turn the terms around, employing the intrinsic object movement to realize an effective scanning, and at the same time removing the need for complex and delicate scanning optical elements. Here we demonstrate the super resolution with S2IM in one of the most relevant system characterized by intrinsic movement: the human eye.
Results
Stochastic structured illumination microscopy
Figure 1a–f reports a scheme with the basic elements of the S2IM functioning, while panel g reports a spectch of the optical experimental setup. Laser light is turned into a propagating speckle pattern thought a speckle generator module (speckle light is identified with dashed green lines in panel h), and is delivered to the eye. The eye intrinsic saccadic and micro-saccadic movement (panel a), produce a movement of the retina with respect to the steady illumination pattern (panel b). A separate optical line provides a flat Led Illumination to the fundus (yellow illumination in panel h). The reflected pattern from the led illumination produces a reflectance image on the camera Cam2. Being realized with a flat speckle-less illumination reflectance image can be employed for image registration. At the same time camera Cam1 retrieves images for the fluorescent signal of the eye fundus. Both camera and sources are synchronized employing a DAQ board for triggering, while exposure time is set to 2 ms to avoid motion blur induced images deformation33. The system thus produce a two stacks of images, the reflectance images stack R(r, t) obtained with the homogeneous led illuminations and the laser generated fluorescence images stack F(r, t). The first set of images is employed for the registration. Being the registration process precision essentially limited by noise34,35, by employing acquisitions counting a large number of photons (our typical reflectance imaging configuration results in 5000 photons per pixel average countrate, for images composed of 384 × 384 pixels), our registration process results in a sub pixels (<0.125 pix) registration accuracy. In particular we employed freely available and optimized software dedicated to registration of rigidly translated images based on an upsampled cross correlation between images36. After the images displacement are retrieved (Δ(t)), each image F(r, t) (panel c) is translated by −Δ(t), thus retrieving images TF(r, t) = T(F(r), − Δ(t)) (panel d) resulting from steady superimposed fluorescent patterns ρ(r) and translated illuminations I(r, t) = T(I(r), −Δ(t)), with T indicating the translation operator. We thus obtain a dataset realized from a steady fluorescent object with translated illuminations
where the ⊗ operator indicates convolution, and ({mathcal{H}}) is the point spread function of the collection optics. Before applying the super resolution algorithm a further post processing step is required. Indeed for a gradient descent algorithm, it is important that the differences between frame to frame are minimal, thus only displacements smaller than the point spread function size are acceptable. On the other hand (as will be described below) the eye movement are uncontrolled and larger displacements are possible. To avoid these unwanted “big jumps” we rearrange the dataset temporally so that the each frame shows a small distance with respect to the previous one (red framed panel between c and d). After this frames rearrangement, data are passed to the S2IM gradient descent algorithm to generate the super resolved frame (panel f) from the stack of low resolution frames (an element of which is reported in panel e). First we tested S2IM on a numerical experiment, then in lab operative condition, simulating experimentally the ocular environment with the motorized version of the biological model EYE (BIME)37 the m-BIME. This system is, at our knowledge the best experimental reproduction of the optical properties of the human eye which can be employed without involving patients. The m-BIME (i) has the same optical properties of the human eye, including non-midriatic numerical aperture; (ii) is a watertight devices, thus being water-filled, also perfectly emulates the humor vitreo and humor acqueo; (iii) provides the possibility to host interchangeable biological samples, such as fixed and stained induced pluripotent stem cells (iPSC); (iv) its positioning (tilt) is controlled by two actuator dc motors with the capability to mimic human natural saccadic movement. This last feature requires further explanation. Human patients, even during “fixation” experiments (in which they are asked to stare continuously an artificial target), are subject to uncontrolled, natural ocular movements which, named saccades, and which are responsible for range finding and visual acuity enhancement38. Thus every patient or volunteer may have its fixation characterized with shift probability cloud, which contains information about the occupation probability of every position in the vision plane. To each position corresponds a relative retinal displacement. Indeed we measured the eye displacement, with millisecond resolution (see “Methods”) on volunteers, to extract the relative retinal displacements probability cloud reported in Fig. 1 panel g. The m-BIME includes indeed a software which, given a measured displacement probability cloud, replicates the eye movement by controlling the tilt motors. Thus we can employ the m-BIME not only to simulate ocular environment but also to emulated eye movement during fixation. We employed this feature of the m-BIME to realize the S2IM.

a–d represent visually the various step of the S2IM data acquisition and analysisi. e A typical single frame and f the super-resolved image obtained with our super resolution approach. g Thee probability cloud obtained with with eye measurement and replicated thanks to the m-BIME (x– and y-axis span 100 μm). h A simplified version of the experiment scheme.
A numerical experiment
We first demonstrate S2IM in a numerical experiment providing with free access and control to all experimental parameters and enabling us to characterize the performance and compare it to C-SIM. We tested the two algorithms on the Siemens Star a sample traditionally employed to assess resolution and performances of imaging techniques (a low resolution image of the Siemens star is reported on Fig. 2a). In the numerical experiment we generated two different dataset for C-SIM and for S2IM. In the C-SIM the illumination are translated on a fully deterministic and controlled pattern corresponding to a square lattice (see inset of 2b) while in the S2IM, illuminations are translated of a stochastic amount (see inset of Fig. 2c) (see “Methods” for “Simulations details”). Panel 2d reports the intensity profiles along a circumference at a fixed radius for C-SIM (red curve) and S2IM (orange curve) demonstrating the intrinsically similar performance. Moreover, given the symmetry of the Siemens star, it is particularly easy to determine the resolution enhancement (see29 and Supplementary information’s). In panel 2e, we reported the RE versus the number of acquired images, demonstrating how the technique start a saturation to the value of 2 when more than 200 frames are acquired. Panel 2f instead reports about the robustness of S2IM with respect to noise. In the numerical experiment synthetic noise (Poissonian shot noise and readout noise) is added to the data in order to assess the RE. In the Supplementary Materials, we also report on the robustness of the technique with respect to the error on the position retrieval, demonstrating that our registration error (<0.125 pix) is much smaller than the value at which S2IM performance results to be affected (1 pix).

a–c A Siemens star retrieved respectively with diffraction limited optics, C-SIM, and S2IM. Insets report the qualitative arrangement of the translations position. Scale bar in a is 10 times the diffraction limited optical resolution. Inset in c reports lines connecting points in temporal order before (blue) and after (red) the temporal rearrangement which minimizes the distance between successive positions. d The Intensity versus angle (intensity along the circular profile depicted in the inset) For Low resolution C-SIM and S2IM. e The resolution enhancement (Res. Enhanc.) versus the number of illumination translations (Num. of Translations), while panel f reports the same observable versus the signal to noise ratio (SNR).
Results on a test target
We tested our technique on a test target made of 15 μm diameter, shell-stained beads (Invitrogen Focalcheck F7235) randomly dispersed on a 12 mm circular cover-slip mounted on the m-BIME. In clinical experiments, the camera exposure cannot exceed the 4 ms in order to avoid, eye-movement induced motion blur33. In a conservative framework we set the exposure time of all the experiments (both test target and retinal cultures) to the even more restrictive value of 2 ms.
These shell-stained beads produce a donut-like signal, which results into full disks due to the limited resolution of the ocular environment (system resolution is set by the eye model to 6.5 μm39).
Figure 3 report two fields. In panels a–c, we report respectively, the super resolved, the realigned and averaged (smart averaging) and the single shot images for the first field. Figure 3a, b results from 2000 acquisitions each with a 2 ms exposure time. In Fig. 3c, intensity fluctuations due to speckled illumination are distinguishable. Figure 3d, e contains comparison between S2IM and smart-averaged images for two zoom in areas and relative intensity profiles along the highlighted lines. Panels 3h–l) and f) are organized in the same way as the previous but represent a different field. Figure 3g reports the retrieved resolution estimate for smart-averaged and super resolved images.

Two fields containing 15 μm diameter fluorescent donuts. a–c S2IM, smart averaged (realigned and than averaged), and single shot images of the same field. d–f Zoom in of the areas indicated by blue rectangles. Each of these panels is organized in three sub panels reporting S2IM, smart averaging and intensity profiles along the blue-red lines. g Resolution extracted from the images. Scale bars are 40 μm. Panels h, i, l report S2IM, smart averaged (realigned and than averaged), and single shot images for a different field.
The estimation of resolution in the two cases is assessed as the smallest closer distinguishable (Rayleigh criterion, i.e., peaks separated by at least a 73.5% less intense dip40) donuts boundaries distance. With this approach we retrieve a 6.5 ± 0.2 μm resolution for the wide field images, and a 3.4 ± 0.1 μm for the S2IM super resolved images. Average and errors are obtained from an estimate on 10 different imaging fields. The resolution enhancement is thus RE = 1.9 ± 0.1, in agreement with the estimate from numerical simulations from the Siemens star.
Results on a retinal neurons culture
We tested S2IM human iPSC-derived retinal neurons (see methods) stained with Phalloidin-Atto 532. Note that the signal from this this biological sample is from 10 to 3 times less intense than the one produced by synthetic fluorescent donuts of our test target. This summed to the reduced exposure time needed to avoid motion blur in the eye (2 ms,33, 50–100 times smaller than the typical exposure time employed for fluorescence microscopy experiments) and to the reduced luminosity of the collection optics (numerical aperture of ~0.04), makes this imaging task extremely challenging, thus furhter supporting the importance of the presented results. Figure 4 reports two different field of view comparing S2IM, smart averaging and single shot images, highlighting the level of details which can be obtained thanks to our technique. A larger set of images is reported in Supplementary Materials.

Images relative to retinal neurons culture actine-stained and loaded on the m-BIME. a–c Report (from left to right), S2IM, smart averaging, and single shot on the first field. d–g Zoom in of the areas highlighted by the blue squares. Each of these panels is organized in three sub panels reporting S2IM, smart averaging and intensity profiles along the blue–red lines. Exposure time is 2 ms, and N = 2000 single shot images have been acquired. Scale bars are all by 40 μm except for panel (f) which is 20 μm. Panels h–l Organized as panels (a–c) while another field of view is shown. d–g Zoom in of the areas highlighted by the blue squares. S2IM
Discussion
S2IM not only provides platform free from sophisticated and costly scanning hardware, but also provides an avenue to super resolve moving objects which are typically inaccessible because of the intrinsic complexity of the multiplexing illumination scanning from uncontrolled movement. This scan-less techniques can reach the same theoretical resolution enhancement of standard SIM experiments even if the uncontrolled sample is characterized by unpredictable movements. We demonstrated S2IM in the clinically relevant environment of the human eye, but it can be potentially exported to other important fields such as the distance imaging (atmospheric, aeronautics, or astronomic) or to other biological system characterized by unpredictable intrinsic motion such as active matter system.
Methods
Mesurement of the retinal saccadic movements
Volunteers took part in a fixation task, while their binocular eye movements were recorded using a high-resolution (spatial accuracy: 0.25–0.50°; resolution: 0.01°) infrared eye tracker (sampling rate: up to 2000 Hz; EyeLink ® 1000 Plus, SR Research). Head movements were restrained. Built-in calibration and validation procedures were employed to ensure acceptable spatial accuracy (maximum error for each point < 1°; average error < 0. 5°).
Each trial started with a central black fixation cross (0.53° × 0.28°) that participants had to look for at least 300 ms, to initiate the task. Afterwards, a black fixation dot (0.27° × 0.27°) appeared at the center of the screen for a variable time. Participants were instructed to look at the dot until its disappearance, by maintaining fixation as much as possible. Finally, a blank screen was presented for 100 ms. The order of trials (n = 15) was randomized. The methodology followed the latest reporting guidelines in eye tracking research41.
Human iPSCs differentiation into retinal neuron
Retinal cultures were differentiated from healthy human induced-pluripotent stem cell line (iPSC) according to a previously published protocol with minor modifications42. Human iPSCs were dissociated to single cells with Accutase (Gibco), and seeded into Matrigel-coated dishes (Corning, dilution 1:100) at a density of 100,000 cells per cm2. The day of seeding the cells were maintained in mTeSR Plus with 10 μM Rock Inhibitor (RI; Peprotech, Cranbury, New Jersey, Stati Uniti). One day after seeding, the medium is switched to neurogenic basal medium (N2B27w/oA) composed of of 50% DMEM/F12 (Sigma), 50% Neurobasal (ThermoFisher), 1% GlutaMAX Supplement (Gibco), 0.1% Pen–Strep (Sigma), 1% NEAA (Gibco), 1% N2 Supplement (ThermoFisher), and 2% B27 Supplement w/oA (Gibco). A different mix of small molecules was added to the medium at specific intervals to induce retinal progenitor induction (DIV 20, days in vitro) and differentiation into retinal neurons up to 30 days: 1 μM Dorsomorphin (Sigma) and 2.5 μM IDE2 (Sigma) from DIV 0 to DIV 6; 10 mM Nicotinamide (Sigma) until DIV 10; 25 μM Forskolin (Sigma) from DIV 0 to DIV 30; 10 μM DAPT (Prepotech) from DIV 20 to DIV 30. The retinal progenitor cells were plated onto PLO/Laminin (Sigma-Aldrich) coated round cover glasses ((varnothing) 12 mm, Thorlabs, Newton, New Jersey, US) at a density of 150,000 cells per glass for subsequent analysis.
Immunostaining
iPSC-derived retinal neurons were fixed at DIV 30 of differentiation with 4% paraformaldehyde (PFA, Sigma Aldrich) for 15 min at room temperature. Fixed retinal cells were permeabilized using PBS solution with 0.2% Triton X-100 (Sigma Aldrich) for 10 min and incubated in blocking solution containing PBS, 0.1% Tween-20 (Sigma-Aldrich), and 5% goat serum (Merck KGaA, Darmstadt, Germany) for 45 min. Subsequently, retinal neurons were incubated with Phalloidin-Atto 532 (1:30, 49429. Sigma Aldrich) for 1 h at room temperature. Hoechst was used to stain nuclei (1:300, 33258, Merck). The retinal cells were sealed between two cover glasses using DAKO (Fluorescent Mounting Medium, Sigma-Aldrich) and mounted on the m-BIME.
Super-resolution algorithm
After the steady fluorescent object dataset is obtained as described in the main text (Eq. (1)) the probrem is reduced to the following. A flourophore density ρ(r) is illuminated by an unknown intensity profile IΛ(r) by a set of Λ = 1, 2, 3…N consequent measurements. Each measurement differs by the previous by the fact that the illumination pattern on the sample plane r is translated by a distance rΛ from the first measurement so that IΛ(r) = I(r + rΛ), where I(r) is the illumination pattern of the first measurement. The image is formatted on the camera image plane y and captured by a detector D(y) so that for each measurement Λ.
where h(x, y) is the point spread function of the optical setup.
The task of the super-resolution algorithm is to find an approximation (tilde{rho }({bf{x}})) and (tilde{I}({bf{x}})) by minimizing a loss function
in an iterating process assuming that the low resolution image that the algorithm produces is.
The fluorophore density guess (tilde{rho }({bf{x}})) and the intensity guess ({tilde{I}}_{Lambda }({bf{x}})) are updated iterative for a number of steps s
and
where α is the gradient descent step where we take to be α = 0.01 and
For the low photon noise experimental data we make use of the Kullback Leibler divergence43
A simplified pseudocode for the algorithm is the following
Algorithm
for s = 1: S do
for Λ = 1 :N do
({tilde{D}}_{Lambda }={mathcal{H}}otimes (tilde{rho }{tilde{I}}_{Lambda }))
(tilde{rho }leftarrow tilde{rho }-alpha {tilde{I}}_{Lambda }{mathcal{H}}otimes (log frac{tilde{D}}{D}))
({tilde{I}}_{Lambda }leftarrow {tilde{I}}_{Lambda }-alpha tilde{rho }{mathcal{H}}otimes (log frac{tilde{D}}{D}))
An user friendly version of the algorithm can be found in https://github.com/emmxyp/super-resolution-algorithm
For the numerical simulations we use the experimental relevant parameters: numerical aperture NA = 0.04, wavelength λ = 0.605 μm, magnification m = 4.6875, camera pixel size px = 6.5 μm, field of view FOV = 531.1061 μm. Both for the C-SIM and S2IM we use 625 translations. The translation step for the C-SIM was taken to be 2.7734 μm.
The processing time for a 384 × 384 × 2000 acquisition stack is about 6 min (25 iterations), on a CPU AMD Ryzen 9 7900X3D and Nvidia GTX 1080 GPU. The code has been developed in Matlab, in a non parallel format. Rough estimate indicate that processing time can be reduced a factor 10/50 with parallelization and optimization. Quasi-live imaging (1–5 s processing time), can be eventually be realized with an optimized code for limited size stacks (384 × 384 × 200).
Responses