0% found this document useful (0 votes)
13 views12 pages

Care

Uploaded by

wadsad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views12 pages

Care

Uploaded by

wadsad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Articles

https://doi.org/10.1038/s41592-018-0216-7

Content-aware image restoration: pushing the


limits of fluorescence microscopy
Martin Weigert 1,2*, Uwe Schmidt1,2, Tobias Boothe 1,2, Andreas Müller 3,4,5, Alexandr Dibrov1,2,
Akanksha Jain 2, Benjamin Wilhelm 1,6, Deborah Schmidt 1, Coleman Broaddus1,2, Siân Culley 7,8
,
Mauricio Rocha-Martins1,2, Fabián Segovia-Miranda2, Caren Norden 2, Ricardo Henriques 7,8,
Marino Zerial2, Michele Solimena2,3,4,5, Jochen Rink2, Pavel Tomancak 2, Loic Royer 1,2,9*,
Florian Jug 1,2* and Eugene W. Myers1,2,10

Fluorescence microscopy is a key driver of discoveries in the life sciences, with observable phenomena being limited by the
optics of the microscope, the chemistry of the fluorophores, and the maximum photon exposure tolerated by the sample. These
limits necessitate trade-offs between imaging speed, spatial resolution, light exposure, and imaging depth. In this work we
show how content-aware image restoration based on deep learning extends the range of biological phenomena observable by
microscopy. We demonstrate on eight concrete examples how microscopy images can be restored even if 60-fold fewer photons
are used during acquisition, how near isotropic resolution can be achieved with up to tenfold under-sampling along the axial
direction, and how tubular and granular structures smaller than the diffraction limit can be resolved at 20-times-higher frame
rates compared to state-of-the-art methods. All developed image restoration methods are freely available as open source soft-
ware in Python, FIJI, and KNIME.

F
luorescence microscopy is an indispensable tool in the life sci- have multiple possible solutions, and require additional assump-
ences for investigating the spatio-temporal dynamics of cells, tions to select one solution as the final restoration. These assump-
tissues, and developing organisms. Recent advances such as tions are typically general, for example, requiring a certain level of
light-sheet microscopy1–3, structured illumination microscopy4,5, smoothness of the restored image, and therefore are not dependent
and super-resolution microscopy6–8 enable time-resolved volumetric on the specific content of the images to be restored. Intuitively, a
imaging of biological processes within cells at high resolution. The method that leverages available knowledge about the data at hand
quality at which these processes can be faithfully recorded, however, ought to yield superior restoration results.
is determined not only by the spatial resolution of the optical device Deep learning is such a method, because it can learn to perform
used, but also by the desired temporal resolution, the total duration complex tasks on specific data by employing multilayered artifi-
of an experiment, the required imaging depth, the achievable fluo- cial neural networks trained on a large body of adequately anno-
rophore density, bleaching, and photo-toxicity9,10. These aspects can- tated example data25,26. In biology, deep learning methods have, for
not all be optimized at the same time—trade-offs must be made, for instance, been applied to the automatic extraction of connectomes
example, by sacrificing signal-to-noise ratio (SNR) by reducing expo- from large electron microscopy data27, for classification of image-
sure time to gain imaging speed. Such trade-offs are often depicted by based high-content screens28, fluorescence signal prediction from
a design space that has resolution, speed, light exposure, and imaging label-free images29,30, resolution enhancement in histopathology31,
depth as its dimensions (Fig. 1a), with the volume being limited by or for single-molecule localization in super-resolution micros-
the maximal photon budget compatible with sample health11,12. copy32,33. However, the direct application of deep learning methods
These trade-offs can be addressed through optimization of the to image restoration tasks in fluorescence microscopy is compli-
microscopy hardware, yet there are physical limits that cannot easily cated by the absence of adequate training data and the fact that it is
be overcome. Therefore, computational procedures to improve the impossible to generate them manually.
quality of acquired microscopy images are becoming increasingly We present a solution to the problem of missing training data
important. Super-resolution microscopy4,13–16, deconvolution17–19, for deep learning in fluorescence microscopy by developing strat-
surface projection algorithms20,21, and denoising methods22–24 are egies to generate such data. This enables us to apply common
examples of sophisticated image restoration algorithms that can convolutional neural network architectures (U-Nets34) to image res-
push the limit of the design space, and thus allow the recovery of toration tasks, such as image denoising, surface projection, recov-
important biological information that would be inaccessible by ery of isotropic resolution, and the restoration of sub-diffraction
imaging alone. However, most common image restoration problems structures. We show, in a variety of imaging scenarios, that trained

1
Center for Systems Biology Dresden, Dresden, Germany. 2Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany. 3Molecular
Diabetology, University Hospital and Faculty of Medicine Carl Gustav Carus, TU Dresden, Dresden, Germany. 4Paul Langerhans Institute Dresden of the
Helmholtz Center Munich at the University Hospital Carl Gustav Carus and Faculty of Medicine of the TU Dresden, Dresden, Germany. 5German Center
for Diabetes Research (DZD e.V.), Neuherberg, Germany. 6University of Konstanz, Konstanz, Germany. 7MRC Laboratory for Molecular Cell Biology,
University College London, London, UK. 8The Francis Crick Institute, London, UK. 9CZ Biohub, San Francisco, CA, USA. 10Department of Computer Science,
Technical University Dresden, Dresden, Germany. *e-mail: [email protected]; [email protected]; [email protected]

1090 Nature Methods | VOL 15 | DECEMBER 2018 | 1090–1097 | www.nature.com/naturemethods


NATurE METhods Articles
a b
Light exposure Training data generation Training Application

yi ~
x


freal –1
~ freal gθ

~
xi y

Spatial resolution Imaging speed

c e
Input Input

50 µm 50 µm
NLM NLM

Network Network

Ground truth Ground truth

d 0.1 1
Imaging conditions Exposure (ms) Raw
0 30
GT
NRMSE

SSIM

C1 NLM
C2
C3 Network
0 2.3
GT/high C1/medium C2/low C3/very low Laser power (mW) 0.0 0
C1 C2 C3 C1 C2 C3

Fig. 1 | CARE. a, Trade-offs between imaging speed, spatial resolution, and light exposure need to be made owing to the constraints of the maximal
photon budget a sample permits. Image restoration enlarges this design space. b, Overview of the proposed pipeline for image denoising. Pairs (xi,yi) of
registered high- and low-SNR volumes are acquired at the microscope. A convolutional neural network is trained to restore yi from xi. The trained CARE
network can then be applied to previously unseen low-SNR images ∼x, yielding y∼. c, Input data and restorations for nucleus-stained (RedDot1) flatworm
(S. mediterranea). Shown are a single image-plane of a raw input stack (top row), the output of NLM denoising22 (second row), the network prediction (third
row), and the high-SNR gold standard/ground truth (bottom row). d, Prediction error for data from c at three imaging conditions C1–C3. Box-dot plots
(n =​20 per condition) show NRMSE and SSIM (higher is better) for the input, for the denoising baseline (NLM), and for network restorations. Boxes show
interquartile range (IQR), lines signify medians, and whiskers extend to 1.5 times the IQR. e, Input data and restorations for a nucleus-labeled (EFA::nGFP)
red flour beetle (Tribolium castaneum) embryo. Figure structure as in c.

content-aware image restoration (CARE) networks produce results Image restoration with physically acquired training data. To dem-
that were previously unobtainable. This means that the applica- onstrate the power of machine learning in biology, we developed
tion of CARE to biological images transcends the limitations of the CARE. We first demonstrate the utility of CARE on micros-
design space (Fig. 1a), pushing the limits of the possible in fluo- copy acquisitions of the flatworm Schmidtea mediterranea, a
rescence microscopy through machine-learned image computation. model organism for studying tissue regeneration. This organ-
ism is exceptionally sensitive to even moderate amounts of laser
Results light35, exhibiting muscle flinching at desirable illumination lev-
Images with a low SNR are difficult to analyze in fluorescence micros- els even when anesthetized (Supplementary Video 1). Using a
copy. One way to improve SNR is to increase laser power or exposure laser power that reduces flinching to an acceptable level results
times, which is usually detrimental to the sample, limiting the possi- in images with such low SNR that they are impossible to interpret
ble duration of the recording and introducing artifacts due to photo- directly. Consequently, live imaging of S. mediterranea has thus far
damage. An alternative solution is to image at low SNR, and later been intractable.
to computationally restore acquired images. Classical approaches, To address this problem with CARE, we imaged fixed worm sam-
such as non-local-means denoising22, can in principle achieve this, ples at several laser intensities. We acquired well-registered pairs
but without leveraging available knowledge about the data at hand. of images, a low-SNR image at laser power compatible with live

Nature Methods | VOL 15 | DECEMBER 2018 | 1090–1097 | www.nature.com/naturemethods 1091


Articles NATurE METhods

imaging, and a high-SNR image, serving as a ground truth (Fig. 1b). ing (Fig. 2b, Supplementary Fig. 9, and Supplementary Note 2).
We then trained a convolutional neural network and applied the The results show that with CARE, reducing light dosage up to ten-
trained network to previously unseen live-imaging data of S. medi- fold has virtually no adverse effect on the quality of segmentation
terranea (Supplementary Notes 1 and 2). We used networks of mod- and tracking results obtained on the projected 2D images with an
erate size (∼​106 parameters) based on the U-Net architecture34,36, established analysis pipeline41 (Fig. 2c,d, Supplementary Video 4,
together with a per-pixel similarity loss, for example absolute error and Supplementary Figs. 10–12). Even for this complex task, the
(Supplementary Fig. 1, Supplementary Note 2, and Supplementary gained photon budget can be used to move beyond the design space,
Table 3). We consistently obtained high-quality restorations, even if for example, by increasing temporal resolution, and consequently
the SNR of the images was very low, for example, being acquired with improving the precision of tracking of cell behaviors during wing
a 60-fold reduced light dosage (Fig. 1c,d, Supplementary Video 2, morphogenesis41.
and Supplementary Figs. 2–4). To quantify this observation, we
measured the restoration error between prediction and ground- Image restoration with semi-synthetic training data. A com-
truth images for three different exposure and laser-power condi- mon problem in fluorescence microscopy is that the axial reso-
tions. Both the normalized root-mean-square error (NRMSE) and lution of volumetric acquisitions is substantially lower than the
the structural similarity index (SSIM; a measurement of the per- lateral resolution (some advanced modalities allow for isotropic
ceived similarity between two images37) improved considerably acquisitions, such as multiview light-sheet microscopy19,42). This
compared with results obtained by several potent classical denoising anisotropy compromises the ability to accurately measure proper-
methods (Fig. 1d, Supplementary Figs. 3 and 5, and Supplementary ties such as the shapes or volumes of cells. Anisotropy is caused by
Table 1). We further observed that even a small number of train- the inherent axial elongation of the optical point spread function
ing images (for example, 200 patches of size 64 ×​  64 ×​ 16) led to (PSF), and the often low axial sampling rate of volumetric acqui-
an acceptable image restoration quality (Supplementary Fig. 6). sitions required for fast imaging. For the restoration of isotropic
Moreover, while training a CARE network can take several hours, image resolution, adequate pairs of training data cannot directly
the restoration time for a volume of size 1,024 ×​  1,024 ×​  100 was be acquired at the microscope. Rather, we took well-resolved lat-
less than 20 s on a single graphics processing unit. (We used a eral slices as ground truth, and computationally modified them
common consumer graphics processing unit (Nvidia GeForce (applying a realistic imaging model; Supplementary Note 2) to
GTX 1080 or Titan X) for all presented experiments.) In this case, resemble anisotropic axial slices of the same image stack. In this
CARE networks are able to take input data that are unusable way, we generated matching pairs of images showing the same
for biological investigations and turn them into high-quality time- content at axial and lateral resolutions. These semi-synthetically
lapse data, providing a practical framework for live-cell imaging generated pairs are suitable to train a CARE network that then
of S. mediterranea. restores previously unseen axial slices to nearly isotropic reso-
We next asked whether CARE improves common downstream lution (Fig. 3a, Supplementary Fig. 13, Supplementary Note 2,
analysis tasks in live-cell imaging, such as nuclei segmentation. We and refs 43,44). To restore entire anisotropic volumes, we applied
used confocal microscopy recordings of developing Tribolium casta- the trained network to all lateral image slices, taken in two
neum (red flour beetle) embryos, and as before trained a network on orthogonal directions, averaged to a single isotropic restoration
image pairs of samples acquired at high and low laser powers (Fig. 1e). (Supplementary Note 2).
The resulting CARE network performed well even on extremely We applied this strategy to increase axial resolution of acquired
noisy, previously unseen live-imaging data (Supplementary Note 4, volumes of fruit fly embryos45, zebrafish retina46, and mouse liver,
Supplementary Video 3, and Supplementary Fig. 7). To test the ben- imaged with different fluorescence imaging techniques. The
efits of CARE for segmentation, we applied a simple nuclei segmen- results show that CARE improved the axial resolution in all three
tation pipeline to raw and restored image stacks of T. castaneum. cases considerably (Fig. 3b–d, Supplementary Videos 5 and 6, and
The results show that, compared to manual expert segmentation, Supplementary Figs. 14 and 15). To quantify this, we performed
the segmentation accuracy (as measured with the standard SEG Fourier spectrum analysis of Drosophila volumes before and after
score38) improved from SEG =​ 0.47 on the classically denoised raw restoration, and showed that the frequencies along the axial dimen-
stacks to SEG =​ 0.65 on the CARE restored volumes (Supplementary sion are fully restored, while frequencies along the lateral dimen-
Fig. 8). Since this segmentation performance is achieved at substan- sions remain unchanged (Fig. 3b and Supplementary Fig. 16). Since
tially reduced laser power, the gained photon budget can now be the purpose of the fruit fly data is to segment and track nuclei, we
spent on the imaging speed and light-exposure dimensions of the applied a common segmentation pipeline47 to the raw and restored
design space. This means that Tribolium embryos, when restored images, and observed that the fraction of incorrectly identified
with CARE, can be imaged for longer and at higher frame rates, thus nuclei was reduced from 1.7% to 0.2% (Supplementary Note 2
enabling improved tracking of cell lineages. and Supplementary Figs. 17 and 18). Thus, restoring anisotropic
Encouraged by the performance of CARE on two independent volumetric embryo images to effectively isotropic stacks leads to
denoising tasks, we asked whether such networks can also solve improved segmentation, and will enable more reliable extraction of
more complex, composite tasks. In biology it is often useful to image developmental lineages.
a three-dimensional (3D) volume and project it to a two-dimen- While isotropic images facilitate segmentation and subse-
sional (2D) surface for analysis, such as when studying cell behavior quent quantification of shapes and volumes of cells, vessels, or
in developing epithelia of the fruit fly Drosophila melanogaster39,40. other biological objects of interest, higher imaging speed enables
Also, in this context, it is beneficial to optimize the trade-off between imaging of larger volumes and their tracking over time. Indeed,
laser power and imaging speed, usually resulting in rather low-SNR respective CARE networks deliver the desired axial resolution with
images. Thus, this restoration problem is composed of projection up to tenfold fewer axial slices (Fig. 3c,d; see Supplementary Fig. 19
and denoising, presenting the opportunity to test whether CARE for comparison with classical deconvolution), allowing one to reach
networks can deal with such composite tasks. For training, we again comparable results ten times faster. We quantified the effect of
acquired pairs of low- and high-SNR 3D image stacks, and further subsampling on raw and restored volumes with respect to restora-
generated 2D projection images from the high-SNR stacks20 that tions of isotropically sampled volumes for the case of the liver data
serve as ground truth (Fig. 2a). We developed a task-specific net- (Fig. 3d and Supplementary Fig. 20). Finally, we observed that for
work architecture that consists of two jointly trained parts: a network two-channel datasets such as the zebrafish, networks learned to
for surface projection, followed by a network for image denois- exploit correlations between channels, leading to a better overall

1092 Nature Methods | VOL 15 | DECEMBER 2018 | 1090–1097 | www.nature.com/naturemethods


NATurE METhods Articles
a c
Low SNR (input) Input (max. projection)
Network

3D

High SNR (ground truth)


50 µm
PreMosa
Projection

3D 2D
Full wing

b Network architecture d Reconstruction accuracy


MIP Network PreMosa + NLM
Input
(3D) 0.2 Network
Projection

NRMSE

Projection
(2D) 0.0
C1 C2 C3
1 Ground truth
ResUnet

SSIM

Output
(2D) 0
C1 C2 C3

Fig. 2 | Joint surface projection and denoising. a, Schematic of the composite task at hand. A single cell layer of interest of a Drosophila wing is embedded
in an imaged 3D volume. The desired pipeline extracts and denoises the 2D tissue layer from low-SNR input volumes. b, The proposed CARE network
first projects the data and then performs a 2D denoising step. c, Restoration results on E-cadherin-labeled fly wing data acquired with a spinning disk
microscope. Shown are a max-projection of the raw input data (top row), a surface projection baseline obtained by state-of-the-art method PreMosa20
(second row), CARE network results (third row), and ground-truth projections obtained by applying PreMosa on a very high-laser-power acquisition of the
same sample (bottom row). d, Prediction error for three imaging conditions (C1–C3; see Methods). Box-dot plots (n =​26 per conditions) show NRMSE and
SSIM (higher is better) for results obtained using PreMosa (blue), PreMosa with additional denoising (NLM22, green), and CARE network results (orange).
Boxes show IQR, lines signify medians, and whiskers extend to 1.5 times the IQR. Comparison to additional baselines can be found in Supplementary Fig. 11.

restoration quality compared to results based on individual channels noise, and background auto-fluorescence (Fig. 4a, Supplementary
(Supplementary Fig. 15). Note 2, and Supplementary Fig. 21). Finally, we trained a CARE net-
work on these generated image pairs, and applied it to two-channel
Image restoration with synthetic training data. Having seen the widefield time-lapse images of rat INS-1 cells where the secretory
potential of using semi-synthetic training data for CARE, we next granules and the microtubules were labeled (Fig. 4b). We observed
investigated whether reasonable restorations can be achieved even that the restoration of both microtubules and secretory granules
from synthetic image data alone, that is, without involving real exhibited a dramatically improved resolution, revealing structures
microscopy data during training. imperceptible in the widefield images (Supplementary Video 7 and
In most of the previous applications, one of the main benefits Supplementary Fig. 22). To substantiate this observation, we com-
of CARE networks was improved imaging speed. Many biological pared the CARE restoration to the results obtained by deconvolu-
applications additionally require the resolution of sub-diffraction tion, which is commonly used to enhance widefield images (Fig. 4b).
structures in the context of live-cell imaging. Super-resolution Line profiles through the data show the improved performance
imaging modalities achieve the necessary resolution, but suffer of the CARE network over deconvolution (Fig. 4b). We addition-
from low acquisition rates. In contrast, widefield imaging offers ally compared results obtained by CARE with those from super-
the necessary speed, but lacks the required resolution. We therefore resolution radial fluctuations (SRRF14), a state-of-the-art method
tested whether CARE can computationally resolve sub-diffraction for reconstructing super-resolution images from widefield time-
structures using only widefield images as input. Note that this is a lapse data. We applied both methods on time-lapse widefield images
fundamentally different approach compared to recently proposed of GFP-tagged microtubules in HeLa cells. The results show that
methods for single-molecule localization microscopy that recon- both CARE and SRRF are able to resolve qualitatively similar micro-
struct a single super-resolved image from multiple diffraction-lim- tubular structures (Fig. 4c and Supplementary Video 8). However,
ited input frames using deep learning32,33. To this end, we developed CARE reconstructions enable imaging to be carried out at least 20
synthetic generative models of tubular and point-like structures that times faster, since they are computed from a single average of up
are commonly studied in biology. To obtain synthetic image pairs to 10 consecutive raw images while SRRF required about 200 con-
for training, we used these generated structures as ground truth, and secutive widefield frames. We also used SQUIRREL48 to quantify
computationally modified them to resemble actual microscopy data the error for both methods and observed that CARE generally pro-
(Supplementary Note 2 and Supplementary Fig. 21). Specifically, we duced better results, especially in image regions containing overlap-
created synthetic ground-truth images of tubular meshes resembling ping structures of interest (Fig. 4d and Supplementary Fig. 23).
microtubules, and point-like structures of various sizes mimicking Taken together, these results suggest that CARE networks can
secretory granules. Then we computed synthetic input images by enhance widefield images to a resolution usually obtainable only with
simulating the image degradation process by applying a PSF, camera super-resolution microscopy, yet at considerably higher frame rates.

Nature Methods | VOL 15 | DECEMBER 2018 | 1090–1097 | www.nature.com/naturemethods 1093


Articles NATurE METhods
a Training data generation b Spectrum
Lateral Axial Input
xy xz

x
Network 10 µm
z y

c DRAQ5 GFP+LAP2b Merge


Input

50 µm
Network

d
Input (subsampled) Input Network

0.08

NRMSE
0.06
50 µm
Network
0.04

2 4 6 8 10 12 14 16
Subsample factor σ

0.8
Isotropically sampled
SSIM

0.6

0.4
2 4 6 8 10 12 14 16
Subsample factor σ

Fig. 3 | Isotropic restoration of 3D volumes. a, Schematic of the semi-synthetic generation of training data. Lateral slices of the raw data (green) are used
as ground truth, which are then synthetically downsampled and convolved with the rotated PSF of the microscope used. This results in corresponding
anisotropic slices (black) with similar resolution as the raw axial slices (orange). b, Application to time-lapse acquisitions of D. melanogaster45. Shown are
three areas of the raw axial input data (top row) and their respective isotropic restorations (bottom row). Additionally, the Fourier spectrum of raw and
restored images illustrates how missing spectral components are recovered. c, An axial slice through a developing zebrafish (Danio rerio) eye. Shown are
the anisotropic raw data (top row) and isotropic restoration (bottom row). Nuclei are stained with DRAQ5 (magenta), and the nuclear envelope is labeled
with GFP-LAP2b (green). d, An axial slice through mouse liver tissue. Shown are the anisotropic raw data (with a subsampling of σ =​8, top row) and the
isotropic restoration by the network (middle row). Nuclei and membranes of hepatocytes are labeled with DAPI and phalloidin, respectively, and imaged
in a single channel. Plots show the effect of increasing levels of axial subsampling of raw (blue) and isotropically restored (orange) volumes. We plot the
NRMSE and SSIM (higher is better) with respect to our restorations of the shown isotropically sampled raw data. Details on data and training can be found
in Supplementary Table 3.

Reliability of image restoration. We have shown that CARE net- machine learning. Nevertheless, we observed only minimal ‘hallu-
works perform well on a wide range of image restoration tasks, cination’ effects, where structures seen in the training data errone-
opening up new avenues for biological observations (Supplementary ously appear in restored images (Supplementary Figs. 24 and 25). In
Table 2). However, as for any image processing method, the issue of Supplementary Fig. 25a we show the two strongest errors across the
reliability of results needs to be addressed. entire body of available image data.
CARE networks are trained for a specific biological organ- Nevertheless, it is essential to identify cases where the above-
ism, fluorescent marker, and microscope setting. When a network mentioned problems occur. To enable this, we changed the last net-
is applied to data it was not trained for, results are likely to suf- work layer so that it predicts a probability distribution for each pixel
fer in quality, as is the case for any (supervised) method based on (Fig. 5a, Methods, and Supplementary Note 3). We chose a Laplace

1094 Nature Methods | VOL 15 | DECEMBER 2018 | 1090–1097 | www.nature.com/naturemethods


NATurE METhods Articles
a d
Training data Network 5,000
2 µm 1,000
Synthetic 800
600

RSE
Error 400
200 Network
SRRF
0
Network SRRF 0 10 20
0 Time point
b c
Input 1 2 Input
2 µm 2 µm

Network Network

Deconv SRRF

1 2 Input 1 2 Input
Network Network
Deconv SRRF

Fig. 4 | Resolving sub-diffraction structures at high frame rates. a, Schematic of the fully synthetic training pipeline. b, Raw widefield images of rat
secretory granules (pEG-hIns-SNAP; magenta) and microtubules (SiR-tubulin; green) in insulin-secreting INS-1 cells (top row), the corresponding
network restorations (second row), and a deconvolution result of the raw image as a baseline (bottom row). Line plots show image intensities along the
dashed lines in the top panels. c, GFP-tagged microtubules in HeLa cells. Raw input image (top row), network restorations (second row), super-resolution
images created by the state-of-the-art method SRRF14 (bottom row). Line plots show image intensities along the dashed lines in the top panels. d, Error
quantification via SQUIRREL48 for network results and the results obtained by SRRF. Shown are error maps corresponding to the dashed box in c and the
resolution scaled error (RSE) for 20 consecutive frames. The data shown in c correspond to frame 1.

distribution for simplicity and robustness (Supplementary Note 3). responding values of D were large (Fig. 5c, bottom row). Therefore,
For probabilistic CARE networks, the mean of the distribution is training ensembles of CARE networks is useful for detecting
used as the restored pixel value, while the width (variance) of each problematic image areas that cannot reliably be restored. Another
pixel distribution encodes the uncertainty of pixel predictions. example for the utility of ensemble disagreement can be found in
Intuitively, narrow distributions signify high confidence, whereas Supplementary Fig. 28.
broad distributions indicate low-confidence pixel predictions. This
allows us to provide per-pixel confidence intervals of the restored Discussion
image (Fig. 5a, Supplementary Figs. 26 and 27). We observed that We have introduced CARE networks designed to restore fluores-
variances tend to increase with restored pixel intensities. This cence microscopy data. A key feature of our approach is that the
makes it hard to intuitively understand which areas of a restored generation of training data does not require laborious manual train-
image are reliable or unreliable from a static image of per-pixel ing data generation. With CARE, flatworms can be imaged with-
variances. Therefore, we visualize the uncertainty in short video out unwanted muscle contractions, beetle embryos can be imaged
sequences, where pixel intensities are randomly sampled from their much more gently and therefore for longer and much faster, large
respective distributions (Supplementary Video 9). We additionally tiled scans of entire Drosophila wings can be imaged and simultane-
reasoned that by analyzing the consistency of predictions from ously projected at dramatically increased temporal resolution, iso-
several trained models we can assess their reliability. To that end, tropic restorations of embryos and large organs can be computed
we train ensembles (Fig. 5b) of about five CARE networks on from existing anisotropic data, and sub-diffraction structures can
randomized sequences of the same training data. We introduced be restored from widefield systems at high frame rates. In all these
a measure D that quantifies the probabilistic ensemble disagree- examples, CARE allows the photon budget saved during imaging
ment per pixel (Methods, Supplementary Note 3). D takes values to be invested into improvement of acquisition parameters relevant
between 0 and 1, with higher values signifying larger disagree- for a given biological problem, such as speed of imaging, photo-
ment, that is, smaller overlap among the distributions predicted by toxicity, isotropy, and resolution.
the networks in the ensemble. Using fly wing denoising as an Whether experimentalists are willing to make the above-men-
example, we observed that in areas where different networks in tioned investment depends on their trust that a CARE network
an ensemble predicted very similar structures, the disagreement is accurately restoring the image. This is a valid concern that
measure D was low (Fig. 5c, top row), whereas in areas where the applies to every image restoration approach. What sets CARE
same networks predicted obviously dissimilar solutions, the cor- apart is the availability of additional readouts, that is, per-pixel

Nature Methods | VOL 15 | DECEMBER 2018 | 1090–1097 | www.nature.com/naturemethods 1095


Articles NATurE METhods

a b
Predicting pixel-wise distributions Network ensemble

P µi,j
Pixeli,j
σi,j
⦁ ⦁ ⦁
Value Network 1 Network 2 Network N
Mean µ Confidence Ground truth

Ensemble distribution Ensemble disagreement

=0 All distributions equal


=1 All distributions maximally different

c Network ensemble
Input Ensemble Ensemble
(MIP) Network 1 Network 2 Network 3 Network 4 mean disagreement
1 2 3 4 0.5

0
1 2 3 4 0.5

Fig. 5 | Reliability readouts for CARE. a, For every pixel of a restored image, CARE networks can predict a (Laplace) distribution parameterized by its mean
μ and scale σ (top). These distributions provide pixel-wise confidence intervals (bottom), here shown for a surface projection and denoising network (see
Fig. 2). The line plot shows the predicted mean (blue), the 90% confidence interval (light blue), and corresponding ground-truth intensities (dashed red)
along the yellow dashed line in the image on the left. b, Multiple independently trained CARE networks are combined to form an ensemble, resulting in
an ensemble distribution and an ensemble disagreement measure D ∈​[0, 1]. c, Ensemble predictions can vary, especially on challenging image regions.
Shown are two examples for a surface projection and denoising ensemble of four networks (rows). From left to right we show the maximum projection of
input data, predictions of the four networks of the ensemble, the pixel-wise ensemble mean, and the ensemble disagreement measure. While the top row
shows an image region with low ensemble disagreement, the bottom row shows a region where individual network predictions differ, resulting in a high
disagreement score in respective image areas.

confidence intervals and ensemble disagreement scores, which exposures, and lower light intensities, while reaching higher reso-
allow users to identify image regions where restorations might lution, and thereby improving downstream analysis. The technol-
not be accurate. ogy described here is readily accessible to the scientific community
We have shown multiple examples where image restoration with through the open source tools we provide. We predict that the cur-
CARE networks positively impacts downstream image analysis, rent explosion of image data diversity and the ability of CARE net-
such as segmentation and tracking of cells needed to extract devel- works to automatically adapt to various image contents will make
opmental lineages. Interestingly, in the case of Tribolium, CARE such learning approaches prevalent for biological image restoration
improved segmentation by efficient denoising, whereas in the case and will open up new windows into the inner workings of biological
of Drosophila, the segmentation was improved by an increase in systems across scales.
the isotropy of volumetric acquisitions. These two benefits are not
mutually exclusive and could very well be combined. In fact, we Online content
have shown on data from developing Drosophila wings that com- Any methods, additional references, Nature Research reporting sum-
posite tasks can be jointly trained. Future explorations of joint train- maries, source data, statements of data availability and associated acces-
ing of composite networks will further broaden the applicability of sion codes are available at https://doi.org/10.1038/s41592-018-0216-7
CARE to complex biological imaging problems (see ref. 49).
However, CARE networks cannot be applied to all existing image Received: 7 June 2018; Accepted: 10 October 2018;
Published online: 26 November 2018
restoration problems. For instance, the proposed isotropic restora-
tion relies on the implicit assumption that structures of interest References
do appear in arbitrary orientations and that the PSF is constant 1. Huisken, J. et al. Optical sectioning deep inside live embryos by selective
throughout the image volume. This assumption is only approxi- plane illumination microscopy. Science 305, 1007–1009 (2004).
mately true, and becomes increasingly worse as the imaging depth 2. Tomer, R. et al. Quantitative high-speed imaging of entire developing
in the sample tissue increases. Additionally, because of the nonlin- embryos with simultaneous multiview light-sheet microscopy. Nat. Methods
9, 755–763 (2012).
ear nature of neural network predictions, CARE must not be used 3. Chen, B.-C. et al. Lattice light-sheet microscopy: imaging molecules to
for intensity-based quantifications such as, for example, fluorophore embryos at high spatiotemporal resolution. Science 346, 1257998 (2014).
counting. Furthermore, the disagreement score we introduced may 4. Gustafsson, M. G. Surpassing the lateral resolution limit by a factor
be useful to additionally identify instances where training and test of two using structured illumination microscopy. J. Microsc. 198,
82–87 (2000).
data are incompatible, that is, when a CARE network is applied on
5. Heintzmann, R. & Gustafsson, M. G. Subdiffraction resolution in continuous
data that contain biological structures absent from the training set. samples. Nat. Photon. 3, 362–364 (2009).
Overall, our results show that fluorescence microscopes can, 6. Betzig, E. et al. Imaging intracellular fluorescent proteins at nanometer
in combination with CARE, operate at higher frame rates, shorter resolution. Science 313, 1642–1645 (2006).

1096 Nature Methods | VOL 15 | DECEMBER 2018 | 1090–1097 | www.nature.com/naturemethods


NATurE METhods Articles
7. Rust, M. J., Bates, M. & Zhuang, X. Sub-diffraction-limit imaging by 38. Ulman, V. et al. An objective comparison of cell-tracking algorithms.
stochastic optical reconstruction microscopy (STORM). Nat. Methods 3, Nat. Methods 14, 1141–1152 (2017).
793–795 (2006). 39. Aigouy, B. et al. Cell flow reorients the axis of planar polarity in the wing
8. Mortensen, K. I. et al. Optimized localization analysis for single-molecule epithelium of Drosophila. Cell 142, 773–786 (2010).
tracking and super-resolution microscopy. Nat. Methods 7, 377–381 (2010). 40. Etournay, R. et al. Interplay of cell dynamics and epithelial tension during
9. Icha, J. et al. Phototoxicity in live fluorescence microscopy, and how to avoid morphogenesis of the Drosophila pupal wing. eLife 4, e07090 (2015).
it. Bioessays 39, 700003 (2017). 41. Etournay, R. et al. TissueMiner: a multiscale analysis toolkit to quantify how
10. Laissue, P. P. et al. Assessing phototoxicity in live fluorescence imaging. Nat. cellular processes create tissue dynamics. eLife 5, e14334 (2016).
Methods 14, 657–661 (2017). 42. Chhetri, R. K. et al. Whole-animal functional and developmental
11. Pawley, J. B. Fundamental limits in confocal microscopy. In Handbook of imaging with isotropic spatial resolution. Nat. Methods 12,
Biological Confocal Microscopy (ed Pawley, J. B.) 20–42 (Springer, Boston, 1171–1178 (2015).
MA, 2006). 43. Weigert, M., Royer, L., Jug, F. & Myers, G. Isotropic reconstruction of 3D
12. Scherf, N. & Huisken, J. The smart and gentle microscope. Nat. Biotechnol. fluorescence microscopy images using convolutional neural networks. In
33, 815–818 (2015). Medical Image Computing and Computer Assisted Intervention—MICCAI 2017
13. Müller, M. et al. Open-source image reconstruction of super-resolution (eds Descoteaux, M. et al.) 126–134 (Springer, Cham, 2017).
structured illumination microscopy data in ImageJ. Nat. Commun. 7, 44. Heinrich, L., Bogovic, J. A. & Saalfeld, S. Deep learning for isotropic
10980 (2016). super-resolution from non-isotropic 3D electron microscopy. In Medical
14. Gustafsson, N. et al. Fast live-cell conventional fluorophore nanoscopy with Image Computing and Computer Assisted Intervention—MICCAI 2017
ImageJ through super-resolution radial fluctuations. Nat. Commun. 7, (eds Descoteaux, M. et al.) 135–143 (Springer, Cham, 2017).
12471 (2016). 45. Royer, L. A. et al. Adaptive light-sheet microscopy for long-term,
15. Dertinger, T. et al. Superresolution optical fluctuation imaging (SOFI). In high-resolution imaging in living organisms. Nat. Biotechnol. 34,
Nano-Biotechnology for Biomedical and Diagnostic Research (eds Zahavy, E. 1267–1278 (2016).
et al.) 17–21 (Springer, Dordrecht, the Netherlands, 2012). 46. Icha, J. et al. Independent modes of ganglion cell translocation
16. Agarwal, K. & Macháň, R. Multiple signal classification algorithm for ensure correct lamination of the zebrafish retina. J. Cell Biol. 215,
super-resolution fluorescence microscopy. Nat. Commun. 7, 13752 (2016). 259–275 (2016).
17. Richardson, W. H. Bayesian-based iterative method of image restoration. 47. Sommer, C. et al. Ilastik: interactive learning and segmentation toolkit. In
J. Opt. Soc. Am. 62, 55–69 (1972). IEEE International Symposium on Biomedical Imaging: From Nano to Macro
18. Arigovindan, M. et al. High-resolution restoration of 3D structures from 230–233 (IEEE, New York, 2011).
widefield images with extreme low signal-to-noise-ratio. Proc. Natl. Acad. Sci. 48. Culley, S. et al. Quantitative mapping and minimization of super-resolution
USA 110, 17344–17349 (2013). optical imaging artifacts. Nat. Methods 15, 263–266 (2018).
19. Preibisch, S. et al. Efficient Bayesian-based multiview deconvolution. 49. Sui, L. et al. Differential lateral and basal tension drives epithelial folding
Nat. Methods 11, 645–648 (2014). through two distinct mechanisms. Nat. Commun. 9, 4620 (2018).
20. Blasse, C. et al. PreMosa: extracting 2D surfaces from 3D microscopy
mosaics. Bioinformatics 33, 2563–2569 (2017).
21. Shihavuddin, A. et al. Smooth 2D manifold extraction from 3D image stack. Acknowledgements
Nat. Commun. 8, 15554 (2017). The authors thank P. Keller (Janelia) who provided Drosophila data. We thank S. Eaton
22. Buades, A., Coll, B. & Morel, J.-M. A non-local algorithm for image denoising. (MPI-CBG), F. Gruber and R. Piscitello for sharing their expertise in fly imaging and
In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) providing fly lines. We thank A. Sönmetz for cell culture work. We thank M. Matejcic
(eds Schmid, C., Soatto, S. & Tomasi, C.) 60–65 (IEEE, New York, 2005). (MPI-CBG) for generating and sharing the LAP2b transgenic line Tg(bactin:eGFP-
23. Dabov, K., Foi, A., Katkovnik, V. & Egiazarian, K. Image denoising by sparse LAP2b). We thank B. Lombardot from the Scientific Computing Facility (MPI-CBG)
3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 16, for technical support. We thank the following Services and Facilities of the MPI-
2080–2095 (2007). CBG for their support: Computer Department, Light Microscopy Facility and Fish
24. Morales-Navarrete, H. et al. A versatile pipeline for the multi-scale digital Facility. This work was supported by the German Federal Ministry of Research and
reconstruction and quantitative analysis of 3D tissue architecture. eLife 4, Education (BMBF) under the codes 031L0102 (de.NBI) and 031L0044 (Sysbio II) and
e11214 (2015). the Deutsche Forschungsgemeinschaft (DFG) under the code JU 3110/1-1. M.S. was
25. LeCun, Y. et al. Gradient-based learning applied to document recognition. supported by the German Center for Diabetes Research (DZD e.V.). T.B. was supported
Proc. IEEE 86, 2278–2324 (1998). by an ELBE postdoctoral fellowship and an Add-on Fellowship for Interdisciplinary
26. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, Life Sciences awarded by the Joachim Herz Stiftung. R.H. and S.C. were supported
436–44 (2015). by the following grants: UK BBSRC (grant nos. BB/M022374/1, BB/P027431/1, and
27. Beier, T. et al. Multicut brings automated neurite segmentation closer to BB/R000697/1), UK MRC (grant no. MR/K015826/1) and Wellcome Trust (grant no.
human performance. Nat. Methods 14, 101–102 (2017). 203276/Z/16/Z).
28. Caicedo, J. C. et al. Data-analysis strategies for image-based cell profiling.
Nat. Methods 14, 849–863 (2017).
29. Ounkomol, C. et al. Label-free prediction of three-dimensional
Author contributions
F.J. and E.W.M. shared last-authorship. M.W. and L.R. initiated the research. M.W. and
fluorescence images from transmitted-light microscopy. Nat. Methods 15,
U.S. designed and implemented the training and validation methods. U.S., M.W., and
917–920 (2018).
F.J. designed and implemented the uncertainty readouts. T.B., A.M., A.D., S.C., F.S.M.,
30. Christiansen, E. M. et al. In silico labeling: predicting fluorescent labels in
R.H., M.R.M., and A.J. collected experimental data. A.D., C.B., and F.J. performed
unlabeled images. Cell 173, 792–803 (2018).
cell segmentation analysis. T.B. performed analysis on flatworm data. U.S. and M.W.
31. Rivenson, Y. et al. Deep learning microscopy. Optica 4, 1437–1443 (2017).
designed and developed the Python package. F.J., B.W., and D.S. designed and developed
32. Nehme, E. et al. Deep-STORM: super-resolution single-molecule microscopy
the FIJI and KNIME integration. E.W.M. supervised the project. F.J., M.W., P.T., L.R.,
by deep learning. Optica 5, 458–464 (2018).
U.S., and E.W.M wrote the manuscript, with input from all authors.
33. Ouyang, W. et al. Deep learning massively accelerates super-resolution
localization microscopy. Nat. Biotechnol. 36, 460–468 (2018).
34. Ronneberger, O., Fischer, P. & Brox, T. U-Net: convolutional networks for Competing interests
biomedical image segmentation. In International Conference on Medical Image The authors declare no competing interests.
Computing and Computer Assisted Intervention (MICCAI) (eds Navab, N. et al.)
234–241 (Springer, Cham, 2015).
35. Shettigar, N. et al. Hierarchies in light sensing and dynamic interactions Additional information
between ocular and extraocular sensory networks in a flatworm. Sci. Adv. 3, Supplementary information is available for this paper at https://doi.org/10.1038/
e1603025 (2017). s41592-018-0216-7.
36. Mao, X.-J., Shen, C. & Yang, Y.-B. Image restoration using very deep Reprints and permissions information is available at www.nature.com/reprints.
convolutional encoder-decoder networks with symmetric skip connections.
Correspondence and requests for materials should be addressed to M.W. or L.R. or F.J.
In Advances in Neural Information Processing Systems (NIPS) Vol. 29
(eds Lee, D.D. et al.) 2802–2810 (Curran Associates, Red Hook, NY, 2016). Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in
37. Wang, Z. et al. Image quality assessment: from error visibility to structural published maps and institutional affiliations.
similarity. IEEE Trans. Image Process. 13, 600–612 (2004). © The Author(s), under exclusive licence to Springer Nature America, Inc. 2018

Nature Methods | VOL 15 | DECEMBER 2018 | 1090–1097 | www.nature.com/naturemethods 1097


Articles NATurE METhods

Methods different developmental stages. From that, we randomly sampled around 15,000
For each of the described experiments and restoration modalities, we (1) imaged patches of size 64 ×​  64 ×​ 16 and trained a 3D network as before. For testing, we used
or generated suitable training data, (2) trained a neural network (or ensemble of six additional volumes per condition, again acquired at different developmental
networks), and (3) applied the trained network and quantified/reported the results. stages. As a denoising baseline we again used NLM22. Nuclei segmentation was
performed using a thresholding-based segmentation workflow. To create the
Network architecture and training. For all experiments (except fly wing segmentation ground truth, we used ilastik to train a pixel-wise random forest
projection) we used residual versions of the U-Net architecture34 in 3D or 2D classifier to distinguish nuclei and background pixels in the high-SNR (GT)
(Supplementary Fig. 1 and Supplementary Fig. 13). For the fly wing projection image, whose output was curated using a combination of segmentation tools
task, we used a two-stage architecture combining a projection and a denoising from SciPy58, the 3D volume rendering software spimagine (https://github.com/
sub-network (Supplementary Fig. 9). All restoration experiments were performed maweigert/spimagine) and manual, pixel-wise corrections. To create segmentations
in Python using Keras50 and TensorFlow51. Source code for training and for restorations (NLM or CARE) of the low-SNR images (C2), we thresholded
prediction, example applications, and documentation can be found at their intensities and labeled connected components of pixels above the threshold as
http://csbdeep.bioimagecomputing.com/doc/. The training details for each individual nuclei. The segmentation accuracy was computed as the SEG score59, which
restoration experiment (e.g., number of used images, network hyper-parameters) corresponds to the average overlap of segmented regions with matched ground-truth
are listed in Supplementary Table 3 and are described in Supplementary Note 2. regions (0 ≤​ SEG ≤​ 1). More details can be found in Supplementary Note 2.

Image normalization. For training, prediction, and evaluation, it is important to Flywing projection, segmentation, and tracking. D. melanogaster expressing
normalize the input images to a common intensity range. We used percentile-based the membrane marker Ecad::GFP were raised under standard conditions at
normalization, typically using percentile ranks plow ∈​(1, 3) for determining the 25 °C. Pupae were collected and prepared for imaging as described in ref. 60. The
lowest value and phigh ∈​(99.5, 99.9) for the highest value. All image pixels are then dorsal side of the pupal wing was imaged with a Yokogawa CSU-X1 spinning disk
affinely scaled, such that the lowest and highest values are converted to 0 and 1, microscope using a Zeiss LCI Plan-Neofluar 63×​/1.3-NA Imm Corr objective.
respectively. For a given image y, the percentile-normalized image will be denoted We acquired image stacks at four different conditions: GT and C1–C3, with
by N(y, plow, phigh). camera exposure/laser power of 240 ms/20% (GT), 120 ms/2% (C1), 120 ms/3%
(C2), and 120 ms/5% (C3), where we again interleaved all conditions during
Quantification of restoration errors. Since the image y predicted by any imaging. For each condition, we acquired 180 different 3D stacks (of size
restoration method (CARE or any compared baseline) and the corresponding ∼​700 ×​  700 ×​ 50). As a prediction target we used the surface-projected 2D ground-
ground-truth image y0 typically differ in the dynamic range of their pixel values, truth signal obtained via PreMosa20 computed on data acquired with GT settings.
they have to be normalized to a common range first. To that end, we first For training we sampled around 17,000 random 3D patches of size 64 ×​  64 ×​  50
percentile-normalize the ground-truth image y0 as described before with plow =​  0.1 from the acquired stacks. For the composite task of joint projection and denoising,
and phigh =​ 99.9. Second, we use a transformation φ(y) =​  αy +​  β that affinely scales we designed a stacked network architecture consisting of a projection and a
and translates every pixel of the restored image based on parameters denoising sub-network (see Supplementary Fig. 9). We evaluated the restoration
quality on 26 previously unseen volumes, and compared results obtained with
α, β = argminα MSE(N (y0 , 0.1, 99.9), α′y + β′) CARE against maximum projection (MIP), smooth 2D manifold extraction
′ ,β′ (SME)21, minimum cost surface projection (GraphCut)61,62, and PreMosa20. For all
with competing methods (except CARE), we additionally applied NLM denoising22 to
the respective output (see Supplementary Fig. 11). To evaluate segmentation and
N
1 tracking results on restored stacks, we used a time-lapsed acquisition of 26 time
MSE(u, v ) =
N
∑ (ui−vi) 2 points imaged with the GT and C2 settings. To create a binary segmentation of
i =1 membrane and background regions, we used a random forest classifier (Trainable
Weka Segmentation63 plugin in Fiji64) that was trained on images with 30 manually
That is, α and β are chosen such that the mean squared error (MSE) between φ(y) labeled cells (membrane contour and corresponding non-membrane region inside)
and N(y0, 0.1, 99.9) is minimal (note that α, β can be easily computed in closed for both imaging settings. The probability maps generated by the classifier were
form). All final error metrics, such as NRMSE and SSIM37, were computed on processed with Tissue Analyzer65, a tool for tracking and segmentation of cells in
images normalized in this way. More details can be found in Supplementary Note 2. 2D epithelia, yielding a joint segmentation and tracking of cells over all frames.
For each frame we computed the SEG score based on the raw and restored images
Planaria denoising. Planaria (S. mediterranea) were cultured at 20 °C in planarian with respect to the ground truth (see Supplementary Fig. 12). For more details, see
water52 and fed with organic bovine liver paste. To label nuclei, S. mediterranea Supplementary Note 2.
samples were stained for 15 h in planarian water supplemented with 2×​ RedDot1
and 1% (v/v) dimethylsulfoxide (DMSO). For training data acquisition, planaria
were euthanized with 5% (w/v) N-acetyl-l-cysteine in PBS and subsequently fixed Drosophila isotropic restoration and segmentation. All input stacks were
in 4% (w/v) paraformaldehyde in PBS. For time-lapse recordings, RedDot1-stained provided by the authors of ref. 45, where histone-labeled D. melanogaster embryos
planaria were anesthetized for 1 h with 0.019% (w/v) Linalool prior to mounting, were imaged using a light-sheet microscope. Note that this dataset was already
which was maintained throughout the course of the live-imaging experiments. processed, but still exhibited an anisotropic PSF and a fivefold axial subsampling
A 5-min incubation in 0.5% (w/v) pH-neutralized N-acetyl-l-cysteine was used that translated into a combined 4–6-fold decrease in axial resolution. We used
to remove the animal’s mucus before mounting. For imaging, fixed or live animals the training data strategy as described in Supplementary Note 2 and ref. 43, where
were mounted in refractive-index-matched 1.5% agarose (50% (w/v) iodixanol) the 2D lateral slices are used as ground truth and are synthetically subsampled
to enhance signal quality at higher imaging depths as described in ref. 52. For by σ =​ 5 and blurred with the theoretical PSF of the light-sheet microscope. We
imaging, a spinning disc confocal microscope with a 30×​/1.05-NA (numerical used 15 volumes from equally spaced time points during development (between
aperture) silicon oil-immersion objective and 640-nm excitation wavelength was embryo cellularization and germband retraction), resulting in around 10,000
used. We used four different laser-power/exposure-time imaging conditions: GT training patches of size 128 ×​ 128. As network architecture we used a 2D U-Net
(ground truth) and C1–C3, specifically 2.31 mW/30 ms (GT), 0.12 mW/20 ms (C1), (Supplementary Fig. 13). To quantify the restoration quality, we computed the
0.12 mW/10 ms (C2), and 0.05 mW/10 ms (C3). To ensure that corresponding spectral isotropy ratio Φ as the ratio of spectral energy of the signal in the Fourier
image stacks were well aligned, we interleaved all four different imaging conditions domain along the axial and lateral dimension. To evaluate a nuclei segmentation
as different channels during acquisition. In total, we acquired 96 stacks of average task, we used a crop of a densely populated center region containing approximately
size 1,024 ×​  1,024 ×​ 400. From these data we sampled around 17,000 randomly 470 nuclei from an unseen test volume and generated ground-truth segmentation
positioned sub-volume pairs of size 64 ×​  64 ×​ 16 voxels. We evaluated our results on masks with ilastik employing extensive manual curation. We compared the
20 previously unseen volumes of S. mediterranea imaged at various developmental segmentability of network-restored images with bicubically upsampled images by
stages. As competing denoising methods, we chose lowpass filter, median filter, training a random forest classifier on both images using the GT masks as a target
bilateral filter53, non-local-means denoising (NLM)22, total variation denoising54, and generated instance segmentation via connected components of the thresholded
BM3D23, and BM4D55. Please see Supplementary Table 1 and Supplementary probability maps. To evaluate the segmentation, we computed a bipartite matching
Note 2 for more details. between proposed and ground-truth nuclei instances (intersection over union ≥​  0.5)
and used the fraction of unmatched nuclei as a measure of segmentation error.
Tribolium denoising and segmentation. An EFA::nGFP transgenic line of
Tribolium castaneum was used for imaging of embryonic development with Zebrafish retinal tissue isotropic restoration. Zebrafish (Danio rerio) imaging
labeled nuclei56. The beetles were reared and embryos were collected according to experiments were performed with a transgenic zebrafish line Tg(bactin:eGFP-
standard protocols57. Imaging was done on a Zeiss 710 multiphoton laser-scanning LAP2b) that labels the nuclear envelope. Embryos were raised in E3 medium at
microscope using a 25×​multi-immersion objective. Similar to the planaria dataset, 28.5 °C and treated with 0.2 mM 1-phenyl-2-thiourea at 8 hours post-fertilization
we used four different laser-power imaging conditions: GT and C1–C3, specifically (hpf) onward to delay pigmentation. Embryos were fixed at 24 hpf in 4%
20 mW (GT), 0.1 mW (C1), 0.2 mW (C2), and 0.5 mW (C3). For each condition paraformaldehyde, permeabilized with 0.25% trypsin, and incubated with a
we acquired 26 training stacks (of size ∼​700 ×​  700 ×​ 50) using different samples at far-red DNA stain (DRAQ5) for 2 d at 4 °C. Imaging of agarose-mounted embryos

Nature Methods | www.nature.com/naturemethods


NATurE METhods Articles
was performed on a spinning disk confocal microscope (Andor Revolution the case of regression, which allows computation of the accuracy/confidence
WD) with a 60×​/1.3-NA objective, using excitation wavelengths of λ =​ 638 nm curves and definition of an expected calibration error of a regression model
(DRAQ5) and λ =​ 488 nm (eGFP-LAP2b). Stacks were acquired with 2-μ​m steps, (see Supplementary Note 3). Furthermore, we quantified the normalized per-
resulting in an axial subsampling factor of σ =​ 10.2. For generating training data, pixel disagreement of a network ensemble via the average Kullback–Leibler
we acquired five multichannel volumes from which we extracted around 25,000 divergence between the individual network distributions and the ensemble mixture
lateral patches of size 128 ×​  128 ×​ 2, applied the corresponding theoretical PSF and distribution. This allowed us to highlight image regions with elevated disagreement
subsampling model, yet always keeping the information of both image channels. scores that may indicate unreliable network predictions (for example, for very
Network training was done as before. To compare the restoration quality with challenging low-SNR input; see Fig. 5 and Supplementary Fig. 28). For an extensive
classical deconvolution, we ran Huygens (Scientific Volume Imaging, http://svi.nl) and detailed discussion including all derivations, see Supplementary Note 3.
on the bicubic upsampled raw stacks once with the actual PSF and once with a
σ-fold down- and upsampled PSF (to account for the additional blur related to Reporting summary. Further information on research design is available in the
upsampling). We used the following parameters from Huygens: method, MLE; Nature Research Reporting Summary linked to this article.
number iteration, 70; SNR parameter, 15; quality threshold, 0.05.
Data availability
Mouse liver isotropic restoration. Mouse livers were fixed through transcardial Training and test data for all experiments presented can be found at
perfusion with 4% paraformaldehyde and post-fixed overnight at 4 °C with the https://publications.mpi-cbg.de/publications-sites/7207. The code for network
same solution. Tissue slices were optically cleared by a modified version of SeeDB66 training and prediction (in Python/TensorFlow) is publicly available at
and stained with 4′​,6-diamidino-2-phenylindole (DAPI) (nuclei) and phalloidin https://github.com/CSBDeep/CSBDeep. Furthermore, to make our restoration
(membrane). The samples were imaged using a Zeiss LSM 780 NLO multiphoton models readily available, we developed user-friendly FIJI plugins and KNIME
laser-scanning microscope with a 63×​/1.3-NA glycerol-immersion objective workflows (Supplementary Figs. 29 and 30).
(Zeiss) using 780-nm two-photon excitation and an isotropic voxel size of 0.3 μ​m.
We acquired eight stacks of mouse liver each of size 752 ×​  752 ×​ 300. For the range
of subsampling factors σ =​  2,…​,16, we created respective axial anisotropic stacks by References
retaining only every σth axial slice from the original volumes to be restored later. 50. Chollet, F. et al. Keras https://keras.io (2015).
For each σ, we extracted around 15,000 patches of size 128 ×​ 128 from the given 51. Abadi, M. et al. Tensorflow: a system for large-scale machine learning. In
body of data and trained a network as described before. Refer to Supplementary Proceedings. 12th USENIX Symposium on Operating Systems Design and
Note 2 for more details. Implementation (OSDI) (eds Keeton, K. & Roscoe, T.) 265–283 (2016).
52. Boothe, T. et al. A tunable refractive index matching medium for live imaging
INS-1 cell tubular/granule restoration. Rat insulin-secreting beta cells (INS-1 cells, tissues and model organisms. eLife 6, e27240 (2017).
cells) were cultured and transiently transfected with pEG-hIns-SNAP as previously 53. Tomasi, C. & Manduchi, R. Bilateral filtering for gray and color images.
described67. The cells were labeled with 1 μ​M SNAP-Cell 505-Star (secretory In Sixth International Conference on Computer Vision 839–846 (IEEE,
granules) and with 1 μ​M SiR-tubulin (microtubules) for 1 h. Imaging was done New York, 1998).
with the DeltaVision OMX (GE) microscope using an Olympus Plan-Apochromat 54. Chambolle, A. An algorithm for total variation minimization and
60×​/1.43-NA objective, yielding dual channel images. Time-lapse movies were applications. J. Math. Imaging Vis. 20, 89–97 (2004).
acquired in widefield mode with 50-ms exposure time and 10% fluorescence 55. Maggioni, M. et al. Nonlocal transform-domain filter for volumetric
intensity for each channel resulting in a final speed of 2 frames per second (fps). data denoising and reconstruction. IEEE Trans. Image Process. 22,
Deconvolution was done with the SoftWorkx software package running on-board 119–133 (2013).
the OMX. We created synthetic ground-truth images of tubular networks and 56. Sarrazin, A. F., Peel, A. D. & Averof, M. A segmentation clock with
granular structures by simulating 2D trajectories and granular points on a pixel two-segment periodicity in insects. Science 336, 338–341 (2012).
grid, respecting the known physical properties (for example, microtubule width 57. Brown, S. J. et al. The red flour beetle, Tribolium castaneum (Coleoptera): a
and persistence length). We generated the corresponding synthetic widefield model for studies of development and PestBiology. Cold Spring Harb. Protoc.
input images by adding low-frequency Perlin noise mimicking auto-fluorescence, https://doi.org/10.1101/pdb.emo126 (2009).
convolving the result with the theoretical PSF of the microscope, and adding 58. Jones, E. et al. SciPy: Open Source Scientific Tools for Python http://www.scipy.
Poisson and Gaussian noise mimicking camera noise. In total, we created around org (2001).
8,000 synthetic patch-pairs of size 128 ×​ 128. For both secretory granules and 59. Maška, M. et al. A benchmark for comparison of cell tracking algorithms.
microtubules, we trained a 2D network (as before) to invert this degradation Bioinformatics 30, 1609–1617 (2014).
process and applied it on the respective channel of the widefield images 60. Classen, A.-K., Aigouy, B., Giangrande, A. & Eaton, S. Imaging Drosophila
(Supplementary Fig. 13). More details can be found in Supplementary Note 2. pupal wing morphogenesis. Methods Mol. Biol. 420, 265–275 (2008).
61. Li, K. et al. Optimal surface segmentation in volumetric images—a
HeLa cell microtubule restoration and error map calculation. HeLa cells stably graph-theoretic approach. IEEE Trans. Pattern Anal. Mach. Intell. 28,
expressing H2B-mCherry/mEGFP-α​-tubulin68 were grown in DMEM containing 119–134 (2006).
10% FBS, 100 U ml–1 penicillin and 100 mg ml–1 streptomycin at 37 °C with 5% 62. Wu, X. & Chen, D. Z. Optimal net surface problems with applications. In
CO2 in a humidified incubator. Before imaging cells were seeded onto a #1.5 glass- International Colloquium on Automata, Languages, and Programming
bottom 35-mm u-Dish. Imaging was performed on a Zeiss Elyra PS.1 inverted (Springer, 2002).
microscope at 37 °C and 5% CO2 in total internal reflection fluorescence mode with 63. Arganda-Carreras, I. et al. Trainable Weka Segmentation: a machine
a Plan-Apochromat 100×​/1.46-NA oil-immersion objective (Zeiss) and additional learning tool for microscopy pixel classification. Bioinformatics 33,
1.6×​magnification with 488-nm laser illumination at an on-sample intensity of 2424–2426 (2017).
<​10 W cm–2. We created synthetic microtubule training data as described before, 64. Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis.
resulting in around 5,000 patch-pairs of size 128 ×​ 128, and trained a 2D network Nat. Methods 9, 676–682 (2012).
as described before. Super-resolution images were reconstructed via SRRF14. Error 65. Aigouy, B., Umetsu, D. & Eaton, S. Segmentation and quantitative analysis of
maps for both SRRF and CARE restoration were computed with SQUIRREL48 epithelial tissues. In Drosophila: Methods and Protocols (ed Dahmann, C.)
against the widefield reference frames. 227–239 (Humana Press, New York, 2016).
66. Ke, M.-T., Fujimoto, S. & Imai, T. SeeDB: a simple and morphology-
Reliability measures and calibration. To model the inherent (aleatoric) preserving optical clearing agent for neuronal circuit reconstruction.
uncertainty of intensity predictions, we adapted the final layers of the network Nat. Neurosci. 16, 1154–1161 (2013).
to output a custom probability distribution for every pixel of the restored image, 67. Ivanova, A. et al. Age-dependent labeling and imaging of insulin secretory
instead of just a scalar value. Specifically, the network predicted the parameters granules. Diabetes 62, 3687–3696 (2013).
μ and σ of a Laplace distribution, 68. Mchedlishvili, N. et al. Kinetochores accelerate centrosome separation to
ensure faithful chromosome segregation. J. Cell Sci. 125, 906–918 (2012).
1 69. Lakshminarayanan, B., Pritzel, A. & Blundell, C. Simple and scalable
p (z ; μ , σ ) = exp(−∣z−μ∣∕σ )
2σ predictive uncertainty estimation using deep ensembles. In Advances in
Neural Information Processing Systems 30 (eds Guyon, I. et al.) 6402–6413
for intensity value z. To represent the (epistemic) model uncertainty for a (Curran Associates, Red Hook, NY, 2017).
specific experiment, we trained an ensemble of M networks (for example, 70. Guo, C., Pleiss, G., Sun, Y. & Weinberger, K. Q. On calibration of modern neural
M =​ 5) and averaged their results (as a mixture model; see ref. 69). We validated networks. In Proc. 34th International Conference on Machine Learning (ICML)
our probabilistic approach by adapting the concept of a calibrated classifier70 to (eds Precup, D. & Teh, Y. W.) 1321–1330 (PMLR, Cambridge, MA, 2017).

Nature Methods | www.nature.com/naturemethods


nature research | life sciences reporting summary
Corresponding author(s): Martin Weigert, Loic Royer, Florian Jug

Life Sciences Reporting Summary


Nature Research wishes to improve the reproducibility of the work that we publish. This form is intended for publication with all accepted life
science papers and provides structure for consistency and transparency in reporting. Every life science submission will use this form; some list
items might not apply to an individual manuscript, but all fields must be completed for clarity.
For further information on the points included in this form, see Reporting Life Sciences Research. For further information on Nature Research
policies, including our data availability policy, see Authors & Referees and the Editorial Policy Checklist.

Please do not complete any field with "not applicable" or n/a. Refer to the help text for what text to use if an item is not relevant to your study.
For final submission: please carefully check your responses for accuracy; you will not be able to make changes later.

` Experimental design
1. Sample size
Describe how sample size was determined. For each of the experiments we chose the sample size such that training and test images are
representative of the variability seen across all developmental time-points.

2. Data exclusions
Describe any data exclusions. No data was excluded from the manuscript.

3. Replication
Describe the measures taken to verify the reproducibility For all experiments we provide the code and training data to retrain all used models and
of the experimental findings. reproduce the findings.

4. Randomization
Describe how samples/organisms/participants were For each experiment, the held-out test set was randomly selected from the corpus of data.
allocated into experimental groups.
5. Blinding
Describe whether the investigators were blinded to For each experiment, the restoration models were optimized based on a random validation
group allocation during data collection and/or analysis. set that had no overlap with the held-out test set on which the final evaluation results were
based.
Note: all in vivo studies must report how sample size was determined and whether blinding and randomization were used.

6. Statistical parameters
For all figures and tables that use statistical methods, confirm that the following items are present in relevant figure legends (or in the
Methods section if additional space is needed).

n/a Confirmed

The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement (animals, litters, cultures, etc.)
A description of how samples were collected, noting whether measurements were taken from distinct samples or whether the same
sample was measured repeatedly
A statement indicating how many times each experiment was replicated
The statistical test(s) used and whether they are one- or two-sided
Only common tests should be described solely by name; describe more complex techniques in the Methods section.

A description of any assumptions or corrections, such as an adjustment for multiple comparisons


Test values indicating whether an effect is present
November 2017

Provide confidence intervals or give results of significance tests (e.g. P values) as exact values whenever appropriate and with effect sizes noted.

A clear description of statistics including central tendency (e.g. median, mean) and variation (e.g. standard deviation, interquartile range)
Clearly defined error bars in all relevant figure captions (with explicit mention of central tendency and variation)
See the web collection on statistics for biologists for further resources and guidance.

1
` Software

nature research | life sciences reporting summary


Policy information about availability of computer code
7. Software
Describe the software used to analyze the data in this tensorflow (1.4.0), keras (2.0.0), python (3.5), pandas, Fiji/ImageJ, Adobe Illustrator (CS5),
study. spimagine 0.2.5, iMovie (10.1.6), softWoRx Version 6.5.2., Huygens Professional 17.10.0,
Andor iQ version 3.4.1,
Training and application code is/will be made available on github and
csbdeep.bioimagecomputing.com.

For manuscripts utilizing custom algorithms or software that are central to the paper but not yet described in the published literature, software must be made
available to editors and reviewers upon request. We strongly encourage code deposition in a community repository (e.g. GitHub). Nature Methods guidance for
providing algorithms and software for publication provides further information on this topic.

` Materials and reagents


Policy information about availability of materials
8. Materials availability
Indicate whether there are restrictions on availability of No unique materials/reagents were used.
unique materials or if these materials are only available
for distribution by a third party.
9. Antibodies
Describe the antibodies used and how they were validated For all antibodies, as applicable, provide supplier name, catalog number, clone name, and lot
for use in the system under study (i.e. assay and species). number. Also describe the validation of each primary antibody for the species and application,
noting any validation statements on the manufacturer’s website, relevant citations, antibody
profiles in online databases, or data provided in the manuscript OR state that no antibodies
were used.

10. Eukaryotic cell lines


a. State the source of each eukaryotic cell line used. - INS-1 cells donated from Claes Wohlheim (Geneva)
- HeLa cells, image data taken from Gustafsson, Nat Com, 2016

b. Describe the method of cell line authentication used. None of the cell lines have been authenticated.

c. Report whether the cell lines were tested for INS-1 cells were tested for mycoplasma contamination (negative).
mycoplasma contamination.
d. If any of the cell lines used are listed in the database INS-1 cells are not listed in the database.
of commonly misidentified cell lines maintained by
ICLAC, provide a scientific rationale for their use.

` Animals and human research participants


Policy information about studies involving animals; when reporting animal research, follow the ARRIVE guidelines
11. Description of research animals
Provide all relevant details on animals and/or - Schmidtea mediterranea, RedDot1 staining, asexual, clonal strain CIW4, 3 weeks after
animal-derived materials used in the study. amputation
- Tribolium castaneum, EFA::nGFP transgenic line, unknown sex (embryos), age between
10-48 hrs
- Drosophila melanogaster (projection), Ecad::GFP transgenic line, male and female, between
16 and 26 hours APF (after puparium formation).
- Drosophila melanogaster (isotropic restoration), His2Av-mRFP1 line, timelapse of 2-4 hpf.
- Danio rerio (isotropic restoration), bactin:eGFP- LAP2B transgenic line, 24 hpf, mixed sex
(before sex determination)
- Mus musculus liver (isotropic restoration), fixed C57BL/6JOlaHsd mice, 9 weeks old, male
- The stable INS-1 cell line (restoration of diffraction limited structures) was donated by Claes
Wohlheim (Geneva), and transfected with pEG-hIns-SNAP.
The HeLa H2B-mCherry/mEGFP-a-tubulin stable cell line (restoration of diffraction limited
structures) was kindly provided by Dr. Buzz Baum, MRC-LMCB, UCL. (doi: 10.1242/
November 2017

jcs.091967).

Policy information about studies involving human research participants


12. Description of human research participants
Describe the covariate-relevant population The study did not involve human research participants.
characteristics of the human research participants.

You might also like