Download the Full Version of textbook for Fast Typing at textbookfull.
com
Statistical Analysis in Proteomics 1st Edition
Klaus Jung (Eds.)
https://textbookfull.com/product/statistical-analysis-in-
proteomics-1st-edition-klaus-jung-eds/
OR CLICK BUTTON
DOWNLOAD NOW
Download More textbook Instantly Today - Get Yours Now at textbookfull.com
Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.
Statistical independence in probability analysis number
theory Kac
https://textbookfull.com/product/statistical-independence-in-
probability-analysis-number-theory-kac/
textboxfull.com
Statistical analysis of contingency tables 1st Edition
Fagerland
https://textbookfull.com/product/statistical-analysis-of-contingency-
tables-1st-edition-fagerland/
textboxfull.com
The Quotable Jung C. G. Jung
https://textbookfull.com/product/the-quotable-jung-c-g-jung/
textboxfull.com
Statistical Data Analysis using SAS Intermediate
Statistical Methods Mervyn G. Marasinghe
https://textbookfull.com/product/statistical-data-analysis-using-sas-
intermediate-statistical-methods-mervyn-g-marasinghe/
textboxfull.com
Statistical Analysis of Financial Data: With Examples In R
1st Edition James Gentle
https://textbookfull.com/product/statistical-analysis-of-financial-
data-with-examples-in-r-1st-edition-james-gentle/
textboxfull.com
Statistical analysis of ecotoxicity studies First Edition
Green
https://textbookfull.com/product/statistical-analysis-of-ecotoxicity-
studies-first-edition-green/
textboxfull.com
Conscience in Action Kim Dae-Jung
https://textbookfull.com/product/conscience-in-action-kim-dae-jung/
textboxfull.com
Proteomics in Biology Part A 1st Edition Arun K. Shukla
https://textbookfull.com/product/proteomics-in-biology-part-a-1st-
edition-arun-k-shukla/
textboxfull.com
Statistical Methods An Introduction to Basic Statistical
Concepts and Analysis 2nd Edition Cheryl Ann Willard
https://textbookfull.com/product/statistical-methods-an-introduction-
to-basic-statistical-concepts-and-analysis-2nd-edition-cheryl-ann-
willard/
textboxfull.com
Methods in
Molecular Biology 1362
Klaus Jung Editor
Statistical
Analysis
in Proteomics
METHODS IN MOLECULAR BIOLOGY
Series Editor
John M. Walker
School of Life and Medical Sciences
University of Hertfordshire
Hatfield, Hertfordshire, AL10 9AB, UK
For further volumes:
http://www.springer.com/series/7651
Statistical Analysis in Proteomics
Edited by
Klaus Jung
Department of Medical Statistics, University Medical Center Göttingen,
Göttingen, Germany
Editor
Klaus Jung
Department of Medical Statistics
University Medical Center Göttingen
Göttingen, Germany
ISSN 1064-3745 ISSN 1940-6029 (electronic)
Methods in Molecular Biology
ISBN 978-1-4939-3105-7 ISBN 978-1-4939-3106-4 (eBook)
DOI 10.1007/978-1-4939-3106-4
Library of Congress Control Number: 2015952312
Springer New York Heidelberg Dordrecht London
© Springer Science+Business Media New York 2016
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction
on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation,
computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not
imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and
regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to
be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty,
express or implied, with respect to the material contained herein or for any errors or omissions that may have been made.
Cover illustration: For the complete image, please see Figure 2 of Chapter 3.
Printed on acid-free paper
Humana Press is a brand of Springer
Springer Science+Business Media LLC New York is part of Springer Science+Business Media (www.springer.com)
Preface
Among the high-throughput technologies that are currently used in biomedical research,
those used in proteomics are perhaps the oldest. While mass spectrometry and 2-D gel
electrophoresis were already used in the 1980s for simultaneous measuring of the abun-
dance of multiple proteins, statistical methods for the analysis of high-throughput data
experienced their great evolution first with the development of DNA microarrays in the
mid-1990s.
Although there is a large overlap between statistical methods for the different “omics”
fields, methods for analyzing data from proteomics experiments need their own specific
adaptations. Therefore, the aim of this book is to provide a collection of frequently used
statistical methods in the field of proteomics. This book is designated to statisticians who are
involved in the planning and analysis of proteomics experiments, beginners, as well as
advanced researchers. It is also designated to biologists, biochemists, and medical researchers
who want to learn more about the statistical opportunities in the analysis of proteomics data.
The different chapters of this book focus on the planning of proteomics experiments,
the preprocessing and analysis of the data, the integration of proteomics data with other
high-throughput data, as well as some special topics. For statisticians who are new in the
area of proteomics, the first chapter provides a detailed overview of the laboratory tech-
niques used in this exciting research area.
Göttingen, Germany Klaus Jung
v
Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
PART I PROTEOMICS, STUDY DESIGN, AND DATA PROCESSING
1 Introduction to Proteomics Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Christof Lenz and Hassan Dihazi
2 Topics in Study Design and Analysis for Multistage Clinical
Proteomics Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Irene Sui Lan Zeng
3 Preprocessing and Analysis of LC-MS-Based
Proteomic Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Tsung-Heng Tsai, Minkun Wang, and Habtom W. Ressom
4 Normalization of Reverse Phase Protein Microarray Data:
Choosing the Best Normalization Analyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Antonella Chiechi
5 Outlier Detection for Mass Spectrometric Data . . . . . . . . . . . . . . . . . . . . . . . . 91
HyungJun Cho and Soo-Heang Eo
PART II GROUP COMPARISONS
6 Visualization and Differential Analysis of Protein Expression
Data Using R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Tomé S. Silva and Nadège Richard
7 False Discovery Rate Estimation in Proteomics . . . . . . . . . . . . . . . . . . . . . . . . 119
Suruchi Aggarwal and Amit Kumar Yadav
8 A Nonparametric Bayesian Model for Nested Clustering . . . . . . . . . . . . . . . . . 129
Juhee Lee, Peter Müller, Yitan Zhu, and Yuan Ji
9 Set-Based Test Procedures for the Functional Analysis
of Protein Lists from Differential Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Jochen Kruppa and Klaus Jung
PART III CLASSIFICATION METHODS
10 Classification of Samples with Order-Restricted Discriminant Rules . . . . . . . . . 159
David Conde, Miguel A. Fernández, Bonifacio Salvador,
and Cristina Rueda
11 Application of Discriminant Analysis and Cross-Validation
on Proteomics Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Julia Kuligowski, David Pérez-Guaita, and Guillermo Quintás
12 Protein Sequence Analysis by Proximities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Frank-Michael Schleif
vii
viii Contents
PART IV DATA INTEGRATION
13 Statistical Method for Integrative Platform Analysis:
Application to Integration of Proteomic and Microarray Data . . . . . . . . . . . . . 199
Xin Gao
14 Data Fusion in Metabolomics and Proteomics for Biomarker Discovery. . . . . . 209
Lionel Blanchet and Agnieszka Smolinska
PART V SPECIAL TOPICS
15 Reconstruction of Protein Networks Using Reverse-Phase
Protein Array Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Silvia von der Heyde, Johanna Sonntag, Frank Kramer,
Christian Bender, Ulrike Korf, and Tim Beißbarth
16 Detection of Unknown Amino Acid Substitutions
Using Error-Tolerant Database Search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Sven H. Giese, Franziska Zickmann, and Bernhard Y. Renard
17 Data Analysis Strategies for Protein Modification Identification . . . . . . . . . . . . 265
Yan Fu
18 Dissecting the iTRAQ Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Suruchi Aggarwal and Amit Kumar Yadav
19 Statistical Aspects in Proteomic Biomarker Discovery . . . . . . . . . . . . . . . . . . . 293
Klaus Jung
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Contributors
SURUCHI AGGARWAL • Immunology Group, International Centre for Genetic Engineering
and Biotechnology, New Delhi, India
TIM BEIßBARTH • Department of Medical Statistics, University Medical Center Göttingen,
Göttingen, Germany
CHRISTIAN BENDER • TRON-Translational Oncology at the University Medical Center
Mainz, Mainz, Germany
LIONEL BLANCHET • Analytical Chemistry-Chemometrics, Institute for Molecules
and Materials, Radboud University Nijmegen, Nijmegen, The Netherlands; Department
of Biochemistry, Nijmegen Centre for Molecular Life Sciences, Radboud University
Medical Centre, Nijmegen, The Netherlands
ANTONELLA CHIECHI • Department of Medicine, Indiana University School of Medicine,
Indianapolis, IN, USA
HYUNGJUN CHO • Department of Statistics, Korea University, Seoul, South Korea
DAVID CONDE • Departamento de Estadística e Investigación Operativa, Facultad de
Ciencias, Universidad de Valladolid, Valladolid, Spain
HASSAN DIHAZI • Clinic of Nephrology and Rheumatology, University Medical Center
Göttingen, Göttingen, Germany
SOO-HEANG EO • Department of Statistics, Korea University, Seoul, South Korea
MIGUEL A. FERNÁNDEZ • Departamento de Estadística e Investigación Operativa,
Facultad de Ciencias, Universidad de Valladolid, Valladolid, Spain
YAN FU • National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory
of Random Complex Structures and Data Science, Academy of Mathematics and Systems
Science, Chinese Academy of Sciences, Beijing, China
XIN GAO • Department of Mathematics and Statistics, York University, Toronto, ON, Canada
SVEN H. GIESE • Research Group Bioinformatics (NG4), Robert Koch-Institute, Berlin,
Germany; Department of Bioanalytics, Institute of Biotechnology, Technische Universität
Berlin, Berlin, Germany; Wellcome Trust Centre for Cell Biology, School of Biological
Sciences, University of Edinburgh, Edinburgh, UK
SILVIA VON DER HEYDE • Department of Medical Statistics, University Medical Center
Göttingen, Göttingen, Germany; IndivuTest GmbH, Hamburg, Germany
YUAN JI • Department of Health Studies, The University of Chicago, Chicago, IL, USA
KLAUS JUNG • Department of Medical Statistics, Georg-August-University Göttingen,
Göttingen, Germany
ULRIKE KORF • Division of Molecular Genome Analysis, German Cancer Research Center
(DKFZ), Heidelberg, Germany
FRANK KRAMER • Department of Medical Statistics, University Medical Center Göttingen,
Göttingen, Germany
JOCHEN KRUPPA • Department of Medical Statistics, Georg-August-University Göttingen,
Göttingen, Germany
JULIA KULIGOWSKI • Neonatal Research Centre, Health Research Institute La Fe,
Valencia, Spain
ix
x Contributors
JUHEE LEE • Department of Applied Mathematics and Statistics, Santa Cruz, CA, USA
CHRISTOF LENZ • Bioanalytical Mass Spectrometry, Max Planck Institute for Biophysical
Chemistry, Göttingen, Germany; Core Facility Proteomics, Institute of Clinical Chemistry,
University Medical Center, Göttingen, Germany
PETER MÜLLER • Department of Mathematics, Austin, TX, USA
DAVID PÉREZ-GUAITA • Centre for Biospectroscopy, School of Chemistry, Monash University,
Clayton, Australia
GUILLERMO QUINTÁS • Safety and sustainability Division, Leitat Technological Center,
Valencia, Spain; Analytical Unit, Health Research Institute La Fe, Valencia, Spain
BERNHARD Y. RENARD • Research Group Bioinformatics (NG4), Robert Koch-Institute,
Berlin, Germany
HABTOM W. RESSOM • Department of Oncology, Lombardi Comprehensive Cancer Center,
Georgetown University Medical Center, Washington, DC, USA
NADÈGE RICHARD • CCMAR, Centre of Marine Sciences of Algarve, University of Algarve,
Faro, Portugal
CRISTINA RUEDA • Departamento de Estadística e Investigación Operativa, Facultad de
Ciencias, Universidad de Valladolid, Valladolid, Spain
BONIFACIO SALVADOR • Departamento de Estadística e Investigación Operativa, Facultad
de Ciencias, Universidad de Valladolid, Valladolid, Spain
FRANK-MICHAEL SCHLEIF • School of Computer Science, University of Birmingham,
Edgbaston, Birmingham, UK
TOMÉ S. SILVA • SPAROS Lda., Olhão, Portugal
AGNIESZKA SMOLINSKA • Department of Toxicology, Nutrition and Toxicology Research
Institute Maastricht (NUTRIM), Maastricht University, Maastricht, The Netherlands
JOHANNA SONNTAG • Division of Molecular Genome Analysis, German Cancer Research
Center (DKFZ), Heidelberg, Germany
TSUNG-HENG TSAI • Department of Oncology, Lombardi Comprehensive Cancer Center,
Georgetown University Medical Center, Washington, DC, USA; Bradley Department
of Electrical and Computer Engineering, Virginia Tech, Arlington, VA, USA
MINKUN WANG • Department of Oncology, Lombardi Comprehensive Cancer Center,
Georgetown University Medical Center, Washington, DC, USA; Bradley Department
of Electrical and Computer Engineering, Virginia Tech, Arlington, VA, USA
AMIT KUMAR YADAV • Drug Discovery Research Center (DDRC), Translational Health
Science and Technology Institute, Faridabad, Haryana, India
IRENE SUI LAN ZENG • The Department of Statistics, The University of Auckland, Auckland,
New Zealand
YITAN ZHU • Program for Computational Genomics and Medicine Research Institute,
NorthShore University HealthSystem, Evanston, IL, USA
FRANZISKA ZICKMANN • Research Group Bioinformatics (NG4), Robert Koch-Institute,
Berlin, Germany
Part I
Proteomics, Study Design, and Data Processing
Chapter 1
Introduction to Proteomics Technologies
Christof Lenz and Hassan Dihazi
Abstract
Compared to genomics or transcriptomics, proteomics is often regarded as an “emerging technology,” i.e.,
as not having reached the same level of maturity. While the successful implementation of proteomics work-
flows and technology still requires significant levels of expertise and specialization, great strides have been
made to make the technology more powerful, streamlined and accessible. In 2014, two landmark studies
published the first draft versions of the human proteome.
We aim to provide an introduction specifically into the background of mass spectrometry (MS)-based
proteomics. Within the field, mass spectrometry has emerged as a core technology. Coupled to increasingly
powerful separations and data processing and bioinformatics solution, it allows the quantitative analysis of
whole proteomes within a matter of days, a timescale that has made global comparative proteome studies
feasible at last. We present and discuss the basic concepts behind proteomics mass spectrometry and the
accompanying topic of protein and peptide separations, with a focus on the properties of datasets emerging
from such studies.
Key words Proteomics, 2-DE, Electrophoresis, Mass spectrometry, Separations
1 Introduction
The term “proteomics” in its original meaning denotes the study
of the entire observable protein complement (or proteome) of a
biological system, be it a relatively homogeneous microbial cell
culture or a tissue sample obtained from a hospital patient. When
Marc Wilkins first coined the term “proteome” in 1994, however,
proteomics was a distant goal rather than a tangible technological
reality. Even the identification of a few tens of proteins would take
researchers weeks to months of work, let alone the assessment of
their quantities or modification status. Over the past 20 years how-
ever proteomics has grown from a promise into a mature set of
technologies that has allowed for example the publication of first
full draft versions of the human proteome in 2014 [1, 2]. Virtually
all aspects of proteome analysis have seen huge improvements,
from sample preparation, protein and peptide separations, detection
and quantitative analysis especially by mass spectrometry which has
Klaus Jung (ed.), Statistical Analysis in Proteomics, Methods in Molecular Biology, vol. 1362,
DOI 10.1007/978-1-4939-3106-4_1, © Springer Science+Business Media New York 2016
3
4 Christof Lenz and Hassan Dihazi
emerged as a core proteomics technology, to the statistical and
bioinformatic analysis of the large and multilayered datasets that a
global “omics” approach produces.
Following technological progress, Tyers and Mann in 2003
redefined proteomics as “almost everything post-genomic: the
study of the proteome in any given cell and the set of all protein
isoforms and modifications, the interactions between them, the
structural description of proteins and their higher-order complexes”
[3]. While the genome of an organism is considered to be mostly
static, the proteome shows dynamic properties with protein profiles
changing in dependence of time and a variety of extracellular and
intracellular stimuli (i.e., cell cycle, temperature, differentiation,
stress, apoptotic signals). The realization that the proteome is highly
dynamic in turn led to an increased demand for quantitative infor-
mation, as information about the detectability of a protein was
superseded by information about relative changes in its abundance,
modification status, localization, and interaction partners [4].
Finally, an increased appreciation of the complexity of the pro-
teome led to a refinement of our understanding what defines a
protein. The seemingly simple concept of “DNA makes RNA
makes proteins” does not describe the observed complexity of pro-
teins in its entirety. While the huge success of genome sequence
projects over the past decades has certainly been a prerequisite for
the progress observed in proteomics [4], there is a plethora of
parameters defining the biological role of a protein that are not
determined by the gene(s) encoding its sequence, e.g., splicing
events, enzymatic processing, or posttranslational modifications
[5]. Consequently the term “protein species” is finding increased
use as it more accurately describes protein diversity [6, 7].
In addition, there is currently no amplification technology for
proteins comparable to PCR. The huge dynamic range observed
for protein quantities in biological samples immediately translates
into dynamic range requirements for any analytical approach to
proteomics samples, necessitating elaborate separation and enrich-
ment strategies to simplify biological specimens [8].
In this introduction we discuss some of the major technical
and experimental approaches that are taken in proteomics research
today, and discuss how the structure of the resulting data influ-
ences bioinformatics approaches to generate knowledge from these
data. Special focus is given to protein and peptide separations, and
to mass spectrometry which has emerged as a key protein detection
and quantitation technology in proteomics research.
2 Separation Technologies in Proteomics
2.1 Bottom-Up Separations are a central feature of all analytical strategies in pro-
Versus Top-Down teomics. The proteins contained in any biological specimen may be
Proteomics separated and analyzed either on the intact protein level or on the
Proteomics Technologies 5
peptide level following endoproteinase digestion. Digestion to
peptides has many analytical benefits that have improved the per-
formance of proteomics workflows, especially if mass spectrometry
is used for detection. On the level of sample handling and separa-
tions, peptides generated by for example trypsin digestion of pro-
teins are a far more homogeneous group of analytes than the
underlying proteins with regard to molecular weight, hydropho-
bicity and solvent solubility, since they do mostly not exhibit any
significant higher order structure. In addition they show a much
more controlled charging behavior under controlled pH condi-
tions, and will in the majority not be modified by for example gly-
cosylation sites. Consequently many peptide separations show
much higher resolution than protein separations, especially where
chromatography-based separations are concerned.
In addition, mass spectrometry as the most frequently used
detection principle in proteomics heavily favors peptides over pro-
teins. Peptides show a more uniform and efficient ionization and
charging behavior than proteins, produce better response on several
types of mass spectrometer detectors and, most importantly, can be
routinely fragmented by ion activation techniques to provide
sequence and structure information. Taken together the detection
of for example tryptic peptides in complex mixtures by modern
mass spectrometry equipment is orders of magnitude more sensi-
tive than the detection of proteins. Therefore the most common
approach in proteomics is to prepare and separate proteins, digest
them with endoproteinases, separate the resulting peptides yet
again, and analyze them for identity, modification state, and quan-
tity by mass spectrometry. In addition, enrichment strategies used
to target low abundance subpopulations maybe employed. This
so-called “bottom-up” approach comes with its own challenges:
digestion multiplies the number of analytes in the sample (e.g.,
2000 protein will produce an average of >100,000 peptides on
tryptic digestion), and it is not always straightforward to back-
assign a digest peptide to the protein it originated from, a problem
referred to as “protein inference” [5]. Still, the benefits outweigh
these challenges by far, making “bottom-up” analysis the prevalent
approach in proteomics as compared to the “top-down” approach
where proteins are treated and analyzed in their intact state through-
out [9, 10]. After discussing options for protein and peptide separa-
tions, we will then focus on the “bottom-up” approach and the
principles applied to peptide analysis by mass spectrometry.
2.2 Protein Level The approach most frequently taken for protein separation is still
Separations SDS-PAGE. Proteins are dissolved in a buffer containing sodium
dodecyl sulfate (SDS), and the resulting negatively charged adducts
2.2.1 Sodium Dodecyl
are pulled through a gel of a defined polymerization degree (or
Sulfate Polyacrylamide Gel
pore size range) by electrophoretic migration. The separation is
Electrophoresis
achieved according to the apparent molecular weight, or rather the
(SDS-PAGE)
hydrodynamic radius of the resulting protein-SDS adducts [11].
6 Christof Lenz and Hassan Dihazi
SDS-PAGE is compatible with a very wide range of protein solubi-
lization and sample handling requirements, making it a very good
choice for the separation of for example very hydrophobic integral
membrane proteins. After staining with Coomassie or silver stain-
ing, entire lanes covering a broad range of apparent molecular
weight can be investigated. Depending on the scientific task, only
proteins from defined MW regions can be investigated, and results
can easily be correlated with for example Western blot analysis.
One of the shortcomings of SDS-PAGE as a one-dimensional sep-
aration is its limited resolution, which does not allow to detect and
separate more than a few tens of bands at best. Consequently it has
to be combined with other separations strategies either on the pro-
tein or—after endoproteinase digestion—on the peptide level to
successfully analyze complex proteome samples.
2.2.2 Two-Dimensional High-resolution two-dimensional polyacrylamide gel electropho-
Gel Electrophoresis (2-DE) resis (2D PAGE) is a commonly applied separation technique in
proteomics, and has been one of its driving forces for decades
[12, 13]. 2D PAGE allows the separation of proteins according to
two largely orthogonal parameters (Fig. 1): their isoelectric point
(pI) and their apparent molecular weight (Mr), enabling the sepa-
ration of complex protein mixtures and their visualization on a
single high-resolution gel [14–17]. Depending on the gel size and
pH gradient used, 2D PAGE can resolve up to 5000 different pro-
teins simultaneously and detect and quantify <1 ng of protein/
spot [17]. 2D PAGE can thus be used to generate protein expres-
sion profiles from different samples, e.g., healthy versus diseased,
Fig. 1 2-DE reference maps of tissue extract proteins. 150 μg protein was loaded on an 11-cm IPG strip with a
linear pH gradient pI 5–8 for isoelectric focusing; a 12 % SDS-polyacrylamide gel was used for the SDS-PAGE.
Proteins were stained with fluorescent stain
Proteomics Technologies 7
knockout versus wild type. Sample solubilization is a critical step
for the reproducibility of the 2D PAGE to get as many proteins
solubilized as possible in reproducible manner, to disrupt (in most
cases) their non-covalent bonds, and to obtain them in a defined
charge state without modification of the polypeptide [17].
Following solubilization, isoelectric focusing (IEF) is used as the
first level of protein separation. IEF is very sensitive to charge alter-
ations and therefore to sample contaminations, e.g., by salts that
may alter the protein charge. The second dimension of separation
is an SDS-PAGE where the proteins are separated according to
their apparent molecular weight. Finally, the separated proteins are
then visualized using a staining technique. Several staining methods
are commonly used: (1) Coomassie Blue is used to visualize proteins
separated by 2D PAGE but suffers from low sensitivity. (2) Silver
staining is more sensitive than Coomassie Blue but involves a com-
plex multistep staining protocol, which limits gel-to-gel reproduc-
ibility [18]. (3) Fluorescence-based staining is highly sensitive and
has a wide linear range of detection between staining intensity and
protein volume, enabling accurate quantitation of high and low
abundance proteins [19, 20]. The fluorescence technologies offer
also the possibility of multicolor labeling and detection. If samples
have to be compared, they can be labeled with different dyes cor-
responding to different excitation and emission wavelengths, mixed
and separated on a single gel allowing a differential analysis of pro-
tein expression [21–24]. After the staining step, gel images are cap-
tured and the resulting 2D maps are analyzed using dedicated
image processing software. Spots of interest are then excised and
processed for identification by mass spectrometry.
There are several limitations to 2D PAGE as a separation
method for proteomic studies. For example, hydrophobic proteins
hardly enter the gel and are often lost during 2D PAGE, limiting
its used for the analysis of for example integral membrane proteins.
Very high or very low molecular weight proteins, highly acidic or
highly basic proteins may also be lost during gel separation. Due to
the often limited staining sensitivity, 2D PAGE also requires rela-
tively large amounts of protein.
In addition 2D PAGE involves many manual processing steps
and is therefore not easily automated. Moreover, the heterogeneity
of cell types in tissue samples makes their analysis with 2D-gels to
a challenge. Due to these limitations of two-dimensional electro-
phoresis, separations techniques, such as multidimensional liquid
chromatography and capillary electrophoresis upstream from the
mass spectrometer offer solid alternatives that can overcome the
2D PAGE limitations.
The first separation step in 2D PAGE, i.e., isoelectric focusing
(IEF), is sometimes also used as a standalone method to fraction-
ate complex protein samples, and has been commercialized for
example in the OFFGEL system. When used for protein
8 Christof Lenz and Hassan Dihazi
separations it suffers from the same limitations as mentioned above,
and is therefore much more widely used as another dimension of
peptide separation (pIEF) downstream [25].
2.2.3 Chromatography- As an alternative to electrophoretic separations such as SDS-PAGE
Based Protein Separations or 2D PAGE, a wide range of chromatographic separation
approaches have been established to separate and purify intact pro-
teins, and have found their way into proteomics research [26, 27].
The separation is most frequently based on one of the three major
physicochemical properties that describe proteins: hydrophobicity
(reversed phase chromatography), charge (ion exchange chroma-
tography), and molecular weight (size exclusion chromatography).
In addition, affinity for example by noncovalent protein–protein
interactions can be used as a separation principle [28–34].
Chromatography-based separations are relatively straightfor-
ward to scale up and automate, and are therefore especially suited
for multi-stage separation workflows. Most separation principles
suffer from a limited range of available buffer conditions which
only allow to focus on subgroups of proteins for example in a cer-
tain molecular weight or charging region, but are less suitable for
generic proteomics approaches. Consequently these separation
principles are mostly employed where the enrichment or purifica-
tion of a single protein or class of proteins is desired. Size exclusion
chromatography (SEC) of proteins has emerged as a suitable pre-
fractionation method for generic proteomics approaches as it is
compatible with SDS-containing buffers that allow the solubiliza-
tion of hydrophobic as well as hydrophilic proteins. It is inherently
of low chromatographic resolution, but can be used to great suc-
cess in multidimensional separation approaches as it has a high
loading capacity.
2.2.4 Chips and Arrays In addition to the classical separation methods, the array technol-
ogy provides an ideal tool to study enriched subsets of proteins or
protein domains. Various protein array technologies have emerged
over the last decades that promise rapid examination of different
samples on a protein scale offering better perspectives for pro-
teomics. Antibodies based arrays are highly promising in this case
the antibodies are immobilized on a specially treated array surfaces.
The samples of interest are then applied to the arrays and only the
proteins that bind to the relevant antibodies remain bound to the
chip and be analyzed and quantified. The immobilized molecules
can also be peptides or other small molecules [35–37]. Readouts
for protein-based arrays can derive from protein interactions,
protein modifications or enzymatic activities. The quality of the
immobilized molecule, e.g., antibody, is critical for the readout of
the system. Once developed it could provide for a convenient
proteome analysis.
Proteomics Technologies 9
2.3 Peptide Level There is a multitude of available peptide separation approaches
Separations that can be used to simplify the hundreds of thousands of peptides
produced by enzymatic digestion of a complex protein sample.
2.3.1 Chromatography-
Similar to proteins, peptides may be separated according to a range
Based Peptide Separations
of physicochemical properties, such as hydrophobicity, charging at
defined pH or polarity. One prerequisite has forced the develop-
ment of peptide separations for proteomics over the past years, i.e.,
that the last chromatographic separation step should be readily
coupled to mass spectrometry to allow for highly automated LC-
MS/MS analyses. This makes several demands on an ideal separa-
tion strategy: it should work with volatile buffer systems at low
flow rates that do not interfere with the mass spectrometer’s ion-
ization process. In addition, it should be readily miniaturizable as
the sensitivity of the electrospray ionization process used in most
of today’s proteomics mass spectrometers is concentration-
dependent, i.e., flow rates in the low nanoliter/min regime are
highly desirable [38, 39]. Finally, this separation should be highly
resolving for, for example, tryptic peptides, which typically have a
length of 6–25 amino acid residues. All these requirements are best
met by capillary diameter (50–75 μm) reversed phase-C18 chro-
matography under acidic conditions, e.g., with volatile formic or
acetic acid buffer systems, at corresponding flow rates of 150–400
nl/min [40]. Indeed this chromatography regime seems to present
a “sweet spot” when coupled to mass spectrometry. Together with
improvements in chromatography materials, high-pressure liquid
chromatography hardware and the use of long columns, many pro-
teomics workflows today use this as the only separation step at all,
an approach referred to as “single-shot proteomics” [41, 42].
In many cases however it can still be beneficial to add another
dimension of chromatographic separation to the overall workflow
to achieve greater simplification of the sample prior to for example
mass spectrometric analysis. Any second chromatographic dimen-
sion preceding the final reversed phase separation hyphenated to
the mass spectrometer only or offline should ideally be highly
orthogonal to the latter, i.e., separate peptide analytes by a differ-
ent physicochemical principle to ensure efficient separation, and be
readily integrated into the overall workflow with regard to buffer
systems without causing need for, for example, additional desalting
or concentration steps. Examples of first dimension separations
that are frequently used in proteomics research include strong cat-
ion exchange chromatography or reversed phase chromatography
at neutral pH [43].
2.3.2 Electrophoresis- Several gel free separation techniques found their way to proteomics,
Based Peptide Separations among them capillary electrophoresis (CE) is a rapid and efficient
technique used to separate a variety of compounds including pro-
teins. In CE proteins are driven by electric field through electrolyte
solution and are separated according to their ion mobility. The
10 Christof Lenz and Hassan Dihazi
advantage of this separation method is that it requires a low sample
load [44, 45]. Coupled to the mass spectrometry as detection/iden-
tification method, the CE becomes an attractive separation method
in proteome analysis. The on-line coupling of CE and MS offered an
interesting alternative to 2D PAGE and to common chromato-
graphic separation techniques, the protein mixtures can be analyzed
within short time and with high resolution [46].
3 Mass Spectrometry-Based Proteomics
Mass spectrometry (MS) has emerged as a key technology in pro-
teomics as it presents the most versatile high sensitivity detection
system for peptide and protein analysis today. Contrary to for
example antibody-based detection, MS is unbiased in principle,
although mass spectrometry response is greatly influenced by the
physicochemical properties of peptides and proteins. In addition,
there are a number of different mass spectrometry techniques or
“flavors” that will be described in the next paragraphs.
3.1 Ionization Mass spectrometry involves the manipulation of ionized peptides
Techniques and proteins under high vacuum conditions. Consequently, ways
have to be found to get rid of the solvent and adduct shells that
usually surround these analytes in any solution, and transfer
charge(s) onto them in a controlled and reproducible fashion
without inducing analyte decomposition. Two so-called “soft”
ionization techniques have emerged over the past 25 years that
allow for this largely non-destructive transfer from solution to the
gas phase: Matrix Assisted Laser Desorption/Ionization (MALDI)
and Electrospray Ionization (ESI). To recognize the almost revo-
lutionary contribution that these techniques have had on the Life
Sciences, the Nobel Prize in Chemistry 2002 was awarded to key
inventors John Fenn (ESI) and Koichi Tanaka (MALDI) jointly
with Kurt Wüthrich (for NMR).
3.1.1 Matrix Assisted In MALDI (Matrix Assisted Laser Desorption/Ionization), pep-
Laser Desorption/Ionization tides or proteins are mixed in solution with a large excess (roughly
(MALDI) 10E4) of a small, UV-absorbing organic molecule, the so-called
matrix. Microliter volumes of this solution are then deposited on a
flat target made of conductive material, and the droplet dried by
slow evaporation. Under suitable conditions, co-crystallization
occurs where the analyte molecules are embedded in matrix crys-
tals. The sample plate with the dried spots is then introduced into
the vacuum chamber of the mass spectrometer, where it is irradi-
ated with short nanosecond pulses of UV laser light. In the result-
ing process of rapid desorption and ionization, positively charged
analyte ions are formed which can then be extracted from the
source region using electrostatic fields and further manipulated for
mass analysis [47–50].
Proteomics Technologies 11
During the MALDI process, peptides and proteins are ionized
mainly as singly charged [M+H]+ species. This introduces both
benefits and limitations: on the one hand, MALDI mass spectra are
usually straightforward to interpret as in most cases each observed
signal corresponds to a single peak. On the other hand, this neces-
sitates the use of mass analyzers with a large “mass range” (more
precisely: m/z range) as high molecular weight ions translate into
high m/z signals. In addition, singly charged biomolecules are in
many cases more problematic with regard to sequence analysis since
the repulsion between multiple charges on the same molecule is one
of the driving forces for efficient fragmentation. Finally, MALDI is
a discontinuous technique where usually several hundred to thou-
sand individual laser shot experiments have to be accumulated to
obtain high quality spectra. Even at the kHz laser frequencies avail-
able in modern instrumentation, this makes the process of for
example peptide sequencing slow compared to ESI-based--> instru-
mentation. Finally MALDI cannot be directly hyphenated to chro-
matographic separations. While the latter limitation can be
moderated by offline coupling (“LC-MALDI”), the combination
of limiting factors has led to a decrease in the use of MALDI-based
mass spectrometers in proteomics research. It still finds significant
use in defined applications that require rapid fingerprinting from a
non-separated sample, e.g., for microbial identification [51].
3.1.2 Electrospray ESI (Electrospray Ionization) today is the standard ionization tech-
Ionization (ESI) nique in proteomics research. For ESI, a volume or stream of an
aqueous analyte solution usually containing organic modifiers is
sprayed from a sharp (μm diameter) needle tip towards the orifice
(i.e., the entry to the vacuum section) of a mass spectrometer. The
process is driven by application of a kV electrostatic potential dif-
ferential between the needle and the orifice and happens at atmo-
spheric pressure, making ESI an instance of the larger group of
ionization techniques referred to as Atmospheric Pressure
Ionization, or API. The thin liquid filament produced from the
needle is quickly broken up into small droplets containing a small
number of analyte ions preformed in solution. Through a combina-
tion of electrostatic repulsion (leading to “Coulomb explosions”
that break droplets apart) and evaporation of solvent molecules,
droplet of diminishing size that contain less and less analyte ions are
produced until finally single analyte ions are produced either
through droplet shrinking (“charge residue model”) or by emission
from highly charged droplets containing other analyte molecules
(“ion evaporation model”). The produced analyte ions usually con-
tain two to five charges for peptide analytes (e.g., [M + 2H]2+), or
tens of charges in the case of intact protein analytes [38, 52–54].
The higher charging observed in ESI compared to MALDI has
both advantages and disadvantages. Multiple charges compress the
m/z range required from the mass analyzer, since, for example, for
peptide produced by trypsination the majority of m/z values
12 Christof Lenz and Hassan Dihazi
observed fall into the range of 350–1250. In addition, multiple
charges on an analyte help drive fragmentation through charge
repulsion or are actually (in the case of Electron Transfer
Dissociation, or ETD) a prerequisite for some fragmentation tech-
niques. In addition, multiple charge states (or m/z values) of the
same analyte provide multiple readouts of the analyte’s mass and
thus potentially more accurate mass determinations. On the down-
side the presence of multiple charge states for each analyte in a
complex mixture requires algorithms to properly assign (“decon-
volute”) these charge states, and often complicates spectra. The
main benefit of ESI as a continuous ionization technique is that it
is readily hyphenated to chromatographic or electrophoretic sepa-
rations, providing a readout of the separation eluent in real time.
Provided that the mass analyzer is fast enough to perform sequenc-
ing events at sufficient speed this leads to a very high sequencing
capacity of the resulting hyphenated LC-ESI-MS setups.
3.2 Mass Analyzers Following ionization, peptides and proteins of different mass and
charge are separated in the vacuum region of the mass spectrometer
by their mass-to-charge (m/z) ratio and detected. The m/z separa-
tion by different mass analyzers follows very different physical prin-
ciples. Their performance can be characterized by the following
parameters: (1) m/z range (or “mass range”), i.e., the range of m/z
values for which ions can be transmitted at all; (2) transmission, i.e.,
the percentage of ions successfully transmitted through the mass
analyzer in a given mode of operation. Transmission is invariably
dependent on m/z value; (3) resolution, i.e., the ability to separate
ions of similar m/z. Today, the most common definition used for
resolution is the m/z value of a peak divided by its width at half
height (FWHM, Full Width Half Height); (4) mass accuracy, i.e.,
the deviation of observed m/z values from their theoretically
expected values, which is usually specified in parts per million
(ppm). In this section we focus on the most common analyzer types
used in proteomics mass spectrometry, and discuss their features
and benefits rather than principles of operation.
Quadrupole and quadrupole ion trap mass analyzers are inher-
ently low resolution, low mass accuracy analyzers which are often
operated at “unit” resolution, i.e., a constant peak width of ~0.7
FWHM that translates into resolution values of 500–1500 for typi-
cal peptide peaks in the range of m/z 400–1000. In addition, they
are relatively slow when operated in scanning mode, i.e., when
covering a wide m/z range. To make up for this low resolution
they possess excellent transmission characteristics with transmis-
sion values in excess of 90 % for wide m/z ranges. Consequently
they are often used to filter for specific ions, e.g., when selecting
for MS/MS precursors, or for manipulating ion packages, e.g.,
when used as collision cells for inducing MS/MS fragmentation
(see below) [55].
Proteomics Technologies 13
Time-of-Flight (ToF) mass analyzers are of moderate resolu-
tion (10,000–40,000 FWHM) and exhibit mass accuracies in the
range of 5–25 ppm with frequent calibration. To achieve good
resolution, ions are usually accelerated in a direction orthogonal to
their initial motion, and reflected on a so-called reflectron, or mir-
ror stage, before hitting the detector. As a consequence of orthog-
onal acceleration and reflecting the ion beam, transmission is
usually low, on the order of a few percent. The low transmission is
partially recovered by the high speed of acquisition [56]. Modern
Time-of-Flight analyzers operate at frequencies of up to 5 kHz,
i.e., 5000 individual experiments per second. Even when these are
accumulated before writing the data to disk, acquisition speed of
up to 100 Spectra-to-Disk can be obtained. Through data accu-
mulation the signal-to-noise ratio can be improved even at weak
absolute signal strength. Its discontinuous mode of operation
makes Time-of-Flight the perfect match for the equally discontinu-
ous MALDI. Indeed, MALDI-ToF mass spectrometers were one
of the first high resolution instrument class introduced into pro-
teomics research [57, 58]. Today, however, ESI-ToF mass spec-
trometers are as common.
Orbitrap mass analyzers are high resolution (15,000–140,000
FWHM), high accuracy (0.5–5 ppm) analyzers that have almost
become a standard in proteomics mass spectrometry. Ions are
introduced into a small spindle-shaped electrostatic cell, and the
imaging current recorded from their axial motion recorded in a
non-destructive fashion. From the observed frequency transient,
the m/z spectrum is then calculated by Fourier Transformation.
Same as for the similarly operated Fourier Transform-Ion Cyclotron
Resonance (FT-ICR) mass analyzers, mass resolution is a function
of transient duration and decay to higher m/z values, so practical
resolution values obtained are similar to those obtained for ToF
instruments [59, 60]. The Orbitrap mass analyzer does not require
frequent recalibration, making it a very good choice for instrument
operated in high throughput environments.
3.3 Tandem Mass Proteomics samples are highly complex mixtures of very similar
Spectrometry (MS/MS) analytes. Following the most commonly employed bottom-up
approach that involves tryptic digestion, a sample containing for
example 2000 protein species will produce an estimate 100,000
peptides on digestion [publication Matthias Mann]. Consequently
it is not enough to determine the accurate mass of a digest peptide
to unambiguously determine its identity. Even when combined
with chromatographic retention time information, an accurate
mass tag (AMT) will only serve to identify a tryptic peptide in pro-
teomes of limited size, and only when information about for exam-
ple posttranslational modifications is excluded [Lit]. In most cases,
information about the peptide’s sequence has to be obtained
within the mass spectrometer to allow for unambiguous
14 Christof Lenz and Hassan Dihazi
identification. This usually requires tandem mass spectrometry,
i.e., the use of two mass analyzers in combination with an event
causing sequence-specific degradation of the peptide.
3.3.1 Product Ion The most common tandem mass spectrometry implementation is
Scanning the product ion scan. A peptide ion of defined m/z value is filtered
from the whole population of ions using a first MS stage, often
achieved using a quadrupole mass filter. This isolated precursor ion
is then fragmented in the mass spectrometer to produce sequence-
specific ions, which are then separated by their m/z and detected
in a second stage mass analyzer, e.g., a ToF or an Orbitrap. Each
peptide is thus characterized by its time of introduction to the MS
(i.e., its retention time when the MS coupled to a chromatographic
separation), its precursor m/z value and a set of fragment m/z
values. In the positive ion mode and when using suitable fragmen-
tation techniques (see below), peptides fortunately produce a
defined set of largely sequence-specific fragments which can be
denominated using a system devised by Roepstorff and Fohlman as
early as 1984 [61, 62].
3.3.2 Precursor Ion In product ion scanning, all fragments derived from a single pre-
and Constant Neutral Loss cursor are recorded. In some instances it can also be useful to alter-
Scanning natively record all precursors producing a single fragment, or
marker ion, e.g., when this is predictive for a structural feature,
e.g., a posttranslational modification. For these so-called precursor
ion scans, the first stage mass analyzer is scanned across the precur-
sor m/z range while the second stage mass analyzer is set to a fixed
m/z to filter for the marker ion. Precursor ion scans have been
successfully employed to screen for, for example, phosphorylated
or glycosylated peptide precursor ions in complex mixtures using
either Triple Quadrupole (QqQ) or Quadrupole-Time-of Flight
(QqToF) mass spectrometers [63, 64].
A related experiment is the Constant Neutral Loss Scan, where
both mass analyzers are scanned simultaneously but at an m/z off-
set to detect precursors specifically losing neutral molecules indi-
cating for example phosphorylation. Neither precursor ion
scanning nor constant neutral loss scanning are much used in pro-
teomics studies today, since specific detection of for example phos-
phopeptides may be achieved much more efficiently by for example
affinity enrichment.
3.3.3 Data-Dependent In a typical proteomics mass spectrometry experiment, in excess of
Versus Data-Independent 100,000 peptide precursors need to be sequenced in a few hours of
Acquisition mass spectrometer acquisition time. Consequently the selection
and sequencing of peptide precursors has to be a fully automated
process, with the required sequencing speeds being on the order of
25 peptides/s [65]. Modern mass spectrometers achieve this
through Data Dependent Acquisition (DDA) routines
Proteomics Technologies 15
implemented in their acquisition software. In DDA mass spec-
trometer first performs an MS scan to detect all peptide precursors
coming from the ion source at a specific time. Up to 25 suitable
precursors are the identified using criteria as intensity, charge state
and m/z, and sequentially submitted to a corresponding number
of product ion scans for obtaining sequence-specific fragmenta-
tion. Once finished, another cycle is started with the next MS scan
[66, 67]. Current instrumentation is capable of sequencing speed
of ten product ion spectra per second, producing a capacity of up
to 36,000 sequencing events/h. As not all sequencing events are
successful and nonredundant, around five to six peptide
identifications/s of acquisition time represent the current state of
the art [65, 68, 69].
The discrepancy between the required and the achieved
sequencing speeds and the resulting undersampling of complex
samples has prompted researchers and instrument manufacturers
to look for fundamentally different data acquisition strategies espe-
cially for reproducible quantitative comparison of large numbers of
samples.
If undersampling in DDA renders the detection and quantita-
tive analysis of analytes of interest irreproducible, one alternative is
to forego a dynamic selection of peptide precursors and rather tar-
get sets of peptides that carry the desired information, e.g., about
the quantity of a set of proteins. Selected Reaction Monitoring
(SRM, also frequently called Multiple Reaction Monitoring, or
MRM) on Triple Quadrupole Mass Spectrometers is the most
popular targeted acquisition strategy. In SRM, the two quadrupole
mass analyzers of the spectrometer are set to preprogrammed fixed
m/z values that filter for a combination of a peptide precursor and,
after fragmentation in a collision cell, a sequence-specific fragment.
While this so-called transition does not carry full spectral informa-
tion, it can be seen as a highly specific detection channel for these
peptides. Several hundred of these channels can be monitored
sequentially in a single LC-ESI-SRM experiment to provide quan-
titative information on dozens of peptides of interest [70].
Targeted mass spectrometry methods require upfront knowl-
edge and specification of the analytes of interest, and are limited by
the number of transitions that can be monitored in a single experi-
ment. Newer developments in Data-Independent Acquisition
(DIA), e.g., SWATH acquisition (Sequential Window Analysis of
All Theoretical Fragment ion Spectra) [71], allow the simultane-
ous detection and quantitation of a principally unlimited number
of analytes in a single LC-ESI-MS experiment. All peptide precur-
sors undergo fragmentation at less stringent filtering, and traces for
sequence-specific fragments contained in a previously obtained
spectral library are extracted from the data to provide quantitative
information. From a single experiment, 10,000 s of fragment ion
traces can be extracted that allow consistent quantitation of for
16 Christof Lenz and Hassan Dihazi
example 2500 proteins derived from 15,000 peptides from S. cere-
visiae [72]. For brevity we refer the reader to the literature for
details of the implementation.
3.3.4 Ion Activation All MS/MS experiments and approaches require techniques for a
controlled, reproducible and reproducible activation of precursor
ions to obtain structure-specific decomposition in the mass spec-
trometer’s vacuum [73]. While there are a multitude of techniques
for ion activation available, only a handful of them are suitable for
the large scale analysis of peptides for proteomics.
Collision-Induced Dissociation (or CID, sometimes referred
to as Collisionally Activated Dissociation, or CAD) is a so-called
ergodic, even-electron ion activation technique where excess vibra-
tional energy is deposited in peptide precursors through multiple
collisions with small neutral gas molecules, e.g., nitrogen, in a
collision cell of defined gas pressure in the mass spectrometer, lead-
ing to eventual breaking of covalent bonds. CID is by far the most
commonly used ion activation technique in proteomics mass spec-
trometry, and is highly reproducible even across different instru-
mental platforms and laboratories. It provides excellent sequence
information especially on non-modified peptides generated from
the trypsination of proteins for bottom-up proteome analysis [74,
75]. For large peptide precursors or peptides carrying labile modi-
fications, e.g., glycosylation, it often produces only limited
sequence information. Electron Transfer Dissociation (or ETD,
related to the less-often applied Electron Capture Dissociation, or
ECD) can be used as an alternative or even complementary ion
activation technique in these cases. In ETD, a single, odd electron
is transferred from a reactant gas onto the peptide precursor in the
mass spectrometer. The resulting odd-electron fragmentation
mechanisms are quite different from those produced in CID [76],
e.g., labile modifications are often retained on peptides. ETD
requires the peptide precursor to be higher charged (n ≥ 3) for effi-
cient fragmentation though, making it a better match for larger
peptides produced using enzymes other than trypsin, or even for
small intact proteins [77].
3.4 Analysis of MS How to exploit now the information contained in LC-ESI-MS
and MS/MS Data data sets obtained from complex peptide mixtures? Each frag-
mented peptide precursor is characterized by (1) it retention time,
3.4.1 Peptide
(2) its intact mass-to charge ratio, (3) its charge state which can in
Identification
most cases be deduced from the isotopic pattern and (4) a set of
more or less structure-specific fragment ions. A typical LC-ESI-MS
data set will today encompass in excess of 100,000 such precursor
“feature sets.”
At the beginning of peptide mass spectrometry, sequence was
often derived from MS/MS fragment ion patterns by de novo
sequencing [78]. By reading out amino acid-specific mass
Proteomics Technologies 17
differences between ions of either C- or N-terminal fragment ion
series, partial stretches of a peptide’s sequence can in many cases be
derived from the spectrum. By combining several such stretches
and information about for example the presence of individual
amino acids or the C-terminal amino acid which may be derived
from individual marker ions, the complete sequence of a peptide
can be obtained in select cases. The process is highly error-prone
though and hampered by incomplete fragmentation, overlay of
different ion series, or additional non-sequence-specific fragmenta-
tion events. It is therefore usually used as a last resort in cases
where other approaches fail, e.g., in the case of proteins from
organisms which are poorly covered in genome and proteome
sequence databases. Related to full-blown de novo sequencing is
the peptide sequence tag approach [79] where a short sequence
tag of as little as three to four consecutive amino acids together
with information on the remaining masses (or tag) required to
combine to the peptide’s full mass are often sufficient for unam-
biguous identification of the peptide sequence in a full proteome
sequence database. Same as de novo sequencing, the approach is
still relatively error-prone and computationally expensive.
Today, protein identification in the majority of cases is achieved
by Peptide Fragment Fingerprint (PFF) matching. Here the set of
fragments characterizing a peptide precursor is not interpreted at
all, but is pattern-matched against fragment patterns predicted in
silico for peptides generated from a theoretical digest of all pro-
teins in a protein sequence database. Each match is then scored
based on the agreement between the observed and the predicted
pattern. In the most commonly used probabilistic approach, the
score reflects the chances of a random assignment against the back-
ground of the whole database. PFF matching is implemented in a
significant number of both academic and commercial algorithms,
or database search engines, such as SEQUEST [80], Mascot [81],
OMSSA [82], Paragon [83], or Andromeda [84].
In case of peptide modifications, the exact position of the
modification on the primary sequence of the peptide may be as
important as its presence in itself, e.g., in the case of phosphoryla-
tion where peptides may contain more than one serine/threonine/
tyrosine residue that can be phosphorylated. This so-called site
localization problem can also be addressed, often by comparing
the search engine scores obtained for different theoretically pres-
ent positional modification isomers of the same primary peptide
sequence and deriving a metascore. The most popular implemen-
tations of this concept are the AScore [85], the MASCOT Delta
Score [86], and phosphoRS [87].
All scoring-based approaches for peptide identification or site
localization suffer from the presence of false positive/negative
identifications, a fact that is easily recognized when different search
algorithms are compared against one another. Individual scores
18 Christof Lenz and Hassan Dihazi
cannot be validated per se, except by comparison with results
obtained on synthetic standards, a concept that is prohibitively
expensive for global analyses. Ways must therefore be found to
estimate the validity of results on the basis of the whole ensemble.
This can be achieved in two ways. The most widely taken approach
is based on the estimation of False Discovery Rates [88]. The
sequence database used for Peptide Fragment Fingerprint match-
ing is extended by (or concatenated with) sequences generated
through for example scrambling or reversing the individual protein
sequences. Sequence reversal is usually preferred as it will not
change the amino acid composition, the number of available
trypsin cleavage sites or the overall length distribution of the result-
ing tryptic peptides. When the ensemble of fragment ion spectra is
searched against the resulting forward/reverse database, all hits
recorded against the reverse part are considered random, with the
same number of random matches expected from the forward part
of the database, and a False Discovery Rate be estimated. The
resulting lists of forward and reverse matches can be used to trun-
cate the results list to a specified FDR level, both on the peptide
and on the protein level.
An alternative approach relies on a semi-supervised machine
learning approach that uses both high-scoring PSMs (“positive
PSMs”) and negative PSMs obtained against shuffled protein
sequence databases to derive a model that improves differentiation
between correct and false positives. The approach is implemented
in the Percolator algorithm which has been widely implemented in
a number of database search pipelines [89].
3.4.2 Protein Inference, Another challenge in bottom-up proteomics is that even the cor-
In-Depth Proteomics rect identification of a peptide sequence does not necessarily lead
and Quantitation to correct identification of a protein, or even its functional state.
Peptide sequences may be conserved across whole families of pro-
teins or different splice isoforms; function might be mediated by
single or multiple posttranslational modifications, e.g., phosphory-
lation cascades in case of cell signaling; and finally most proteins do
not function in isolation, but rather in the context of for example
protein–protein complexes. What is more, single or even multiple
experimentally validated peptide sequences cannot necessarily be
linked to a single set of genes coding for a protein, making the cor-
relation of genomics, transcriptomics and proteomics data chal-
lenging [5–7]. It is therefore of utmost importance not only to
identify and quantitate all functionally relevant structural features
of a single proteoform in each experiment, but to do so and follow
changes across different cell compartments, functional states or
isoforms, and with a number of biological and technical replicates
that allow their visualization on a base of statistical significance.
These requirements have several consequences. First, the
implementation of algorithms that derive the most plausible set of
Proteomics Technologies 19
protein properties (e.g., identity, modification state, and quantity)
from an observed set of peptide properties. The approach followed
by most algorithms—and implemented in all relevant commercial
and academic software packages—follows the principle of Occam’s
Razor: to find and use the most concise explanation to explain all
relevant observations. While this approach is widely accepted in the
community, researchers should still be aware that a list of protein
identification or quantitation results may actually represent more
proteoforms than apparent, and any mechanism or software
implementation used to communicate and discuss proteomics data
should allow mining multilayered data of this type.
Second, there is still a need to improve proteomics workflows
further so that they provide the highest possible amount of infor-
mation with moderate effort regarding sample preparation and
instrument time, and at high technical reproducibility. The ideal
workflow should provide full information (sequence coverage,
modification state, quantity) about all proteoforms [6] in the sam-
ple, not require more than a few hours of instrument time to allow
acquisition and analysis of relevant numbers of biological and tech-
nical replicates for improved statistical significance, and involve as
few sample preparation and fractionation steps as these are poten-
tial sources of non-reproducibility. This trend to what is often
referred to as in-depth proteomics has been a significant driver of
both mass spectrometer technology and proteomics workflow
development over the past years [8].
Finally, it has been realized that all successful proteomics
experiments need to involve suitable strategies for quantitation and
quantitative standardization. If a protein’s concentration is just
above detection level in state A, and just below detection level in
state B of a biological system, this might reflect small changes in
the efficiency of sample preparation or instrument performance on
a given day as much as its actual concentration. The often used
Venn diagrams that represent the sets of peptides or proteins either
detected or not detected in the different states are thus rather a
reflection of analytical reproducibility than of biological meaning.
Quantitative experimentation should include direct information
about either relative concentration changes, or absolute informa-
tion about protein concentration in relation to the attainable limits
of detection and quantitation.
3.4.3 Quantitation Which properties of proteins acquired in a proteomics experiment
from MS and MS/MS Data can be used for quantitation? And what are practical strategies for
the introduction of either relative or absolute quantitation stan-
dards? If gel staining techniques are used for detecting and resolv-
ing the different proteins in a sample then the quantitation can be
decoupled from the identification or characterization of the pro-
tein in question, which has significant implications for the work-
flow. For example, if 2DE is used to visualize and quantitate
Exploring the Variety of Random
Documents with Different Content
“Not before so many people,” said her husband with emotion.
“Suppose he were to refuse my hand?”
Marion sighed: but her hopeful nature whispered that the New-Year’s
Eve was not yet ended. And now a clock of silvery tone chimed and
struck the hour of midnight. The guests were conducted to supper:
unseen harps, and sweet voices, gave a slow farewell to the old
year, as they were seating themselves at the upper end of the hall,
and then burst forth into a joyful welcome to the new, as the villagers
entered and took their places at the lower range of tables; this again
died away, and a sweet strain arose, of the softest prayer, for peace
and happiness to all! Marion looked round with emotion.
It was a lovely scene, that huge banquet-hall, with its gay wreaths of
holly and flowers. The bright assemblage of guests; the happy faces
of the villagers below; the beautiful hostess, seated in an antique
chair at the upper end, with the banners of her ancient race, trophies
of ages long gone by, waving behind her; the lovely figure of Peace
below, almost shrouded in the dark leaves, and forming a striking
contrast to those warlike emblems: all these afforded a sight which,
once beheld, would not be easily forgotten.
After each guest had paid sufficient homage to the choice viands
before them, Edith took up a cup of curious workmanship; her face
was radiant with kindness and love as she looked on those around
her.
“This cup has been possessed, for many a century, by my
ancestors,” she said; “preserved for ages as a venerated relic:
doubtless many a toast has been pledged in it—many a friendly
welcome expressed; but I believe no more cordial and sincere one
than that with which I greet you all this night. I would fain express the
usual wish of a new-year of all imaginable happiness and prosperity,
but as such have never visited this earth, we know it would be vain;
and I therefore wish you the greatest of all blessings—that which
cheers and supports us in the sorrows of life, and heightens beyond
measure its pleasures and enjoyments,—love and harmony in your
hearts and homes! There may be some among us estranged from
friends and kindred, grieving over the fault, (for few, let us hope, in a
Christian land, can live unmoved in enmity one with another,) and yet
hanging back, in mistaken pride or want of moral courage, from the
few conciliatory words which would, in most cases, suffice for a
perfect reconciliation. The old year is now passing away—may it
bear with it all anger, all animosity! May those few healing words be
spoken,—and Peace, and Love, and Charity be with us all!”
Edith’s voice trembled with emotion, but she did not perceive the
agitation of many of her guests, for her eyes were fixed, as if in a
dream, on the lower end of the hall. There was a movement of
surprise among those seated there: she made her way, she knew
not how, through them all. Yes, it was Percy!—One look, expressing
a thousand emotions, and their hands were clasped in each other!
For an instant her lovely head was bowed before him, while a few
large, heavy tears, fell on the flowers at her feet! But she soon
mastered her emotion, and, with a face radiant with joy, led him
through the crowd of sympathising faces to her mother’s side. In the
short silence which ensued, the bells of the village church were
plainly heard ringing-in the new-born year! When had they ever
sounded so sweetly before?
And now a joyous strain again burst forth, and all returned to the
ball-room. Again the young, the beautiful, the gay, joined in the
dance; and never feet flew more lightly than theirs. But there were
those who felt a deeper joy; the serene, the heavenly one of
Reconciliation!
And Percy and Edith once more stood side by side,—united, happy!
And Marion told her wondering friend how Percy (who was an old
college friend of her husband’s) had come to see them that morning,
and in their quiet home had confessed that he was drawn to them by
the desire of obtaining news of her, round whom his deep true love
still lingered with so much regret. She had tried to persuade him to
accompany them that night, but still he doubted—still feared. Yet he
now confessed to Edith how, when they were gone, he had longed to
see her face again, how he had concealed himself in the crowd, and
how he had been moved, by what she had just said, to rush forward
from the recess where he stood unobserved, that he might be the
first to own the gentle Magic of those words!
And many others had felt them too! Marion was leaning on her
father’s arm—her eyes cast down and tearful in their joyfulness, as
he spoke to her in a low tone of the invalid whom she must see on
the morrow.
And all hearts were touched and softened, and rich and poor felt
drawn closer together! And they thought of the voice that had said,
—“Love one another as I have loved you,”—and of the divine
lessons of peacefulness and long-suffering which some had
forgotten! And many blessed to the end of their days the Magic
Words spoken by the Peacemaker[A] on that New-year’s Night.
MAGIC WORDS.
Magic words! magic words!
From holy impulse they are born,
The seeming chance of circumstance,
God’s utterance to hearts forlorn;
Where’er they fall reject them not,
Nor think their mission is in vain;
’Twixt loving hearts, whom coldness parts,
Let not the dreary silence reign.
Magic words! what are they?
Things the truest soul will say!
Magic words! magic words!
Ah! dear as to the dying flow’r,
The starry dews that balm infuse,
And whisper of the fallen show’r!
Sweet as the bubbling desert spring
To one who wanders o’er the sands,
Are those chance words, that sow like birds
The flowering seeds of happier lands!
Magic words! what are they?
Things the simplest tongue may say!
Magic words! magic words!
O let them live on ev’ry lip,
A source of bliss, of holiest kiss,
And bond of fairest fellowship.
And evermore at this blest time,
Tho’ winter’s snows o’erspread the scene,
One magic call, to bind us all,
Shall be old Christmas’ evergreen!
Magic words! are not they
Offerings meet for Christmas Day?
London:—Printed by G. Barclay, Castle St. Leicester Sq.
FOOTNOTE:
[A] Edith, in the Anglo-Saxon language, signifies
Peacemaker.
TRANSCRIBER’S NOTES:
Obvious typographical errors have been corrected.
Inconsistencies in hyphenation have been
standardized.
Archaic or variant spelling has been retained.
*** END OF THE PROJECT GUTENBERG EBOOK MAGIC
WORDS: A TALE FOR CHRISTMAS TIME ***
Updated editions will replace the previous one—the old editions
will be renamed.
Creating the works from print editions not protected by U.S.
copyright law means that no one owns a United States copyright
in these works, so the Foundation (and you!) can copy and
distribute it in the United States without permission and without
paying copyright royalties. Special rules, set forth in the General
Terms of Use part of this license, apply to copying and
distributing Project Gutenberg™ electronic works to protect the
PROJECT GUTENBERG™ concept and trademark. Project
Gutenberg is a registered trademark, and may not be used if
you charge for an eBook, except by following the terms of the
trademark license, including paying royalties for use of the
Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such
as creation of derivative works, reports, performances and
research. Project Gutenberg eBooks may be modified and
printed and given away—you may do practically ANYTHING in
the United States with eBooks not protected by U.S. copyright
law. Redistribution is subject to the trademark license, especially
commercial redistribution.
START: FULL LICENSE
THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK
To protect the Project Gutenberg™ mission of promoting the
free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.
Section 1. General Terms of Use and
Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree to
abide by all the terms of this agreement, you must cease using
and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.
1.B. “Project Gutenberg” is a registered trademark. It may only
be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project
Gutenberg™ works in compliance with the terms of this
agreement for keeping the Project Gutenberg™ name
associated with the work. You can easily comply with the terms
of this agreement by keeping this work in the same format with
its attached full Project Gutenberg™ License when you share it
without charge with others.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.
1.E. Unless you have removed all references to Project
Gutenberg:
1.E.1. The following sentence, with active links to, or other
immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it
away or re-use it under the terms of the Project Gutenberg
License included with this eBook or online at
www.gutenberg.org. If you are not located in the United
States, you will have to check the laws of the country where
you are located before using this eBook.
1.E.2. If an individual Project Gutenberg™ electronic work is
derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of the
copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.
1.E.3. If an individual Project Gutenberg™ electronic work is
posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.
1.E.4. Do not unlink or detach or remove the full Project
Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute
this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
1.E.7. Do not charge a fee for access to, viewing, displaying,
performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.
1.E.8. You may charge a reasonable fee for copies of or
providing access to or distributing Project Gutenberg™
electronic works provided that:
• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You provide a full refund of any money paid by a user who
notifies you in writing (or by e-mail) within 30 days of receipt that
s/he does not agree to the terms of the full Project Gutenberg™
License. You must require such a user to return or destroy all
copies of the works possessed in a physical medium and
discontinue all use of and all access to other copies of Project
Gutenberg™ works.
• You provide, in accordance with paragraph 1.F.3, a full refund of
any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.E.9. If you wish to charge a fee or distribute a Project
Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.
1.F.
1.F.1. Project Gutenberg volunteers and employees expend
considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite
these efforts, Project Gutenberg™ electronic works, and the
medium on which they may be stored, may contain “Defects,”
such as, but not limited to, incomplete, inaccurate or corrupt
data, transcription errors, a copyright or other intellectual
property infringement, a defective or damaged disk or other
medium, a computer virus, or computer codes that damage or
cannot be read by your equipment.
1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES -
Except for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU
AGREE THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE,
STRICT LIABILITY, BREACH OF WARRANTY OR BREACH
OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE
TRADEMARK OWNER, AND ANY DISTRIBUTOR UNDER
THIS AGREEMENT WILL NOT BE LIABLE TO YOU FOR
ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE
OR INCIDENTAL DAMAGES EVEN IF YOU GIVE NOTICE OF
THE POSSIBILITY OF SUCH DAMAGE.
1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If
you discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person or
entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.
1.F.4. Except for the limited right of replacement or refund set
forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR
ANY PURPOSE.
1.F.5. Some states do not allow disclaimers of certain implied
warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the
Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you do
or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.
Section 2. Information about the Mission of
Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.
Volunteers and financial support to provide volunteers with the
assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.
Section 3. Information about the Project
Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status by
the Internal Revenue Service. The Foundation’s EIN or federal
tax identification number is 64-6221541. Contributions to the
Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.
The Foundation’s business office is located at 809 North 1500
West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact
Section 4. Information about Donations to
the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.
The Foundation is committed to complying with the laws
regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or
determine the status of compliance for any particular state visit
www.gutenberg.org/donate.
While we cannot and do not solicit contributions from states
where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.
International donations are gratefully accepted, but we cannot
make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.
Please check the Project Gutenberg web pages for current
donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.
Section 5. General Information About Project
Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.
Project Gutenberg™ eBooks are often created from several
printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
This website includes information about Project Gutenberg™,
including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
Let us accompany you on the journey of exploring knowledge and
personal growth!
textbookfull.com