0% found this document useful (0 votes)
50 views5 pages

The Syntactic Aspect of Information

1) The document discusses the syntactic aspect of biological information and how information theory concepts can be applied to biology. It specifically focuses on the origin of semantic information. 2) It outlines the three dimensions of information - syntactic, semantic, and pragmatic. The syntactic dimension deals with relationships between characters, while semantic deals with what the characters represent. Pragmatic deals with implications for the sender and recipient. 3) It provides an overview of Shannon information theory, which quantifies information as the number of symbols needed to transmit a message. The theory focuses on communication but ignores the semantic meaning of messages.

Uploaded by

PabloAPacheco
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views5 pages

The Syntactic Aspect of Information

1) The document discusses the syntactic aspect of biological information and how information theory concepts can be applied to biology. It specifically focuses on the origin of semantic information. 2) It outlines the three dimensions of information - syntactic, semantic, and pragmatic. The syntactic dimension deals with relationships between characters, while semantic deals with what the characters represent. Pragmatic deals with implications for the sender and recipient. 3) It provides an overview of Shannon information theory, which quantifies information as the number of symbols needed to transmit a message. The theory focuses on communication but ignores the semantic meaning of messages.

Uploaded by

PabloAPacheco
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

.

-/

The Syntactic Aspect of Information

As we saw in chapter 2, the myriad of structures and functions in the


biosphere can be traced back to a universal concept ofinformation
that has its roots at the level of molecules. Admittedly, we have used
the concept "information" in an intuitive way, as a synonym for
"that which appears planned."
But the existence of a molecular
symbol-Ianguage
and its semantic fixation in the gene tic code
suggest strongly that the concepts of information theory may be
applicable here. Our task in this chapter will be to express the
concept of biological information more precisely and to discuss it
in relation to existing information theory.
In the discussion to foIlow, it will be important to remember that
"information,
even considered as a concrete object, occupies a
special intermediate position between the natural sciences and the
humanities. "23 This is se en particularly clearly in the case of
biological information. Everything about biological structures that
works "according to plan," that is, everything that is controlled by
information, has a "meaning" and a "significance" in the context of
the functional order that we find in living systems. This means that
biological information is associated with defined semantics. When
in the rest of this chapter we refer to "biological" information, it will
be precisely this semantic aspect of information that is meant. A
theory of the origin of life must therefore necessarily include a
theory of the origin of seman tic information. But precisely here lies
the difficulty; traditionaIly, the natural sciences have never taken
account of semantic phenomena.
Michael Polanyi has expressed
this with particular force: "AlI objects conveying information are
irreducible to the laws ofphysics and chemistry."24 This statement
is the basal thesis upon which Polanyi attempted
to justify his
assertion of the irreducibility ofbiological phenomena
(this wiIl be

~J

'J'

above, certain prior knowledge in the form of an agreement


between sender and recipient. Moreover, semantic information is
unthinkable without pragmatic information, because the recognition of semantics as semantics must cause some kind of reaction

discussed more fully in chapter 7). The central question in


connection with the problem of the origin of life is therefore that
of how far, if at all, the concept of semantic information can be
made objective and thus be come the subject maUer of a mechanisticaIly oriented science.
What is information?
"Information is only that which is understood," says Carl Friedrich von Weizsacker.25 This simple statement
underlines the general fact that we can only speak of information
when it has both a sender and a recipient. Information can only be
represented and conveyed, whether in acoustic, optical, or other
form, by the use of characters or letters. Characters that have a
meaning we shaIl here caIl "symbols"; their recognition assumes a
prearranged
semantic agreement between sender and recipient.
This idea wiIl be expanded in the following paragraphs.
We regard the symbols as elementary, nondissectable
units of
information. The symbols thus fix one semantic level as a microstate,
while the various combinations ofthe symbols each define a certain
semantic level as a macrostate. Information
thus exists only with
reference to two semantic levels that behave toward one another as

from the recipient. The resolution of the concept of information


into a syntactic, a semantic, and a pragmatic dimension is therefore
only justifiable in the interest of simple representation.
In this section we shaIl consider these three dimensions separately, in the order of increasing complexity. Thus we turn first to
the syntactic aspect of information, then to the semantic (chapter
4) and finaIly to the pragmatic (chapter 5).
According to this definition, the syntactic dimension covers the
relation of characters to each other.
It is the centerpiece
of
"classical" information theory, as put forward by Claude Shannon
and Warren Weaver.28 We also refer to this as Shannon information
theory.
Shannon information theory is basicaIly a theory of communications. It deals with certain problems of telecommunications
that
arise in connection with the storage, conversion, and transmission
of characters and character sequences.
The semantic aspect of
information,
expressed in the "meaning" of a message, is completely ignored in Shannon information theory. This has be en
expressed aptly by Weaver: "...two messages, one heavily loaded
with meaning and the other pure nonsense, can be equivalent as
regards information. "29 In this regard, Shannon information
theory is a structural sciepce; that is to say, "It studies structures in
abstracto, independently
ofwhat objects have these structures, and
indeed of whether such objects existo "30
In the foreground of Shannon information theory, therefore,
stands the naked problem of communication:
that of conveying
correctly, within certain limits of fluctuation, a given sequence of
characters from the sender to the recipient. Since these characters
will in general encode a message, we can also caIl them symbols.
We shaIl assemble some basic concepts ofShannon information
theory that are relevant to many problems in biology. In order to
make an information-processing
operation accessible for mathematical analysis, we need a quantitative measure of the amount of
information that is contained in a sequence of symbols. Shannon
information theory shows that it is useful to measure the quantity
of information in a message by the number of symbols required to
formulate the message in the shortest possible way. (Instead of a

micro- and macrostate (chapter 4). "Information" has no absolute


meaning"; it exists onlywith respect to an idea or, more accurately,
"relatively, between two semantic levels. "26 These two semantic
levels are presupposed as necessary common communication
structures for mutual understanding,
without which the meaningful
exchange of information cannot take place.
The concept of information that we have introduced possesses
three dimensions:27
a. the syntactic dimension,
the individual characters,

which comprises relationships

between

b. the semantic dimension, which comprises relationships


the individual characters and what they stand for, and

between

c. the pragmatic dimension, which comprises relationships


between the individual characters, what they stand for, and what
action this implies for the sender and the recipient.
In accordance with this definition, the pragmatic aspect of information contains a semantic part, and this in turn a syntactic parto
Conversely, syntactic information is meaningless unless the recipient is already in possession of semantic information.
The identification of a character as a "symbol" presupposes,
as mentioned

"message" we can also in general speak of an "event.") While any


system of symbols can be used for coding, it has been found
practical in the technical realization of information-processing
systems to use the binary codeo In the binary system, every message
consists of a sequence of zeros and ones. The unit of information,
the "bit" (binary digit) is the amount of information in a binary
decision, the information
that specifies whether a symbol shall
adopt the value "O" or "1."
If a set of messages consists of N different messages, all equally
probable, then we need

H = Id(N)

--

!I

~
C")
11
CID

(a)

~11
:I:

(1)

(Id = logarithm to base 2)

binary decisions in order to select a particular message from them


(figure lOa). The quantity His also called the decision contentofa
set of messages.31
Up to now, we have assumed that all messages have the same
prior probability of occurrence. If this is not the case, then the set
of messages {xl' "', xJ can possess any distribution of probabilities
P = {P(XI)' oo" P(XN)}'
If this distribution also fulfils the condition

Ip( x)

-----

Message
Code

@
111

@@

@@

(E)@

110 101

100 011

010 001

000

= 1, then generalization of equation (1) leads to the conclu-

sion that the choice of a particular


is Pk= P( Xk)' now requires

message, whose prior probability


-

(2)

Ik = -Id (Pk)

binary steps (figure 10b), Shannon and Weaver, and independently ofthem Norbert Wiener, have named Ikthe information content
of a message Xkwith prior probability p/2,33
The quantification
of information-for
example, as attempted
in equation (2)-contains
as an essential starting point some prior
knowledge on the part of the recipient that is characterized by the
probability distribution P. This prior knowledge is naturally a
subjective property of the recipient, and the probabilities in equaFigure 10
Decision-tree for the choice of a single message out of a set of eight messages.
(a) AlI messages are equally probable. Any individual message can be chosen
by a series of three either-or decisions. (b) The messages A to H have differen t
probabilities,

with these relative values (frequencies

of occurrence):

P(A)

P(B) = 1/4; P(C) = P(D) = 1/8; P(E) = P(F) = P(G) = P(H) = 1/16. The set is
subdivided into groups containing messages of equal probability. To choose
message A, only two binary decisions are needed. The same is true for message
B. For each of messages e and D, three decisions are needed; for the remainder, four.

N
11
ID
'4
11

(t)
11

:I:

(5

11

4:

(b)

11

11

u:

.!!.
w
Message
Code

@
11

10

011

@ @ (E)@ @
010

0011 0010 0001 0000

A step to the left is assigned the binary character "1" and a step to the right the
character "O." The code that results thus allots the fewest binary characters
(bits) to the most frequent messages. This principIe is also employed by
languages.
Frequently used words are on average shorter than rarely used
ones.

tion (2) are thus subjective and related to the recipient.


In the
context of Shannon information
theory there is therefore no
information in an absolute sense.
That the foregoing

definition

of the information

content

H [bits]

of a

message is meaningful is shown by the properties of the function ~.


A message Xkwith a prior probability Pk = 1 cannot tell the recipient
anything new, and according to the Shannon formula it has an
information content of
1k

= O

-ld(l)

The information content of a message is thus the greater,


probable its arrival was:
1>L

for

p < P2

= -Id (P"P2) = -ld(P)

(4)

-Id (P2) = 1 + 12

(5)

For the special case in which p =... =PN=l/N, equation (2) reduces
to the simpler equation (1).
If we posit the existence of N messages {xl' oo.,XN}with prior
probabilities {Pl' oo.,PJ, whereIpi = 1, the expectation value H of a
single message is given by
H = I PJk = -I PkId(jJ0
k

(6)

Shannon called this quantity entropy, by analogywith Boltzmann's


ll-function. The entropyfunction
H displays a number ofinteresting properties,
of which two basic ones should be mentioned
(figure 11) :34
l.The entropy function
probabilities

12

the less

Finally, the information content of a message also possesses the


additive property expected of a quantitative measure; that is, for
the simultaneous choice of two independent
messages with prior
probabilities p and P2 the quantity of information needed is
12

(3)

H adopts a value of zero when one of the


has a value of Pk = l.

2.The entropy function H attains a maximum when all the probability values Pi are equal.
Much confusion has arisen in connection with the sign of the
Shannon entropy. The quantity defined by equation (2) gives the
novelty valueofthe message Xk'while the entropyin Shannon's sense
is the expectation value of the novelty content of the message.

P,

Figure 11
Entropy of a source of information. The set of messages consists of two
independent messages x and X2 with respective prior probabilities p and
P2= 1 - p. The expectation value of the information content of a messageis
then, according to equation (6): H = - pld(p) - (l-p) ld(l- p). In the case
of even distribution, p] = P2= 1/2, the entropy H reaches its maximum; that is,
the uncertainty in the choice of a message is greatest. For p] = Oand P2= 1, H
becomes equal to zero, since a state of probability P2= 1 has no alternative and
no further information is needed to make a choice. The same applies for
p]=1'P2=O.

Thus, according
same sIgn, l.e.,
information

to Shannon,

information

and entropy

have the

= entropy

But it is common to equate information


with knowledge and
entropy with lack of knowledge, following Lon Brillouin, so that
information

= negentropy

which leads to an apparent contradiction.35


However, as von
Weizsacker has pointed out, this is only a conceptual or verbal
indarity, which is disposed of when one distinguishes between
actual and potential information,
that is, information
that one
already possesses and information that one will first possess when
the next signal has arrived.36 Thus Shannon entropy, the expectation value for the novelty content of a message, is to be understood
as potential information and as having the same sign as Boltzmann
entropy.
The phrase "potential information" expresses a connection with
the future: the information that can be obtained from an observation.

We will now describe

more dosely the idea of a "gain" in

- ,.- -.I"'~-"-

information.

We start with a set of messages {x!, .oo,XN}with the

probability distribution

PJand

P= {P!' '0"

the constraintsIp = 1

and p > O. If, by observation


or by the discovery of further
constraints, we become able to replace the prior distribution Pby
the posterior distribution Q= {q!, .0.,qJ (with Iq= 1 and q? O), then
the pure change in entropy, given by

(7)

H (P) -H (Q) = 'LPklk - 'L qklk


k

is not a correct measure of the gain in information, since the


difference H(P) - H( Q) can be either positive or negative. In
addition, this quantity is the difference between two average values
and tells us liule about the gain of information in each individual
caseo
The concept of gain in information can be brought into sharper
focus by consideration not of the difference between two average
values [equation (7)], but, instead, of the change in information
content for each individual message, after which a new average is
calculated. This quantity is an average of differences, and reflects
much beuer the gain in information in single events than does the
difference between averages defined in equation (7). This consideration prompted Alfrd Rnyi to define the gain in information as
follows:37
H(QIP)

= 'Lqk
[/(Pk-/(qk)]
k

=Iqk
k

In contradistinction
to Shannon
important inequality

H(Q IP)

? O

Id(qkIPk)

entropy,

(8)

this quality fulfils the

(9)

in which the equality holds when and only when the distributions
Q and Pare identical, i.e., the observation has not modified but
merely confirmed the distribution (see also chapter 5).
The concept of information thus introduced can be greatly
generalized with the help of the mathematical concept of semiorder and, based on it, the theory of mixing character. The
excursions of Shannon and Rnyi then appear as special cases of
the general theory.38
A characteristic of Shannon information theory is that it always
refers to an ensemble of possible events and analyzes the uncertainty with which the occurrence of these events is associated. In
recent years, algorithmicinformation theoryhas been developed as an
alternative to this. Here, a measure for the information content of

u-rw,

"J "':}"'

v'o

individual symbol sequences is defined, without the postulates of


their incorporation into an ensemble of all possible sequences and
of a corresponding
probability distribution.39 Algorithmic information theory has far-reaching implications concerning the problem of induction, the concept of chance and the question of the
evolutionary origin ofinformation.40o4!.42 We shall discuss the basic
principIes of this theory in chapter 9.

You might also like