0% found this document useful (0 votes)

44 views11 pages

Arithmetic Coding in Parallel: Jan Supol and Bo Rivoj Melichar

The document describes a cost optimal parallel algorithm for arithmetic coding that runs in O(log n) time using n/log n processors on an EREW PRAM model. This results in a total parallel cost of O(n). The algorithm improves upon prior work that was not cost optimal or focused on hardware implementations rather than theoretical efficiency.

Uploaded by

Rohit Soren

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views11 pages

Arithmetic Coding in Parallel: Jan Supol and Bo Rivoj Melichar

Uploaded by

Rohit Soren

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Arithmetic Coding in Parallel

Jan Šupol and Bořivoj Melichar

Department of Computer Science & Engineering
Faculty of Electrical Engineering
Czech Technical University
Karlovo nám. 13, 121 35 Prague 2

e-mail: {supolj,melichar}@fel.cvut.cz

Abstract. We present a cost optimal parallel algorithm for the computation

of arithmetic coding. We solve the problem in O(log n) time using n/log n
processors on EREW PRAM. This leads to O(n) total cost.

Keywords: arithmetic coding, NC algorithm, EREW PRAM, PPS, parallel

text compression.

1 Introduction
There is still a need for data coding. The growing demand for network communication
and for storage of data signals from space are not the only examples of coding needs.
Many algorithms have been developed for text compression.
One of these is arithmetic coding [Mo98, Wi87], which is more efficient than
the widely known Huffman algorithm [Hu52]. The latter rarely produces the best
variable-size code, the arithmetic coding overcomes this problem. Arithmetic coding
can be generated in O(n) time sequentially, and we present a well scalable NC parallel
algorithm that generates the code in O(log n) time on EREW PRAM with n/log n
processors. This leads to O(n) total cost and a cost optimal algorithm.
Despite the large number of papers on the parallel Huffman algorithm (the last
known [Lb99] is work optimal) there are only a few papers on parallel arithmetic
coding. Most of these are based on a quasi-arithmetic coding [Ho92]. We know only
two exceptions. The first [Yo98] is based on an N-processor hypercube and is not
cost optimal. The second [Ji94] is mainly focused on the hardware implementation.
Authors expected the processing speed of their tree-based parallel structure eight
times as high as the speed of a sequential coder. This is still O(n) parallel time.
This paper is organized as follows. Section 2 provides a description of the sequen-
tial arithmetic coding algorithm. Section 3 presents some basic definitions. Section 4
describes the parallel prefix computation needed by our algorithm. Section 5 presents
our parallel arithmetic coding algorithm. Section 6 describes the time complexity of
our algorithm. Section 7 contains our conclusion. Note that this paper does not
mention the decoding process.

168
Arithmetic Coding in Parallel

2 Sequential Arithmetic Coding

First we review the sequential algorithm. Let A = [a0 , a1 , . . . , am−1 ] be the source al-
phabet containing m symbols and an associated set of frequencies F = [f0 , f1 , . . . , fm−1 ]
shows the occurrences of each symbol. Next we compute the m−1array of probabili-
ties R = [r0 , r1 , . . . , rm−1 ] such that ri = fi /T where T = i=0 fi , the array of
i
high ranges H = [h0 , h1 , . . . , hm−1 ] such that hi = x=0 rx , the array of low ranges
L = [l0 , l1 , . . . , lm−1 ] such that l0 = 0 and li = hi−1 , i > 0. Table 1 shows an example.

A F R L H
S 5 5/10=0.5 0.5 1.0
W 1 1/10=0.1 0.4 0.5
I 2 2/10=0.2 0.2 0.4
M 1 1/10=0.1 0.1 0.2
1 1/10=0.1 0.0 0.1

Table 1: Frequencies, probabilities and ranges of ﬁve symbols.

The string of symbols S = [s0 , s1 , . . . , sn−1 ] is encoded as follows. The ﬁrst charac-
ter s0 can be encoded by a number within an interval [ly , hy ) associated to a character
y = s0 , y ∈ A. This notation [a, b) means the range of real numbers from a to b, not
including b. Let us deﬁne these two bounds as LowRange and HighRange.
As more symbols are input and processed, LowRange and HighRange are updated
according to

LowRangej = LowRangej−1 + (HighRangej−1 − LowRangej−1 ) × lx ,

HighRangej = LowRangej−1 + (HighRangej−1 − LowRangej−1 ) × hx ,

where hx and lx are low and high ranges of new character x ∈ A, LowRange−1 = 0,
HighRange−1 = 1. Table 2 indicates an example for the word “SWISS”.

A L&H The calculation of low and high ranges

S L 0.0 + (1.0 − 0.0)×0.5 = 0.5
H 0.0 + (1.0 − 0.0)×1.0 = 1.0
W L 0.5 + (1.0 − 0.5)×0.4 = 0.70
H 0.5 + (1.0 − 0.5)×0.5 = 0.75
I L 0.7 + (0.75 − 0.7)×0.2 = 0.71
H 0.7 + (0.75 − 0.7)×0.4 = 0.72
S L 0.71 + (0.72 − 0.71)×0.5 = 0.715
H 0.71 + (0.72 − 0.71)×1.0 = 0.720
S L 0.715 + (0.72 − 0.715)×0.5 = 0.7175
H 0.715 + (0.72 − 0.715)×1.0 = 0.7200

Table 2: The process of arithmetic encoding.

169
Proceedings of the Prague Stringology Conference ’04

3 Definitions
Our parallel algorithm is designed to run on the Parallel Random Access Machine
(PRAM), which is a very simple synchronous model of the SIMD computer([Le92,
Qu94, Tv94]). PRAM includes many submodels of parallel machines that differ from
each other by conditions of access to the shared memory. Our algorithm works on the
Exclusive Read Exclusive Write (EREW) PRAM model, which means that no two
processors can access the same cell of the shared memory.
We define sequential time SU(n) as the worst time of the best known sequential
algorithm where n is the size of the input data. Parallel time T (n, p) is the time
elapsed from the beginning of a p-processor parallel algorithm solving a problem
instance of size n until the last (slowest) processor finishes the execution.
Consider a synchronous p-processor algorithm A with τ = T (n, p) parallel steps.
Let pi be the number of processors active (working) at step i ∈ {1, 2, . . . , τ } of A.
Then the synchronous parallel work of A is

W (n, p) = T1 + T2 + · · · + Tτ .

Parallel cost (also called processor-time product) is deﬁned as

C(n, p) = p × T (n, p).

It is obvious that
SU(n) ≤ W (n, p) ≤ C(n, p).
If SU(n) = W (n, p) then the algorithm is work optimal. If SU(n) = C(n, p) then
the algorithm is cost optimal.
The eﬃciency of the parallel algorithm is deﬁned as

SU(n)
E(n, p) = .
C(n, p)

Let E0 be the constant such that 0 < E0 < 1. Then isoeﬃciency function ψ1 (p)
is the asymptotically minimum function such that

∀np = Ω(ψ1 (p)) : E(np , p) ≥ E0 .

Hence, ψ1 (p) gives asymptotically the lower bound on the instance size of a prob-
lem that can be solved by p processors with eﬃciency at least E0 .
Scalability is the ability to adapt itself to a changing number of processors or or
to changing size of the input data. Good scalability means that if we want to use
new processors we have to increase the size of our problem only a little. Fast growth
of function ψ1 provides poor scalability.
We say that class NC (Nick’s class) is a set of algorithms that can be computed
with at most polylogarithmic time and with at most a polynomial number of proces-
sors. These algorithms provide a high level of parallelization.

170
Arithmetic Coding in Parallel

4 Parallel Preﬁx Computation

As far as our parallel algorithm is based on the parallel prefix algorithm we show how
it works. The problem is defined as follows [La80]. Let S = [s0 , s1 , . . . , sn−1 ] be the
array of numbers. The prefix problem is to compute all the prefixes of the products
s0 ⊗ s1 ⊗ · · · ⊗ sn−1 ,
where ⊗ is an associative operation.
Fig. 1 shows the algorithm that assumes n processors p0 , p1 , . . . , pn−1 and array
M = [m0 , m1 , . . . , mn−1 ] of numbers stored in the shared memory. Every processor pi
also has a register y i . From now on we will use EREW PRAM with similar conditions.
for i := 0, 1, . . . , n − 1 do in parallel
y i := M[i];
for j := 0, 1, . . . , log n − 1 do sequentially
begin
for i := 2j , 2j + 1, . . . , n − 1 do in parallel
y i := y i ⊗ M[i − 2j ];
for i := 2j , 2j + 1, . . . , n − 1 do in parallel
M[i] = y i;
end

Figure 1: Parallel preﬁx algorithm.

Fig. 2 indicates a parallel preﬁx algorithm computing an array of 7 numbers with

the associative operation of sum. This is then called the parallel prefix sum.
Here we show the parallel time T (n, p) of the parallel prefix computation on EREW
PRAM. First we suppose that p < n. Each processor simulates n/p processors. This
sequentially sums n/p numbers. This takes at most 4n/p steps (read first number,
read second number, sum and write the result). After that the processors run the
parallel prefix algorithm in time O(log p). So the parallel time, cost, efficiency and
function ψ1 take
T (n, p) = O(n/p + log p),
C(n, p) = O(n + p log p),
n
E(n, p) = O( ),
n + p log p
ψ1 (p) = O(p log p).
We can say that the parallel prefix algorithm is a well scalable NC algorithm due to
the definitions in Section 3. If p = n then
T (n, n) = O(n/n + log n) = O(log n),
C(n, n) = O(n + n log n) = O(n log n).
However, when p = n/log n then
T (n, n/log n) = O(n log n/n + log n − log log n) = O(log n),
C(n, n/log n) = O(n + n/log n(log n − log log n)) = O(n).
Hence, we have obtained a parallel cost optimal algorithm.

171
Proceedings of the Prague Stringology Conference ’04

3 2 4 7 1 5 2

@ @ @ @ @ @
@ @ @ @ @ @
@ @ @ @ @ @
@R?
@ @R?
@ @R?
@ @R?
@ @R?
@ @R?
@

3 5 6 11 8 6 7
HH HH HH HH HH
HH HH HH HH HH
HH HH HH HH HH
HH HH HH HH HH
j?
H j?
H j?
H j?
H j?
H

3 5 9 16 14 17 15
XX X X
XXX XXXX XXXX
XX X X
XXX XXXX XXXX
XX X X
XXX XXXX XXXX
z? XX
XX z? XX
z?

3 5 12 16 17 22 24

Figure 2: Parallel preﬁx sum example.

5 Parallel Arithmetic Coding

Recall that we use the array A = [a0 , a1 , . . . , am−1 ] of the source alphabet containing
m symbols, the associated set of frequencies F = [f0 , f1 , . . . , fm−1 ],the associated set
m−1
of probabilities R = [r0 , r1 , . . . , rm−1 ] so that ri = fi /T where T = i=0 fi , the array
of low ranges L = [l0 , l1 , . . . , lm−1 ], the array of high ranges H = [h0 , h1 , . . . , hm−1 ] so
that l0 = 0, li = hi−1 , i > 0 and hi = li + ri .
Our idea of parallelism is that we have a string S = [s0 , s1 , . . . , sn−1 ] of n characters
to encode. Each processor pj is associated with a character sj and computes variables
LowRange and HighRange for that character.

5.1 Preliminaries
We suppose that we have an array Range = [range0 , range1 , . . . , rangen−1 ] for our
algorithm. Each rangej is initialized with probability ry such that ay = sj where j is
the index of the j-th character in the input string and sj ∈ A. We also suppose that
we have an array Low = [low0 , low1 , . . . , lown−1 ]. Each lowj is initialized with value ly
such that ay = sj . We need at least one variable high initialized with value hy such
that ay = sn−1 .

172
Arithmetic Coding in Parallel

5.2 Changes in Sequential Algorithm

Let us return to sequential arithmetic coding and try to change the algorithm a bit
so that it can be parallelized. Recall the bounds computation

LowRangej = LowRangej−1 + (HighRangej−1 − LowRangej−1 ) × lx ,

HighRangej = LowRangej−1 + (HighRangej−1 − LowRangej−1 ) × hx ,

where hx and lx are low and high ranges of new character x ∈ A, LowRange−1 = 0,
HighRange−1 = 1 and mark the cumulative lower and higher bounds

LRj = (HighRangej−1 − LowRangej−1 )×lx ,

HRj = (HighRangej−1 − LowRangej−1 )×hx .

So the values LowRange and HighRange are updated as

LowRangej = LowRangej−1 + LRj ,

HighRangej = LowRangej−1 + HRj

and we now focus only on the variables LR and HR now.

LRj = (HighRangej−1 − LowRangej−1 ) × lx =

= (LowRangej−2 + HRj−1 − LowRangej−2 − LRj−1) × lx =
= (HRj−1 − LRj−1 ) × lx ,

HRj = (HighRangej−1 − LowRangej−1 ) × hx =

= (LowRangej−2 + HRj−1 − LowRangej−2 − LRj−1) × hx =
= (HRj−1 − LRj−1 ) × hx .

Moreover, LowRangej can be computed as

LowRangej = LRj + LowRangej−1 = LRj + LRj−1 + LowRangej−2 =

= · · · = LRj + LRj−1 + · · · + LR0 + LowRange−1 =
j
j
= LRx + LowRange−1 = LRx
x=0 x=0

because LowRange−1 = 0.
The change in our algorithm is that we first compute the cumulative lower and
higher bounds and next we simply compute the sum of these cumulative bounds so
that we obtain the final bounds LowRange and HighRange.
Let us see how the variables LR and HR can be computed for the word “SWISS”.
We declare that LR0 is the LR variable for the first character s0 = “S and lx , hx ,
rx are lower range, higher range and probability of character x ∈ A. LR−1 and HR−1
are initial cumulative bounds for a number that represents the encoded text S. For
arithmetic coding this number is defined by default as an interval [0,1). That is why
LR−1 = LowRange−1 = 0 and HR−1 = HighRange−1 = 1.

173
Proceedings of the Prague Stringology Conference ’04

LR−1 = 0
HR−1 = 1
LR0 = (HR−1 − LR−1 ) × ls = 1.0 × 0.5 = 0.5
HR0 = (HR−1 − LR−1 ) × hs = 1.0 × 1.0 = 1.0
LR1 = (HR0 − LR0 )×lw = (hs − ls )×lw = rs ×lw = 0.5×0.4 = 0.2
HR1 = (HR0 − LR0 )×hw = (hs − ls )×hw = rs ×hw = 0.5×0.5 = 0.25
LR2 = (HR1 − LR1 )×li = (rs ×hw − rs ×lw )×li = rs ×rw ×li = 0.5×0.1×0.2 = 0.01
HR2 = (HR1 − LR1 )×hi = (rs ×hw − rs ×lw )×hi = rs ×rw ×hi = 0.5×0.1×0.4 = 0.02
LR3 = (HR2 − LR2 )×ls = (rs ×rw ×hi − rs ×rw ×li )×ls = rs ×rw ×ri ×ls = 0.005
HR3 = (HR2 − LR2 )×hs = (rs ×rw ×hi − rs ×rw ×li )×hs = rs ×rw ×ri ×hs = 0.01
...
So it is obvious that the lower bound of the j-th character LRj and the higher
bound of the j-th character HRj can be computed as

j−1
j
LR = ( r x ) × lj , j > 0,
x=0

j−1
j
HR = ( r x ) × hj , j > 0.
x=0

5.3 Parallel Preﬁx Production

for i := 0, 1, . . . , n − 1 do in parallel
y i := Range[i];
for j := 0, 1, . . . , log n − 1 do sequentially
begin
for i := 2j , 2j + 1, . . . , n − 1 do in parallel
y i := y i × Range[i − 2j ];
for i := 2j , 2j + 1, . . . , n − 1 do in parallel
Range[i] = y i;
end

Figure 3: Parallel preﬁx production algorithm.

These new LR and HR variables are exactly what we need, because j−1 rx
j x=0
can be computed in parallel as we immediately show. Computation of x=0 r x =
j x
x=0 range can be done by the parallel preﬁx production algorithm explained in
Section 4, as shown in Fig. 3. Table 3 indicates the parallel preﬁx algorithm in our
example for the word “SWISS”.

174
Arithmetic Coding in Parallel

S W I S S
0.5 0.1 0.2 0.5 0.5
0.5 0.05 0.02 0.1 0.25
0.5 0.05 0.01 0.005 0.005
0.5 0.05 0.01 0.005 0.0025

Table 3: Parallel preﬁx production example for the word “SWISS”.

5.4 Cumulative Bounds Computation

If we have computed j−1 r x we can obtain the variables LRj and HRj simply as the
j−1 x j x=0 j−1 x j
product of x=0 r ×l and x=0 r ×h . Parallel algorithm computing the variables
LR and the variable HRn−1 is shown in Fig. 4. The variables HR are not exactly
needed, except for the last one HRn−1 . If these variables are required, they can be
computed in a similar way. The value HRn−1 , which is the cumulative high range, is
computed after the parallel preﬁx production computation as

n−2
HRn−1 = ( r x ) × hn−1 .
x=0

Table 4 shows this computation in our example for the word “SWISS”. Note that
the results correspond to the cumulative bounds in our sequential example.

do sequentially
begin
y n−1 := High;
y n−1 := y n−1 × Range[n − 2];
High := y n−1 ;
y 0 := 1;
end
for i := 1, 2, . . . ,n − 1 do in parallel
y i := Range[i − 1];
for i := 0, 1, . . . ,n − 1 do in parallel
begin
y i := y i×Low[i];
Low[i] := y i;
end

Figure 4: Parallel computation of the variables LR and HRn−1 .

Now we have computed the cumulative high and low ranges. The array Low
contains the LR values and the ﬁeld High contains the value HRn−1 . Next we have to
compute the sum of these cumulative ranges LR so that we shall obtain the required
bounds HighRange and LowRange for arithmetic compression of string S.

175
Proceedings of the Prague Stringology Conference ’04

L/H S W I S S
LR 0.5 0.2 0.01 0.005 0.0025
HR 1 0.25 0.02 0.01 0.005

Table 4: Low and high ranges.

5.5 Computation of Low and High Ranges

In Section 5.4 we computed the cumulative bounds LR and HR. Here we show how to
obtain the bounds earlier declared as LowRange and HighRange for the compressed
text. These values can be computed as shown in Section 5.2 as

j−1
j
LowRange = ( LRx ) + LRj ,
x=0

j−1
j
HighRange = ( LRx ) + HRj .
x=0

To compute the sum we can use the parallel preﬁx algorithm once more, exactly the
parallel preﬁx sum shown in the former text. Finally, after computing the sum, the
variable HighRangen−1 is obtained as

HighRangen−1 = LowRangen−2 + HRn−1 .

This algorithm is shown in Fig. 5. The array Low contains the values LowRange and
the ﬁeld High contains the value HighRangen−1 . Our example for the word “SWISS”
is shown in Table 5.

for i := 0, 1, . . . , n − 1 do in parallel
y i := Low[i];
for j := 0, 1, . . . , log n − 1 do sequentially
begin
for i := 2j , 2j + 1, . . . , n − 1 do in parallel
y i := y i + Low[i − 2j ];
for i := 2j , 2j + 1, . . . , n − 1 do in parallel
Low[i] = y i ;
end
do sequentially
begin
y n−1 := High;
y n−1 := y n−1 + Low[n − 2];
High := y n−1;
end

Figure 5: LowRange and HighRangen−1 computation algorithm.

176
Arithmetic Coding in Parallel

S W I S S
0.5 0.2 0.01 0.005 0.0025
0.5 0.7 0.21 0.015 0.0075
0.5 0.7 0.71 0.715 0.2175
0.5 0.7 0.71 0.715 0.7175

Table 5: Parallel preﬁx sum example.

6 Time and Cost Complexities

Our algorithm does not say how to set the arrays Range, Low and the variable High
in a preliminary phase. However, having set the arrays A, R, L and H, this can be
done in time O(1) on CREW PRAM with a good hash function that returns an index
in the array A of an input character from the input string S.
Our EREW PRAM algorithm consists of three phases. In the first phase, the
parallel prefix production is computed. As shown in Section 4, this can be done in
time O(n/p + log p) where p is the number of used processors and n is the size of
the input. In the second phase, shown in Fig. 4, we have computed the cumulative
bounds LR and HR in time O(n/p). The third phase, the parallel prefix sum shown
in Fig. 5, also takes O(n/p + log p) time. The computation of HighRangen−1 takes
only O(1) time in any phase. So the time and cost of our algorithm are

T (n, p) = O(n/p + log p),

C(n, p) = O(n + p log p).

If p = n/log n then the total time is O(log n) and the cost is O(n).
Because our algorithm consists mainly of parallel preﬁx computation, it inherits
its best properties. Our algorithm is therefore a well scalable NC algorithm, and it
can be implemented as the cost optimal algorithm.

7 Conclusions
We have presented a parallel NC algorithm for computation of arithmetic coding. We
have solved the problem in O(log n) time using n/log n processors on EREW PRAM.
Our algorithm leads to O(n) total cost and is cost optimal.
The preliminary phase is a weakness of our algorithm. However, if we were able
to construct a good adaptive parallel arithmetic coding based on our algorithm, it
could solve this problem.
Another question is how to make a good parallel arithmetic decoding algorithm.

References
[Ho92] Howard, Paul G., Jeﬀrey Scott Witter (1992): Parallel Losseless Image Com-
pression Using Huﬀman and Arithmetic Coding. Proceedings of the IEEE
Data Compression Conference, 299-308.

177
Proceedings of the Prague Stringology Conference ’04

[Hu52] Huﬀman, David (1952): A method for the construction of minimum-

redundancy codes. Proceedings of the Inst. Radio Engineers, 40: 1098-1101.

[Ji94] Jiang J., S. Jones (1994): Parallel design of arithmetic coding. IEE
Proceedings-Computers and Digital Techniques, 141(6):327-333, November.

[La80] Ladner, Richard and Michael J. Fisher (1980): Parallel Preﬁx Computation.
Journal of the ACM, 27(4):831-838, October.

[Lb99] Laber, Eduardo Sany, Ruy Luiz Milidi and Artur Alves Pessoa (1999): A
Work Eﬃcient Parallel Algorithm for Constructing Huﬀman Codes. Pro-
ceedings of the IEEE Data Compression Conference DCC’99.

[Le92] Lewis, T.G. and H. El-Rewini (1992): Introduction to Parallel Computing.

Prentice Hall.

[Qu94] Quinn, M.J.(1994): Parallel Computing Theory and Practise. McGraw-Hill.

[Mo98] Moﬀat, Alistar, Redford Neal, and Ian H.Witten (1998): Arithmetic Coding
Revisited. ACM Transactions on Information Systems, 16(3):256-294, July.

[Tv94] Casavant, T.L., P. Tvrdı́k and F. Plášil, editors (1994): Parallel Computers:
Architectures, Languages, and Algorithms. IEEE CS Press.

[Wi87] Witten, Ian H., Redford Neal and John G. Cleary (1987): Arithmetic coding
for Data Compression. Communications of the ACM 30(6):520-540.

[Yo98] Youssef A. (1998): Parallel Algorithms for Entropy Coding Techniques. Pro-
ceedings of European Parallel and Distributed Systems. ACTA Press.

178

Optimization Models (Giuseppe C. Calafiore, Laurent El Ghaoui) (Z-Library)
No ratings yet
Optimization Models (Giuseppe C. Calafiore, Laurent El Ghaoui) (Z-Library)
648 pages
PRAM Algorithms
100% (1)
PRAM Algorithms
24 pages
An Introduction To Parallel Algorithms
No ratings yet
An Introduction To Parallel Algorithms
66 pages
Parallel Algorithms: Theory and Practice
No ratings yet
Parallel Algorithms: Theory and Practice
44 pages
PRAM COMP 633: Parallel Computing Algorithms: The PRAM Model of Computation
No ratings yet
PRAM COMP 633: Parallel Computing Algorithms: The PRAM Model of Computation
49 pages
1 Parallel and Distributed Computation
No ratings yet
1 Parallel and Distributed Computation
10 pages
Lecture 9 - Parallel Algorithms
No ratings yet
Lecture 9 - Parallel Algorithms
28 pages
Chapter 14: Parallel Algorithms
No ratings yet
Chapter 14: Parallel Algorithms
23 pages
Parallel Algorithms and Architectures 1
No ratings yet
Parallel Algorithms and Architectures 1
22 pages
1 Overview, Models of Computation, Brent's Theorem
No ratings yet
1 Overview, Models of Computation, Brent's Theorem
8 pages
DC (Unit 4)
No ratings yet
DC (Unit 4)
17 pages
Parallel Prefix Sum
No ratings yet
Parallel Prefix Sum
32 pages
ECE645 Lecture3 Fast Adders
No ratings yet
ECE645 Lecture3 Fast Adders
31 pages
Chapter 02
No ratings yet
Chapter 02
47 pages
Lec5 Pramalgs 1
No ratings yet
Lec5 Pramalgs 1
13 pages
Lecture Parallelism DC PDF
No ratings yet
Lecture Parallelism DC PDF
7 pages
Par Seq Algorithms
No ratings yet
Par Seq Algorithms
44 pages
Basic Exercises for Competitive Programming: Python
From Everand
Basic Exercises for Competitive Programming: Python
Jan Pol
No ratings yet
ECE645_lecture3_fast_adders and hw approaches to implementation
No ratings yet
ECE645_lecture3_fast_adders and hw approaches to implementation
51 pages
DAA 6th
No ratings yet
DAA 6th
12 pages
Parallel Algorithms Unit 2 By Dr. Choudhary Ravi Singh
No ratings yet
Parallel Algorithms Unit 2 By Dr. Choudhary Ravi Singh
18 pages
Lecture 03-Parallel Prefix
No ratings yet
Lecture 03-Parallel Prefix
6 pages
UNIT-8 Forms of Parallelism: 8.1 Simple Parallel Computation: Example 1: Numerical Integration Over Two Variables
No ratings yet
UNIT-8 Forms of Parallelism: 8.1 Simple Parallel Computation: Example 1: Numerical Integration Over Two Variables
12 pages
Week5 Lec14
No ratings yet
Week5 Lec14
27 pages
20.5 Arithmetic Coding
No ratings yet
20.5 Arithmetic Coding
6 pages
Lec4 Arith Compression
No ratings yet
Lec4 Arith Compression
36 pages
Arithmetic Coding Algorithm and Implementation Issues
No ratings yet
Arithmetic Coding Algorithm and Implementation Issues
7 pages
Parallel Computation Models: Slide 1
No ratings yet
Parallel Computation Models: Slide 1
28 pages
Parallel Random Access Machine (PRAM) : Control
No ratings yet
Parallel Random Access Machine (PRAM) : Control
9 pages
Introduction To Parallelism
No ratings yet
Introduction To Parallelism
27 pages
Chapter Six
No ratings yet
Chapter Six
19 pages
Parallel Prefix Computation
No ratings yet
Parallel Prefix Computation
8 pages
Notes 03
No ratings yet
Notes 03
3 pages
parallel and distributed algorithms
No ratings yet
parallel and distributed algorithms
21 pages
qt6j57h5zw Nosplash
No ratings yet
qt6j57h5zw Nosplash
2 pages
PDA_3
No ratings yet
PDA_3
90 pages
Dis Top Tim Notes 1
No ratings yet
Dis Top Tim Notes 1
3 pages
CSE524sp10-01
No ratings yet
CSE524sp10-01
62 pages
Timing-Constrained Area Minimization Algorithm For Parallel Prefix Adders
No ratings yet
Timing-Constrained Area Minimization Algorithm For Parallel Prefix Adders
8 pages
Week 03-Informtion Sources and Source Coding
No ratings yet
Week 03-Informtion Sources and Source Coding
25 pages
Chapter Six
No ratings yet
Chapter Six
18 pages
Parallel Models of Computation
No ratings yet
Parallel Models of Computation
3 pages
Unit 1
No ratings yet
Unit 1
18 pages
Parallel Prefix Adders Presentation
No ratings yet
Parallel Prefix Adders Presentation
35 pages
Overheads
No ratings yet
Overheads
139 pages
Fundamental Algorithms: Chapter 3: Parallel Algorithms - The PRAM Model
No ratings yet
Fundamental Algorithms: Chapter 3: Parallel Algorithms - The PRAM Model
26 pages
Arithmetic Coding
No ratings yet
Arithmetic Coding
6 pages
Lect5Brent
No ratings yet
Lect5Brent
10 pages
An FPGA-Based Implementation of Multi-Alphabet Arithmetic Coding
No ratings yet
An FPGA-Based Implementation of Multi-Alphabet Arithmetic Coding
9 pages
Pram Algorithms: Parallel and Distributed Algorithms BY Debdeep Mukhopadhyay AND Abhishek Somani
No ratings yet
Pram Algorithms: Parallel and Distributed Algorithms BY Debdeep Mukhopadhyay AND Abhishek Somani
17 pages
Lecture 8-Print
No ratings yet
Lecture 8-Print
24 pages
JaJa Parallel - Algorithms Intro
50% (2)
JaJa Parallel - Algorithms Intro
45 pages
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
No ratings yet
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
104 pages
CS4230 Parallel Programming Introduction To Parallel Algorithms
No ratings yet
CS4230 Parallel Programming Introduction To Parallel Algorithms
25 pages
Chapter 01
No ratings yet
Chapter 01
52 pages
Parallel
No ratings yet
Parallel
59 pages
Assignment of Algorithm
No ratings yet
Assignment of Algorithm
9 pages
7-Tree Sum Parallel Algorithm & Applications
No ratings yet
7-Tree Sum Parallel Algorithm & Applications
23 pages
Main
No ratings yet
Main
10 pages
2. Coding Theory
No ratings yet
2. Coding Theory
49 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Neural Architecture Search: Basics
No ratings yet
Neural Architecture Search: Basics
20 pages
5 SVM
No ratings yet
5 SVM
16 pages
SL No Title of Book Author Publication: Numerical Simulation of Fluid Flow and Heat/Mass Transfer Processes
No ratings yet
SL No Title of Book Author Publication: Numerical Simulation of Fluid Flow and Heat/Mass Transfer Processes
4 pages
UCEST105 Algorithmic Thinking With Python MODEL QP SEMESTER 1
No ratings yet
UCEST105 Algorithmic Thinking With Python MODEL QP SEMESTER 1
4 pages
This Study Resource Was: MC Qu. 6-69 The Figure Below Shows The Nodes..
No ratings yet
This Study Resource Was: MC Qu. 6-69 The Figure Below Shows The Nodes..
3 pages
200 Problem Set 6
No ratings yet
200 Problem Set 6
7 pages
10 Discrete-Time Fourier Series: Recommended Problems
No ratings yet
10 Discrete-Time Fourier Series: Recommended Problems
6 pages
159 GRJ 5248
No ratings yet
159 GRJ 5248
7 pages
Mtech VLSI Projects
No ratings yet
Mtech VLSI Projects
3 pages
DM Mod4
No ratings yet
DM Mod4
108 pages
03 second_call_27 june afternoon (1)
No ratings yet
03 second_call_27 june afternoon (1)
3 pages
1. Project2 Objectivce+Deliverables+Rubrics
No ratings yet
1. Project2 Objectivce+Deliverables+Rubrics
2 pages
MATA29 Assignment 1 - Functions (BLANK)
No ratings yet
MATA29 Assignment 1 - Functions (BLANK)
7 pages
Decision Recognizing Risks: Topic 5
No ratings yet
Decision Recognizing Risks: Topic 5
24 pages
Luna: An Evaluation Foundation Model To Catch Language Model Hallucinations With High Accuracy and Low Cost
No ratings yet
Luna: An Evaluation Foundation Model To Catch Language Model Hallucinations With High Accuracy and Low Cost
13 pages
Pritam Bhowmick - Operation Research - 24000721065
No ratings yet
Pritam Bhowmick - Operation Research - 24000721065
12 pages
kvs-maths-marking-scheme
No ratings yet
kvs-maths-marking-scheme
9 pages
Theory, Algorithm, and System Development, Prentice Hall, Upper Saddle River
No ratings yet
Theory, Algorithm, and System Development, Prentice Hall, Upper Saddle River
12 pages
Stjepan Tomaš Moj Tata Spava S Anđelima Mali Ratni Dnevnik PDF
0% (2)
Stjepan Tomaš Moj Tata Spava S Anđelima Mali Ratni Dnevnik PDF
7 pages
worked example
No ratings yet
worked example
4 pages
Programme Guide
No ratings yet
Programme Guide
90 pages
DSP Question Paper Unit 4
No ratings yet
DSP Question Paper Unit 4
2 pages
Redundant Fixed Effects Tests FP To DLLP 2
100% (1)
Redundant Fixed Effects Tests FP To DLLP 2
2 pages
Parker Manufacturing Has Been Using Activity
No ratings yet
Parker Manufacturing Has Been Using Activity
3 pages
c13 Quiz - Attempt Review 3
No ratings yet
c13 Quiz - Attempt Review 3
9 pages
Hyperpatch Notes
No ratings yet
Hyperpatch Notes
15 pages
3 Modeling Quadratic Func Lesson 3
No ratings yet
3 Modeling Quadratic Func Lesson 3
15 pages
Fast Reed-Solomon Interactive Oracle Proofs of Proximity
No ratings yet
Fast Reed-Solomon Interactive Oracle Proofs of Proximity
33 pages

Arithmetic Coding in Parallel: Jan Supol and Bo Rivoj Melichar

Uploaded by

Arithmetic Coding in Parallel: Jan Supol and Bo Rivoj Melichar

Uploaded by

Arithmetic Coding in Parallel

Jan Šupol and Bořivoj Melichar

Abstract. We present a cost optimal parallel algorithm for the computation

Keywords: arithmetic coding, NC algorithm, EREW PRAM, PPS, parallel

2 Sequential Arithmetic Coding

Table 1: Frequencies, probabilities and ranges of ﬁve symbols.

LowRangej = LowRangej−1 + (HighRangej−1 − LowRangej−1 ) × lx ,

HighRangej = LowRangej−1 + (HighRangej−1 − LowRangej−1 ) × hx ,

A L&H The calculation of low and high ranges

Table 2: The process of arithmetic encoding.

Parallel cost (also called processor-time product) is deﬁned as

C(n, p) = p × T (n, p).

∀np = Ω(ψ1 (p)) : E(np , p) ≥ E0 .

4 Parallel Preﬁx Computation

Figure 1: Parallel preﬁx algorithm.

Fig. 2 indicates a parallel preﬁx algorithm computing an array of 7 numbers with

Figure 2: Parallel preﬁx sum example.

5 Parallel Arithmetic Coding

5.2 Changes in Sequential Algorithm

LowRangej = LowRangej−1 + (HighRangej−1 − LowRangej−1 ) × lx ,

HighRangej = LowRangej−1 + (HighRangej−1 − LowRangej−1 ) × hx ,

LRj = (HighRangej−1 − LowRangej−1 )×lx ,

HRj = (HighRangej−1 − LowRangej−1 )×hx .

So the values LowRange and HighRange are updated as

LowRangej = LowRangej−1 + LRj ,

HighRangej = LowRangej−1 + HRj

LRj = (HighRangej−1 − LowRangej−1 ) × lx =

HRj = (HighRangej−1 − LowRangej−1 ) × hx =

Moreover, LowRangej can be computed as

LowRangej = LRj + LowRangej−1 = LRj + LRj−1 + LowRangej−2 =

5.3 Parallel Preﬁx Production

Figure 3: Parallel preﬁx production algorithm.

Table 3: Parallel preﬁx production example for the word “SWISS”.

5.4 Cumulative Bounds Computation

Figure 4: Parallel computation of the variables LR and HRn−1 .

Table 4: Low and high ranges.

5.5 Computation of Low and High Ranges

HighRangen−1 = LowRangen−2 + HRn−1 .

Figure 5: LowRange and HighRangen−1 computation algorithm.

Table 5: Parallel preﬁx sum example.

6 Time and Cost Complexities

T (n, p) = O(n/p + log p),

C(n, p) = O(n + p log p).

[Hu52] Huﬀman, David (1952): A method for the construction of minimum-

[Le92] Lewis, T.G. and H. El-Rewini (1992): Introduction to Parallel Computing.

[Qu94] Quinn, M.J.(1994): Parallel Computing Theory and Practise. McGraw-Hill.

You might also like