0% found this document useful (0 votes)
61 views75 pages

Data Compression Askbooks

aktu quantum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
61 views75 pages

Data Compression Askbooks

aktu quantum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 75
GQ ecco QUANTUM Series TCH ea Data Compression y Even Sener ayy + Topic-wise coverage of entire syllabus in Question-Answer form. + Short Questions (2 Marks) www.askbooks.net A.S.K. Always.Seek.Knowledge All AKTU QUANTUMS are available + An initiative to provide free ebooks to students. * Hub of educational books. Wie te es on this website are submitted by readers you can also donate ebooks/study materials. PA er un eke aru hncus) 3. If you have any issues with any material on this Pe eu act mC cece 4. All the logos, trademarks belong to their respective owners. PUBLISHED BY: Apram Singh Quantum Publications® (A Unit of Quantum Page Pvt. Ltd) _ Plot No. 59/2/1, Site - 4, Industrial Area, Sahibabad, Ghaziabad-261 010 Phone :0120-4160479 Email: [email protected] Website: www.quantumpage.co.in Delhi Office 6590, Bast Rohtas Nagar, Shahdara, Delhi-110032 © Au. Ricits Reseven fo part of thie pubeation may be reproduced or tranemitted, ‘im any form or by any means, with reba Layee a aaNel hones beleved be rub Every eflorthes bens ad ensr Sceurcy, however nether the publisher nr he shor fhanice the wcurcy or completeness ofa formato | | published herein and nether the pubisher nor the amore | Tat be raponchic forany emer onions or damagrs_| | Being ot o me of hs formation Data Compression (CS/T : Sem) To ation © 207212 De Baiton » 201218 = alton © 2019-14 Taion © 2024-15 Balition © 2015.76 Paton © 201617 Biiuon © 2017-78 Edition © 201810 Instore Rs, 110/- only CONTENTS RCS 087 : DATA COMPRESSIO! UNIT: : INTRODUCTION Compression Techniques: Loss less compression (4p—27) Lossy Compression, {pert Mlalcing andconkng Matiastol Preliminaries for Lanse, compression, Abt nenhic ta tondortion theory, Mees TRycEat nate Pronabty models, Markov modes, compuste source tmodiel Ceding uniquely decodable codes, Pet code> UNIT-2 : HUFFMAN CODING (28 D—61D) The Hulman coding algorithm: Minimum variance Hutfman codes Ihuaptive Hutlan easing. Update procedure, Encoding procedure Decbding procedure, Golomb coules, Rice codes, Tunstall codes Applications of Hoffman coding: Loss less image compression, Text Ghaprension, Audio Compression UNIT. : ARITHMETIC CODING (62 D—106 D) (Coatings sequence, Generating binary code, Comparisn of Huffonan coding, Applications: Brlevel image comp ‘dard, TBIG2, Image compression. Dic Static Dictionary” Diagram Coding, Adaptive Dictionary. The L277 Approach, The L278 Approsch, Applications. Tile Compression -UNDC compress, image Compression. ile Graphies Interchange Forreat (GIP). Compression over Modem: W 12 bite Predictive Codi Partial match (ppm): The basic algorithm, The FSCAPI of context, The Exclusion Principle, The Burrows Wheeler Transform. Moveto-tront coding. CALIC. JPEC-LS, Multi resoltion Approaches, Facsimile Encoding, Dynamic Markay Compression. ve Prediction with SYMBOL, length UNIT-4 : MATHEMATICAL PRELIMINARIES FOR LOSSY CODING (107 D—135 D) Distortion criteria, Models Scalar Quantization: The Quantization problem, Uniform Quantzer, Adaptive Quantization, Non uniform Cuantisation UNITS : VECTOR QUANTIZATION Advantages of Vector Quota (136 D—146 Dy monet Scalar Quantization, The Linde SHORT QUESTIONS (47 D—165 p) SOLVED PAPERS (2011.12 TO 2018.19) (166 D—176 D) UNIT Introduction (SD - 10D) Compression Techniques : Lossless Compression and Lossy Compress Measures of Performance Modeling and Coding A. Concept Outline :PArt-L ess B. Long and Medium Answer Type Questions Part-2. Mathematical Preliminaries for Lossless Compression : A Brief Introduction to Information Theory Models : Physical Models Probability Models Markov Model b Composite Source Model A. Concept Outline : Part-2 B. Long and Medium Answer Type Questions... Part-3 Coding : Uniquely Decodable Codes + Prefix Codes A. Concept Outline : Part-2 os B. Long and Medium Answer Type Question sCSAT-8) D Data Compression SiCSIre)p PART-1 Compression Techniques : Lossless Compression and Lossy Compression, Measures of Performance, Modeling and Coding. CONCEPT OUTLINE: PART-1 ‘+ Data compression is an art of science of representing information ina compact form, + Lossless compression involves no loss of information, + Losay compression involves some loss of information and data cannot he reconstructed exactly same as original + Modeling and coding are the two phases for the development of| L any data compression algorithm for a variety of data, [ Questions-Answers Long Answer Type and Medium Answer Type Questions Que 1-1, ] What is data compression and why we need it? Explain compression and reconstruction with the help of block diagram. UPTU 2013-14, Marks 05 ‘UPTU 2015-16, Marks 02 ‘UPTU 2016-16, Marks 10 Answer 1. In computer science and information theory, data compression is the Process of encoding information using fewer bits than a decoded representation would use through use of specific encoding schemes. 2 Itis the art or science of representing information in a compact form. ‘This compaction of information is done by identifying the structure that exists in the data, 3. Compressed data communication only works when both the sender and the receiver of the information understand the encoding scheme 1 For example, any text makes sense only if the receiver understand that itis intended to be interpreted as characters representing Fnglish Janguage. Similarly, the compressed data can only be understood if the decoding: method is known by the receiver. 6(csir-s) D Introduction, [Need of data compression : L Compression is needed because it helps to reduce the consumption of ‘expensive resources such as a hard disk space or transmission bandwidth, ‘Asan uncompressed text or multimedia (speech, image or video) data requires a huge amount of bits to represent them and thus require large bandwidth, this storage space and bandwidth requirement can be decreased by applying proper encoding scheme for compression, ‘The design of data compression schemes involves trade off among various. factors including the degree of compression, the amount of distortion introduced and the computational resources required to compress and decompress the data Compression and reconstruction : 1 2 ‘Acompression technique or compression algorithm refers two algorithms i.e., compression algorithm and reconstruction algorithm, ‘The compression algorithm takes an input X and gonorates a representation X, that requires fewer bits, and the reconstruction algorithm operates on the compressed representation X, to generate the reconstruction Y, These operations are shown in Fig. 1-11 ‘Original Fig. Lads Quast | What do you mean by data compression ? Explain its UPTO R01L-Az, Marks 05 application areas. “Answer Data compression : Refer Q. 1.1, Page 5D, Unit-l Applications of data compression 1 Audio ‘8. Audio data compression reduces the transmission bandwidth and storage requirements of audio data, Data Compression 7(CSAT-8) D b. Audio compression algorithms are implemented in software as audio codes. © Lossy audio compression algorithms provide higher compression at the cost of fidelity and are used in numerous audio applications, 4d. ‘These algorithms rely on psychoacoustics to eliminate or reduce fidelity ofless audible sounds, thereby reducing the space required to store or transmit them. 2 Video: 4, Video compression uses modern coding techniques to reduce redundancy in video data. 1b, Most video compression algorithms and codecs combine spatial image compression and temporal motion compensation. Vidco compression is a practical implementation of source coding in information theory. sneties : Genetics compression algorithms are the latest generation of lossless algorithms that compress data using both conventional compression algorithms and genetic algorithms adapted to the specific datatype 4 Emulation : 4. In order to emulate CD-based consoles such as the Playstation 2, data compression is desirable to reduce huge amounts of disle space used by ISOs, b. For example, Final Fantasy XII (Computer Game) is normally 2.9 gigabytes, With proper compression, it is reduced to around 90% of that size, Que] What do you understand by lossless and lossy compression oR What do you mean by lossless compres: mn ? Compare lossless UPTU 2011-12; Marke 05 UPTU 2015-16, Marks 02 UPTU 2015-16, Marka 10) compression with lossy compression. =z] Lossless compression + 1. In loseloss compression, the redundant datais removed. information contained in the 2 Due to removal of such information, there is no lose interest. Hence itis called as losless compression, ot" Of tHe data of 8(CS/IT-8) D Jon is also known as data compaction 4, Lossless compression techniques, as their nam of information. 3. Lossless compre: nplies, involve no loss 5. Ifdatahave been losslessly compresced, the original data can be recovered, exactly from the compressed data, 6 Lossless compression is generally used for applications that cannot tolerate any difference between the original and reconstructed data. 7. Text compression is an important area for lossless compression. 8. Itisvery important that the reconstruetion is identical to the text original, as very small differences can result in statements with very different meanings. Lossy compression : 1. In this type of compression, there is a loss of information ina controlled ‘The lossy compression is therefore not completely reversible, 3. But the advantage of this type is higher compression ratios than the lossless compre: “ion. 4. The lossless compression is used for the digital data. For many applications, the lossy compression is preferred due to its higher compression withouta significant loss of important information, 6. Panaligital audio and video applications, we need astandard compression algorithm, eer evnmpression techniques involve some loss of information, and Lossy compen oe compressed ving lossy techniques generally be sete ered or reconstructed exactly. jung his distortion in the reconstruction, we ean ene cestapresson ratios than is poseibe with 8, In return for accepti generally obtain mucl Jossless compression. Que. | What is data compression and why we need it? Describe y compression technique is necessary Wiarks 10 two applications where 105s a [UPTO Boa, Mt for data compression Anewer | Data compression a ‘Applications where I compression + d its need : Refer Q. 1.1, Page 5D, Unit-L aay compression is necessary for dats pression 4? Ue dgradation of picture qual 1. Lossy image com storage eapacities with minimal degrad Data Compression 9(CsIT-/D 2. In lossy audio compression, methods of psychoacousties are used to remove non-audible (or less audible) components of the audio signal, Que 15. | What is data compression and why we need it? Explain compression and reconstruction with the help of block diagram. What are the measures of performance of data compression UPTU 2012-15, Marks 10 algorithms ? on, What are the measures of performance of data compression algorithms ‘UPTU 2018-14, Marks 05 UPTU 2015-16, Marks 10 ‘Answer | Data compression and its need : Refer Q. 1.1, Page 5D, Unit-1. Compression and reconstruction : Refer Q. 1.1, Page 5D, Unit-1 Measures of performance of data compression : 1. A-compression algorithm ean be evaluated in a number of different 2. Wecould measure the relative complexity of the algorithm, the memory required to implement the algorithm, how fast the algorithm performs ona given machine, the amount of compression, and how closely the reconstruction resembles the original. 3, A very logical way of measuring how well a compression algorithm compresses a given set of data is to look at the ratio of the number of bits required to represent the data before compression to the numberof bits ‘oquired to represent the data after compression. This ratio iscalled the compression ratio. 4. Another way of reporting compression performance is to provide the fverage number of bits required to represent a single sample, 5. Thisis generally referred to as the rate. 6. Inlossy compression, the reconstruction differs from the original data 7. ‘Therefore, in order to determine the efficiency of a compression algorithm, we have to have some way of quantifying the diflerenc® ‘The difference betwoen the original and the reconstruction is oer called the distortion. Lossy techniques are generally used for the compr haiog signals, such as speech and video. ich azid video, the final arbiter of quality =P ssion of data thet originate 10, Incompression of =) 0c: uu. w. 13, Because human responses are difficult to model mathematically, many approximate measures of distortion are used to determine the quality of the reconstructed waveforms Other terms that are also used about differences between the reconstruction and the original are fidelity and quality. When the fidelity or quality of a reconstruction is high, the difference between the reconstruction and the ariginal is small Que 16. | Explain modeling and coding with the help of suitable example. Data Compression | ~ A fundamental limit to lossless data compression is ealled entropy 1 (csrs) D GONCEPT OUTLINE PART-2 ‘+ Entropy is the measure of the uncertainty associated with a random variable, + Physical model, probability model and Markov model are the three approaches for building mathematical model, “UPTU 2015-14, Marka 05 =] ‘The development of any data compression algorithm for a variety ofdata can. be divided into two phases: Modeling and Coding. i ‘Modeling = a. Inthis phase, we try to extract information about any redundancy or similarity that exist in the data and describe the redundancy of data in the form of model. b. ‘This model act as the basis of any data compression algorithm and ‘the performance of any algorithms will depend on how well the model is being formed, Coding : 8. This is the second phase. It is the description of the model and a description of how the data different from the model are encoded, generally encoding is done using binary digits. b. Example: Consider the following sequence of number ‘9, 10, 11, 12, 13 By examining and exploiting the structure of datain a graph paper it seems ta be a straight line, so we modeled it with the equation, xant9 n=0,1 20 PART-2 Mathematical Preliminaries far Lossless Compression : A Brief Intractuction to Information Theory, Models : Physical Models, Probability Models, Markov Models, Composite Source Model |__Long Answer Type and Medium Answer Type Questions Que if, | Explain entropy or entropy rate as given by Shannon. al 1 In information theory, entropy is the measure of the uncertainty dusociated with arandom variable, Tels umually refer to ax Shannon entropy, which quantifies, in the sense ofan expected valve, he information contained ina meseago usually in ‘unite such as bit. Shannon entropy isa measure of the average information content be, the average number of binary symbol needed to Code the output of the Shannon entropy represents an absolute limit on the best possible lossless compression of any commiunication under certain eonstrainta treating message to be encoded as a sequence of independent and identically ‘Bstelboted random varioble, ‘The entropy rate of a source is a number which depends only on the statistical nature of the source. Consider an arbitrary source” He A Ky Xp Following are various modcle 1. Zeroorder model: The casracters are statistically independent of each other and every lottor of alphabet are eaually lheae te neces Let m be the size of the alphabet, In this case, the entrapy r mn the entropy rate is H = log, mbitsicharacter example, ifthe alphabet size is m= For exan r 27 then the entropy rate H = tog, 2 4, F 12(CS17-8) D Introduction He first order model : The character are statistically independent Let m be the size of the alphabet and let P. is th alphabet and let P, is the probability of the #* letter in the alphabet. The entropy ratei, H~ -¥ Plog, P. vitweharacter Hi. Second order shodel :Let P, be the conditional probability that the present character is the j* letter in the alphabet given that the Previous character is the # lotter. The untroey tenn Ha ENS Py log, P, bitwcharacter fv. Third order model: Let P, be the conditional probability that the present character is the k! letter in the. alphabet given that the Previous character is the j letter and the one before thet hehe os letter. The entropy rate is, H= PEP be, Py, bitscharacter General model : Let B, represents the first n characters. The entropy rate in the general ease is given by, 15 PUB, dog, PUB.) bitaleharacter Que 1.8. ] Consider the following sequence : Data Compression Quella, 13 (CS'T-8) D [aL ze] +o gi4] What do you understand by information and entropy ? Find the first order entropy over alphabet A = lay dy ay a,) where Pla,)=Pla,) = Pla,) = Pla) = VA. UPTU 2013-14, Marks 05 “Answer Information + L The amount of information conveyed by a message increases as the amount of uncertainity regarding the message becomes greater, 2. The more it is known about the message a source will produce, the lees the uneertainity, and less the information conveyed, 8. The entropy of communication theory is a measure of this uncertainity conveyed by a message from a source. 4 The starting point of information theory is the concept of uncertainity. 5. Let us define an event as an occurrence which can result in one of the many possible outcomes, 6 The outcome of the event is known only after it has occurred, and before its occurrence we do not know which one of the several possible outcomes will actually result. 7. We are thus uncertain with regard to the outcome before the occurrenee 1, 2, 3, 2,3, 4,5, 4,5, 6,7, 8,9,8,9, 10 Assuming the frequency of occurrence of each number is reflected accurately in the number of the sequence is independent and first order entropy of the sequence. Angwer 1es it appears in the sequence and jentically distributed, find the 1 xy= 2 PG) = pond 21 Pan 2o2 i6 “8 Pay= 2.1 168 204 1a = 2 PO) = Pon 6-8 201 P15) a PAO)» Por Bs a= Fe First order entropy is given by H= -SPWlog, Pa) of the event 8. After the event has occurred, we are no longer uncertain about it. 9. Ifwe know or can assign a probability to each one of the outcomes, then We will have some information as to which one of the outcomes is most likely to oceur. Entropy : Refer Q. 1.7, Page LID, Unit-1 Numerical First order entropy is given by, Hz -Y Pwlog, Pw) 4 1a 1 + dtog, 4 + Flog, 1 + Loy Jog, 5+ gong tt 1 4 1 1 1 1 = Hho. 4+ toe 4 +} og Tog, 4 H= Flom. 4+ Flows 4 + Flow, 4 Hos, = a{Liog,4 He 4 108.4] H= 2bits 14(CSAT-8) D Introduction ‘Que 140, | Prove that the average codeword length J of an optimal code for a source $ is greater than or equal to entropy H(S). ae] 1. According to Kraft-MeMillan, ifwe have uniquely decodable code € with keeodeword then the following inequality holds : toa) 2, It states that if we have a sequence of positive integers Ujlf.y which satisfy equation (1.10.1). Consider a source S with alphabet A = lay. 4, ",), and probability model (PC), PCa), Pla the average codeword length is given by, 3, Therefore, the difference between entropy of the source H(S) and the average length as, — E Plapogs Pia) ¥ Pah His) é. 1 \ Erno ts ]-4) A ( 1) 2) = EPeo(toee| gy] tee} « fame 25] «eS sai i * lity, which states Dav itfta) iva concave function, Shen BYU] =/UE1X). Phe tg anetio Sot De <1, a hence W(S)-T=0 or ms)sT ed and received Que 1.11. ] The joint probabilities of the transmitted and message of a communication system is given as: Data Compression 15 (CSAT-8) D Y, ¥, ¥, YY e uw 6 wm oo rane BO ow ao % 8 to ta zs o 120 ° 10 x 3 4 3 OMS Caloulnte HOD and BO. sz Ri wim ahaa} einte}-o 2/8 lae| 6 aree ae: — x, 0 | 0 }1/10/ 1/20 | 0 |u20] 0 |i xlole |e lars soma 03 PX,) = 4+ 041/100 PX) = 0+ V4 40+ 120 os a Rs Se Ser Beret enews TOD = POC) og, WPCC NS Px, lg, PUK) 2 Poe Late ee oa Tpa ee yee = 0.5 og, v0.5) +0:310g ney 8 ee! Cina ata Rae CNR FOB osaor sass Statens ome up 2 2) Soin Sins, AY aeeemametmn Fabrostares erase, PED: Voge cease A Ooo tae hee, 0.05 TH = Piso PA ge pcr SB tops UP EN Bm = 0.25 log, (1/0.25) + 0.3 log, (10.3) + 6.2 log, * (202) + 0.25 tog, (70.25) * = 6000 + 0 TM son AY) = 1.9855 bits jomeae fenne Que 112, | What do you underst: jive an alphabet 4 = (a, following cases 1 Pa)=12, Pla) = 4, Pla.) = ‘and by information and entropy? find the first-order + tropy in the a) = V8 16(CSAT-8) D Hi. P(a,) =0.505, Pla,) = U4, Pia.) = V8 and Pla,) = 0.12. And also differentiate between static length and variable length ‘coding schemes. Explain with the help of examples. UPTU 2019-13, Marks 10 oR Differentiate between static length and vari scheme. Explain with the help of an example. UPTU S013, Marks 05 UPTU 2016-16, Marks 10 wble length coding “Answer Information and entropy : Refer Q. 1.7, Page TLD, Unit-1 i. First-order entropy is given by, hm 1,1,3,3 L75bits 1 and Pla, = 0.12 5 01508, Pia,)= 1, Pla, 4 s1=- [essing 0505 ing 2-2 ngs. 013 ~ 10.505(~ 0.985644) + 0.25(-2) + 0.125(~3) + 0.12(- 3.05) ~ [- 0.49775 — 0.5 ~ 0.375 ~ 0.366] = 1.73875 Difference between static length and variable length coding schemes: Static length codes : 1. Static length codes are also known as fixed length codes. A fixed length, code is one whose codeword length is fixed. 2 ‘The ASCII code for the letter ‘a’ is 1000011 and for the letter ‘A’ is coded ‘as 1000001 8. Here, it should be notice that the ASCII code uses the same number of bits to represent each symbol. Such codes are called static ar fixed length codes. Variable length codes : 1. Avvariable length code is one whose cod 2 For example, consider a table given bel ' 17 (CSET-8) D Data Compression 5 Golet [_Code2 | Code® , 00 © © & o1 1 a0 = 10 oo 110 2 i 1 m1 In this table, code 1 is fixed length code and code 2 and code 3 are variable length codes, Que 115.] What is average information ? What are the properties used in measure of average information ? [UPTU 2011-13, Marks 06) UPTU 2015-16, Marks 10 Answer Average information : 1, Average information is also called as entropy. 2. Ifwe have a set of independent events A,, which are the set of outcomes of some experiments S, such that Uses where S is the sample space, then the average self-information associated with the random experiment is given by, H = EP(A,) i(A,) = -EP(A) log, PA) 3. The quantity is called the entropy associated with the experiment. 4, One of the many contributions of Shannon was that he showed that if the experiment is a source that puts out symbol A, from a set A, then the entropy is a measure of the average number of binary symbols needed to code the output of the source. 5, Shannon showed that the best that a lossless compression scheme can do is to encode the output of a source with an average number of bits ‘equal to the entropy of the source. Given a set of independent events A,, A,, .... A, with probability P, = P\A), the following properties are used in the measure of average information H 1. We want f to be @ continuous function of the probabilities p,. That is, a small change in p, should only cause a small change in the average information. 2, Ifall events are equally likely, that is, p, = V/n for all i, then H should be a monotonically inereasing function of n. The more possible outcomes there are, the more information should be contained in the accurrence of any particular outcome. 18(CSAT-8) D Introduction, 3. Suppose we divide the possible outcomes into a number of groups. We indicate the occurrence of a particular event by first indicating the group it belongs to, then indicating which particular member of the group it is, 4. ‘Thus, we get some information first by knowing which group the event belongs to and then we get additional information by learning which particular event (from the events in the group) has occurred. ‘The information associated with indicating the outcome in a single stage. GET] Explain ditrerent approaches for building mathematical model also define two state Markov model for ary images. UPTU R014 15, Marko UPTU 3011-12, Marka 05) [UPTO 2015-16, Marks 10] There are several approaches to building mathematical models Physical models : 1. Inspeech-related applications, knowledge about the physies of speech production can be used to construct a mathematical model for the sampled speech process. Sampled speech can then be encoded using this model. 2 Models for certain telemetry data can also be obtained through knowledge of the underlying process. 3. Forexamplo, if residential electrical meter readings at hourly interval were to be coded, knowledge about the living habits of the populace could be used to determine when electricity usage would be high and ‘when the usage would be low. Then instead of the actual readings, the difference (residual) between the actual readings and those predicted by the model could be coded. Probability models : 1. The simplest statistical model for the source is to assume that each letter that is generated by the source is independent of every other letter, and each oceurs with the same probability. 2. We could call this the ignorance model, as it would generally be useful only when we know nothing about the source. The next step up in comple.ity is to keep the inde” assumption, but remove the ‘equal probability assur bability of occurrence to each letter in the alphabet 3, For asource that generates letters from an alphabet A = lay. dy) (sh we can have a probability model P= (Pa), a, May) 4. Given a probability model (and the independence assumption), we can compute the entropy of the source using equation (1.14.1) Data Compression 19(CSAT-8) D H(s) = —SPOX,) log PX) aan 5, We eanalso construct some very efficient codes to represent the letters ina. Ofcourse, these codes are only efficient ifour mathematical assumptions are in accord with reality, Markov models : 1. One of the most popular ways of representing dependence in the data is through the use of Markow models, Formodels used in lossless compression, we use a specific type of Markov process called a discrete time Markov chain, 3. Lat [x,] bea sequence of observations. This sequence is said to follow a order Markov model if PELE yon Ba) = PEELS, ges Syagr od 342) In other words, knowledge of the past & symbols is equivalent to the knowledge of the entire past history of the process, 5. The values taken on by the set (x, ,»...., ,) are called the states of the process. IFthe size ofthe source alphabet i/, then the number of states & The most commonly used Markov model isthe first-order Markov model, for which Poe, |x,..) = Pee, |, Equations (1.14.2) and between samples. 8. However, they do not describe develop different first-order assumption about the form of th 9. Iwe assumed that the de ‘we could view the data se white noise eat ose 114.3) (2.14.8) indicate the existence of dependence the form of the dependence. We can Markov models depending on our 1e dependence between samples, pendence was introduced in a linear manner, "quence as the output ofa linear filter driven by 10. The output of such a filter can be given by the difference equation, 2~ Pan #5, aaa) de is often used when 1s for speech and images. * Markow model does not require the assumption of linearity where ©, is a white noise process. This mo developing coding algorithm: 1. Theuse of the 12, Forexample, 8 Consider a binary image. b. ‘The image has only two types of n ape types of pixels, white pixels and black 20 (CSAT-8) D Introduetion © We know that the appearance of a white pixel as the next observation depends, to some extent, on whether the eurrent pixel is white or black ‘Therefore, we ean model the pixel process asa diserete time Markov chain, a ©. Define two states S, and S, (S, would correspond to the ease where the current pixel is a white pixel, and S, corresponds ta the case where the current pixel is a Black pixeD). £ We define the transition probabilities Pt/b) and Pibiw), and the probability of being in each state P(S,) and P\S,), The Markov model can then be represented by thé state diagram shown in Fig, LL. & The entropy of a finite state process with states S, is simply the average value of the entropy at each state H= DPS) HS) 1145) Pele Poi A two-state Markov model for binary images.) h. For our particular example ofa binary image HIS.) = -P(blie) log POb hw) ~ Pll) log Plate) P(whw) = 1 PXblw). H(S,) can be calculated in a similar where Composite source model 1. Tamang applications it isnot easy to ute single model to describe the 2. Insuch case, we can define a composite soures, which ean be viewed as $'SiStonios or omapositien of envere evurcen, with only com puree being ostive atany given time. eri apurce ca bop sa inet aa ao * 3 ‘each with its own model Mand a switch that selects a source S, with. Drobabity P,onshown in Pi, 1-142) 21(CSAT-8)D Data Compression Source 1 Souree 2 t \Saiteh, Source n |—* Fig. 1.14.2, A composite source. 4, This is an exceptionally rich model and ean be used to describe some very complicated processes. ‘Que TAB. | what is zero frequency model in Markov models in text compression ? UPTU 2019-14, Marka 05 UPTU 3015-16, Marks 05 “Answer 1. Asexpected, Markov models are particularly useful in text compression, where the probability of the next letter is heavily influenced by the preceding letters, In current text compression literature, the kth-order Markov models ‘are more widely known as finite context models, Consider the word preceding, Suppose we have alread ¥ processed preceding and are going to encode the next letter. : fos Ifwe take no account of the context and treat each letter as a surprise, the probability of the letter g occurring is relatively low. [Ewe use a first-order Markov model or single-letter context (that is, we look at the probability model given n), we can see that the probability of 4 would increase substantially. ‘As we increase the context size (go from n to in to din and so on), the Probability of the alphabet becomes more and more skewed, which results in lower entropy. ‘The longer the context, the better its predictive value. itwe were to store the given length, the nur the length of context. Probability model with respect to all eontexts of imber of contexts would grow exponentially with 10. Consider a context model of last four symbols) 11. Te we take an alphabet size of 95, the possible number of contexts ix 95" Gnore than BI mnillien), der four (the contexts determined by the 22(c811 12, 13 16 1, 18 Que by uniquely decodable cod 8) D Introduction Context modeling in text compression schemes tends to be an adaptive strategy in which the probabilities for different symbols in the different contexts are updated ns they are encountered, However, this means that we will often encounter symbols that have not been encountered before for any of the given contexts (this is known as the zero Frequency problem), ‘The larger the context, the more often this will happen. This problem could be resolved by sending a code to indicate that the following symbol was being encountered for the first time, followed by a prearranged code for that symbol, ‘This would significantly inerease the length of the code for the symbol its first occurrence. However, ifthis situation did not occur too often, the overhead associated ‘with sueh occurrences would be small compared to the total number of bits used to encode the output of the souree. Unfortunately, in context-based encoding, the zerafrequeney problem is encountered often enough for overhead to be a problem, especially for longer contexts. PART-3 Coding : Uniquely Decodable Codes, Prefix Codes. GONGEPT OUTLINE : PART-2 > Acode is uniquely decodable ifthe mapping C= A." A," is one toone, that is, v, and.x’in A,*, x 2° => C> (e) + C°\r) + A-code C is a prefix code if no codeword w, is the profix to ‘another codeword w, (i #3). "i Answer Toe and Meiamn Answer Type Questions iG: Write a short note on coding. What do you understand “Answer 1 2 8 Yoding means the assignment of binary sequences to elements of an alphabet ‘The set of binary sequences is ealled a code, and the individisal members of the set are called codewords An alphabet is collection of symbols called letter Data Compression 2s (cs) D “The ASCII code uses the same numberof bits to represent each aymia Such a code is called a fixed-length code. z I we want to reduce the number obits required to represent different messages, we need use a different number ofits to represent diferent symbols. 7. Ife use fewer bite to represent symbols that oestr more often, on the tverage we would use fewer bits per symbol, 8. ‘The average number of bits per symbol is often called the rae ofthe code Uniguely decodablecodes: ‘Aecode is uniquely decodable the mapping C*:A",-»A*,is one to ons, that is, Yxandx inA‘x 22” CO) +. 2. Suppose we have two binary codeword a and b where ais bit longand Bin" bitlong, L

You might also like