Lecture 5 - Signature-Hash-MACNotes
Lecture 5 - Signature-Hash-MACNotes
1
Reminder of previous lecture
• Asymmetric encryption
– Difference between symmetric and asymmetric
– ‘One way’ functions (do not confuse with a hash)
– RSA
• Data encryption (Related to factoring problem)
• Eulers Totient, Modular Inverse (Extended Euclidean)
– ElGamal
• Data encryption (Related to discrete logarithm problem)
• Modular Inverse (Extended Euclidean)
– Diffie-Hellman
• Key exchange - not encryption (Discrete logarithm problem)
• Man-in-the-middle attack
2
Number Theory
• From after Problem Set 1 onwards:
– Modular exponentiation (RSA/ElGamal/DH)
– Modular Inverse (RSA/ElGamal)
– Eulers Totient (RSA – but this is just (p-1)(q-1))
3
Today’s Lecture
• Data integrity (and non-repudation)
– Digital Signatures
– Hash function
– Message Authentication Codes (MACs)
• CILO2 and CILO5
(technology that impact systems, and security
mechanisms)
4
Digital Signature
Hash Function
Message Authentication Code
5
Integrity
• Can encryption also provide integrity services? Does encrypting
a message prevent:
– Changing part of a message
• Levels of integrity
– Detect (accidental} modification
– Data origin authentication (verify origin/no modification)
– Non-repudiation (only one person generated this message)
You must be able to explain why encryption does not prevent data
from being modified.
Also think about the different levels of integrity you would like to
provide – and it what practical cases they might be useful.
6
For example, sending payment information between and ATM and the bank
needs only data origin authentication (make sure the message is as it was
generates by the ATM). No need for non-repudation as the ATM is not
later going to claim it did not send this message (it belongs to the bank)
6
Electronic Signature
• There is an electronic document to be sent from Alice to Bob.
• Is there a functional equivalence to a handwritten signature?
– Easy for Alice to sign on the document
– But hard for anyone else to forge
– Easy for Bob or anyone to verify
• What do we want to a signature to do?
– Establish the origin of a message (data origin authentication)
– Settle later disputes what was sent and who sent it (non repudiation)
Study this slide – be familiar with the properties (i.e. easy to sign,
difficult to forge, easy to verify) and the security goals (origin
authentication and non-repudiation)
7
Digital Signature
• Use asymmetric cryptography
• Only one party should be able to sign
– Sign using Alice’s private key (signing key)
– Verify using Alice’s public key (verification key)
Message Signature
Private key
Valid/Invalid
❑ Only the signer (who has a private key) can generate a valid signature
❑ Anyone (since the corresponding public key is published) can verify if a
signature with respect to a message is valid
Study
8
RSA Signature Scheme
• We look at how a digital signature can be implemented using RSA
• Please remember the following:
– RSA is a very popular algorithm!
– RSA is a cryptosystem that happens to have properties that allow it to be
used for both encryption and digital signature
– Not all digital signature schemes are based on RSA
• For example, DSA (Digital Signature Algorithm) based on ElGamal
– So do not be lazy with terminology
• Signing is not ‘encryption with private key’
• Verifying is not ‘decryption with public key’
• Does not hold for all signature schemes
This is for your interest (but please do not use the terminology
that this slides says not to use – if you say in the exam that I sign
by encrypting data with private key I will not give you a mark)
9
RSA Signature Scheme
• Setup:
– n = pq where p, q are large prime (say 512 bits long each)
– ed = 1 mod (p-1)(q-1)
– Signing (Private) Key : d
– Verification (Public) Key : (e, n)
• Signature Generation:
– S = Md mod n
where M is some message
• Signature Verification:
– If Se mod n = M, output valid; otherwise, output invalid
• Problem…
– What is the largest message we can sign in this way?
10
Study – you should already be familiar with RSA from PKE lecture –
it is simply a matter of remembering whether the private or public
component is used for signing/verification.
Can we break the message into bits and sign each on individually?
No, public key crypto is too slow!
10
RSA: Key Length vs. Security Strength
• RSA is inefficient – it gains strength slowly
• RSA-1024 is equivalent to an 80-bit symmetric key
• RSA-2048 is equivalent to a 112-bit key (3DES)
• RSA-3072 is equivalent to 128-bit key (AES)
• RSA-7680 is equivalent to an 192-bit AES key
• RSA-15,380 is required to equal an AES-256 key!
11
11
Hash Function Motivation
• Consider the RSA Signature Scheme, if M > n, how to sign M?
• Solution: instead of signing M directly, Alice signs a hash of M denoted
by h(M)
– Alice sends M and S = Sign(SKAlice, h(M)) to Bob
– Bob verifies that Verify(PKAlice, h(M), S) = valid
• h is called a hash function
• h maps a binary string to a non-zero integer smaller than n
• h(M) is called the message digest
You must be able to explain why a hash is useful (also see previous
slide).
You must also realise that a hash does not by itself provide any
security service.
1.Anyone can generate it as no key (so attacker can change
message, update hash)
2.It only supports others (signature, HMAC)
12
Hash Function
• A cryptographic hash function h(x) should provide
– Two functional properties
• Compression – arbitrary length input to output of small, fixed length
• Easy to compute – expected to run fast
– Three security properties
• One-way – given a hash value y it is infeasible to find an x such that h(x) = y
(also called pre-image resistance)
• Second pre-image resistance – given y and h(y), cannot find x where h(x)=h(y)
• Collision resistance – infeasible to find x and y, with x y such that h(x) = h(y)
13
Very important slide! You must know the functional and security
properties.
Realise that if you take a message of length x (x>n) and put it into a
hash that maps it to a value of length n then you will always have
mathematical probability of collisions – it must just be infeasible to
compute these.
13
implement.
If you think about it the other way finding a collision is easier for an
attacker than breaking the one-wayness - finding *the* input that
gives a specific output.
13
Hash Function
• Well known hash functions: MD4, MD5, SHA-1, SHA-256
14
14
Hash Function Security vs. Hash Output Length
• If a hash function is collision resistant, then it is also one-way.
• If the adversary can compromise collision resistance, the adversary may not be able to
compromise one-wayness.
• There is a fixed output length for every collision resistant hash function h.
• To break h against collision resistance using bruteforce attack, the adversary repeatedly
chooses random value x, compute h(x) and check if the hash function is equal to any of the
hash values of all previously chosen random values.
• If the output of h is N bits long, what is the expected number of times that the adversary needs
to try before finding a collision?
15
Read 15-19 together. You must know basic idea of how to evaluate
the collision resistance of a hash.
You must know when the birthday problem is relevant and what the
resultant computational requirement to find a collision is 2^(n/2)
Do not confuse this with brute forcing keys, or even finding second
pre-image collisions. If you are given x and h(x) and you need to
find y so h(y)=h(x) then birthday problem not relevant.
15
Pre-Birthday Problem
• Suppose K people in a room
• How large must K be before the probability that someone has
the same birthday as me is 1/2
– Solve: 1/2 = 1 − (364/365)K for K
– Find K = 253
• This problem is related to collision resistance but not the same.
16
16
Birthday Problem
• How many people must be in a room before probability is
1/2 that two or more have same birthday?
– 1 − 365/365 364/365 (365−K+1)/365
– Set equal to 1/2 and solve: K = 23
• Surprising? A paradox? since we compare all pairs x and y
• K is about sqrt(365)
• This problem is related to collision resistance.
– Question: suppose h’s output is 80 bits long, how many values must
the adversary try before having the probability of compromising
collision resistance be at least 1/2?
17
You do not need to study the probability maths behind this idea,
just remember the final implication and where it is applied.
17
Bruteforce Attack Against the Collision-resistance of a
Hash Function
• Finding collisions of a hash function using Birthday Paradox.
1. randomly chooses K messages, m1, m2, …, mk
2. search if there is a pair of messages, say mi and mj such that
h(mi) = h(mj).
If so, one collision is found.
• This birthday attack imposes a lower bound on the size of message
digests.
• E.g. 10-bit message digest is very insecure, since one collision can be
found with probability at least 0.5 after doing slightly over 2 5 (i.e. 32)
random hashes.
• E.g. 40-bit message digest is also insecure, since a collision can be found
with probability at least 0.5 after doing slightly over 220 (about a million)
random hashes.
18
18
Block Ciphers as Hash Functions
• We could use block ciphers as hash functions
– Set H0=0
– compute: Hi = AESMi [Hi-1]
– and use the final block as the hash value
– If the length of message is not the multiple of the key size,
zero-pad the last segment of message
– Why should we not do this?
19
If you use a block cipher as a has then Hash length = block size…
Generally block sizes are not that large – block cipher security is
more proportional to key length. Even AES has blocksize 128 bits.
If we did a block cipher hash the birthday problem would mean that
finding collisions are much easier than for normal hash functions.
19
General Design of Hash Algorithms
• Partition the input message into fixed-sized blocks. (e.g. 512 bits per block)
• The remaining bits of the input are padded with the value of the message length.
P0 P1 … Pn-1 Pn-1||pad||Len
r bits r bits
IV f f f H
H = Z0||Z1||…
Sponge
20
Take message and split into input block, pad last block if message if
not a multiple of the input length.
Merkle-Damgard (SHA-1/SHA-2)
We first calculate the hash if IV and M1 to give us intermediate
hash value H1 (f being a compression function). We then use H1 and
M2 to generate H2, and so we continue….H_n=h(H_(n-1),M_n)
The final hash function output is then used as the hash value.
Sponge (SHA-3)
Internal state of length r+c bits.
Aborbing
r-bits input each round, PRF f mix state.
20
Squeezing
Take out current state r (then run though PRF until output Z is desired
length).
Sponge is very flexible in performance – can adjust input block length (r)
and output length.
Internal state is larger than output (believed better security)
20
What is the main application of cryptographic
hash functions?
21
21
You must be able to explain how a hash is used in digital signature
scheme.
The signer will calculate the hash of the message and then in the
case of RSA signs the hash (not the message). All signature
schemes use the hash in the signature generation.
Upon receiving the message and signature the receiver will also
compute the hash of the received message. This hash is then used
during signature verification.
If we were using RSA, the signature verification with the public
key will result in the hash the sender calculated. We then compare
this verification hash with the has we calculated over the received
message and if it they are the same we consider the signature valid.
22
Popular Crypto Hashes
• MD5 ⎯ designed by Ronald Rivest
– 128 bit output
– Available at http://www.ietf.org/rfc/rfc1321
• SHA-1 ⎯ A US government standard (similar to MD5)
– 160 bit output
– Available at http://www.itl.nist.gov/fipspubs/fip180-1.htm
23
23
Security Updates of Hash Functions
MD5
• In Aug 2004, Wang, et al. showed that it is “easy” to find collisions in MD5. They found many
collisions in very short time (in minutes)
• http://eprint.iacr.org/2004/199.pdf
SHA-1
• In Feb 2005, Wang et al. showed that collisions can be found in SHA-1 with an estimated
effort of 269 hash computations.
– Less than 280 hash computations by birthday attack.
• http://www.schneier.com/blog/archives/2005/02/sha1_broken.html
Impacts
• Hurts digital signatures
• From 2010 NIST recommends mandates use of SHA-2 for applications requiring collision
resistance
• SHA-1 still alternative for some other crypto mechanisms.
• http://csrc.nist.gov/CryptoToolkit/tkhash.html
24
For interest
24
Some Details about Finding Collisions in SHA-1
25
For interest
Why are hash collisions bad? Signatures
Total bitcoin system 2^65 hash per second. Every single miner
worked on collision then couple of seconds.
Bitcoin used Sha-256, and not looking for collisions but for specific
answer (more a pre-image problem).
25
SHA-3
26
For interest
26
Message Authentication
27
27
Message Authentication
• Make sure what is sent is what is received
• Detect unauthorized modification of data
• Example: Inter-bank fund transfers
– Confidentiality is nice, but integrity is critical
• Encryption provides confidentiality (prevents
unauthorized disclosure)
• Reminder! Encryption alone does not assure message
authentication (a.k.a. data integrity)
28
This slide revises some comments from start of the lecture. You must be
able to give me an example where a MAC is useful (where we only need
data origin authentication)
28
MAC
• How MAC Works
– A MAC is a symmetric cryptographic mechanism
– Sender and receiver share a secret key K
1. Sender computes a MAC tag using the message and K; then sends the MAC tag along
with the message
2. Receiver computes a MAC tag using the message and K; then compares it with the
MAC tag received. If they are equal, then the receiver concludes that the message is
not changed
– Note: only sender and receiver can compute and verify a MAC tag
29
We calculate the MAC using a key K. We then send the message and
the MAC. The receiver takes the received message and also
calculates its MAC (since it has the same key K). It the compares it
to the sent MAC. If they are the same then message is unmodified.
29
Message Authentication Code
• Comparison to hash
– Both maps arbitrarily long message to fixed length output
– Who can calculate a hash and a MAC? Need a key?
• Comparison to Digital Signature
– Faster to computer- symmetric encryption/hash faster than signing
– MAC does not provide non-repudiation
• Since both sender and receiver share the same symmetric key,
• Use digital signature for non-repudiation
Alice Bob
T’
Eve T’ =? T’’
M M, T M’ T’’
MAC cryptanalysis MAC
K
Secure channel
key
30
In MAC two people have the key needed to generate the MAC (it uses a
symmetric key). This means the receiver could modify the message and
generate a new valid MAC (or the receiver could claim that it is possible
and deny that he sent a message).
30
A MAC Algorithm
• MAC can be constructed from a block cipher
operated in CBC mode (with IV=0).
• Suppose a plaintext has 4 plaintext blocks P=P0,
P1, P2, P3
• Suppose K is the secret key shared between sender
and receiver.
C0 = E(K, P0),
C1 = E(K, C0 P1),
C2 = E(K, C1 P2),…
CN−1 = E(K, CN−2 PN−1) = MAC tag
31
31
Why does a MAC work?
• Suppose Alice has 4 plaintext blocks
• Alice computes the MAC by doing the following operations:
C0 = E(K, P0), C1 = E(K, C0P1),
C2 = E(K, C1P2), C3 = E(K, C2P3) = MAC tag
• Alice sends P0,P1,P2,P3 and MAC tag to Bob
• Suppose Trudy changes P1 to X
• Bob computes
C0 = E(K, P0), C1 = E(K, C0X),
C2 = E(K, C1P2), C3 = E(K,C2P3) = MAC tag MAC tag
• Hence, Trudy can’t change MAC tag to MAC tag without key K
• Note: The MAC algorithm above may not be secure if the messages
are in variable length.
32
Study
32
The Insecurity of Block Cipher Based MAC
Algorithm
P’ = P1, P2
P’’ = P3
– Attack: anyone can forge a message and have correct MAC tag (P’’’,T’’’)
without knowing the MAC key by setting P’’’ = P1,P2,P3 T’ and T’’’ = T’’.
33
33
How?
P1 P2
P`
C0 = E(K, IV(00) P1),
T’=MAC= C1 = E(K, C0 P2) ENC ENC
K K
P``
T’’=MAC= C0 = E(K, IV(00) P3) C1
C0
P```
C0 = E(K, IV(00) P1),
T’=C1 = E(K, C0 P2)
C2 = E(K, C1 (P3 T`)) = E(K, T’ (P3 T`)) = E(K, P3) since T’=C1
So T’’’= C2 = T’’
New message is P1 P2 ,P3 T` with valid MAC T’’’ = T’’
34
34
Message Authentication - HMAC
• Message Authentication Code: A CK (M)
– M: message K opad K ipad M
– A: authentication tag
– for integrity and authenticity
• HMAC: Keyed-hashing for Message Authentication H
• Used extensively in IPSec (IP Security)
– IPSec is widely used for establishing Virtual H
Private Networks (VPNs)
Let B be the block length of hash, in bytes (B = 64 for MD5 and SHA-1)
ipad = 0x36 repeated B times
opad = 0x5C repeated B times
35
Study
Hash functions are generally seen as being quite efficient and fast
– and they also give good output lengths. It has therefore been
proposed that we build MAC out of hash rather than symmetric
encryption.
K opad and K ipad is just a fancy way of saying you should use
two different keys (although they could be derived from each
other)
35
What about integrity and
confidentiality?
• How should we go about providing both?
• We can encrypt the data and do a MAC
36
Interest only
For MAC and encrypt the argument is that the MAC is calculated on
the plaintext and then encrypted along with the plaintext. This is
argued to be more inline with the purpose of a MAC…
However, if you were interested in modification during transmission
then you would have to completely decrypt the data, AND then
verify the MAC before knowing if something was modified.
36
Depending on implementation this could also lead to padding oracle
attacks. As such, encrypt and MAC is generally considered more secure in
general.
36
Padding Oracle
Interest only
Proposed in 2002.
In our example:
If the last byte of P2 is 0x01 then the implementation would
consider it a valid padding. But last byte of P2 is last byte of C1
XOR to last byte of D(C2). So attacker can set last byte of C1 to
any value and send new message C0, C1, C2 to the recipient. If no
error comes back attacker knows that last byte of P2 XOR to his
changed byte in C1 is equal to 0x01 and he knows true value of P2
last byte. Then attacker can choose last byte of C1 and start
guessing second last byte of C1 so that last two bytes of C1 XOR
D(C2) is 0x02 0x02. This allows attacker to guess one byte in 256
37
tries, and the entire plaintext block in 16x256 tries.
More info
https://en.wikipedia.org/wiki/Padding_oracle_attack
https://en.wikipedia.org/wiki/Lucky_Thirteen_attack
PKCS7 padding
1 block 0x01
2 blocks 0x02 0x02
3 blocks 0x03 0x03 0x03
Etc.
37
Authenticated Encryption
• Encrypt and then MAC
– Two encryption operations per block
• Use mode of operation providing both
– Example: Galois/Counter Mode
– Encrypt, XOR, multiply per block
Figure: Wikipedia 38
38
Additional Signature Schemes
• Same basic RSA equations can be used for
encryption and signature.
– Not so for ElGamal (as seen in Lecture 4)
39
ElGamal Signature
40
Digital Signature Algorithm (DSA)
41
The end!
?
Any questions…
42
42