TRAITOR TRACING FOR ANONYMOUS ATTACK IN CONTENT

PROTECTION

Hongxia Jin

IBM Almaden Research Center, San Jose, CA, U.S.A.

Keywords: Traitor Tracing, Content Protection, Anti-piracy, Anonymous Attack, Broadcast Encryption.

Abstract: In this paper we take a closer look at traitor tracing in the context of content protection, especially for anony-

mous attack where the attackers pirate the content and re-distribute the decrypted plain content. When the

pirated copies are recovered, traitor tracing is a forensic technology that can identify the original users (called

traitors) who have participated in the pirate attack and involved in the construction of the pirated copy of

the content. In current state-of-art, traitor tracing scheme assumes a maximum coalition size of traitors in

the system and is deﬁned to detect one traitor, assuming the detected traitor can be disconnected and tracing

just repeats with the remaining traitors. In this position paper we argue this deﬁnition does not sufﬁciently

reﬂect the reality where a traitor tracing technology is used to defend against piracy especially in the context

of content protection. We believe a traitor tracing scheme should deduce the active coalition size and should

be deﬁned to detect all active traitors even taking into consideration that found traitors need to be technically

disabled. We believe the traditional deﬁnition misleads in the design of an efﬁcient and practical traitor trac-

ing schemes while our deﬁnition much better ﬁts the reality and can lead to design of efﬁcient traitor tracing

schemes for real world use.

1 INTRODUCTION

Content protection for copyrighted materials is all

about making sure the materials are only accessible to

user who are authorized to access. Pirated attackers

want to bypass the restrictions and enable access of

the content illegally. Business scenarios include pay-

TV systems, NetFlix or massively distributing physi-

cal media like DVDs. The success of these types of

business scenarios hinges on the ability to make sure

the content is only accessible to authorized (paying)

customers. Indeed piracy is one of the biggest con-

cerns in entertainment industry. Digital copies are

perfect copies. In a broadcast encryption (Fiat and

Naor, 1993) based content protection system, each de-

vice (also called decoders, users) is assigned a unique

set of decryption keys (called device keys). It deﬁnes

a key management scheme that assigns keys to de-

vices and encrypts the content that can guarantee that

only compliant devices can decrypt the content, with-

out requiring authentication of the device. Further-

more, because the distributed content is oftentimes

large, for example, a movie is about 2G bytes, in or-

der to save space or bandwidth during content distri-

bution, hybrid encryption is usually used. For exam-

ple, a content encrypting key (sometimes called me-

dia key or title key) is randomly chosen to encrypt

the content; and the media key itself is then encrypted

by compliant device keys again and again while the

non-compliant device keys are used to encrypt only

garbage string instead of the valid media key. The

bulk encrypted media key is put as a header into the

distribution package and distribute together with the

encrypted content. During playback, a compliant de-

vice can use one of its valid device keys to decrypt the

header ﬁrst to get a valid media key which in turn al-

lows it to decrypt the content. The non-compliant de-

vices can only decrypt to garbage and therefore can-

not decrypt the content. There are different types of

pirate attacks in the above content protection system.

Every pirate attack enables one to access content ille-

gally.

1. Pirates use compromised device keys to build a

clone pirate decoder.

2. Pirates re-distribute content encrypting key (me-

dia key): pirates stay anonymous.

3. Pirates re-distribute decrypted content: pirates

stay anonymous.

When the pirate evidence is found, forensic analy-

sis can allow one to ﬁnd out the source that the pirate

copies come from. In literatures, the forensic technol-

ogy to defend against piracy is termed as ”traitor trac-

ing”. The source devices (users) that involved in the

331

Jin H. (2008).

TRAITOR TRACING FOR ANONYMOUS ATTACK IN CONTENT PROTECTION.

In Proceedings of the International Conference on Security and Cryptography, pages 331-336

DOI: 10.5220/0001929003310336

 SciTePress

construction of the pirated copies are called traitors.

For different types of pirated attacks, different types

of traitor tracing schemes are needed.

In ﬁrst pirate attack, the secret device keys are ex-

tracted from one or more compromised devices and

put into a clone device that can decrypt the encrypted

content. When a clone device is captured, in or-

der to do traitor tracing (Chor et al., 1994; Naor

et al., 2001), forensic testing materials are fed into the

clone. In each forensic testing, the compliant device

keys are partitioned into two subsets. Device keys in

one subset are chosen to encrypt the valid media key

while device keys in the other subset are chosen to

encrypt the garbage to put into the header. Observing

the forensic testing results which is play or not-play

the content gives us information on which keys are

inside the clone device. The traceability is deﬁned to

be the number of testings needed to detect traitors.

In an anonymous attack, the attackers decrypt the

header and get the valid media key (or title keys) to

decrypt the content. They can sell the decrypted con-

tent or serve the content decrypting keys on demand.

In a traitor tracing scheme (Safani-Naini and Wang,

2003) (Jin et al., 2004) to defend against anonymous

attack, content is encrypted differently for different

devices. For example, content may be differently wa-

termarked and encrypted. Of course preparing a dif-

ferent version for each different user is too costly, how

to space-efﬁciently prepare and distribute the content

is outside the scope of this paper. Readers refer to

(Jin et al., 2004). What is relevant here in this pa-

per is that different device will decrypt and play back

content differently. Recovering the decrypted content

or the content encrypting keys enables one to iden-

tify which keys the attackers have. The traceability of

a traitor tracing scheme for anonymous attack is de-

ﬁned to be the number of recovered pirated copies of

content or content encrypting keys needed to detect

traitors.

In state-of-art, a traitor tracing is deﬁned to be a

scheme that can detect at least one traitor when a max-

imum number of traitors in the system is up to a cer-

tain number. The efﬁciencyof a traitor tracing scheme

is deﬁned to be the number of recovered forensic ev-

idences/results needed in order to detect one traitor.

They also assume the discovered traitor can be dis-

connected in some way, and the tracing can simply be

repeated for remaining traitors.

We believe this deﬁnition is inadequate and does

not capture the reality of a traitor tracing scheme

when actually used. In fact this deﬁnition does not

help designing an efﬁcient traitor tracing scheme. In

our opinion, a traitor tracing scheme should be de-

ﬁned to probabilistically detect all active traitors in-

volved in the pirate attack as well as deduce the size

of active members in the coalition. The detection

should also take into considerations that some discov-

ered traitors will be technically disabled. We argue

this in the context of anonymous attack, even though

similar can be said to other attacks as well.

2 TRADITIONAL TRAITOR

TRACING

As one can imagine, a traitor tracing scheme is all

about how to assign the tracing keys (or content ver-

sions) to the devices and perform efﬁcient forensic

analysis after recovering forensic evidences such as

keys and content versions.

For example, in traitor tracing for anonymous at-

tack, content is differently watermarked and differ-

ently encrypted for different users. So, a traitor trac-

ing scheme for anonymous attack usually consists of

two basic steps:

1. Assignment Step: Assign different keys and con-

tent versions to devices,

2. Traitor/coalition Detection Step: Based on the re-

covered pirated content/keys, trace back to the

traitors.

Traitor tracing schemes in literatures have been

mostly focused on the assignment step. The actual de-

tection algorithm is simple and straightforward: you

take your sequence of recovered forensic evidences,

be it the pirated copies of keys or content versions,

and simply score all the devices based on how many

the recovered copies match with what each device has

been assigned. You incriminate the highest scoring

device. Traitors are therefore detected one by one.

But why not detect every member in the coalition all

together? The classic one-by-one method has some

obvious advantages:

1. It seems easier to detect one traitor at a time.

2. The identiﬁed traitor can always be disconnected,

which potentially makes it faster to detect the sec-

ond traitor given that the coalition size is smaller

after disconnection of the previously detected

traitor.

3. It seems it is much more complicated to de-

tect guilty coalitions than individuals because the

number of coalitions is much bigger than the num-

ber of individuals. In fact the number of coali-

tions of a certain size is exponential in the number

of users in the system. For example, if there are

1,000,000,000 devices/users in the world, there

SECRYPT 2008 - International Conference on Security and Cryptography

332

are roughly 500,000,000,000,000,000pairs of de-

vices (i.e., coalitions of size 2).

A further assumption made popularly is that once

a traitor is detected, he can be disconnected and trac-

ing can simply be repeated with remaining traitors.

This is true sometimes when some legal means can

be taken to actually disable the traitorous device.

Moreover, the research community deﬁne traitor

tracing schemes to detect a traitor when the maximum

size of the coalition is up to some threshold. This is

because the actual coalition size is usually unknown.

Indeed the ﬁrst traitor tracing scheme is deﬁned

in (Chor et al., 1994) to detect one traitor up to a

maximum coalition size and is still the popular def-

inition that is using now. A variant is deterministic

or probabilistic tracing. A probabilistic tracing tries

to make the probability of incriminating an innocent

user as small as possible. Deterministic tracing can

be seen in the popularly used approach called ”trace-

ability code”.

A traceability code is one of the traditional ap-

proaches that incriminates the highest score device,

i.e. the device whose codeword is at the smallest

Hamming distance from the pirated copy. Indeed a

traceability code enables one to decode to the nearest

neighbor of a pirate code and the nearest neighbor is

deterministically a traitor.

Lemma 2.1

. (J. N. Staddon and Wei, 2001) Assume

hat a code C with length n and distance d is used to

assign the symbols for each segment to each user and

that there are t traitors. If code C satisﬁes

d > (1− 1/t

)n, (1)

then C is an t-traceability-code.

We raise the following questions for arguments.

1. Is one-by-one detection really efﬁcient?

2. When coalition size is unknown, is deterministic

tracing even possible?

3. Can we deduce the active coalition size?

4. Is it reasonable to assume that the traitor can al-

ways be disabled in non-technical way?

5. Can we simply repeat tracing after disabling

traitors?

3 OUR TRAITOR TRACING

We believe the above traditional deﬁnition of a traitor

tracing scheme does not lead to the design of an efﬁ-

cient and practical scheme for real world use. In our

opinion, a traitor tracing scheme should be deﬁned to

ﬁnd all active traitors probabilistically and also de-

duce the active coalition size. In this section we will

go through the above questions and argue why our po-

sition stands.

3.1 Is One-by-one Detection Efﬁcient?

Notice that the ultimate goal of traitor tracing is to

ﬁnd all active traitors and disable them. While intu-

itively it seems easy and efﬁcient to detect one traitor

at a time, we have an anti-intuitive observation. it is

easier to ﬁnd the entire coalition than to sequentially

ﬁnd one individual traitor, disable him and ﬁnd an-

other one.

While the number of coalitions of a certain size is

exponential in the number of users, it turns out that it

is much less likely that coalitions appear by random

chance, than that individual user randomly has high

score. An example can informally illustrate the un-

derlying idea.

Suppose there are 4 people involved in a collud-

ing attack, and we have a random sequence of 20

recovered pirated copies of keys or content. Each

key/content originally has 256 variations of which a

given user(device) only knows 1. The attackers wish

to see that high scoring device can happen by chance.

If the four attackers are using round robin, each guilty

user will evenly score 5. Can we incriminate any user

that share 5 copies with the recovered sequence? No,

there will be about 15 completely innocent users scor-

ing 5 or greater due to chance alone. What can you

do then? You have to recover more pirated copies of

keys/content before you can incriminate any user.

However, the above 4 guilty users together can

explain all the movies in the sequence. What is the

chance that a coalition of size 4 might have all the

variations in the sequence? The answer is roughly

0.04. In other words, while there are plenty of users

that can explain 5 movies, it is unlikely that any four

of them can “cover” all twenty movies. If we ﬁnd four

users that do cover the sequence, it is unlikely that

this could have happened by chance. It is more likely

that some devices in the coalition are indeed guilty.

Based on this counter-intuitive observation, we be-

lieve a more efﬁcient forensic analysis for traitor de-

tection algorithm would be to detect the entire coali-

tion that can explain all recovered pirated copies and

then ﬁlter out the innocent users from the found coali-

tion if any. Readers see (Jin et al., 2008) for more

details.

Now let us do a concrete comparison. Suppose

each content/key comes with 256 variations and there

are 1 billion devices in the system. The traditional

TRAITOR TRACING FOR ANONYMOUS ATTACK IN CONTENT PROTECTION

333

tracing based on Formula 1 can deterministically

identify ONE traitor in a coalition of nine after re-

covering 256 pirated content/keys and it takes similar

number of pirated copies to detect the second, and

subsequent traitor. In contrast, for the same coali-

tion size, the new algorithm based on detecting entire

coalition can detect all active traitors using only 56

pirated content/keys and the false positive rate can be

low at 0.0001%.

Of course, the attackers may use scapegoat strat-

egy. Some device is used heavily, for example, score 9

or 10. The traditional approach can correctly identify

him, but it is hard to ﬁnd the lightly used device and

the true coalition size. The new tracing can nonethe-

less ﬁnd the other members in the coalition.

Again the ultimate goal is to detect and disable all

traitors as fast as possible, we believe the traditional

traitor tracing deﬁnition does not lead to the design of

an efﬁcient tracing scheme that can achieve the above

ultimate goal efﬁciently.

3.2 Assume Coalition Size vs. Deduce

Coalition Size? or Deterministic vs.

Probabilistic?

We believe it is not practical to assume a maximum

coalition size and perform deterministic tracing based

on the assumed coalition size. Indeed, because the

tracing agency rarely knows exactly how many de-

vices are involved in the attack. As a result, the an-

swers it gets are always qualiﬁed. For example, an an-

swer might be as follows: ”If N devices are involved,

it must be exactly this N. However, different innocent

coalitions of N + M devices may have produced the

same result.” We will walk readers through a simple

example to show how the actual tracing is done based

on forensic evidence.

Suppose each content/key comes with 256 varia-

tions and there are 255 content/keys in the sequence.

So each device is assigned 255 content/keys with one

variation in each content/key. The assignment can be

done using a systematic approach like error correct-

ing code, for example, Reed-Solomon code. This ap-

proach can guarantee that any two users differ at at

least 252 content/key assignment. This assignment

can support 1 billion devices in the system. For any

given content,

256

of the devices (about 4 million de-

vices) encode the content the same way. For a given

three content, only



256



of the devices (about 60

players) encode those content the same way. For a

given four content, exactly 0 of the devices encode

the content the same way. That is the essential prop-

erty of the Reed-Solomon code assignment.

Let us take the case of an attack where only a sin-

gle device X is being used. After recovering a single

content/key, the license agency has four million de-

vices that are potential candidates, including X. Af-

ter the second or third recovered pirated content/keys,

the number of candidates is reduced, but it is not un-

til the fourth pirate content/key is recovered that the

guilty device X positively identiﬁed– BUT only if it

is known that only a single device is involved. Mil-

lions of pairs of devices could also have produced the

four pirated content/keys.

By the time nine pirated content/keys have been

recovered, the license agency knows there are no pos-

sible innocent pairs of devices. (By ”innocent”, we

mean a pair that does not include the actual guilty

device X.) An innocent pair could have produced at

most six of the pirated content/keys. However, an in-

nocent triplet picked at random could have produced

all nine pirated content/keys, each member of the

triplet having three content/keys in common with the

guilty device. The number of such triplets are:





* 60 *





* 60 *





* 60

Among all the



1,000,000,000



triplets the probabil-

ity that a triplet picked at random is in the above set is

roughly 2 in 10

. If the licensing agency is willing to

assign apriori probabilities to the different numbers of

attackers, and assuming that the attackers cannot de-

duce the code and therefore must act randomly, the

license agency can perform a Bayesian analysis and

conclude, based on the observedresult, what the prob-

ability is that the indicated device X is, in fact, guilty.

One important caveat is traditionally addressed by

deﬁning the tracing problem to be ﬁnding a single

member of the coalition, not ﬁnding the exact mem-

bership of the coalition. So the Bayesian analysis

really reveals the probability that device X is an at-

tacker, not that he is the sole attacker.

As one can see from the above simple example,

without knowing the actual coalition size, the tracing

has to be probabilistic. During tracing, every time a

pirated copy is recovered, it increases the probability

that the suspect device is actually guilty. That is the

nature of the tracing when the actual coalition size

is unknown. From the example above, we can also

see during the process of ﬁguring out the traitorous

devices, it is possible to deduce the size of the active

members in the coalition without having to assume

the maximum coalition size. We believe performing

probabilistic tracing and deducing the active coalition

size much better ﬁts the real world scenarios.

SECRYPT 2008 - International Conference on Security and Cryptography

334

3.3 Disable Found Traitors and

Continued Tracing?

Traditional traitor tracing is deﬁned to be ﬁnding at

least a traitor even if there may exist a coalition. They

assume this traitor can be disconnected in some way.

If there still is piracy, just repeat the same scheme.

As we mentioned earlier, while it is sometimes pos-

sible to use legal means to disable traitors, it is not

always possible to do that. In fact, technical means

are oftentimes necessary in the lifetime of a traitor

tracing system to help disabling traitors. In addition

to this, we also challenge the assumption that tracing

can simply be repeated with remaining traitors after

disabling found traitors.

So, how does one technically disable a device that

is found to be a traitor? For anonymous attack, we

know each content is differently watermarked and en-

crypted. Each device is assigned a sequence of keys,

one key for each content. To disable the traitorous de-

vice, one can render the compromised keys no longer

usable for future content. But we know many devices

might share a single compromised key. Therefore, re-

vocation of a single key is impossible. We must re-

voke the unique set of keys assigned to a revoking de-

vice. Furthermore to make this work one must make

sure that no two devices have many keys in common,

so even if the system has been heavily attacked and a

signiﬁcant fraction of the keys in the system is com-

promised, all innocent devices will still have many

keys that are not compromised.

For example, suppose each device is assigned a

sequence of keys (called tracing keys) from a matrix,

exactly one key per column. Each key enables the

device to decrypt one content, and repeat from begin-

ning when reaching the end of the column. The role

of the sequence of keys assigned to each device in this

system is similar to the media key discussed above

in a broadcast encryption system. In order to disable

a traitorous device, we will use Tracing Key Blocks

(or TKB in short). TKB is generated by the license

agency and distributed together with the new content

in future. The purpose of the Tracing Key Block is to

give all innocent devices a column they can use to cal-

culate the correct key to decrypt the content, while at

the same time preventing compromised devices (who

have compromised keys in all columns) from getting

to the correct answer. In an TKB there are actually

many correct answers, one for each variation of the

content. Let us simply call those answers output keys.

Now after some traitorous devices are disabled in

Tracing Key Blocks, can we simply repeat the same

tracing process as before for the remaining traitors?

The answer is not trivial.

...

link

...

link

...

hdr

link

...

hdr

link

...

link

...

hdr

...

unconditional

(1)

conditional

(2)

conditional

(3)

conditional

(4)

Figure 1: It is possible that the new TKB will not provide

new tracing information for continued tracing

As we know, each column in a TKB contains an

encryption of an output key in every un-compromised

tracing key’s cell. More precisely, in every un-

compromised tracing key’s cell in each column, it

contains an encryption of one of the different cor-

rect output keys D

(0 ≤ i < k),k is the number of

different correct answer in the system. Notice the

same set of output keys are encrypted in each column,

although they are distributed differently in different

columns. For a particular output key D

, it can be

obtained from any column in the TKB. A compliant

(good) device will process TKB and obtain a correct

output key from the ﬁrst column in TKB that it has

a non-revoked key. However, when there are still at-

tackers in a coalition that have not been detected, the

coalition can mix-and-match their revoked keys and

non-revoked keys when processing TKB. In turn they

have multiple ways to process TKB and get a valid

output key to play back the content. They can choose

in which column they want to use a non-revoked key

to get a valid output key. It does not have to be in the

ﬁrst column. Moreover, different keys can be used

in different columns to obtain the same variant D

When the license agency observes a pirate copy corre-

sponding to a particular output key D

, since it can be

obtained from any column, the license agency has no

way to exactly know which key has been used in ob-

taining that output key. The entire path that the unde-

tected traitors goes through to process TKB can even

look like from an innocent device or from a path that

was never assigned to any device, thus untraceable.

The ﬁgure 1 illustrates the issue discussed here.

Keep in mind that the output key has multiple valid

versions. If the attackers combine the revoked keys

with the keys that have not been detected, it is not

always possible to know from which column the TKB

processing ends to get a valid key.

To force the undetected traitors to reveal the keys

they use when processing TKB, we must make sure

TRAITOR TRACING FOR ANONYMOUS ATTACK IN CONTENT PROTECTION

335

each column gets different variations so that when re-

covering a key/content-variation, the scheme knows

from which column it comes from. Only by ob-

serving that, the tracing scheme can continue trac-

ing. Unfortunately that means the q variations have

to be distributed among the columns contained in the

TKB. Each column only effectively gets q/c varia-

tions where c is the number of columns in the TKB. It

is clear that traceability degrades when the effective q

decreases. When the number of columns c becomes

big enough, the traceability degrades to so low that it

basically becomes untraceable. The scheme is over-

whelmed and broken in that case. As a result, that puts

a limit on the revocationcapability of the scheme. See

(Jin and Lotspiech, 2007) for more details.

As one can see, the challenge here is to make sure

the newly released TKB can continue to provide trac-

ing informationto the license agencyto enable contin-

ued tracing. Unfortunately this is not always possible.

In fact, oftentimes it provides less or even no tracing

information to the license agency for future traitor de-

tection after the previous traitors are disabled. The

simple assumption in traditional traitor tracing deﬁni-

tion that tracing can simply repeated with same trace-

ability does not hold.

In our opinion, when considering a complete cy-

cle, a traitor tracing scheme contains the following

three steps instead of the two steps deﬁned in Section

1. Assignment step: Assign versions of the con-

tent/key to currently known innocent devices

2. Forensic Analysis step: Based on the recovered

forensic evidences (i.e., pirated content/keys),

trace back to the traitors

3. Revocation step: loop to step 1 but exclude the

currently discovered traitors, in other words, as-

sign garbage to detected traitorous devices

The traceability of a complete traitor tracing sys-

tem should be deﬁned to be the traceability to detect

all traitors in the system, including after revocation.

4 CONCLUSIONS

In this paper, we have examined what should be a

good deﬁnition of a traitor tracing scheme that can

lead to the design of efﬁcient traitor tracing schemes.

Traditionally traitor tracing system has been deﬁned

to ﬁnd one traitor when assuming a maximum num-

ber of traitors in the system. We argue that deﬁnition

is adequate and does not help one design efﬁcient and

practical traitor tracing system to use in real world.

Keep in mind the ultimate goal of a traitor tracing

scheme in real world is to detect and disable all ac-

tive traitors. The traditional deﬁnition by detecting

traitor one-by-one seems to be easy but actually does

not help achieve this ultimate goal. Furthermore, as-

suming tracing can simply repeat after previously de-

tected traitors are disabled in the system is wrong.

In our position, we believe a traitor tracing scheme

should be deﬁned to ﬁnd all active traitors in the sys-

tem and deduce the active coalition size. This in-

cludes the case that when some traitors are detected

and disabled, the system should efﬁciently ﬁnd the

remaining traitors. The traceability is deﬁned to de-

tect all traitors in the system even with revocations of

previously found traitors. Our new deﬁnition sets the

correct tracing goal straight and could help leading to

the design of an efﬁcient and practical traitor tracing

scheme.

REFERENCES

Chor, B., Fiat, A., and Naor, M. (1994). Tracing traitors.

In Crypto 1994, Lecture Notes in computer science,

volume 839, pages 480–491.

Fiat, A. and Naor, M. (1993). Broadcast encryption. In

Crypto 1993, Lecture Notes in computer science, vol-

ume 773, pages 480–491.

J. N. Staddon, D. S. and Wei, R. (2001). Combinato-

rial properties of frameproof and traceability codes.

In IEEE Transactions on Information Theory, vol-

ume 47, pages 1042–1049.

Jin, H. and Lotspiech, J. (2007). Renewable traitor tracing:

a trace-revoke-trace system for anonymous attack. In

European Symposium on Research in Computer Secu-

rity.

Jin, H., Lotspiech, J., and Megiddo, N. (2008). Efﬁcient

coalition detection in traitor tracing. In IFIP 23rd In-

ternatinal Information Security Conference.

Jin, H., Lotspiech, J., and Nusser, S. (2004). Traitor tracing

for prerecorded and recordabe media. In ACM work-

shop on Digital Rights Management, pages 83–90.

Naor, D., Naor, M., and Lotspiech, J. B. (2001). Revoca-

tion and tracing schemes for stateless receivers. In

CRYPTO ’01, Lecture Notes in Computer Science,

pages 41–62.

Safani-Naini, R. and Wang, Y. (2003). Sequential traitor

tracing. In IEEE Transactions on Information Theory,

volume 49, No.5, pages 1319–1326.

SECRYPT 2008 - International Conference on Security and Cryptography

336