Data Leakage Identification and Blocking Fake Agents Using Pattern Discovery Algorithm
Data Leakage Identification and Blocking Fake Agents Using Pattern Discovery Algorithm
Copyright to IJIRCCE
www.ijircce.com
5660
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
Copyright to IJIRCCE
www.ijircce.com
5661
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
Data Leakage detection Protocol- the proposed protocol will be appends on every data transmission, the protocol
carries all details about the agent as well as the data. When the data leaks, the protocol identifies the guilty agent
immediately.
Pattern Discovery- pattern discovery used to create fake objects. Agents that can receive fake objects.
IV. PROPOSED ALGORITHM
Data leakage detection and pattern discover algorithms used in this study are shown in the Fig.2. Fig.2(a) explains the
data request from sender to receiver and agent receiving the data and fake agent detection using proposed algorithms.
The main focus of our work is the data allocation problem as how can the distributor intelligently give data to agents
in order to improve the chances of detecting a guilty agent, Admin can send the files to the authenticated user, users can
edit their account details etc. Agent views the secret key details through mail. In order to increase the chances of
detecting agents that leak data is shown in Fig.2 (b) to Fig.2 (e).
Copyright to IJIRCCE
www.ijircce.com
5662
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
Algorithm Steps
Step: 1Distributor select agent to send data. The distributor selects two agents and gives requested data R1, R2 to both
agents.
Step: 2 Distributor creates fake object and allocates it to the agent. The distributor can create one fake object (B = 1)and
both agents can receive one fake object (b1 = b2 = 1). If the distributor is able to create more fake objects, he could
further improve the objective.
Step: 3 check number of agents, who have already received data Distributor, checks the number of agents, who have
already received data.
Step: 4 Check for remaining agents Distributor chooses the remaining agents to send the data. Distributor can increase
the number of possible allocations by adding fake object.
Step: 5Select fake object again to allocate for remaining agents. Distributor chooses the random fake object to allocate
for the remaining agents.
Step:6 Estimate the probability value for guilt agent. To compute this probability, we need an estimate for the
probability that values can be guessed by the target
(a)
(b)
Copyright to IJIRCCE
www.ijircce.com
5663
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
(c)
(d)
(e)
Fig.2. (a), (b), (c), (d), (e) Data leakage detection and pattern discover algorithm
V. RESULT
Previous system concentrates on the watermarking technique, which identifies data leakage. To overcome the
drawbacks the proposed paper represents invisible watermarking technique. It may not be certain if a leaked objects
came from an agent or from some other source, since certain data cannot admit watermarks. The proposed work will
give an effective identification of data leakage, and the agent who leaked the data. Where the paper concentrates on
invisible watermarked notations to identify the guilty agent, but the proposed techniques concentrates on identifying
and blocking the agents by gathering all histories of data transfer through the special protocol. Various modules created
for identifying the guilt agent and probability of guilty agent accessing is shown in Fig.3 to Fig.6. This study has been
Copyright to IJIRCCE
www.ijircce.com
5664
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
implemented and evaluated in .Net framework by creating a client server application on the network. The flow of the
methodology implementation has been described in the following diagrams. The distributor creates and adds fake
objects to the data that he distributes to agents. Fake objects are objects generated by the distributor in order to increase
the chances of detecting agents that leak data. The distributor may be able to add fake objects to the distributed data in
order to improve his effectiveness in detecting guilty agents. Our use of fake objects is inspired by the use of trace
records in mailing lists. In case we give the wrong secret key to download the file, the duplicate file is opened, and that
fake details also send the mail is shown in Fig.3.
Fig.4 and Fig.5 is mainly designed for determining fake agents. This design uses fake objects (which is stored in
database from guilt model module) and Determines the guilt agent along with the probability. A graph is used to plot
the probability distribution of data which is leaked by fake agents.
Copyright to IJIRCCE
www.ijircce.com
5665
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
A data distributor has given sensitive data to a set of supposedly trusted agents (third parties). Some of the data is
leaked and found in an unauthorized place. The distributor must assess the likelihood that the leaked data came from
one or more agents, as opposed to having been independently gathered by other means Admin can able to view the
which file is leaking and fake users details also is explained in Fig.6.
Copyright to IJIRCCE
www.ijircce.com
5666
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
certain data cannot admit watermarks In spite of these difficulties, we have shown it is possible to assess the likelihood
that an agent is responsible for a leak, based on the overlap of his data with the leaked data and the data of other agents,
and based on the probability that objects can be guessed by other means. Our model is relatively simple, but we
believe it captures the essential trade-offs. The algorithms we have presented implement a variety of data distribution
strategies that can improve the distributors chances of identifying a leaker. We have shown that distributing objects
judiciously can make a significant difference in identifying guilty agents, especially in cases where there is large
overlap in the data that agents must receive. Our future work includes the investigation of agent guilt models that
capture leakage scenarios that are not studied in this paper. For example, what is the appropriate model for cases where
agents can collude and identify fake tuples? A preliminary discussion of such a model is available. Another open
problem is the extension of our allocation strategies so that they can handle agent requests in an online fashion (the
presented strategies assume that there is a fixed set of agents with requests known in advance.
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
Shreyta Raj, Dr. Ravinder Purwar, Ashutosh Dangwal, A Model for identifying Guilty Agents in Data Transmission, International Journal of
Advanced Research in Computer Engineering & Technology, Vol.1,pp.709-713, Jun2012.
Sandip A. Kale, S.V.Kulkarni, Data Leakage Detection, International Journal of Advanced Research in Computer and Communication
Engineering, Vol. 1, pp.668-678, Nov 2012.
Ramkumar.S, Elakkiya.A, Emayavaramban.G, Data Transfer Model - Tracking and Identification of Data Files Using Clustering Algorithms,
IJLTEMAS, Volume III, pp.13-21, Aug2014.
Chandni Bhatt, Richa Sharma, Data Leakage Detection, International Journal of Computer Science and Information Technologies, Vol. 5, pp.
2556-2558, 2014.
Amol O. Gharpande ,V. M. Deshmukh, Data Leakage Detection, International Journal of Computer Science and Applications Vol. 6, pp.216219, Apr 2013.
R. Agrawal and J. Kiernan.,Watermarking relational databases, International conference on Very Large Data Bases, pp. 155166, 2002.
P. Bonatti, S. D. C. di Vimercati, and P. Samarati, An algebra for composing access control policies, ACM Trans. Inf. Syst. Secur.,5, pp.1
35, 2002.
P. Buneman, S. Khanna, and W. C. Tan, Why and where: A characterization of data provenance, pp.316330, Springer, 2001.P. Buneman
and W.-C. Tan, Provenance in databases, international conference on Management of data, pp. 11711173, 2007
Y. Cui and J. Widom, Lineage tracing for general data warehouse transformations, In The VLDB Journal, pp. 471480, 2001.
S. Czerwinski, R. Fromm, and T. Hodes, Digital music distribution and audio watermarking.
F. Guo, J. Wang, Z. Zhang, X. Ye, and D. Li, Information Security Applications, An Improved Algorithm to Watermark Numeric Relational
Data, pp.138149, 2006.
F. Hartung and B. Girod, Watermarking of uncompressed and compressed video, . Journal of Signal Processing,, Vol.66, pp.283301, 1998.
S. Jajodia, P. Samarati, M. L. Sapino, and V. S. Subrahmanian, Flexible support for multiple access control policies, ACM Trans. Database
Syst., vol.26, pp.214260, 2001.
Y. Li, V. Swarup, and S. Jajodia., Fingerprinting relational databases: Schemes and specialties, IEEE Transactions on Dependable and Secure
Computing, vol.2, pp.3445, 2005.
Copyright to IJIRCCE
www.ijircce.com
5667