1 © Hortonworks Inc. 2011–2018. All rights reserved.
Building A Data-Driven Authorization
Framework
2 © Hortonworks Inc. 2011–2018. All rights reserved.
The Speaker’s Info
Amer Issa:
• Platform and Security Architect at Hortonworks based in Canada.
• I specialize in Hadoop platform engineering with an emphasis on DevOps and Security.
Started my career as a System’s Engineer and transitioned into Cloud and Big Data.
• I have a spent the majority of my career in highly governed and secured environments;
mostly financial and health related.
• Currently I act as an SME when it comes to security implementations and the integration
of Hadoop with the existing frameworks of organizations. I also help transform
organizations to a mentality of automation and infrastructure as code.
3 © Hortonworks Inc. 2011–2018. All rights reserved.
Introduction
• The integration of Ranger and Atlas is a fundamental shift in how to provision access to
assets within the Hadoop ecosystem.
• It allows for those who understand the content and classification of data to assign
proper permissions based on data-specific attributes, rather than the current model of
location- and user-based model
• It provides a clear separation of duties and ensures the responsibility of maintaining
data access security remains with the most appropriate teams: i.e. those who know the
data best.
4 © Hortonworks Inc. 2011–2018. All rights reserved.
Object-Driven vs Data-Driven
Object-Driven Data-Driven
• Permissions are assigned similar to how they are
assigned in Unix: e.g Read permissions on
/data/files/sales.csv to group dpt-sales.
• Static: Changes in data does not reflect changes in
permissions unless strictly specified.
• System level context.
• One policy per service.
• Managed and owned by administration and
operations.
• Permissions are assigned based on tags. Data tagged
with sales will give permissions read dpt-sales.
• Dynamic: Changes in metadata will reflect changes in
permissions. E.g. Add an extra tag will result in extra
permissions.
• Business level context
• One policy can cover multiple services.
• Managed and owned by data stewards.
5 © Hortonworks Inc. 2011–2018. All rights reserved.
Process Flow in A Data Driven Environment
6 © Hortonworks Inc. 2011–2018. All rights reserved.
Structure of User Groups
A group for each of the following:
• Platform and System Administrator
• Hadoop Application Support
• Data Stewards, multiple if required
• Line Of Business or Distinct User Base
• Ingestion and Metadata
7 © Hortonworks Inc. 2011–2018. All rights reserved.
Tags to Groups Mappings
SalesNA
SalesMENA
SalesEuro
hrComp
hrPerf
8 © Hortonworks Inc. 2011–2018. All rights reserved.
Atlas Features and Limitation
• Out of the box integration:
• Hive Bridge
• Sqoop Bridge
• Nifi Bridge
• Storm Bridge
• For other services, it is up to the client to register entities and linage
• The UI does not support:
• Adding custom fields
• Editing existing fields
• Adding entities to hive and few other types
9 © Hortonworks Inc. 2011–2018. All rights reserved.
Integration of Ranger and Atlas
10 © Hortonworks Inc. 2011–2018. All rights reserved.
Tags and Inheritance
11 © Hortonworks Inc. 2011–2018. All rights reserved.
Associating Tags and Defining Permissions
12 © Hortonworks Inc. 2011–2018. All rights reserved.
Testing Security Settings
13 © Hortonworks Inc. 2011–2018. All rights reserved.
Adding Custom Attributes Using A Script: Configs
14 © Hortonworks Inc. 2011–2018. All rights reserved.
Adding Custom Attributes Using A Script: Execution
15 © Hortonworks Inc. 2011–2018. All rights reserved.
In Summary
Data Driven authorization framework streamlines authorization and
introduces a business perspective to authorization
16 © Hortonworks Inc. 2011–2018. All rights reserved.
Other Projects By The Speaker:
https://github.com/amerissa
• Ranger MultiDomain Sync
• Cloudbreak Shared Services in the Cloud
• Ambari SSL Wizard
• Docker and ELK Based Dashboards for Hadoop
17 © Hortonworks Inc. 2011–2018. All rights reserved.
Q&A

Building a data-driven authorization framework

  • 1.
    1 © HortonworksInc. 2011–2018. All rights reserved. Building A Data-Driven Authorization Framework
  • 2.
    2 © HortonworksInc. 2011–2018. All rights reserved. The Speaker’s Info Amer Issa: • Platform and Security Architect at Hortonworks based in Canada. • I specialize in Hadoop platform engineering with an emphasis on DevOps and Security. Started my career as a System’s Engineer and transitioned into Cloud and Big Data. • I have a spent the majority of my career in highly governed and secured environments; mostly financial and health related. • Currently I act as an SME when it comes to security implementations and the integration of Hadoop with the existing frameworks of organizations. I also help transform organizations to a mentality of automation and infrastructure as code.
  • 3.
    3 © HortonworksInc. 2011–2018. All rights reserved. Introduction • The integration of Ranger and Atlas is a fundamental shift in how to provision access to assets within the Hadoop ecosystem. • It allows for those who understand the content and classification of data to assign proper permissions based on data-specific attributes, rather than the current model of location- and user-based model • It provides a clear separation of duties and ensures the responsibility of maintaining data access security remains with the most appropriate teams: i.e. those who know the data best.
  • 4.
    4 © HortonworksInc. 2011–2018. All rights reserved. Object-Driven vs Data-Driven Object-Driven Data-Driven • Permissions are assigned similar to how they are assigned in Unix: e.g Read permissions on /data/files/sales.csv to group dpt-sales. • Static: Changes in data does not reflect changes in permissions unless strictly specified. • System level context. • One policy per service. • Managed and owned by administration and operations. • Permissions are assigned based on tags. Data tagged with sales will give permissions read dpt-sales. • Dynamic: Changes in metadata will reflect changes in permissions. E.g. Add an extra tag will result in extra permissions. • Business level context • One policy can cover multiple services. • Managed and owned by data stewards.
  • 5.
    5 © HortonworksInc. 2011–2018. All rights reserved. Process Flow in A Data Driven Environment
  • 6.
    6 © HortonworksInc. 2011–2018. All rights reserved. Structure of User Groups A group for each of the following: • Platform and System Administrator • Hadoop Application Support • Data Stewards, multiple if required • Line Of Business or Distinct User Base • Ingestion and Metadata
  • 7.
    7 © HortonworksInc. 2011–2018. All rights reserved. Tags to Groups Mappings SalesNA SalesMENA SalesEuro hrComp hrPerf
  • 8.
    8 © HortonworksInc. 2011–2018. All rights reserved. Atlas Features and Limitation • Out of the box integration: • Hive Bridge • Sqoop Bridge • Nifi Bridge • Storm Bridge • For other services, it is up to the client to register entities and linage • The UI does not support: • Adding custom fields • Editing existing fields • Adding entities to hive and few other types
  • 9.
    9 © HortonworksInc. 2011–2018. All rights reserved. Integration of Ranger and Atlas
  • 10.
    10 © HortonworksInc. 2011–2018. All rights reserved. Tags and Inheritance
  • 11.
    11 © HortonworksInc. 2011–2018. All rights reserved. Associating Tags and Defining Permissions
  • 12.
    12 © HortonworksInc. 2011–2018. All rights reserved. Testing Security Settings
  • 13.
    13 © HortonworksInc. 2011–2018. All rights reserved. Adding Custom Attributes Using A Script: Configs
  • 14.
    14 © HortonworksInc. 2011–2018. All rights reserved. Adding Custom Attributes Using A Script: Execution
  • 15.
    15 © HortonworksInc. 2011–2018. All rights reserved. In Summary Data Driven authorization framework streamlines authorization and introduces a business perspective to authorization
  • 16.
    16 © HortonworksInc. 2011–2018. All rights reserved. Other Projects By The Speaker: https://github.com/amerissa • Ranger MultiDomain Sync • Cloudbreak Shared Services in the Cloud • Ambari SSL Wizard • Docker and ELK Based Dashboards for Hadoop
  • 17.
    17 © HortonworksInc. 2011–2018. All rights reserved. Q&A