SlideShare a Scribd company logo
Robin David
Mobile: +919952447654 email: robinhadoop@gmail.com
Objective:
Intend to build a career with leading corporate of hi-tech environment with committed and
dedicated people. Willing to work as a key player in challenging and creative environment
A position in which my technical abilities, education and past work experience will be utilized
to best benefit the organization.
Experience:
 Around 9 years of total experience in IT with 3+ years of Experience in Hadoop
 Professional experience of 4 months with Polaris, as Senior consultant mainly focusing
on building Reference Data Management (RDM) solution within Hadoop
 Professional experience of 18 months with iGATE Global Solutions, as Technical lead
mainly focusing on Data Processing with Hadoop, Developing IV3
engine (iGATE’S
Proprietary Big Data Platform) and Hadoop cluster Administration
 Professional experience of 5.7 Years with Cindrel Info Tech, as Technology Specialist
mainly focusing on ADS, LDAP, IIS, FTP, DW/BI and Hadoop
 Professional experience of 11 months with Purple Info Tech, as Software Engineer
mainly focusing on Routers and Switches
Achievements in Hadoop
• Having experience to build own plug-in with web user interface for
o HDFS data encryption/decryption
o Column based data masking on Hive
o Hive Benchmarking
o Sqoop automation for data ingestion on HDFS
o HDD space issues alert automation on Hadoop cluster environments
• Integrated Revolution R with cloudera distribution
• Integrated Informatica - 9.5 and HDFS for Data processing in Apache and cloudera
Cluster
Professional Summary
• Expertise in creating data lake environment in HDFS within secure manner
• Having experience to pull the data from RDBMS and various staging area to HDFS
using Sqoop, Flume and Shell scripts
• Providing solution and technical architecture to migrate existing DWH to Hadoop
platform
• Using the data Integration tool Pentaho for designing ETL jobs in the process of
building Data lake
• Having experience to build and processing data within the Datastax Cassandra cluster
• Having good experience to handle data within advance hive
• Experienced in Installation, Configuration and Management Pivotal, Cloudera,
Hortonworks and Apache Hadoop Clusters
• Designed and built IV3
IGATE proprietary Big Data platform
• Good experience in Hadoop Distributed File System [HDFS] management
• Having experience in shell scripting for various HDFS operations
• Installation, Configuration and Management on EMC tools (gemfireXD, Green plum
DB, and HAWQ)
• Configure Hadoop clusters on amazon cloud
• Recovery, Hadoop cluster, and name node or data node failures
• End-to-end performance tuning of Hadoop clusters and Hadoop Map/Reduce routines
against very large data sets
• Kerberos implementation with Hadoop clusters
• Configuring HA on Hadoop clusters
• Integrate Splunk server to HDFS
• Installing Hadoop cluster Monitoring tools (Pivotal Command Center, Ambari,
Cloudera manager and Ganglia)
• Cluster health monitoring and fixing performance issues
• Data balancing on cluster and Commissioning/Decommissioning data nodes in
existing Hadoop cluster
• Having experience on HDFS create directory structure/permission for project specific
needs and Set access permission for groups and users as required for project specific
needs
• Understand/Analyze specific job (project) and the run process
• Understand/Analyze the scripts, map reduce codes, and input/output files/data for
operations support
• Build an archiving platform in Hadoop environment
• Good exposure on understanding on Hadoop services and quick problem resolution
skills
• Having good experience in ADS, LDAP, DNS, DHCP, IIS, GPO, User Administration,
Patch Maintenance, SSH, SUDO ,Configuring RPM through YUM, FTP and NFS
Technical Skills
Operating System RHEL 5.x/6.x, Centos 5.x/6.x, UBUNTU, Windows servers and
Client family
Hardware Dell , IBM and HP
Database MySQL, Postgresql, MSSQL and ORACLE
Tools Command Center, Check_MK, Ambari, Ganglia, Cloudera Manager
and GIT LAB.
Languages Shell Scripting and Core Java
Cloud Computing
Framework
AWS
Hadoop Ecosystem Hadoop, ZooKeeper, Pig, Hive, Sqoop, Flume, Hue and Spark.
Certifications:
 Microsoft Certified IT Professional (MCITP - 2008)
 Microsoft Certified Professional (MCP-2003)
 CCNA
Educational Qualifications:
Bachelor of Science (Computer)
St. Joseph’s College, Bharthidasan University - Trichy
Major Assignments:
Project1
Market Reference Data Management is a pure play reference data management solution built
using big data platform and RDBMS to be used in the securities market industry. As part of this
project data is collected from different market data vendors such as Reuters, interactive data etc.
for different types of asset classes such as equity, fixed income and derivatives. The entire
solution is built using pentaho as the ETL tool and the final tables are stored in Hive. The
different downstream application will access data from hive as and when required.
• Responsible for provide an architecture plan to implement entire RDM solution
• Understanding the securities master data model
• Create an API for getting data from various sources.
• Create Hive data models
• Design ETL jobs with pentaho for Data cleansing, Data identification and Load the data
to Hive tables
• Create a Hive (HQL) scripts to process the data
(12 – 2015) To (till date)
Project: Reference Data Management
Domain Solution Building in Finance Domain
Environment: CDH 5.0
Role: Senior Consultant
Project 2
GE purchases parts from different vendor across the world within all the business units. There
is no central repository to monitor the vendors across the business units. Parts purchased were
charged on different scale within the vendor and its subsidiary and other vendors. To identify
the purchase price difference and to build a master list of vendors, GE software COE and
IGATE together designed a data lake on Pivotal Hadoop consisting of all PO (Purchase order)
And invoice data imported from multiple SAP/ERP sources. The data in the data lake is
cleansed, integrated with DNB to build a master list of vendors and the data is analyzed to
identify anomaly behavior in PO.
Job Responsibilities:
• Monitoring POD and IPS Hadoop clusters - Each environment has Sandbox,
Development and Production division.
• Having experience in CHECK_MK tool for monitoring Hadoop cluster environment.
• Provide the solution for all Hadoop ecosystem and EMC tools.
• Having experience in working together with EMC support.
• Shell scripting for various Hadoop operations
• User creation and quota allocation on Hadoop cluster and GPDB environment.
• Talend Support
• Having experience in GIT-LAB tool
• Provide the solution for performance issues in Green plum DB environment
• Bring back failure segments to active in GPDB environment
Project 3
Retail Omni channel Solution leverages Cross Channel Analytics (Web, Mobile & Store) along
with Bluetooth LE technology in the store to deliver superior customer experience. The target
messages / personalized promotions are delivered at the right time (Moment of Truth) to
maximize sales conversions.
Job Responsibilities:
• Create a Hive SQL Script for Data Processing and Data merging from multiple tables
(03 – 2015) To (09 - 2015)
Project: Data Lake Support
Client GE - Software COE
Environment: Pivotal and EMC Tools
Role: Technical Lead
(01 – 2014) To (06 - 2014)
Project: Retail Omni channel Solution
Client Retail Giant in US
Environment: Cloudera Distribution CDH (4X)
Role: Technical Lead
• Loading data to HDFS
• Export Data from HDFS to RDBMS (MySQL) using Sqoop
• Create a script for IROCS automation
Project 4
The principle motivation for IV3
is to provide a turnkey Big Data platform that abstracts the
complexities of technology implementation and frees up bandwidth to focus on creating
differentiated business value. IV3 is software based big data analytics platform designed to
work with enterprise class Hadoop distributions providing an open architecture and big data
specific software engineering processes. IV3
is power-packed with components and enablers
covering the life cycle of Big Data implementation starting from Data Ingestion, storage &
transformation to various analytical models. It aims to marshal the three Vs of Big Data
(Volume x Velocity x Variety) to deliver the maximum business impact.
Job Responsibilities:
• Implement data ingestion (RDBMS to HDFS) in IV3
Platform
• Testing IV3
Tools on different Hadoop Distribution
• Configure auto Yarn-Memory Calculator on IV3
Platform
• HDFS data encryption/decryption
• Column based data masking on Hive
• Hive Benchmarking
• Sqoop automation for data ingestion on HDFS
• Create a automation script for detecting HDD space issues on Hadoop cluster
environments
Project 5
(06- 2014) – (09-2014)
Project: Predictive Fleet Maintenance
Client Penske
Environment: Cloudera (CDH 4.6) - Hive
Role: Technical Lead
The business requirement of Penske is about collecting data from repository of un tapped data –
Vehicle Diagnostics, Maintenance & Repair and this data potentially be leveraged to generate
economic value. Penske want to create a future ready Big Data platform to efficiently store,
process and analyze the data in consonance with their strategic initiatives. Penske engaged
IGATE to partner with them in this strategic initiative to tap insights hidden in -Diagnosis,
Maintenance & Repair data. IGATE would be leveraging its state of the art Big Data
Engineering lab to implement the data engineering and data science part of this project.
(12 - 2013) To (09 - 2015)
Project: IV3
(Proprietary Big Data Platform)
Client IGATE
Environment: CDH, HDP and Pivotal
Role: Technical Lead
Job Responsibilities:
• Understand project scope, business requirement and current business processes.
• Map business requirements with use cases.
• Implemented use cases using with Hive.
Project 6
(01 – 2012) To (11-2013)
Project: WA-Insights and Analytics
Client Watenmal Group – Global
Environment: Apache Hadoop - HIVE, Map Reduce, Sqoop
Role: Technology Specialist
WAIA is intended to support all retail business segments involved in sale of goods and
supporting services. WAIA Retail store integrated data model addresses 3 major aspects of a
store business. (1) The physical flow and control of merchandise into, through and out of the
store. (2)The selling process where the products and services offered for sale are transformed
into tender and sales revenue is recognized. (3)The control and tracking of tender from the
point of sale where it is received through its deposit into a bank or other depository
Job Responsibilities:
• Understanding Hadoop main components and architecture
• Data migration between RDBMS to HDFS by using Sqoop
• Understanding nuance map/reduce program in UDF
• Data merging and optimization in hive
• Adding new data nodes in existing Hadoop cluster
• Safely decommissioning failure data nodes
• Monitoring Hadoop cluster with ganglia tool
Project 7:
(5 – 2008) to (12 – 2011)
Project: eyeTprofit and Maginss
Client SI and Arjun Chemicals
Environment: Windows – sql server, .Net Framework and LDAP
Role: Technology Specialist
EyeTprofit and Magniss enables businesses to easily analyze profitability, budget versus actual,
revenue, inventory, and cash requirements etc instantaneously especially when the information
is spread across multiple applications. EyeTprofit and Magniss is a non-invasive reporting
system that facilitates information from multiple functions to be culled out and presented in an
appropriate form to enable informed decisions.
Job Responsibilities:
• LDAP integration with BI Tool
• Administering Active directory, DHCP and DNS Servers
• Managing group policies
• Distributed file system management (DFS)
• Administering FTP servers and IIS servers
• Patch Maintenance
• Administering File Shares, Disk Quotas
• Providing access to the share drive users
• Remote Support
• Configure virtual machines
Project 8:
(5 – 2007) to (04 - 2008)
Project: TRMS
Client TTP
Environment: Windows – Storage server , Router and Layer 3 Switches
Role: Software Engineer
A State of the art Traffic Management system, the first of its kind in India. It helps regulate and
enforce law with the efficiency, expediency and accuracy of technology.
Job Responsibilities:
• Expertise in handling Wireless Communication
• Maintaining with hand-held computer devices
• Expertise in handling storage server
• Responsibility of designing the topology of network in the client places implementing
the infrastructure with high security and with hierarchy
• Handling the issues on the Network related problems on the client places, mainly
debugged issues that oriented on Network peripherals

More Related Content

DOC
ashishtripathi
Ashish Tripathi
 
DOCX
RENUGA VEERARAGAVAN Resume HADOOP
renuga V
 
DOCX
sudipto_resume
Sudipto Saha
 
DOCX
Shanthkumar 6yrs-java-analytics-resume
Shantha Kumar N
 
DOCX
hadoop resume
Hassan Qureshi
 
DOC
Jayaram_Parida- Big Data Architect and Technical Scrum Master
Jayaram Parida
 
DOC
Hadoop Big Data Resume
arbind_jha
 
DOCX
Manoj(Java Developer)_Resume
Vamsi Manoj
 
ashishtripathi
Ashish Tripathi
 
RENUGA VEERARAGAVAN Resume HADOOP
renuga V
 
sudipto_resume
Sudipto Saha
 
Shanthkumar 6yrs-java-analytics-resume
Shantha Kumar N
 
hadoop resume
Hassan Qureshi
 
Jayaram_Parida- Big Data Architect and Technical Scrum Master
Jayaram Parida
 
Hadoop Big Data Resume
arbind_jha
 
Manoj(Java Developer)_Resume
Vamsi Manoj
 

What's hot (20)

DOC
Rakesh_Resume_2016
Rakesh Lunawat
 
DOCX
Puneet_Senior_Java_Developer_Resume
Puneet Nebhani
 
DOC
Subhadeep_Mukherjee_Java_7years
Subhadeep Mukherjee
 
DOC
Krishna_Agrawal_Resume
Krishna Agrawal
 
PDF
Borja González - Resume ​Big Data Architect
Borja Gonzalez Martinez-Cabrera
 
DOCX
Hemant_Mittal_Resume
Hemant Mittal
 
DOC
Resume-Manish_Agrahari_IBM_BPM
Manish Agrahari
 
DOC
Resume_of_Vasudevan - Hadoop
vasudevan venkatraman
 
PDF
Updated Mohiuddin Resume
Raju Hossain
 
DOCX
Job Suneel Mandam
Job Suneel
 
DOC
resume_abdul_up
Abdul Kareem Khan
 
DOCX
Anoop Saxena
Anoop Saxena
 
PDF
Database Development
Aalpha India
 
PDF
David Edson CV Abridged
David Edson
 
DOC
Malli Resume_30 Jun 2012
mallikarjun ch
 
DOCX
ETL Profile-Rajnish Kumar
Rajnish Kumar
 
DOC
Resume For Java Devloper
veerendra_veeru
 
DOCX
ANANTHAKUMAR Resume
Anantha kumar
 
DOC
Madhava_Sr_JAVA_J2EE
RAMADHAVA REDDY KAYYURU
 
DOCX
Resume
Praveen G
 
Rakesh_Resume_2016
Rakesh Lunawat
 
Puneet_Senior_Java_Developer_Resume
Puneet Nebhani
 
Subhadeep_Mukherjee_Java_7years
Subhadeep Mukherjee
 
Krishna_Agrawal_Resume
Krishna Agrawal
 
Borja González - Resume ​Big Data Architect
Borja Gonzalez Martinez-Cabrera
 
Hemant_Mittal_Resume
Hemant Mittal
 
Resume-Manish_Agrahari_IBM_BPM
Manish Agrahari
 
Resume_of_Vasudevan - Hadoop
vasudevan venkatraman
 
Updated Mohiuddin Resume
Raju Hossain
 
Job Suneel Mandam
Job Suneel
 
resume_abdul_up
Abdul Kareem Khan
 
Anoop Saxena
Anoop Saxena
 
Database Development
Aalpha India
 
David Edson CV Abridged
David Edson
 
Malli Resume_30 Jun 2012
mallikarjun ch
 
ETL Profile-Rajnish Kumar
Rajnish Kumar
 
Resume For Java Devloper
veerendra_veeru
 
ANANTHAKUMAR Resume
Anantha kumar
 
Madhava_Sr_JAVA_J2EE
RAMADHAVA REDDY KAYYURU
 
Resume
Praveen G
 
Ad

Viewers also liked (14)

PDF
Cindy Moskalyk_RESUME
Cindy Moskalyk
 
DOC
spurthy_resume
spurthydanda
 
DOCX
Cv
Satyajit Das
 
DOC
Kirchner_J_Resume_2016_v2.2
Joe Kirchner, CPIM
 
DOCX
Resume arvind -csm
Arvind Swarup
 
DOC
Michael Johnson
Michael Johnson
 
DOCX
Resume[1]
Curt Higdem
 
PDF
J. keith hubbard commodity specialist
J. Keith Hubbard
 
PDF
Regional Project Coordinator
Christine Kaleeba
 
PDF
Timothy_Bergen Resume_2016
Timothy Bergen
 
PDF
J. keith hubbard medical plant manager
J. Keith Hubbard
 
DOCX
Anshul Verma_C.V
Anshul Verma
 
DOCX
VIVEKSHUKLA_10YRS_TESTAUTOMATION_SELENIUM
Vivek Shukla
 
DOC
Emanis, Dustin Jr. Resume 2015 (1)
Dustin Emanis
 
Cindy Moskalyk_RESUME
Cindy Moskalyk
 
spurthy_resume
spurthydanda
 
Kirchner_J_Resume_2016_v2.2
Joe Kirchner, CPIM
 
Resume arvind -csm
Arvind Swarup
 
Michael Johnson
Michael Johnson
 
Resume[1]
Curt Higdem
 
J. keith hubbard commodity specialist
J. Keith Hubbard
 
Regional Project Coordinator
Christine Kaleeba
 
Timothy_Bergen Resume_2016
Timothy Bergen
 
J. keith hubbard medical plant manager
J. Keith Hubbard
 
Anshul Verma_C.V
Anshul Verma
 
VIVEKSHUKLA_10YRS_TESTAUTOMATION_SELENIUM
Vivek Shukla
 
Emanis, Dustin Jr. Resume 2015 (1)
Dustin Emanis
 
Ad

Similar to Robin_Hadoop (20)

DOC
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Mopuru Babu
 
DOC
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Mopuru Babu
 
DOCX
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Mopuru Babu
 
DOC
Nagarjuna_Damarla
Nag Arjun
 
DOC
Resume - Narasimha Rao B V (TCS)
Venkata Narasimha Rao B
 
DOC
Resume_VipinKP
indhuparvathy
 
DOC
Resume (1)
NAGESWARA RAO DASARI
 
PDF
Sandish3Certs
Sandish Kumar H N
 
DOC
Purnachandra_Hadoop_N
Purnachandra CH
 
DOCX
Prasanna Resume
Prasanna Raju
 
PDF
Hareesh
Hareesh Ravulapati
 
DOCX
Prashanth Kumar_Hadoop_NEW
Prashanth Shankar kumar
 
DOCX
Shiv Shakti
Shiv Shakti
 
DOC
SreenivasulaReddy
Sreenivasula Reddy B
 
DOC
ganesh_2+yrs_Java_Developer_Resume
Yeduvaka Ganesh
 
DOCX
Poorna Hadoop
Poornachandrarao Kommana
 
PDF
Sidharth_CV
Sidharth Kumar
 
DOCX
Big data cv
Priya Singla
 
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Mopuru Babu
 
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Mopuru Babu
 
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Mopuru Babu
 
Nagarjuna_Damarla
Nag Arjun
 
Resume - Narasimha Rao B V (TCS)
Venkata Narasimha Rao B
 
Resume_VipinKP
indhuparvathy
 
Sandish3Certs
Sandish Kumar H N
 
Purnachandra_Hadoop_N
Purnachandra CH
 
Prasanna Resume
Prasanna Raju
 
Prashanth Kumar_Hadoop_NEW
Prashanth Shankar kumar
 
Shiv Shakti
Shiv Shakti
 
SreenivasulaReddy
Sreenivasula Reddy B
 
ganesh_2+yrs_Java_Developer_Resume
Yeduvaka Ganesh
 
Sidharth_CV
Sidharth Kumar
 
Big data cv
Priya Singla
 

Robin_Hadoop

  • 1. Robin David Mobile: +919952447654 email: [email protected] Objective: Intend to build a career with leading corporate of hi-tech environment with committed and dedicated people. Willing to work as a key player in challenging and creative environment A position in which my technical abilities, education and past work experience will be utilized to best benefit the organization. Experience:  Around 9 years of total experience in IT with 3+ years of Experience in Hadoop  Professional experience of 4 months with Polaris, as Senior consultant mainly focusing on building Reference Data Management (RDM) solution within Hadoop  Professional experience of 18 months with iGATE Global Solutions, as Technical lead mainly focusing on Data Processing with Hadoop, Developing IV3 engine (iGATE’S Proprietary Big Data Platform) and Hadoop cluster Administration  Professional experience of 5.7 Years with Cindrel Info Tech, as Technology Specialist mainly focusing on ADS, LDAP, IIS, FTP, DW/BI and Hadoop  Professional experience of 11 months with Purple Info Tech, as Software Engineer mainly focusing on Routers and Switches Achievements in Hadoop • Having experience to build own plug-in with web user interface for o HDFS data encryption/decryption o Column based data masking on Hive o Hive Benchmarking o Sqoop automation for data ingestion on HDFS o HDD space issues alert automation on Hadoop cluster environments • Integrated Revolution R with cloudera distribution • Integrated Informatica - 9.5 and HDFS for Data processing in Apache and cloudera Cluster
  • 2. Professional Summary • Expertise in creating data lake environment in HDFS within secure manner • Having experience to pull the data from RDBMS and various staging area to HDFS using Sqoop, Flume and Shell scripts • Providing solution and technical architecture to migrate existing DWH to Hadoop platform • Using the data Integration tool Pentaho for designing ETL jobs in the process of building Data lake • Having experience to build and processing data within the Datastax Cassandra cluster • Having good experience to handle data within advance hive • Experienced in Installation, Configuration and Management Pivotal, Cloudera, Hortonworks and Apache Hadoop Clusters • Designed and built IV3 IGATE proprietary Big Data platform • Good experience in Hadoop Distributed File System [HDFS] management • Having experience in shell scripting for various HDFS operations • Installation, Configuration and Management on EMC tools (gemfireXD, Green plum DB, and HAWQ) • Configure Hadoop clusters on amazon cloud • Recovery, Hadoop cluster, and name node or data node failures • End-to-end performance tuning of Hadoop clusters and Hadoop Map/Reduce routines against very large data sets • Kerberos implementation with Hadoop clusters • Configuring HA on Hadoop clusters • Integrate Splunk server to HDFS • Installing Hadoop cluster Monitoring tools (Pivotal Command Center, Ambari, Cloudera manager and Ganglia) • Cluster health monitoring and fixing performance issues • Data balancing on cluster and Commissioning/Decommissioning data nodes in existing Hadoop cluster • Having experience on HDFS create directory structure/permission for project specific needs and Set access permission for groups and users as required for project specific needs • Understand/Analyze specific job (project) and the run process • Understand/Analyze the scripts, map reduce codes, and input/output files/data for operations support • Build an archiving platform in Hadoop environment • Good exposure on understanding on Hadoop services and quick problem resolution skills • Having good experience in ADS, LDAP, DNS, DHCP, IIS, GPO, User Administration, Patch Maintenance, SSH, SUDO ,Configuring RPM through YUM, FTP and NFS Technical Skills Operating System RHEL 5.x/6.x, Centos 5.x/6.x, UBUNTU, Windows servers and Client family Hardware Dell , IBM and HP
  • 3. Database MySQL, Postgresql, MSSQL and ORACLE Tools Command Center, Check_MK, Ambari, Ganglia, Cloudera Manager and GIT LAB. Languages Shell Scripting and Core Java Cloud Computing Framework AWS Hadoop Ecosystem Hadoop, ZooKeeper, Pig, Hive, Sqoop, Flume, Hue and Spark. Certifications:  Microsoft Certified IT Professional (MCITP - 2008)  Microsoft Certified Professional (MCP-2003)  CCNA Educational Qualifications: Bachelor of Science (Computer) St. Joseph’s College, Bharthidasan University - Trichy Major Assignments: Project1 Market Reference Data Management is a pure play reference data management solution built using big data platform and RDBMS to be used in the securities market industry. As part of this project data is collected from different market data vendors such as Reuters, interactive data etc. for different types of asset classes such as equity, fixed income and derivatives. The entire solution is built using pentaho as the ETL tool and the final tables are stored in Hive. The different downstream application will access data from hive as and when required. • Responsible for provide an architecture plan to implement entire RDM solution • Understanding the securities master data model • Create an API for getting data from various sources. • Create Hive data models • Design ETL jobs with pentaho for Data cleansing, Data identification and Load the data to Hive tables • Create a Hive (HQL) scripts to process the data (12 – 2015) To (till date) Project: Reference Data Management Domain Solution Building in Finance Domain Environment: CDH 5.0 Role: Senior Consultant
  • 4. Project 2 GE purchases parts from different vendor across the world within all the business units. There is no central repository to monitor the vendors across the business units. Parts purchased were charged on different scale within the vendor and its subsidiary and other vendors. To identify the purchase price difference and to build a master list of vendors, GE software COE and IGATE together designed a data lake on Pivotal Hadoop consisting of all PO (Purchase order) And invoice data imported from multiple SAP/ERP sources. The data in the data lake is cleansed, integrated with DNB to build a master list of vendors and the data is analyzed to identify anomaly behavior in PO. Job Responsibilities: • Monitoring POD and IPS Hadoop clusters - Each environment has Sandbox, Development and Production division. • Having experience in CHECK_MK tool for monitoring Hadoop cluster environment. • Provide the solution for all Hadoop ecosystem and EMC tools. • Having experience in working together with EMC support. • Shell scripting for various Hadoop operations • User creation and quota allocation on Hadoop cluster and GPDB environment. • Talend Support • Having experience in GIT-LAB tool • Provide the solution for performance issues in Green plum DB environment • Bring back failure segments to active in GPDB environment Project 3 Retail Omni channel Solution leverages Cross Channel Analytics (Web, Mobile & Store) along with Bluetooth LE technology in the store to deliver superior customer experience. The target messages / personalized promotions are delivered at the right time (Moment of Truth) to maximize sales conversions. Job Responsibilities: • Create a Hive SQL Script for Data Processing and Data merging from multiple tables (03 – 2015) To (09 - 2015) Project: Data Lake Support Client GE - Software COE Environment: Pivotal and EMC Tools Role: Technical Lead (01 – 2014) To (06 - 2014) Project: Retail Omni channel Solution Client Retail Giant in US Environment: Cloudera Distribution CDH (4X) Role: Technical Lead
  • 5. • Loading data to HDFS • Export Data from HDFS to RDBMS (MySQL) using Sqoop • Create a script for IROCS automation Project 4 The principle motivation for IV3 is to provide a turnkey Big Data platform that abstracts the complexities of technology implementation and frees up bandwidth to focus on creating differentiated business value. IV3 is software based big data analytics platform designed to work with enterprise class Hadoop distributions providing an open architecture and big data specific software engineering processes. IV3 is power-packed with components and enablers covering the life cycle of Big Data implementation starting from Data Ingestion, storage & transformation to various analytical models. It aims to marshal the three Vs of Big Data (Volume x Velocity x Variety) to deliver the maximum business impact. Job Responsibilities: • Implement data ingestion (RDBMS to HDFS) in IV3 Platform • Testing IV3 Tools on different Hadoop Distribution • Configure auto Yarn-Memory Calculator on IV3 Platform • HDFS data encryption/decryption • Column based data masking on Hive • Hive Benchmarking • Sqoop automation for data ingestion on HDFS • Create a automation script for detecting HDD space issues on Hadoop cluster environments Project 5 (06- 2014) – (09-2014) Project: Predictive Fleet Maintenance Client Penske Environment: Cloudera (CDH 4.6) - Hive Role: Technical Lead The business requirement of Penske is about collecting data from repository of un tapped data – Vehicle Diagnostics, Maintenance & Repair and this data potentially be leveraged to generate economic value. Penske want to create a future ready Big Data platform to efficiently store, process and analyze the data in consonance with their strategic initiatives. Penske engaged IGATE to partner with them in this strategic initiative to tap insights hidden in -Diagnosis, Maintenance & Repair data. IGATE would be leveraging its state of the art Big Data Engineering lab to implement the data engineering and data science part of this project. (12 - 2013) To (09 - 2015) Project: IV3 (Proprietary Big Data Platform) Client IGATE Environment: CDH, HDP and Pivotal Role: Technical Lead
  • 6. Job Responsibilities: • Understand project scope, business requirement and current business processes. • Map business requirements with use cases. • Implemented use cases using with Hive. Project 6 (01 – 2012) To (11-2013) Project: WA-Insights and Analytics Client Watenmal Group – Global Environment: Apache Hadoop - HIVE, Map Reduce, Sqoop Role: Technology Specialist WAIA is intended to support all retail business segments involved in sale of goods and supporting services. WAIA Retail store integrated data model addresses 3 major aspects of a store business. (1) The physical flow and control of merchandise into, through and out of the store. (2)The selling process where the products and services offered for sale are transformed into tender and sales revenue is recognized. (3)The control and tracking of tender from the point of sale where it is received through its deposit into a bank or other depository Job Responsibilities: • Understanding Hadoop main components and architecture • Data migration between RDBMS to HDFS by using Sqoop • Understanding nuance map/reduce program in UDF • Data merging and optimization in hive • Adding new data nodes in existing Hadoop cluster • Safely decommissioning failure data nodes • Monitoring Hadoop cluster with ganglia tool Project 7: (5 – 2008) to (12 – 2011) Project: eyeTprofit and Maginss Client SI and Arjun Chemicals Environment: Windows – sql server, .Net Framework and LDAP Role: Technology Specialist EyeTprofit and Magniss enables businesses to easily analyze profitability, budget versus actual, revenue, inventory, and cash requirements etc instantaneously especially when the information is spread across multiple applications. EyeTprofit and Magniss is a non-invasive reporting system that facilitates information from multiple functions to be culled out and presented in an appropriate form to enable informed decisions.
  • 7. Job Responsibilities: • LDAP integration with BI Tool • Administering Active directory, DHCP and DNS Servers • Managing group policies • Distributed file system management (DFS) • Administering FTP servers and IIS servers • Patch Maintenance • Administering File Shares, Disk Quotas • Providing access to the share drive users • Remote Support • Configure virtual machines Project 8: (5 – 2007) to (04 - 2008) Project: TRMS Client TTP Environment: Windows – Storage server , Router and Layer 3 Switches Role: Software Engineer A State of the art Traffic Management system, the first of its kind in India. It helps regulate and enforce law with the efficiency, expediency and accuracy of technology. Job Responsibilities: • Expertise in handling Wireless Communication • Maintaining with hand-held computer devices • Expertise in handling storage server • Responsibility of designing the topology of network in the client places implementing the infrastructure with high security and with hierarchy • Handling the issues on the Network related problems on the client places, mainly debugged issues that oriented on Network peripherals