0% found this document useful (0 votes)
30 views5 pages

Davinder Gill's Data Engineering Expertise

Jagan Mohan Kanimetta is a seasoned IT professional with over 15 years of experience in Software Development Life Cycle, specializing in data engineering and cloud platforms, particularly Azure. He has a strong background in designing and managing data infrastructure, ETL processes, and has worked extensively with tools like Azure Data Factory, Databricks, and Informatica. His recent roles include Lead Data Engineer at Bank of America and Mitsubishi Union Finance Group, where he developed and optimized data pipelines and workflows for large-scale data solutions.

Uploaded by

anup.upadhyay504
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views5 pages

Davinder Gill's Data Engineering Expertise

Jagan Mohan Kanimetta is a seasoned IT professional with over 15 years of experience in Software Development Life Cycle, specializing in data engineering and cloud platforms, particularly Azure. He has a strong background in designing and managing data infrastructure, ETL processes, and has worked extensively with tools like Azure Data Factory, Databricks, and Informatica. His recent roles include Lead Data Engineer at Bank of America and Mitsubishi Union Finance Group, where he developed and optimized data pipelines and workflows for large-scale data solutions.

Uploaded by

anup.upadhyay504
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd

Jagan Mohan Kanimetta

sravanmaganti2@[Link]
+1 7327310281
[Link]

Summary

 15+ years of Information Technology SDLC (Software Development Life Cycle) with a strong
background in Data Platform, Data Migration, ETL Processes and Data warehousing solutions with
4+ years of experience on data engineering cloud platforms.
 Extensive experience with different phases of the project lifecycle, including project initiation,
requirement gathering, system design, coding, testing, and debugging client-server-based
applications.
 Experience in Azure cloud data platforms to analyze and validation of the data between different
processing zones.
 Proficient in designing, implementing, and managing large-scale data infrastructure and ETL
pipelines.
 Expertise in SQL, Data Warehousing, Data Lake, and Azure Data Factory, along with a strong
foundation in optimizing and automating data workflows.
 Experience in data processing using Azure Databricks using Lakehouse Architecture.
 Experience in Azure Databricks, Apache Spark, Clusters, notebook, Workspace, Data loading,
Running Spark Jobs.
 Experience in developing applications using PySpark and Spark-SQL for data extraction,
transformation and aggregation of data from multiple file formats.
 Demonstrated experience in automated pipelines orchestration using Databricks workflows.
 Demonstrated proficiency in Azure Cloud Services, managing critical components such as Azure
SQL, Azure Table Storage and Azure Lake ensuring robust and secure data storage, processing, and
retrieval.
 Ability to architect solutions, create Data Platform roadmaps and enable scalability in Azure Data
Lake Azure Databricks and Azure Data Factory.
 Experience with Developer tools like code Deploy, Code build, Code pipeline, design the overall
virtual private Cloud VPC environment including server instances, subnets, high availability zones
 Proficient in writing complex SQL queries, including creating and managing Databricks Delta
tables, views, indexes, and stored procedures. Skilled in SQL-based analytics to build efficient
data models, processes, and transformation pipelines within Databricks environment.
 Good knowledge Spark distributed frame work and performance concepts. Good experience on Code
optimization techniques in Spark, PySpark, Python.
 Experience in working on complex data files structured/unstructured file formats such as XML, JSON,
and Text file.
 Experience in Airflow for Data pipeline for job scheduling, orchestration & monitoring
 Used Python (NumPy, SciPy, pandas, scikit-learn, seaborn) and Spark (PySpark, MLlib) to develop
variety of models and algorithms for analytic purposes.
 Experience with different data formats like JSON, Avro, Parquet, ORC formats, and compressions like
Snappy & Bzip.
 Experience in Data Integration, Migration and ETL process using Informatica Power Center
10.x/9.x/8.x.
 Experienced in PL/SQL, UNIX shell scripting.
 Worked with Stored Procedures, Triggers, Cursors, Indexes and Functions
TECHNICAL SKILLS:

Cloud Azure: Azure Data Lake, Data Factory, Databricks, Azure SQL
Platforms AWS: AWS Glue, DMS, IAM, S3, SQS, RDS, EC2 etc
Snowflake
ETL Tools Informatica 10.x/9x/8.x/7.x (Power Center/Power Mart) (Designer, Workflow Manager,
Workflow Monitor) , Informatica Power Exchange, Ab Initio 3.1
Data Erwin 4.0/3.5, Star Schema Modeling, Snowflake Modeling
Modeling
Tools Autosys, CA ESP, Control M, DB2A, DB2I, Endeavor, Perforce, Control-M and QMF.

Databases SQL Server, Azure SQL, Oracle 11g,Teradata 14.0/13/V2R5/V2R6, DB2


Programmin Python, PL/SQL,T-SQL, Unix Shell Scripting, MVS Cobol, JCL
g Languages
Data Azure Databricks, PySpark, Apache Airflow
Processing
Operating UNIX/LINUX, Windows
Systems

EDUCATION:

 Master of Technology - 2005, JNTU, Hyderabad, India.


 Bachelor of Technology - 2002, JNTU, Hyderabad, India.

TRAININGS & CERTIFICATIONS:

 Databricks Certified Data Engineer - Associate


 AWS Certified Solutions Architect - Associate
 DB2 UDB Certified.
 AINS 21 Certified.

PROFESSIONAL EXPERIENCE:

Bank of America, Charlotte/NC


Sep 23 – Till Date
Role : Lead Data Engineer

Responsibilities:

 Designed and implemented data pipelines using Azure Data Factory to ingest, process, and
transform data from multiple sources into Azure Data Lake and SQL databases.
 Built and maintained large-scale Azure Data Lake solutions to store unstructured and semi-
structured data, enabling high-performance analytics.
 Leveraged Azure Databricks and Apache Spark to process large volumes of data efficiently,
reducing processing times by 30%.
 Worked with ADF and its infrastructure, including Copy activity, Get metadata, Web activity,
execute pipeline, Azure data flows, IR’S, Dataset and linked service implementation, IAM,
triggers, synapse.
 Executed complex data processing tasks using PySpark and Python, optimizing data workflows
for performance across distributed systems.
 Created ETL workflows for data transformation and cleansing, improving the data quality and
reporting accuracy.
 Implemented Azure Databricks notebooks to handle complex file transformations involving data
sourcing formats like csv/parquet/Json.
 Enabled Unity Catalog for secure data governance within Databricks, managing access controls
and data cataloging.
 Utilized PySpark RDDs (Resilient Distributed Datasets) and DataFrames for efficient data
manipulation and analysis in distributed computing environments
 Implemented Slowly Changing Dimension (SCD), utilizing delta tables and change data feed.
 Developed and maintained detailed documentation for all data engineering processes, including
data models, ETL workflows, and data transformation logic, ensuring transparency and ease of
knowledge transfer.
 Created the DAGs in Airflow for orchestration of tasks through Python code and using the
operators.

Environment: Azure Databricks, Data Factory, Pyspark, Python, Spark SQL, Azure SQL, Informatica
10.x.

Mitsubishi Union Finance Group (MUFG), Charlotte, NC/Los Angeles CA


Projects: EDP Data Lake Pillar2, Application Production Support, OFSAA 6.1 upgrade etc.
Mar 15 – Aug23
Role : Lead Data Engineer

Responsibilities:

 Engineered resilient data pipelines leveraging Azure Data Factory and Azure Blob Storage to
streamline data ingestion from on-premises and cloud sources.
 Generated Spark jobs to handle data ingestion, transformation, and aggregation, significantly
reducing the time required for data preparation.
 Formulated and optimized complex SQL queries to extract, transform, and load (ETL) sales data
from various sources into a centralized warehouse.
 Testing & bug tracking and software maintenance in a CI/CD environment for Database and
Development Environment with GIT and Jenkin.
 Development of Ingestion, Curation and Consumption process in Azure for new or existing sources.
 Functional test case preparation, execution, logging and tracking defects in Jira
 Report and discuss the status in scrum calls, attend all other meetings according to the Agile
practice.
 Analyze business requirements and transformation rules for conversion into data validation test
scripts.
 Responsible for BAU activities and production support of various applications and making sure no
impact on business.
 Business development and delegating work to the teams by priority of the task and efficiency of the
team, as well as mentoring the team.
 Design, Develop and Supported Extraction, Transformation and Load Process (ETL) for data
migration with Informatica 10.x/9.x with PL/SQL Packages.
 Develop ETL mapping Documents like High Level Design (HLD) and Low Level design (LLD) for
every mapping and Mapping specification document for smooth transfer of project from
development to testing environment and then to production.
 Performs the walkthrough on low-level design, Unit test plans and implementation plans at various
stages of the project prepared by the team; Ensures that all the team members are following the
PMP standards
 Develop shell scripts and PL/SQL Procedures as part of Oracle data load.

Cloud Environment: Azure, PySpark , CI/CD ,Azure Data Factory, Azure SQL DB, Python
Environment: Informatica 10.x/9.5.1, Oracle 12c/11g, PL/SQL and UNIX Shell Scripting.
CIGNA- IM (CCW Accel Rx/Rebates), Bloomfield CT May 12 – Feb
15
Role : Informatica lead

Responsibilities:

 Understand the business rules completely based on High Level document specifications and
implement the data transformation methodologies.
 Business development and delegating work to the teams by priority of the task and efficiency of the
team, as well as mentoring the team.
 Handles Offshore-Onsite-Client communication; prepares Functional Design documents and reviews
the deliverables and Quality Documentation.
 Designed, Developed and Supported Extraction, Transformation and Load Process (ETL) for data
migration with Informatica 9.x with support of Teradata database.
 Developed ETL mapping Documents like High Level Design (HLD) and Low Level design (LLD) for
every mapping and Data Migration document for smooth transfer of project from development to
testing environment and then to production.
 Performs the walkthrough on low-level design, Unit test plans and implementation plans at various
stages of the project prepared by the team; Ensures that all the team members are following the
PMP standards; interacts with the client to get the approvals of the design, coding and
implementation

Environment: Informatica 9.1.1, Teradata 14, Oracle 11g and UNIX Shell Scripting.

Liberty Mutual Jan ‘ 10 –


Apr 12
Role: ETL Developer
Base Location : Hyderabad, India

Responsibilities:
 Gathered the system requirements and created mapping document which gives detail
information about source to target mapping and business rules implementation.
 Drafted Business Requirement Documents, System Requirement Specifications, Business Work
Flow Diagram, Use Case Diagram, Data Flow Diagram, Cross Functional Diagram to represent
Business and System requirements.
 Designed, developed and debugged ETL mappings using Informatica designer tool.
 Created complex mappings using Aggregator, Expression, Joiner, Filter, Sequence, Procedure,
Connected & Unconnected Lookup, Filter and Update Strategy transformations using
Informatica Power center designer.
 Extensively used ETL to load data from different sources such as flat files, XML to Oracle.
 Worked on mapping parameters and variables for the calculations done in aggregator
transformation.
 Implemented slowly changing dimension for accessing the full history of accounts and
transaction information.
 Tuned and monitored in Informatica workflows using Informatica workflow manager and
workflow monitor tools.

Environment: Informatica Power Center 8.6.1, Informatica power Exchange, Teradata, Unix and
Mainframe.

Marks and Spencers, UK Apr 08 - Dec


09
Role: Mainframe Developer
Base Location : Chennai, India
Responsibilities:

 Attending client work group meetings and getting the requirements during the design phase.
 Preparing Low Level Designs.
 Coordinate and Communicate with the offshore by conducting weekly status calls.
 Reviewing the offshore design docs & code deliverables and ensures that the coding is inline
with design specifications.
 Ensuring quality process is followed at every stage of enhancement.
 Training and Mentoring of the new joiners into the team and other teams by conducting KT
sessions.
 Worked for CFTO and CSSM applications and implemented successfully in production
 Writing the System Test Scripts and Test scenarios for the applications developed.

Environment: Cobol II, JCL, DB2

Common questions

Powered by AI

Jagan Mohan employs techniques such as using Airflow for job scheduling, orchestration, and monitoring, coupled with Databricks Workflows for automating ETL pipelines. By using these tools, he achieves seamless coordination and execution of ETL tasks, leading to reduced manual intervention, enhanced pipeline reliability, and improved processing efficiency. These techniques ensure that data transformation processes are consistent and scalable across large datasets .

Jagan Mohan ensures data quality and performance optimization by using advanced techniques such as constructing complex SQL queries for efficient data extraction, utilizing PySpark for data transformation, and implementing data validation scripts. He also optimizes data workflows across distributed systems to enhance performance and applies his proficiency in managing data across various stages of ETL processes using tools like Informatica Power Center and Azure Data Factory .

Jagan Mohan applies methodologies that include designing complex SQL queries for transformation, utilizing PySpark for efficient data manipulation in distributed computing environments, and employing PL/SQL for database-related transformations. He also utilizes Informatica Power Center to support transformation logic historically, crafting design documents to guide the development and implementation processes for clarity and consistency .

Jagan Mohan's expertise in Python, PL/SQL, and Unix Shell Scripting enhances his data engineering capabilities by allowing him to develop robust, scalable data pipelines and execute complex data processing tasks efficiently. His skills in managing environments such as Azure Databricks and Informatica Power Center enable him to create automated ETL processes and perform intricate data manipulations, which streamline data workflows and enhance data management .

Jagan Mohan creates and maintains robust data infrastructure by designing solutions around Azure Data Lake for storing unstructured and semi-structured data, thus enabling high-performance analytics. He implements data pipelines using Azure Data Factory to ingest and transform data, and uses Azure SQL for structured data management. He ensures secure data governance via Unity Catalog in Databricks and optimizes data workflows using PySpark and Spark. These actions foster a scalable and efficient data infrastructure .

Jagan Mohan is experienced in all phases of the project lifecycle, including project initiation, requirement gathering, system design, coding, testing, debugging, and maintenance. His role involves drafting requirement and design documents, developing and guiding the execution of test plans, and managing the transition of projects from development to production. These experiences contribute to successful project execution by ensuring that projects are well-planned, risks are mitigated, and deliverables meet stakeholder requirements .

Jagan Mohan leverages his extensive experience in Azure cloud data platforms, including the use of Azure Data Factory for designing data pipelines, Azure Databricks for processing large volumes of data, and Spark and PySpark for creating efficient data workflows. His expertise in SQL, data warehousing, and ETL processes, as well as his knowledge of managing Azure SQL and Data Lake, contribute to his ability to engineer large-scale data pipelines effectively .

Jagan Mohan utilizes Azure Databricks by orchestrating Spark jobs to process large datasets efficiently. He employs PySpark and Spark-SQL for data extraction and transformation, which significantly reduces processing time by up to 30%. The benefits of using Databricks include enhanced data processing speed, scalability, and the ability to handle complex file transformations involving multiple formats like CSV, Parquet, and JSON .

Azure Data Lake plays a central role in Jagan Mohan's strategy by providing a scalable and secure platform for storing large volumes of unstructured and semi-structured data, enabling robust analytics capabilities. Meanwhile, Azure SQL serves as a reliable storage solution for structured datasets and supports advanced SQL-based analytics and data model development. Together, these tools allow effective data integration, storage, and processing across diverse datasets .

Jagan Mohan enhances data governance through the implementation of Unity Catalog within Azure Databricks, which manages access controls and data cataloging to ensure secure data handling. He employs a detailed documentation process for all data engineering steps, which promotes transparency and accountability. Additionally, he uses tools like IAM for secure access management and orchestrates secure data transfer practices within Azure to ensure comprehensive data security .

You might also like