image 75

Migrating On-Prem Database To Cloud

Frame 1178

About Client

The client is a Pioneer in Workforce management, client manages outsourced recruiting processes, on-demand staff recruiting, and entire facilities.

The client assigns as many as 50,000+ people to work each day, drawing from a database of hundreds of thousands of candidates and placing more than 75,000+ people in permanent positions each year.

Client’s specialized workforce solutions meet their client requirements for a reliable, efficient workforce, and it services a wide variety of industries including construction, energy, manufacturing, financial services, pharmaceuticals, transportation, aviation, and energy.

Objective

Migration of 15+ On-Prem databases to the cloud, Transforming the data, and generating reports to drive business decisions.

Architecture

image 2

AWS Services Used

AWS DMS – Data migration Service for migration of data from servers to S3.

AWS S3 – To store data from the servers.

AWS Glue – For ingesting data, the transformation of data, and loading the data.

AWS Redshift – Loading data for better analytics.

AWS Athena – To query the data from S3.

AWS Quick Sight – To generate reports and graphs to drive business decisions

Steps Taken

1. Setup AWS

Setup AWS environment including account, VPC, subnets, Internet Gateways, etc based on the business requirement of the application.

2. 15+ Databases Migration

  • Using Data Migration Service (AWS DMS): – Using this service the data is migrated from servers to S3
  • Using Glue: – There are few sources of data that can’t use DMS so for that Glue an ETL tool is used with following 3 steps –
    1. Ingest the data from MySQL, S3, API.
    2. Transformation of data: Removing or adding columns, manual.
    3. Loading the data to S3, Redshift or DynamoDB

3. Data Segregation

Once the Data is landed in S3 the bucket is divided into 3 parts –

  • Landing Bucket: Here the Data is in RAW form.
  • Structured Bucket: Here the data is in Structured form.
  • Analytical Bucket: Here the data is in its best form.

Ready to be used for generating reports and Machine learning.

4. Query Data

  • Using Athena, the Data is queried from S3.
  • Redshift is used for warehousing the data.

5. Data Insights

  • The queried data is then used to generate reports and data models.