Skip to content

dsouzaankit/Data-Engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

data-engineering

Large scale business data integration

  1. Involves joining SQL data tables without a common designated primary key
  2. Utilizes database indexes of two different vendor tables to perform a scalable join
  3. Natural record ordering is leveraged to significantly reduce number of search iterations
  4. Ultimately combines data into a single consolidated table containing each row's source
  5. Uses Python Pandas library to efficiently manipulate and join records in memory

About

Large scale business data integration

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages