Difference between revisions of "Data Lake"
From Clinfowiki
Line 1: | Line 1: | ||
− | A | + | A data lake is a central repository that allows the storage and flow of structured and unstructured data sources. This concept is akin to a lake with multiple streams or sources to fill up a reservoir and store data as is, before it is allowed to flow out to various applications within an organization. |
=Functions of a Data Lake= | =Functions of a Data Lake= | ||
Line 22: | Line 22: | ||
=References= | =References= | ||
− | + | https://aws.amazon.com/big-data/datalakes-and-analytics/what-is-a-data-lake/ | |
Submitted by Tom Nahass | Submitted by Tom Nahass | ||
[[Category:BMI512-FALL-20]] | [[Category:BMI512-FALL-20]] |
Revision as of 19:11, 26 October 2020
A data lake is a central repository that allows the storage and flow of structured and unstructured data sources. This concept is akin to a lake with multiple streams or sources to fill up a reservoir and store data as is, before it is allowed to flow out to various applications within an organization.
Contents
Functions of a Data Lake
Data Ingestion
- Tools
Data Storage and Retention
- Tools
Data Processing
- Tools
Data Access
- Tools
Difference from Data Warehouse
Data Swamp
This is when a data lake can become unruly and become a data swamp.
References
https://aws.amazon.com/big-data/datalakes-and-analytics/what-is-a-data-lake/
Submitted by Tom Nahass