{"id":44368,"date":"2022-11-30T19:05:57","date_gmt":"2022-11-30T16:05:57","guid":{"rendered":"http:\/\/datalabsua.com\/ua\/?p=44368"},"modified":"2024-05-22T17:04:26","modified_gmt":"2024-05-22T14:04:26","slug":"key-differences-between-database-data-warehouse-data-mart-and-data-lake","status":"publish","type":"post","link":"https:\/\/datalabsua.com\/en\/key-differences-between-database-data-warehouse-data-mart-and-data-lake\/","title":{"rendered":"Key differences between database, data warehouse, data mart and data lake"},"content":{"rendered":"<p>In order to make the right decision when choosing a data organization system, it is advisable to conduct a comparative analysis.<\/p>\n<p><strong>Key differences between databases and data warehouse:<\/strong><\/p>\n<p><strong>Data Warehouse<\/strong><\/p>\n<ul>\n<li>stores summary data;<\/li>\n<li>used for data analysis;<\/li>\n<li>storage of historical and current data;<\/li>\n<li>information from various sources providing;<\/li>\n<li>providing of information on general business operations;<\/li>\n<\/ul>\n<p><strong>Database<\/strong><\/p>\n<ul>\n<li>uses detailed data;<\/li>\n<li>fixation transactions;<\/li>\n<li>storage of current data;<\/li>\n<li>collection of data from one source;<\/li>\n<li>fixation the main day-to-day operations;<\/li>\n<\/ul>\n<p><strong>Key differences between data mart and data warehouse<\/strong><\/p>\n<p><strong>Data Mart<\/strong><\/p>\n<ul>\n<li>providing of a thematic data subset that was retrieved from the data warehouse (usually less than 100 GB in size);<\/li>\n<li>is a repository of valuable data for a specific subgroup;<\/li>\n<li>fast data analysis;<\/li>\n<li>getting data from the data warehouse;<\/li>\n<\/ul>\n<p><strong>Data Warehouse<\/strong><\/p>\n<ul>\n<li>significantly larger (terabyte or more);<\/li>\n<li>contains all cleaned data for business units;<\/li>\n<li>getting data from databases;<\/li>\n<\/ul>\n<p><strong>Key differences between data lake and data mart<\/strong><\/p>\n<p><strong>Data Lake<\/strong><\/p>\n<ul>\n<li>contains all raw and unfiltered organization data;<\/li>\n<li>expedient to use for wider and deeper analysis of raw data;<\/li>\n<li>a complete solution that acts as a data warehouse, database and data mart;<\/li>\n<li>availability of a central archive where data marts can be stored in different user areas;<\/li>\n<\/ul>\n<p><strong>Data Mart<\/strong><\/p>\n<ul>\n<li>contains filtered and structured data for a specific department;<\/li>\n<li>allows to quickly and efficiently analyze relevant information;<\/li>\n<li>is a one-time solution without ETL process;<\/li>\n<\/ul>\n<p><strong>Key differences between data lake and data warehouse<\/strong><\/p>\n<p><strong>Data Warehouse<\/strong><\/p>\n<ul>\n<li>storage of cleaned data to create structured data models and reports;<\/li>\n<li>use of ODS from transactional systems;<\/li>\n<li>intended for users who need to create reports for analytics;<\/li>\n<\/ul>\n<p><strong>Data Lake<\/strong><\/p>\n<ul>\n<li>storage of all data for the organization;<\/li>\n<li>use of hardware that makes it possible to economically store large amounts of data (terabytes, petabytes);<\/li>\n<li>extracting data from all data types, including non-traditional data types (web service logs, social media activity, sensor data, etc.);<\/li>\n<li>designed for deep analysis that goes beyond data scope that is stored in the repository;<\/li>\n<\/ul>\n<p><strong>Key differences between databases and data mart<\/strong><\/p>\n<p><strong>Database<\/strong><\/p>\n<ul>\n<li>is a transactional data repository (OLTP);<\/li>\n<li>fixation of all aspects and activities of one subject in particular;<\/li>\n<li>contains raw data;<\/li>\n<li>users do not interact with data in databases;<\/li>\n<li>is the first step in the ETL process;<\/li>\n<\/ul>\n<p><strong>Data Mart<\/strong><\/p>\n<ul>\n<li>is a warehouse of analytical data (OLAP);<\/li>\n<li>contains data from several subjects;<\/li>\n<li>contains processed and verified data that simplifies the process of creating reports;<\/li>\n<li>direct user interaction with data from data marts;<\/li>\n<li>is \u00a0the last step in the ETL process<\/li>\n<\/ul>\n<p><strong>Key differences between databases and data lake<\/strong><\/p>\n<p><strong>Database<\/strong><\/p>\n<ul>\n<li>fixation of transactional data that are related to one topic;<\/li>\n<li>stores traditional data (text, numbers);<\/li>\n<li>does not perform data cleaning, stores raw data;<\/li>\n<li>exports its data to another process (operational data storage);<\/li>\n<li>is the first step in the ETL process;<\/li>\n<\/ul>\n<p><strong>Data Lake<\/strong><\/p>\n<ul>\n<li>recording the activity of many databases and other disparate data sources;<\/li>\n<li>it is possible to store data of any type (pdf-files, images, sound files, etc.);<\/li>\n<li>stores raw data, however, a data cleansing procedure is implemented;<\/li>\n<li>performs all data processing (cleansing and aggregation);<\/li>\n<li>handles all aspects of the ETL process.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Data Lake contains all raw and unfiltered organization data, it&#8217;s expedient to use for wider and deeper analysis of raw data. Data lake is a complete solution that acts as a data warehouse, database and data mart<\/p>\n","protected":false},"author":2,"featured_media":44845,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[97,85,159,86],"class_list":["post-44368","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog","tag-database","tag-datalake","tag-datamart","tag-datawarehouse"],"_links":{"self":[{"href":"https:\/\/datalabsua.com\/en\/wp-json\/wp\/v2\/posts\/44368","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datalabsua.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/datalabsua.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/datalabsua.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/datalabsua.com\/en\/wp-json\/wp\/v2\/comments?post=44368"}],"version-history":[{"count":4,"href":"https:\/\/datalabsua.com\/en\/wp-json\/wp\/v2\/posts\/44368\/revisions"}],"predecessor-version":[{"id":44373,"href":"https:\/\/datalabsua.com\/en\/wp-json\/wp\/v2\/posts\/44368\/revisions\/44373"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/datalabsua.com\/en\/wp-json\/wp\/v2\/media\/44845"}],"wp:attachment":[{"href":"https:\/\/datalabsua.com\/en\/wp-json\/wp\/v2\/media?parent=44368"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/datalabsua.com\/en\/wp-json\/wp\/v2\/categories?post=44368"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/datalabsua.com\/en\/wp-json\/wp\/v2\/tags?post=44368"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}