ETL – is a process of data extraction, transforming and loading. It means data is extracted from a source in the first instance, sent to the «intermediate zone» for transformation and after it is loaded into a destination area.
New tools generation gave a possibility to proceed from ETL to ELT. Key difference between them is a work concept. As opposed to ETL, ELT is a process of data extracting from different sources, loading directly into a destination area and then its transformation. ELT usage is the main benefit for work with large quantities of data.
Nevertheless, ELT field is at the stage of its infancy and rapid development. Questions of confidential information processing (PHI, PII) are still opened. That’s why discussion about little data processing necessity is actual and drives to a hybrid version (ETLT) appearing.
While companies like Snowflake, Bigquary, Redshift have changed data location mode, management and access, data integration industry has been developing also. There are a lot of prospects to automate many engineering tasks in the cloud storage system, where the main goal is data extraction and download (without its transformation). Such prospects had impact on the growth of such companies as Segment, Stitch, Fivetran etc.
Let’s explore Fivertran as example that has a form of automated ETL platform. It allows to collect and analyze data by connecting data bases to the central repository. Fivertran offers wide connectors variety whereby data is extracted from different sources and loaded into a storage. This process occurs automatically, it is completely managed process that doesn’t demand any support. This enabled different non-engineering teams to configure connectors for data integration and management.
Currently such tools have a wide usage. And the proof of that is company’s performance: over the past year the value of Series C securities was $1,2 billion.