ETL Data Extraction process

The first part of an ETL process involves extracting the data from the source systems. Most data warehousing projects consolidate data from different source systems. Each separate system may also use a different data organization / format. Common data source formats are relational databases and flat files, but may include non-relational database structures such as Information Management System (IMS) or other data structures such as Virtual Storage Access Method (VSAM) or Indexed Sequential Access Method (ISAM), or even fetching from outside sources such as web spidering or screen-scraping. Extraction converts the data into a format for transformation processing.An intrinsic part of the extraction involves the parsing of extracted data, resulting in a check if the  data meets an expected pattern or structure. If not, the data may be rejected entirely.


6 comments :

  1. Nice post ! Thanks for sharing valuable information with us. Keep sharing.. Informatica Online Course India

    ReplyDelete
  2. Thank you so much for providing information about Informatica and other tools associated that makes complex IT operations simple.

    Informatica Read Soap API

    ReplyDelete
  3. Thanks for sharing this wonderful information. I too learn something new from your post.
    Informatica Training in Chennai
    Informatica Training in Bangalore

    ReplyDelete
  4. Efficient data extraction ensures that diverse formats are properly parsed and prepared for transformation in the ETL pipeline.
    Tools that handle dynamic content and anti-bot measures explain why use FlareSolverr during web-based data extraction.

    ReplyDelete