How to Copy Data from SQL Table to ADLS Gen2 in ADF

In today’s data-driven landscape, organizations often need to move large volumes of information between systems for analytics, reporting, or archiving. A common requirement is to Copy Data from SQL Table to ADLS Gen2 in ADF, enabling scalable storage and seamless integration with Azure analytics services. Azure Data Factory (ADF), Microsoft’s fully managed data integration platform, provides an efficient way to orchestrate this data movement.

This guide walks you through how to Copy Data from SQL Table to ADLS Gen2 in ADF, including setup, configuration, pipeline creation, and validation to ensure a smooth and reliable data transfer process.

Step 1: Establish Linked Services for Secure Connections

The first step in Azure Data Factory is to set up Linked Services—these define the connection details and authentication for your source and destination systems.

In ADF Studio, navigate to the Manage tab and select Linked Services.

Click “New” and choose Azure Data Lake Storage Gen2 as the type. Assign a meaningful name such as “LS_ADLSGen2.”

For authentication, select Account Key, then link your ADLS Gen2 storage account. Click “Test Connection” to ensure access, and then hit “Create.”

Repeat a similar process to create a Linked Service for your Azure SQL Database if you haven’t already set this up. These Linked Services act as secure bridges, allowing ADF to read from your SQL database and write to your data lake storage.

Step 2: Define Source and Sink Datasets

After establishing Linked Services, you need to create datasets, which specify the data structure and location for both the source (your SQL table) and the sink (your ADLS Gen2 storage).

  • Source Dataset:
    • Choose Azure SQL Database as the dataset type.
    • Connect it to your SQL Linked Service.
    • Point it to the specific table you want to export, such as “IndustryData.”
  • Sink Dataset:
    • Select Azure Data Lake Storage Gen2 as the dataset type.
    • Link it to the “LS_ADLSGen2” Linked Service you created.
    • Choose a container (e.g., “output”) and specify a file format, typically Delimited Text (CSV) for easy interoperability.

This setup ensures ADF knows exactly where to source the data and where to write it in your data lake.

Step 3: Build the Data Pipeline to Copy Data from SQL Table to ADLS Gen2 in ADF

With Linked Services and datasets configured, you’re ready to build the ETL pipeline. In ADF Studio, create a new pipeline.

From the Activities pane, drag the Copy Data activity into the pipeline canvas. Configure the Source tab by selecting your SQL dataset (e.g., IndustryData). For the Sink tab, select your ADLS dataset (the CSV in your chosen container).

In the Mapping tab, click “Import Schema” or “Import Mapping” to automatically map columns from the SQL table to fields in the CSV file. Review and adjust the mappings if necessary to ensure accuracy.

This Copy Data activity functions as the engine that extracts rows from your SQL database and writes them into a CSV file inside ADLS Gen2.

Step 4: Debug, Run, and Validate Your Pipeline

Before scheduling or operationalizing your pipeline, it’s important to test it. Click “Debug” to execute the pipeline in test mode. Monitor the run in the Output window—if everything is configured correctly, you’ll see a green “success” status.

Next, navigate to your ADLS Gen2 account, open the designated container (like “output”), and verify that the CSV file has been created. Check that the data matches your expectations and that all required columns are present.

Benefits of Using Azure Data Factory for SQL-to-ADLS Data Movement

Azure Data Factory is a powerful choice for this scenario due to its automation capabilities, scalability, and seamless integration with the broader Azure analytics ecosystem. You can schedule data transfers, handle large tables, and export data in various formats, including CSV, Parquet, and JSON. Data stored in ADLS can then be processed by services like Azure Synapse Analytics, Azure Databricks, and Power BI.

Conclusion

Using Azure Data Factory to Copy Data from SQL Table to ADLS Gen2 in ADF is a powerful strategy for modern data engineering. By configuring Linked Services, defining datasets, building a pipeline, and validating the output, you can streamline data movement for analytics, reporting, and archival processes. This method aligns with Azure best practices for scalability, security, and flexibility.

Watch the full video here:

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top