How to copy data from SQL Table to ADLS in ADF

Copy data from SQL Table to ADLS is a common requirement for businesses that want to move structured data into scalable storage for analytics, reporting, and machine learning. In this guide, we’ll walk you step by step through the process of exporting data from an Azure SQL Database into Azure Data Lake Storage Gen2 (ADLS Gen2) using Azure Data Factory (ADF). This allows for scalable storage and seamless integration with various Azure analytics tools.

Azure Data Factory (ADF) is Microsoft’s fully managed data integration service designed to simplify complex data workflows. If you need to copy data from SQL Table to ADLS, ADF provides a powerful and efficient platform to orchestrate this process.

This step-by-step guide will walk you through how to copy data from SQL table to ADLS using ADF. You’ll learn how to set up the environment, configure linked services, create data pipelines, and validate the data transfer.

Step 1: Establish Linked Services for Secure Connections

To copy data from SQL Table to ADLS, you first need to set up a Linked Service in Azure Data Factory. This enables secure connectivity between ADF and ADLS Gen2.—these define the connection details and authentication for your source and destination systems.

In ADF Studio, navigate to the Manage tab and select Linked Services. Click “New” and choose Azure Data Lake Storage Gen2 as the type. Assign a meaningful name such as “LS_ADLSGen2.”

For authentication, select Account Key, then link your ADLS Gen2 storage account.

Click “Test Connection” to ensure access, and then hit “Create.”

These Linked Services act as secure bridges, allowing ADF to read from your SQL database and write to your data lake storage.

Step 2: Define Source and Sink Datasets

After establishing Linked Services, you need to create datasets, which specify the data structure and location for both the source (your SQL table) and the sink (your ADLS Gen2 storage).

  • Source Dataset:
    • Choose Azure SQL Database as the dataset type.
    • Connect it to your SQL Linked Service.
    • Point it to the specific table you want to export, such as “IndustryData.”
  • Sink Dataset:
    • Select Azure Data Lake Storage Gen2 as the dataset type.
    • Link it to the “LS_ADLSGen2” Linked Service you created.
    • Choose a container (e.g., “output”) and specify a file format, typically Delimited Text (CSV) for easy interoperability.

This setup ensures ADF knows exactly where to source the data and where to write it in your data lake.

Step 3: Build the Data Pipeline with Copy Data Activity

With Linked Services and datasets configured, you’re ready to build the ETL pipeline.

In ADF Studio, create a new pipeline. From the Activities pane, drag the Copy Data activity into the pipeline canvas. Configure the Source tab by selecting your SQL dataset (e.g., IndustryData).

For the Sink tab, select your ADLS dataset (the CSV in your chosen container).

In the Mapping tab, click “Import Schema” or “Import Mapping” to automatically map columns from the SQL table to fields in the CSV file. Review and adjust the mappings if necessary to ensure accuracy.

This Copy Data activity functions as the engine that extracts rows from your SQL database and writes them into a CSV file inside ADLS Gen2.

Step 4: Debug, Run, and Validate Your Pipeline

Before scheduling or operationalizing your pipeline, it’s important to test it.

Click “Debug” to execute the pipeline in test mode.

Monitor the run in the Output window—if everything is configured correctly, you’ll see a green “success” status.

Next, navigate to your ADLS Gen2 account, open the designated container (like “output”), and verify that the CSV file has been created. Check that the data matches your expectations and that all required columns are present.

Benefits of Using Azure Data Factory for SQL-to-ADLS Data Movement

Azure Data Factory is a powerful choice for this scenario due to its automation capabilities, scalability, and seamless integration with the broader Azure analytics ecosystem. You can schedule data transfers, handle large tables, and export data in various formats, including CSV, Parquet, and JSON. Data stored in ADLS can then be processed by services like Azure Synapse Analytics, Azure Databricks, and Power BI.

Conclusion

Using Azure Data Factory (ADF) to copy data from SQL Table to ADLS Gen2 provides a powerful and scalable solution for modern data engineering workflows. By creating Linked Services, defining datasets, building pipelines with the Copy Data activity, and validating outputs, you can automate and streamline data movement for analytics, reporting, and archival purposes.

This approach not only enhances operational efficiency but also follows Azure best practices for secure, scalable, and flexible data management—making it an ideal choice for organizations looking to optimize their cloud-based data workflows.

Watch the full video here:

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top