Azure Synapse Analytics is a scalable, cloud-based data warehousing solution from Microsoft. It’s also the next iteration of Azure SQL Data Warehouse. In addition to offering all of the technology and features of SQL Data Warehouse, Azure Synapse also incorporates business intelligence, data analytics, and machine learning tools for both relational and non-relational data.

To help you better understand the awesome power and capabilities of this cutting-edge data warehouse and business analytics system – and to announce the release of Xplenty’s native Azure Synapse Analytics connector – this guide will help you understand what Azure Synapse Analytics is and how it can support your data goals.

Please use these links to navigate the guide:

  1. Overview of Azure Synapse Analytics
  2. Features of Azure Synapse Analytics
  3. When to Use Azure Synapse Analytics
  4. Xplenty: ETL Data Into Azure Synapse Analytics the Easy Way

Integrate Your Data Today!

Try Xplenty free for 7 days. No credit card required.

Overview of Azure Synapse Analytics

undefined

If you’re familiar with Azure SQL Data Warehouse, you already know the core features of Synapse Analytics. For example, Synapse offers cloud-based, relational data warehousing services, massively parallel processing (MPP) scale-out technology, and enough computational power to efficiently manage petabytes and petabytes of data (just like SQL Data Warehouse).

In addition to these SQL Data Warehouse features, Synapse Analytics adds new capabilities like: 

  • The capability to ingest, save, query, and process non-relational data
  • More integrations with Microsoft technologies
  • Business intelligence integrations
  • Machine learning integrations
  • More efficient ingestion, transformation, management, and processing of large volume data

It’s also important to note that Azure Synapse Analytics can operate via the “on-demand serverless” model (which allows you to scale up or down and pay for only what you need when you need it), or it can operate on pre-provisioned server resources -- whichever is better for your budget and use-case.

As for operational components, Synapse consists of four fundamental parts:

undefined

  1. SQL analytics: Synapse offers T-SQL analysis of your relational and non-relational data via SQL Cluster (where you pay by the computational unit) and SQL on-demand (where you pay by the number of processed terabytes). 
  2. Apache Spark: Apache Spark is the leading platform for managing SQL queries, batch processing, stream processing, and machine learning analysis on large data stores.
  3. Synapse Analytics Studio: Synapse Analytics Studio offers a unified workspace where you can use all of your analytics tools related to AI, ML, IoT, and BI in a single place.
  4. Connectors for ingesting/integrating data from data sources: Synapse features 85 native connectors for integrating the most popular data sources so you can quickly ingest all of your data from diverse systems into the data warehouse.

Here’s a simple matrix that compares the general capabilities of Azure Synapse Analytics with other data management solutions:

General Feature/Capability Azure Synapse Azure SQL Database SQL Server (hosted on VM) Apache Hive (hosted on HDInsight) Hive LLAP (hosted on HDInsight)
Is it a relational data store? X
Managed service? X
Does it need data orchestration? X X
SMP or MPP? MPP SMP SMP MPP MPP
Does it have real-time reporting? X X
Does it offer flexible backup restore points? X
Can you integrate multiple data sources? X X
Is pausing compute supported? X X X X

(Source)

Features of Azure Synapse Analytics

Let’s review the defining features of Azure Synapse Analytics:

Cloud Data Warehousing, Machine Learning Analytics, and Business Intelligence

undefined

Through its deep integrations with a wide range of Microsoft Azure technologies, Azure Synapse offers cloud data warehousing, machine learning analytics, and dashboarding in a single workspace. This allows you to quickly ingest all of your data, transform and query it with SQL, analyze the data with advanced machine learning algorithms, and visualize it with Microsoft Power BI.

Ingest and Query Both Structured and Unstructured Data

Azure Synapse ingests all types of data, including relational (data warehouse) data and non-relational (data lake) data, and it lets you explore this data with SQL. In this way, Synapse brings all of your structured and unstructured data (LOB, CRM, Graph, Image, Social, IoT, etc.) under the same roof for easy access and analysis. 

Azure Data Lake Storage Gen2

Azure Synapse uses Azure Data Lake Storage Gen2 (ADLS Gen2) as a next-level data storage solution to support large-volume data analytics. ADLS Gen2 combines ADLS Gen1 features (like file-level security, scaling and file system semantics) with Azure Blob Storage features such as tiered storage, disaster recovery, and high-availability.

Massively Parallel Processing (MPP)

undefined

Azure Synapse uses massively parallel processing (MPP) database technology, which allows it to manage analytical workloads and aggregate and process large volumes of data efficiently. In contrast to transactional databases, which store rows in a table as an object, MPP databases store each column as an object. MPP databases also distribute data across many nodes that operate in parallel to process different portions of queries. This database architecture facilitates complex, long-running analytical processes. 

Cloud-Native Hybrid Transaction/Analytical Processing (HTAP) Implementation

undefined

Azure Synapse Analytics uses “Synapse Link” and HTAP implementation technology to achieve real-time data integrations with the Azure databases that make up your operational database infrastructure. The result is real-time machine learning and business intelligence insights drawn from live, operational data – without impacting your operational systems.

According to Gartner, “HTAP will enable business leaders to perform, in the context of operational processes, much more advanced and sophisticated real-time analysis of their business data than with traditional architectures. Large volumes of complex business data can be analyzed in real-time using intuitive data exploration and analysis without the latency of offloading the data to a data mart or data warehouse. This will allow business users to make more informed operational and tactical decisions.

On-Demand Serverless or Provisioned Processing Resources

Synapse gives you the ability to query massive data stores using either an on-demand serverless deployment (which scales automatically as needed to handle any processing or load) or provisioned resources. This allows organizations to either pay for what they need when they need it, or they can have a set amount of pre-provisioned processing and storage capabilities.

Programming Language Compatibility

Azure Synapse is compatible with the widest range of scripting languages – including Scala, Python, .Net, Java, R, SQL, T-SQL, and Spark SQL. Synapse’s compatibility with so many different languages makes it suitable for a wide range of analytics tasks and data engineering profiles.

Easy Integrations with Microsoft Technology

As a Microsoft Azure product, Synapse integrates natively with your favorite Microsoft and Azure solutions such as Azure Blob Storage, Azure Data Lake, Azure Active Directory, Azure Machine Learning, and Power BI.

Open Data Initiative Compatibility

undefined

Azure Synapse readily integrates with solutions that adhere to the Open Data Initiative, which promotes easier data integration and compatibility between Adobe, Microsoft, and SAP technologies. Open Data Initiative solutions include products like Microsoft Dynamics 365, Microsoft Office, and Adobe Customer Experience Platform.

Workload Optimization and Management Features

Synapse facilitates query performance tuning and optimization via limitless concurrency, workload isolation, and workload management. An example of how this works in terms of workload management could involve giving greater importance to queries from important users, like the CEO. In the following illustration, the CEO’s query gets automatically promoted from “queued” status to “running.” 

undefined

Watch this video from Microsoft for more information about what Synapse can do in terms of workload optimization.

Security and Privacy

Synapse includes the latest security and privacy technology such as real-time data masking, dynamic data masking, always-on encryption, Azure Active Directory authentication, single-sign-on authentication, and automated threat detection. The platform also allows you to control access to sensitive data via column-level and row-level security.

Here’s a matrix comparison of the security features of Synapse Analytics and other solutions:

Security Feature/Capability Azure Synapse Azure SQL Database SQL Server (hosted on VM) Apache Hive (hosted on HDInsight) Hive LLAP (hosted on HDInsight)
What types of authentication? SQL, Azure Active Directory SQL, Azure Active Directory SQL, Azure Active Directory Local and Azure Active Directory Local and Azure Active Directory
Is there row-level security? X √ 
Is there support for firewalls? √ 
Is there dynamic data masking? X √ 
Is there authorization? √ 
Is there auditing? √ 
Is there data encryption at rest? √  √  √  √  √ 

(Source)

Compliance Certifications

Azure has more compliance certifications than any other cloud service provider. These compliance certifications allow your organization to adhere to the most stringent government and industry compliance standards.   

Global US Government Industry Regional
CIS Benchmark CJIS 23 NYCRR Part 500 HIPAA / HITECH BIR 2012 (Netherlands) LOPD (Spain)
CSA-STAR attestation CNSSI 1253 AFM + DNB (Netherlands) HITRUST C5 (Germany) MeitY (India)
CSA-STAR certification DFARS APRA (Australia) KNF (Poland) CCPA (US-California) MTCS (Singapore)
CSA-STAR self assessment DoD DISA L2, L4, L5 AMF and ACPR (France) MARS-E IRAP / CCSL (Australia) My Number (Japan)
ISO 20000-1:2011 DoE 10 CFR Part 810 CDSA MAS + ABS (Singapore) CS Mark Gold (Japan) NZ CC Framework (New Zealand)
ISO 22301 EAR (US Export Adm. Reg.) CFTC 1.31 (US) MPAA Cyber Essentials Plus (UK) PASF (UK)
ISO 27001 FedRAMP DPP (UK) NBB + FSMA (Belgium) Canadian Privacy Laws PDPA (Argentina)
ISO 27017 FIPS 140-2 EBA (EU) NEN-7510 (Netherlands) DJCP (China) Personal Data Localization (Russia)
ISO 27018 IRS 1075 FACT (UK) NERC EN 301 549 (EU) TRUCS (China)
ISO 27701 ITAR FCA (UK) OSFI (Canada) ENS (Spain)  
ISO 9001 NIST 800-171 FDA CFR Title 21 Part 11 PCI DSS ENISA IAF (EU)  
SOC NIST CSF FERPA RBI + IRDAI (India) EU Model Clauses  
WCAG Section 508 VPATS FFIEC (US) SEC 17a-4 EU-US Privacy Shield  
    FINMA (Switzerland) SEC Regulation SCI GB 18030 (China)  
    FINRA 4511 Shared assessments GDPR (EU)  
    FISC (Japan) SOX G-Cloud (UK)  
    FSA (Denmark) TISAX (Germany) IDW PS 951 (Germany)  
    GLBA TruSight ISMS (Korea)  
    GxP HDS (France) IT Grundschutz Workbook (Germany)

(Source)

When to Use Azure Synapse Analytics

Here are some general use-case scenarios where Azure Synapse Analytics may be useful:

  • Need for a managed service: Azure Synapse can serve as your managed cloud-based data warehouse instead of an on-site data warehouse that you have to maintain yourself. 
  • Large data sets and complex queries: Azure Synapse Analytics uses an MPP architecture (see above), which is excellent for managing large datasets while running complicated read and data analytics operations.  
  • Managing structured and unstructured datasets: If you’re dealing with unstructured data or a mix of structured and unstructured data, Azure Synapse integrates with Azure Data Analytics, which allows you to process unstructured data with Spark, Azure Databricks, Hive LLAP, and Azure Data Lakes Analytics. Azure Synapse also supports high-speed, compute-heavy read operations on structured data.
  • Data pipeline orchestration: Azure Synapse Analytics allows you to orchestrate data pipelines in order to separate historical data (into a data warehouse optimized for high-speed read operations) from real-time operational databases. 
  • Analytics on real-time operational data: Azure Synapse Analytics’ use of “Synapse Link” and HTAP implementation technology allows you to analyze real-time operational data without negatively impacting your operational systems.
  • Using many Microsoft and Azure services: If your organization already subscribes to and uses services within the Microsoft and Azure ecosystems, you’ll enjoy the fact that Synapse easily integrates with these services. 

Integrate Your Data Today!

Try Xplenty free for 7 days. No credit card required.

Xplenty: ETL Data Into Azure Synapse Analytics the Easy Way

If you’re planning to use Azure Synapse Analytics to service your data warehousing, analytics, and business intelligence needs, you’ll need a way to quickly and easily move your data from diverse systems into the service. This is where Xplenty’s easy-to-use Azure Synapse Analytics connector can help.

undefined

As the only data integration tool that merges powerful ETL capabilities with ease-of-use, Xplenty empowers both non-tech-savvy team members and experienced data engineers to quickly design sophisticated workflows that safely extract, join, aggregate, mask, and encrypt data from multiple systems (while adhering to the most important data compliance standards), then load the data into your Azure Synapse Analytics. 

With Xplenty’s newly-released native connector for Azure Synapse Analytics, anyone on your team – regardless of their data engineering skill level – can develop powerful ETL pipelines to your Azure Synapse Analytics data warehouse. Want to see how easy it is to use Xplenty for yourself? Contact the Xplenty team to schedule a free trial now

Some images used courtesy of Microsoft