IBM Db2 is a cutting-edge suite of data management products from IBM that includes the following tools:

  • Db2 Database: An RDBMS for transactional workloads and operational systems.
  • Db2 on Cloud: A scalable, cloud-based RDBMS for transactional workloads and operational systems.
  • Db2 Warehouse: A data warehouse for large volume data aggregation, data analysis, and business intelligence.
  • Db2 Warehouse on Cloud: A scalable, cloud-based data warehouse for large volume data aggregations, data analysis, and business intelligence. 
  • Db2 Big SQL: A highly-speed SQL data engine on Hadoop with an MPP architecture. 
  • Db2 Event Store:  A data management system for high volume streaming data capture and analysis. 

IBM Db2 is a cutting-edge family of data management products from IBM. The suite of Db2 services includes relational database management systems, data warehouse management systems, a high-performance data engine, and an information capture/analytics solution for streaming data. As a highly-compatible family of products for on-premises and the cloud, IBM Db2 incorporates the latest AI and machine learning technologies so organizations can break their information silos and derive deeper insights from their data. 

In this guide to IBM Db2, you’ll learn about the history of IBM Db2, its current portfolio of products, and how IBM Db2 tools are used in a modern business and app development context. We will also be announcing the release of Integrate.io’s native IBM Db2 connector, which allows you to extract, transform, and load data from IBM Db2’s entire portfolio of data products and services.

Table of Contents

  1. IBM Db2 History
  2. IBM Db2 Product Family
  3. Integrate.io: ETL Data From IBM Db2 the Easy Way

IBM Db2 History

The history of IBM Db2 dates back to 1970 when IBM researcher Edgar F. Codd first described the relational database model. Several years later, in 1974, IBM used Codd’s theories to develop the relational database management system known as System R which used the now-ubiquitous sublanguage SQL (Structured Query Language). After evolving its RDBMS technology over several years, IBM eventually released “DB2” (IBM Database 2) in 1983. Marilyn Bohl, Don Haderle, and Bob Jackson led the early 1980s development of Db2 Version 1. 

thumbnail image

From left to right, this image shows Marilyn Bohl, Don Haderle, and Bob Jackson, who led DB2 Version 1 development in the early 1980s. This image was sourced from Semanticscholar.org.

The contributions of these and other IBM leaders to the formation of Db2 are outlined in the IBM-published book IBM Db2 the Past, Present, and Future. According to the authors:

“Today, DB2 enjoys a wide installation base among enterprises throughout the world. The product suite, anchored by its flagship mainframe DBMS, drives billions of dollars in revenue to IBM annually and supports mission-critical applications for most major corporations around the world. However, its success wasn’t a given during its early days, as DB2 faced considerable business and technical challenges during its formative years. Ultimately, combined efforts from key IBM business leaders, developers, and researchers enabled IBM to overcome these challenges and deliver a popular suite of software offerings based on relational DBMS technology.”

At first, DB2 was only available on IBM mainframes. In the 1990s, IBM released Db2 for platforms like OS/2, MS Windows, Unix, and Linux. Over the next three decades, noteworthy Db2 developments included:

  • Mid-1990s: Db2 provided a shared-nothing architecture, which provides scalability by partitioning a database across multiple, interconnected Db2 servers. Currently, this is called the Database Partitioning Feature (DPF), which comes with Db2 Enterprise.
  • Early 2000s: Db2 incorporated object-relational extensions to become an object-SQL DBMS after IBM purchased Informix Software in 2001.
  • Mid-2006: IBM announced that DB2 9 or “Viper” was the first relational database with a native XML storage capacity. Other developments included OLTP enhancements to support distributed platforms, and data warehouse enhancements to support business intelligence. 
  • Mid-2007: IBM announced that DB2 9.7 or “Cobra” would feature data compression for database indexes, large objects, and temporary tables. IBM also added features to help Oracle Database users operate Db2 without as much of a learning curve.
  • 2009: IBM announced that Db2 can serve as an engine in MySQL. 
  • Early 2012: IBM announced that Db2 10.1 or “Galileo” would offer data management capabilities like row- and column-based access control. This provided more detailed control of the database – and more cost-effective and efficient storage – by moving data to “hot” versus “cold” depending on how frequently the data was used. Galileo also offered “adaptive compression” for data table compression. 
  • Mid-2017: IBM re-branded all DB2 and dashDB products under the name “Db2.”
  • Mid-2019: IBM announced the release of Db2 11.5 or “AI Database,” which includes AI features for better query performance and improved AI application development. 

The evolution of IBM Db2 spans four decades. Today, this powerful portfolio of data management products continues to be a cutting-edge, highly-competitive solution in the data management, data warehousing, and data analytics industries. 

IBM Db2 Product Family

The current incarnation of the Db2 family consists of a variety of data management and analytics tools – which incorporate AI-powered and machine learning technologies – to help businesses manage, analyze, and understand their structured and unstructured data. 

Db2 products work with data on-premises, in the cloud, or with multi-cloud setups. In many cases, enterprises access the Db2 family of products via IBM Cloud Pak for Data, which serves as an integrated data management/analytics solution where you can take advantage of multiple Db2 services – including AI/ML technology – all under the same roof. 

Db2 Database

Db2 Database is a SQL-based RDBMS (relational database management system) that’s ideal for your operational systems. As a highly-performant, highly-reliable RDBMS, Db2 Database is a transactional database that delivers a variety of cutting-edge features. These include storage optimization, in-memory technology, workload management, continuous data availability, and management and development tools. This database runs on Windows, Unix, or Linux. 

Benefits of Db2 Database include:

  • Integrates as many as 10 common programming languages. 
  • Offers a REST service for integrating with web, mobile, and cloud-based apps. 
  • Lowers storage needs by 47% and increases compression by 39% (according to IBM).

Db2 on Cloud

Db2 on Cloud is a cloud-based, fully-managed, highly-available transactional database. IBM claims that this SQL-based RDBMS delivers an astounding 99.99% uptime. Some of the most compelling features from Db2 on Cloud are automatic security updates, independently scalable storage, and independently scalable processing. 

Db2 on Cloud scales up and down based on request loads and usage requirements. It offers the latest in encryption, data federation, and backup/recovery. Db2 on Cloud is available through AWS, IBM Cloud, and you can deploy it on a private network that you access with a secure VPN. Db2 Hosted is the unmanaged, hosted version of the product. 

Use cases for Db2 on Cloud include:

  • Store/retrieve data for your web applications with a highly-available cloud-based database.
  • Satisfy compliance and security requirements by using data virtualization to store only required data in the public cloud, and maintain sensitive information (including PII data) on-prem. 
  • Build scalable, resilient web applications by storing application data in Db2 on Cloud. 

Db2 Warehouse

As a data warehouse management solution on par with leading competitors, Db2 Warehouse empowers enterprises to gather relational and non-relational data from diverse business systems and operational databases. From there, they can analyze the data using machine learning algorithms and other analytical models to reveal hidden relationships, patterns, forecasting, and other deep insights. 

Db2 Warehouse includes support for the following capabilities and data types such as geospatial data, non-relational data, relational data, predictive modeling, multi-parallel processing, in-memory analytics, RStudio, XML data, Apache Spark, and Spark Analytics engine. Run Db2 Warehouse (managed or unmanaged) on the private cloud, on different public clouds, or on-premises.

Use cases for Db2 Warehouse include:

  • Rapidly deploy a pre-configured, elastic scaling, fully-managed, data warehouse on a Docker container supported infrastructure. 
  • Perform in-database analytics using Spark and R. Use predictive modeling algorithms faster because they are built into the database itself. 
  • Achieve faster performance for complex queries through in-memory SQL columnar processing and an MPP (massively parallel processing) architecture. 

Db2 Warehouse on Cloud (formerly dashbDB for Analytics)

Db2 Warehouse on Cloud is a fully-managed data warehouse system available on the cloud. Db2 Warehouse automatically scales up and down based on usage requirements to handle virtually any level of workload. The platform is ideal for managing compute-heavy machine learning tasks and performing other types of analytical processes on massive datasets. 

Db2 Warehouse on Cloud features an autonomous, self-tuning data engine, and other fully-automated features for database monitoring, operations monitoring, and uptime monitoring. Db2 Warehouse on Cloud also allows for querying compressed datasets, in-memory processing, data skipping, and utilizes a column-oriented storage architecture. The platform includes advanced capabilities, algorithms and analyses for the following: k-means, ANOVA, Regression analysis, Association Rule, Esri data types, and spatial analytics. Db2 Warehouse on Cloud also offers Python drivers and works with Jupyter Notebooks. The platform is available on IBM Cloud or AWS.

Use cases for Db2 Warehouse include:

  • Ingest and analyze cross-organizational data for better business intelligence and more strategic decision-making.
  • Ingest and analyze cloud-based, clickstream, and sensor data to create new machine learning models and AI-driven tools.
  • Ingest, aggregate, and analyze transactional data from POS systems to develop new sales campaigns and other business strategies. 

Db2 Big SQL (formerly IBM SQL)

Db2 Big SQL is a powerful SQL data engine on Hadoop. This solution offers an MPP (massively parallel processing) architecture that delivers enterprise scalability to handle virtually any level of demand. With the power of Db2 SQL, you can access and query data from diverse sources easily and securely. 

Whether you are querying HDFS, WebHDFS, RDBMS, NoSQL databases, or object stores, this hybrid ANSI-compliant SQL engine delivers the processing power and speed to handle the job. Plus, Db2 is particularly adept at querying unstructured streaming data. Db2 Big SQL works with the entire Db2 product family. 

Use cases for Db2 Big SQL include:

  • Integrate new kinds of unstructured and partly structured data (such as sentiment, streaming audio-visual, log, or social media) with your traditional structured data. 
  • Empower developers, data scientists, business analysts to access data in Hadoop with the powerful tools they need for ad-hoc, real-time data queries.

Db2 Event Store

The Db2 Event Store data management system was built for high-speed, high-volume storage and analysis of streaming data. With Db2’s capacity for high-speed data capture and analytics on streaming data, the solution allows you to save/analyze as many as 250 billion event records daily with just three server nodes. The solution also incorporates IBM Watson Studio for AI and machine learning analysis, and it’s compatible with Spark SQL and Spark Machine Learning. Db2 Event Store’s supported languages include ODBC, Go, JDBC, and Python.

Use cases for Db2 Event Store include the ingestion and analysis of streaming data for:

  • IoT (Internet of Things)
  • Financial services
  • Industrial
  • Telecom
  • Online retail data

Integrate.io: ETL Data From IBM Db2 the Easy Way

Integrate.io is an automated, easy-to-use, enterprise-grade ETL platform that empowers anyone, regardless of their tech experience, to break through data silos and extract, transform, and load data from virtually any source to any destination. Moreover, with Integrate.io’s native Db2 source connector, you can easily develop sophisticated ETL pipelines for your entire family of Db2 products. 

Want to try Integrate.io for yourself? Contact the Integrate.io team and schedule a demo and risk-free trial today!