A data warehouse, combined with analytical tools, is one way to ensure you get optimal business intelligence from your raw data. But what does this insight ultimately cost your business? There is potential to get a healthy return on investment from this ability to utilize data, but just how much depends on the initial investment in the system and processes. According to Capterra (2019), a Gartner survey found those businesses who focused on getting the right solution based on functionality got the best business benefit, compared to those who focused only on cost.
Table of Contents:
Does a Data Warehouse Automatically Provide Business Intelligence?
First, it's important to clarify what a data warehouse actually does. It stores your data from diverse sources in an organized manner. Combined with a business intelligence solution, companies gain insight into trends and can perform sophisticated analysis. The data warehouse ensures the information is queryable by a business intelligence solution. In other words, it typically doesn't perform the whole job, but is essential to getting the job done.
A data warehouse is distinct from a database. In fact, a transformation solution, like Xplenty, takes information from separate databases and puts them into the warehouse. Your transformation solution should work seamlessly with your Shopify, Salesforce, Google eCommerce, and other databases you use for your business. The transformation can happen on a manual basis, or automatically under scheduled preset conditions. Once data is in the warehouse, it becomes ripe for manipulation and analysis.
Assessing Typical Components
There are a few key parts to a data warehouse. The storage platform, the transformation pipeline, and the people who make it all work. A number of providers compete for your business at each stage.
Storage: In-house or Cloud-based
You have to choose where to actually house your data warehouse. As with most enterprise technology, you can opt for on-site hardware or a cloud-based solution. There is a greater movement towards cloud-based solutions since they don't require the space, upfront investment, and ongoing maintenance like their on-site counterparts. In addition, they support accessibility regardless of geographic location, making them optimal for remote work arrangements.
When you choose to go cloud-based, you are typically already saving money. That's largely because cloud-solutions do not require hardware, on-site IT staffers, space for the machines, or operational costs like electricity. Cloud solutions can cost $18 to $84 per TB per month, while on-site solutions can cost up to $12,000 per year ($1,000 a month) by some estimates.
Nonetheless, there are some reasons to choose an on-premise solution. Sometimes the data moves more quickly from the in-house solution to a clients single location than from the cloud, since the latter is held on global servers in multiple locations. An on-site solution means you have complete control over how the data warehouse connects with other systems.
Some of the most noted cloud warehouse providers include:
- AWS (Amazon) Redshift
- Google BigQuery
- Microsoft Azure
- IBM Db2
If you frequently access the data in your warehouse, you'll need a "hot" storage solution, therefore it is important to choose one that offers high performance and speed. If you access the data less frequently, a "cold" storage solution -- where you sacrifice a little bit on speed and performance -- may suffice.
Visualization/Business Intelligence Software
It is essential to think about moving the data from your disparate sources into your warehouse. From there, you can think about visualization -- this is the stage where you gain business intelligence from your data.
Often the terms "visualizaton tool" and "business intelligence solution" are used interchangeably. According to Capterra, a business intelligence solution costs on average $3,000 per year, $600 on the low end, and $6,000 on the high end. The goals of your business intelligence should determine which solution you purchase: you can focus on regulatory compliance, revenue optimization, cost reduction, among others.
You can opt for an open-source solution like D3.js or a provider such as Tableau or QlikView. The non-open source options come at an additional cost, of course, which you have to assess when developing your business intelligence processes.
Moving the data typically happens through an ETL (extract, transform, load) solution. The cost of your solution will depend on which platform you choose and the pricing model. Each one typically supports a different suite of databases. So before choosing one option, you want to be certain it will sync with the data you want to store in your warehouse. Xplenty, for example, has a full suite of integrations including Snowflake, MySQL, Oracle, Amazon Redshift, Google BigQuery, and Microsoft Azure, among many others.
When it comes to pricing, the most important thing to note is the cost pattern used by the company you select.
Many ETL solutions use the variable pricing model (diagram B). These packages start off free or at a low introductory price, but scale up exponentially based on the number of jobs you run. This means that, while your initial monthly bill may start low, you are pretty much guaranteed to see your costs rise over time. If you have a particularly heavy usage one month, you might be stuck paying massive (and unexpected) overages. That kind of unpredictability can make it difficult to stick to a consistent budget.
To make sure your costs stay consistent month to month, you will want to opt for an ETL solution that employs a fixed or stepped pricing model (diagrams A or D). The fixed-rate model starts at one price and stays there regardless of workload, while the stepped model increases at set amounts based on predictable factors. Xplenty, for example, charges a single monthly rate per connector. Even when adding more connectors, you’ll know exactly how much those connections will cost, and you can budget accordingly. You can send an unlimited amount of data through an integration without increasing your monthly bill, while still having the ability to scale up when you need to.
People also cost money, and they are definitely a valuable part of your data warehouse process. Some of the roles you may have to fill include:
- Information systems manager ($12,000/month)
- Backend developer ($8,800/month)
- Database architect ($9,400/month)
- Data analyst ($7,500/month)
Of course, the amount you spend depends on the strain you put on each of these members of the team. That pressure depends on the work you ask them to do, which in turn depends on the usability of your component solutions. Xplenty, for example, is a simple, drag-and-drop ETL solution with managed services options. Working with Xplenty could potentially save your backend developer costs, which would otherwise run on average about $8,800 per month.
So, what might a database solution cost overall? Here are some averages, taking into account all of the above components, to give you an idea:
- Cloud storage solution: $18 to $82 per tetrabyte per month
- On-site storage solution: $1,000 per month
- Visualization software: $600 to $6,000 per year
- ETL software: $800 to $8,000+ per month (either fixed or variable)
- Personnel: $37,700 per month
In short, a data warehouse can be a significant investment. However, the returns you reap on your business intelligence are invaluable. The warehouse allows you to drive your future business decisions with precision, lowering your overall business risk.
Benefits of Working With Xplenty for ETL
When it comes to the ETL options, you can find significant gains by working with Xplenty. The platform's intuitive functionality means you don't need to pay for high-tech training. Furthermore, Xplenty’s stepped-rate pricing structure is transparent, affordable, and predictable. You pay per connector, and therefore have the ability to accurately anticipate monthly costs.
But the pricing is just one benefit. Xplenty is a good platform to work with. You can create fast connections between your databases and your data warehouse, with real-time availability. The platform gives you the option for repeatable, transparent data pipelines that you can run in-house with ease. This is accomplished without sacrificing data integrity or data quality. The data is reliable, easy to access, and transformable into insights you can use to run your business.