Panoply Blog: Data Management, Warehousing & Data Analysis

9 Things Your Boss Wants to Know About Data Warehousing

Written by An Bui | Feb 7, 2018 11:43:11 PM

As an analyst who deals with data from many data sources daily, you might find yourself as the go-to source for information about business intelligence in your office. While it’s cute when your grandma asks about what it is you do, exactly...it’s slightly more concerning when it’s your boss.

The changes in the technology for handling large quantities of data and integrating diverse data sources has made for a constantly shifting landscape. This changing environment means analysts and data engineers can provide valuable insights to corporate leadership about how data analytics informs business intelligence, and possible strategies for using these applications in the real world. So what are the things your boss really wants to know about data warehousing?

1. Why can't we just put our data into a BI dashboard for insights?

To pull data from many different data sources requires a data stack. In general terms, a data stack is a collection point for different types of data from varying sources. In a business, this could be legacy systems, CRMs, operational systems, sales information, product information, competitive research and many other types of insight-generating data. All of this data needs to be extracted, transformed and loaded into a data warehouse before a BI dashboard can be generated. 

2. And what can I do with a data stack?

A well-designed stack incorporates different types of processing programs for extracting information from the data provided within the stack and combining it together to provide insight through things like visualizations, graphs and charts. The ETL, data warehouse and BI tools becomes the framework for the single source of truth that your organization can use for operations and strategy. 

3. What's wrong with our current situation and why didn’t we have a data stack before?

Because businesses use so many systems to create, store and report on data, special reporting processes and computer languages were needed to make all the systems work together. This usually required a programmer who was fluent in multiple computer languages, plus analysts to interpret the data, and a designer to turn it into something audiences could understand. The data stack existed, it was just complex and required a lot of manual management. 

Today, technological advances have enabled vendors to provide cost-effective expertise to keep up with the data volume, or events, that is being generated. Ideally, look for providers whose tools, platform or software-as-a-service offer interoperability between data sources, meaning data from multiple sources can be standardized, stored together, and queried as a group. 

4. What’s a data warehouse, and what part does it play in the stack? 

The data warehouse is a central component of the data stack. Data warehousing, just like inventory warehousing with physical products, is a place to store data. It not only centralizes all your business data, creating a single source of truth for your organization, but enables your BI tools to run faster because it's optimized for analytical querying.

5. How’s a data warehouse different from a database?

A database is related information from one application or data source. A data warehouse stores information from many different applications or data sources in a single place. Often, databases make up many of the data sources that are ingested by the data warehouse. 

6. What’s ETL, and why do I need it?

ETL is an acronym for extract, transform, and load. ETL is a process that helps pull (extract) the data stored in the various data sources, transform the data to the proper format or structure for query and analysis, and load the data that your business uses into the data warehouse. Traditionally, ETL commands were scripts written in the computer language of the data source, but new drag-and-drop user interfaces have eliminated the need for a programmer to write scripts in order to move data.

Panoply’s ETL process also provides valuable features like automated extraction that updates the data in your business intelligence and visualization tools by enabling you to schedule regular (hands-free) refreshes.

7. Do I have to worry about all this data from my business systems being stored together?

Another way the data stack has evolved is that many of the tools within the stack are now cloud-based. This eliminates hardware management and maintenance, plus gives an added layer of security. For example, many data warehouses, such as Panoply, come with built-in industry-specific security protocols, like HIPAA, GDPR, SOC2.

Data is also protected from malicious attacks and human error by access controls and dynamic permissioning. These processes give individuals access to blocks of data based on things like the team they work with, the projects they’re assigned to, or the department they belong in. This access can be changed on the fly, making it responsive to business needs while mindful of data security.

8. Is cloud data warehousing better than on-premise solutions?

While on-premise data storage has the advantage of being able to see the machines that are holding your data, the overwhelming advantages that come from cloud-based data warehousing make it a clear winner for most businesses. Cloud-based warehousing offers flexibility and scalability, without high equipment purchase or maintenance costs, and the security offered by large-scale data hosting.

Panoply supports the IT needs of our data warehouses, eliminating the need for self-support by the businesses that utilize our services. But that doesn’t mean Panoply users are without support. Data architects from the Panoply team can work directly with a business’s IT staff to create customized data analysis frameworks and strategies to help businesses maximize the value of their data.

9. What is the budgetary impact of data warehousing?

Especially for cloud-based warehousing, the minimal support and maintenance needs of a data stack can mean real cost savings in the IT department. Panoply’s ability to aggregate different data sets and assimilate their contents, means fewer IT resources are needed for tasks like programming, script writing, and data manipulation.

Because Panoply learns data queries (the searches businesses make using their data), the responses are returned in increasingly quick executions. This allows data analysts and business intelligence teams to deliver information more quickly to their colleagues and for businesses to be more agile and responsive to dynamic business environments.

Between the equipment savings, the savings in IT resources and the increase in responsiveness on business decisions, a data warehouse is a necessity for data analysts...and their bosses.