Panoply Blog - Data Management, Data Infrastructure, and Data Analysis.

Data Warehouse Automation. A Question of When Not If

Written by Yaniv Leven | Mar 14, 2016 9:42:00 AM

The premise that to build a perfect data warehouse you must have perfect business understanding is simply irrelevant. We must look to technology to empower us to keep pace with our rapidly changing business logic.

A few weeks back I read an interesting article questioning the validity of data warehouse automation technologies within analytical environments. That said, I wouldn’t be writing this blog post if I agreed with all that was said anything said. So let’s take it apart point by point.

Argument #1

One reason is that you really shouldn’t automate concepts before you fully understand them. As the saying goes, don’t put your problems on roller skates.

The issue here is not the truth of this argument but its relevance. Without a doubt this has been true for the last 10-15 years. Mainly as a result of the fact that the knowledge and expertise needed to harness automation technologies in complex analytical infrastructure environments was (and still is in many cases) so gargantuan that it was rare to find engineers with the level of knowledge needed to fully understand them. When you do not fully understand these processes the end point is usually a broken process, automated to the point of having to be rebuilt, manually. The human factor, in complex analytical infrastructure environments, has always been the cornerstone of successful technology and process implementation, unfortunately it is also the chain’s weakest link both in terms of time to value and scalability.

In 2020 the rate of data production will be 44 times what it was in 2009. The explosion of new data sources and unstructured data sources is beginning to show its exponential curve backed by Gartner’s comments in 2013 stating that “Business Intelligence and Analytics need to scale up to support explosive growth in data sources”. The shift is in force, from collecting to connecting data. We are fast approaching the point where our ability to see, much less understand, all the business logic at scale is coming into question.

Understanding this leap in data consumption, looking at the world of data today and comparing it to where it was 5 years ago and where it’s headed five years from now, I see no way the current data management processes will be able to cope with these challenges; the human factor, in my opinion, will have to be minimized to a point of control and administration and less to process building and logic implementation. That’s why Mckinsey, in a 2013 report about automation of knowledge work, stated that by 2025 “knowledge work automation tools and systems could take on tasks that would be equal to the output of 110 million to 140 million full-time equivalents” and specifically in IT, science and engineering “there is a potential for productivity gains of about 45 to 55 percent”.

With advances in computing technology, machine learning and natural user interfaces; The rise in complexity in information technology, creating demand for workers with new skills who can perform new kinds of tasks; and the growing production and fragmentation of data, the question of analytical infrastructure automation is not a question of “if” but a question of “when”. It’s not just a cool thing to do, it’s a growing cross industry need.

Argument #2

…. are viewed as an art and still experience cultural resistance to the concept that a process can repeat the artistry of a data model or a dashboard design.

There is no polite way for me to state my true thoughts regarding this sentence.

Any business that allows cultural resistance to get in the way of their business goals will die. Yes, change is hard and painful, but you can bet that your competitors are looking for any way big data can give them an edge. So either get on board or clear the field. The artistry of the dashboard design. I cannot express how frustrating I find this sentence to be. Dashboards and data visualizations are an important part of BI but they are cosmetics. This is equivalent to buying a car based on its color rather than its capabilities and features. At the end of the day you have a complex engine underneath the paint and while they all drive, some are superior to others. The insights that big data provides us enable us to identify and act on opportunities increasingly faster. Directly affecting our time from data to value. Sometimes that might mean focusing campaigns on a specific region but it could also lead to significant changes to our business that break our data model. Flexibility is just as vital as scale.

Argument #3

The myth that data warehouse automation or any devops practice can replace the true thought processes that go into the design of an analytical environment.

Welcome to the wonderful world of AI, machine learning, deep learning etc. Let us put this issue to rest, we have the technology to agilely design analytical environments. Some of these technologies are more robust than others but they are here and they’re growing. Humans, unlike machines, are finite in quantity and our capacity to compute. Every thought process automated, even if at first it’s done badly, allows us to ask a question that we never could have asked before because we were too busy thinking about repetitive tedious tasks and processes. It’s not what door is closing; it’s what windows are opening.

We believe that the next leap, the answer to all of the challenges stated above is smart logic learning AI, and we’ve taken on the battle to build these technologies. We are challenging today’s accepted methodologies around analytical data processes with 3 problems and 3 sets of machine learning AI based solutions. The problem of ingestion, problem of maintenance and handling, and the problem of extraction, the full set of these problems that categorize the data management problem. We enabled these solutions to co-exist under the same radical software and wrapped them up with utilization GUI to enable friction-less engagement. A full solution to data warehouse automation with the ease of pushing a button.

The world of big data is changing right before our eyes. Smaller and smaller entities are processing more and more data, while data is becoming larger and more fragmented. We believe that the only way to recharge the exponential growth of technology is through smart data driven decision making. We believe that the only way to do this is to shorten the time to value from data to information to as close as possible to none. We are one pioneering company, one of many that are solving today’s big data challenges to make a smarter tomorrow.