At Panoply, we’ve heard lots well-intentioned pieces of advice about ETL. While these come from well-meaning colleagues or professional connections who are in the position to give advice on the subject, their solutions were designed for them. Here are some common pieces of ETL advice we’ve flagged as “repeat offenders,” and some reasons they’ve made this list.
(1) One size fits all
Nope, it doesn’t. ETL is typically licensed by data set size and memory utilization, so a company’s scale of usage—both now and in the future!—matters. Getting a good match on ETL means understanding not only what your immediate needs are, but planning for your future needs. Selecting tech vendors and partners that plan to grow with your business is critical, otherwise, you risk costly upgrades or equipment changes. Which brings us to our second point…
(2) Design an on-prem solution for easier management
This one can hurt you. Bad. Picking the wrong design architecture can not only hamper your data analysis efforts, it can tank your hardware budget and your IT budget if you run into problems or (this never happens, right?) make an error in your planning. Build vs buy is not a chance to flex your mighty development muscles—it’s an opportunity to make a sound management decision. Don’t get suckered into DIY when buy will be a better fit. Cloud offers scalability and flexibility you can’t get with hardware—so your infrastructure reflects your business reality. And speaking of architecture...
(3) Your IT department can totally handle building an ETL project
While that may be true NOW, don’t forget about ongoing maintenance costs, changing data formats, future connection changes (like fixes to broken connections and new data sources to add) and the change requests you’ll inevitably get as your business changes and your data sets grow. In terms of scale, your IT needs on a self-built ETL will grow quickly in parallel to your data volumes, ratcheting up your headcount and your HR budget. As for HR...
(4) ETL is an IT process for handling data
Sure it is. But ultimately, data is for people. They query it, consume it, interpret it, report on it. If you leave the human needs of your business out of your ETL design, you’ll pay for it in lower utilization and higher support costs. Remember, things have to be easy to spur adoption and use and data utilization is no different. Most end users aren’t coders, so your solution should be designed so that all users can feel comfortable utilizing the solution. Speaking of coding...
(5) If you can code, manually building your ETL is the best way to go
Hate to be the bearer of bad news, but the machines are smarter than us now. Writing scripts is definitely one way to get the answers needed, but it’s time-consuming, and there are better, more efficient ways to do it. Tools like Panoply are plug-and-play with easy-to-use drag-and-drop interfaces that can connect to your data, build your data warehouse, allow you to run your query and return your data in the time it would take a data engineer to tee-up data sources in SSIS. And with ETL as a whole...
(6) You need ETL
But what if we told you...you didn’t? Not to say you don’t need the extraction, transformation and loading processes, but what happens if the E to T to L sequence wasn’t the right one for your business and its future growth? Data analysts and IT leaders are increasingly looking to ELT (extraction, loading, transformation) sequenced processes to handle their big data needs. ELT has been shown to reduce processing time and increase responsiveness in queries, making it ideal for high volume warehouses.
One piece of advice about building ETL for your company? Consider it, but make sure to weigh all the research and input against your unique business, your current needs and your future growth.