2017 has come to an end, and as you're gearing up for a successful 2018, you might find a cheat sheet of essential ETL tools and data storage comparisions handy. This list provides commentary to help you make the best decisions for your organization as your ETL and data warehousing needs evolve.
In October 2017, Oracle announced the vision of a “fully autonomous” database at Oracle Open World. Panoply's CEO, Yaniv Leven, provides commentary on what this meant for the industry and provided validation for Panoply’s vision. What does autonomous database or data warehouse mean for a large enterprise focused company? Get the answer and find out what it means for Oracle to sign on for the same vision that Panoply is executing on.
Panoply's CTO, Roi Avinoam, offers a comprehensive overview of Amazon's data warehouse and Google's query engine. Get an in-depth look at performance with example queries, an analysis of cost structure, usability, simplicity, and available integrations with BI tools. What data types are supported, how do updates and upserts happen, and is nearline storage available? What about data loading and the ecosystem? Get an in-depth look at everything you need to compare Redshift and BigQuery.
Panoply's Lead Data Architect, Alon Brody, walks us through the differences between two Amazon technologies, Athena and Redshift. Do you care about initialization time, partitioning, UDFs, primary key constraints, and supported data formats / types? This post provides all of that as well as hands-on examples for setup. You can see the performance of each and understand the tradeoff between performance and cost. If you’re trying to decide which tool is the best one for your job, this post is worth a look.
If you work in with an AWS stack, you’ve likely heard about Redshift Spectrum, which allows Redshift users to query arbitary files stored in S3 as if they were normal Redshift tables. This separation of storage and compute within Redshift opens a whole new world to your budget’s bottom line. Find out how it works, data formats supported, how to query, performance and cost right here.
Do you need your database to run on mobile device or scale because you like big data and you cannot lie? If 2018 is a year of serious growth or serious mobility, you need to check out Matan Sarig’s write up on CouchDB and MongoDB. Sarig, a data architect at Panoply, summarizes support, documentation, active community, requirements and tradeoffs for your convenience!
MongoDB is hot, Hot, HOT! This time, Sarig compares these two NoSQL databases, so you can understand the context behind when you might choose MongoDB or Redis. Just like the other comparative posts, he talks database structure, indexing, queries, deployments and availability of replication/clustering. Understand who’s behind each database, support options, documentation, community activity, and how you can choose the right one for your business. This post details the benefits of Redis and MongoDB.
Sarig’s back and providing commentary on MySQL and its fork, MariaDB. He explores the database structures and indexes, where and how the databases are deployed and what replication and clustering are available. If you want to know who’s behind the databases, support options, ongoing development work, and documentation available, this post on MariaDB and MySQL is for you. Find out where the user communities are, what connectors are available, and how to answer the question “which database is right for your business?”
Pull back the curtain and let’s dive deep! Avinoam takes us through what’s happening under the hood of ETL (extract, transform, load) and compares it to a new bulk data cleaning process, ELT (extract, load, transform). While the outcome is ultimately the same, the change in process matters. Each approach—ETL and ELT—has its pros and cons. Find out what they are and see a hands-on example in this post.
Guess who’s back, back again? Brody’s back—tell a friend about his comprensive overview of 11 (mostly open sourced!) ETL tools. While each of these are amazing, we firmly believe that ELT is the future. The CEO of Stitch, an ETL service, has described the need to evolve from ELT to ETL. Will 2018 be the year of ELT?
2018—new year, new you—does this mean a new career? If you want to understand the day-to-day of a data engineer and how you might become one, this blog post’s for you. Leven walks us through what a data engineer does and the key skills needed. Learn the difference between data engineering and data science and see the resources available in this guide to becoming a data engineer.
If your 2018 New Year's resolution is to learn more about ETL (or ELT!), data warehousing, or business intelligence (BI) we at Panoply aim to please. Just pop in your email address below for fresh updates once a month.