Enhance Redshift With Panoply's Smart Data Warehouse

Panoply is now available on AWS Marketplace

Panoply is an “Advanced Tier” AWS Partner and can now be easily purchased directly on the AWS Marketplace with simple, upfront pricing.

Get your data warehouse up and running in minutes with no maintenance required

AWS Redshift is one of the most widely used data warehouses in the world and is used by governments, nonprofits and companies of all sizes. As popular as it is, however, those working with Redshift are familiar with the common challenges to maintain and manage Redshift. It requires a tech savvy data analyst, lots of engineering, and usually IT resources to help manage and optimize database storage - and it takes hours of coding to transform all the disparate data as well.

What’s more, Redshift is only part of a data stack that businesses need today today; it lacks the key ETL (extract, transfer, load) component to collect, parse, and sort data to render it available for querying. Now, for the first time on the market, there is an all-in-one solution based on AWS Redshift, which simplifies and enhances the data stack by including a built-in automated ETL for any and all type of data sources, and connects seamlessly to any BI visualization tool.

Panoply = Automated ETL + Serverless Amazon Redshift 

Panoply is a full-service, cloud data warehouse solution that automatically manages data ingestion, storage and query optimization in Redshift. Panoply enables your data to quickly and seamlessly flow to your reports through any BI visualization tool, in effect making dashboard load times 10x faster.

In other words, Panoply is Redshift on steroids. It’s built for anyone working with data analytics to collect and combine all disparate data sources in one place and then seamlessly connect it to any BI tool to gain actionable insights in minutes. Panoply is unique because it’s the only data warehouse that uses an ELT (extract load transform) solution, so the most labor intensive and time consuming IT and data engineering tasks in Redshift have been simply automated. You can now get a Redshift-based data stack up and running in just a few clicks - saving time, resources and cost for any size organization or company in any industry vertical. No manual configurations, no coding to transform data, and no data storage management is required. This is a complete self-serve, optimized data stack that you set up, in just a few clicks, to optimize the performance of your entire data pipeline from raw data to actionable insights.

Panoply at-a-glance 

  • Panoply optimizes your data warehouse based on your specific usage. It automatically changes constraints, compresses or materializes views, queries and more to give you the best architecture possible.
  • Panoply performs common transformations. That includes the identification of structured and semi-structured data formats like CSV, TSV, JSON, XML and log formats, and immediately flattens nested structures like lists and objects. Structured data can be transformed into different tables with a one-to-many relationship.
  • To help overcome concurrency issues, multi-cluster replication allows the compartmentalization of storage and compute. The number of available clusters scales with the number of users and the intensity of the workload, supporting hundreds of parallel queries that are load balanced between clusters.
  • Panoply exposes a standard JDBC/ODBC endpoint with ANSI-SQL support enabling seamless connection to any BI tool.
  • Panoply supports standard SQL, Python and R.

Panoply makes Redshift lightning fast and boosts storage optimization 

Data Ingestion 

 

Redshift: Write code that reads the data, transforms and formats it and then uploads it into S3 in the correct format and finally a COPY would load it into Redshift. 

Panoply: We provide an ETL-less solution (ELT) for data ingestion without the need of engineering/coding. Over 100+ native data integrations - as well as an SDK and 3rd party data integration solutions that add another 50+ data sources, for pushing data from any current and future data source into Redshift. All done via Panoply’s easy-to-use dashboard interface so that anyone working with analytics can do it on their own. What’s more, analysts can immediately query the data within the data warehouse using standard SQL or seamlessly connect Panoply to any BI visualization tool.

Data/Schema Modeling

Redshift: Prepare the schema - configuring the right columns, data types and relationships - ahead of loading any data. Changes are rigid and cumbersome (impossible to change data types, for example). 

Panoply: Adaptive schema changes at real time along with the data. You don't need any prior knowledge and changes are seamless. Just load data in, everything else - such as data modeling and reindexing, is automatic and adaptive.

Performance Tuning

Redshift: Plan ahead how you're likely to use the data (what would be the most common queries), and based on that choose the right sort keys and distkey. When data or query structure changes, it's impossible to reconfigure these parameters without starting from scratch. It doesn’t have materialized views with which to enhance performance.

Panoply: Learns how you query the data and adjusts the performance configuration as you go. Including: changes to sort and dist keys, materializing views and queries.

Scaling

Redshift: Start a resize operation via AWS UI or API. It involves downtime for writes until the operation completes, which can take days, during which new data cannot be ingested.

 Panoply: Automatically scales up and down based on the data volume. Scaling happens as an offline operation, keep your cluster available for both reads and writes, and thus ingestion can continue. When the scaling is complete, the old and new clusters are swapped instantly.

No Maintenance data warehouse

Redshift: Run manual VACUUM and ANALYZE commands. Sometimes it might not be enough to free up space, and you'll need a deep copy of the entire table, which often involves scaling as well.

Panoply: Automatically runs all of the maintenance operations, such as regular vacuuming and deep copy, without the need of IT or engineering for a no-maintenance data warehouse.

Servers Configuration

Redshift: Plan ahead of time how much data you'll need and what are the performance requirements. Based on that, attempt figure out which instance types, and how many nodes, you'll need. Impossible to change that choice without a complete rebuild of the infrastructure and data.

 Panoply: We’re a serverless data warehouse solution so you can avoid the manual node configurations.

Transformations

Redshift: Data is transformed ahead of time. The ingestion code needs to process, clean up and prepare the data for analytics. Changes usually involve re-running the ingestion scripts on all of the historic data, which usually takes days for every change.

Panoply: Data is loaded as is, untransformed. Transformation takes place using SQL views, either at the time of query or as materialized tables. In both cases, changes are immediate in real time.

Concurrency

Redshift: Up to 50 concurrent queries, but in reality above 5 queries and you'll hit a glass ceiling in performance, which causes all queries to slow down significantly. This especially affects dashboards in BI visualization tools that have to run at sometimes 10-15 queries in parallel per dashboard, per user. 

Panoply: Multi-cluster replication allows complete separation of storage and compute to alleviate concurrency issues. The number of available cluster scales with the number of users and the intensity of the workload. Panoply can support hundreds of parallel queries that are load balanced between clusters.

Simplicity/Accessibility

Redshift: Complex user interface. API requests and coding SQL code is required to troubleshoot issues, improve performance, exploring the data and making changes. 

Panoply: Easy-to-use console provides everything that anyone (including the non-tech savvy analyst) might need, from the list of tables to changing column data types, although most operations are automatic and do not require much tuning or extensive interactions with the platform. We’ve done this to democratize data. Now anyone at any size company or industry vertical can can easily add any data source and connect to any visualization tool in minutes, on their own, to get insights in record time.

If you have any questions about how Panoply works, you can sign up for a customized demo with a Panoply Data Architect or try Panoply for free for 21 days (no credit card required).

Get a free consultation with a data architect to see how to build a data warehouse in minutes.
Request Demo
Read more in:
Share this post:

Work smarter, better, and faster with weekly tips and how-tos.