What is an Executive Dashboard?

Turn your data into useful insights for executives with these dashboard tips.

Executive Dashboards have been a key tool and resource for C-suite executives as long as data and analytics have been around, so a while. The evolution of dashboards has led to more in-depth and interactive tooling, however the core needs of an executive remains KPIs and metrics that allow them to quickly and effectively understand how the business is performing. 

The core needs of speed and efficiency should be at the heart of how you create and manage an executive dashboard, but there are also several design principles that should be taken into account. With this post we will dive into the best practices when developing a dashboard for your executive team, focusing mainly on those executives who look at user behavior data. 

What is an executive dashboard?

Fundamentally an executive dashboard is no different than any other dashboard your analytics or BI team creates for other functions of the business. It should consist of relevant and trusted insights that allow for fast decision-making. Where they begin to differ is the substance and detail included in an executive dashboard vs. any other. 

An executive needs to see KPIs first and foremost and in the simplest terms possible. This is not to diminish the complex calculations or various inputs of that KPI, more so it demonstrates how valuable these metrics are at communicating complex information. Designing the proper KPIs are another topic entirely but some common ones that should be included are expanded upon below.

Additionally these dashboards should be accessible to the executive whenever he or she needs to see them. This allows for self-service and reduces the speed between an executives question and the relevant information to answer or help answer it. When selecting a tool to deliver this information it’s important to consider ones that don’t require human intervention prior to sharing it. 

Let’s look deeper into the type of information typically included in an executive dashboard, and a few you may want to consider.

What you should include on an executive dashboard

The question of what a dashboard should include entirely depends on what the Role and Goals of the executive are. Most often each team or line of business (LOB) have their own slice of data and dashboard that dive deep into these given areas. As mentioned above, executive dashboards need a summarized view of this complex world and are often composed of parts of these LOB dashboards. 

Kubit - executive dashboard

A few best practices when designing these executive dashboards include:

Relevant Time PeriodConsistent NamingCritical Metrics OnlyContext When Possible
If the executives typically look back and plan forward Quarterly then ensure the intervals of time align with that view point. 

It’s less useful if a report is Daily but the CMO plans Monthly.
Ensure that any common nomenclature or terminology is carried onto the Dashboard.

Acronyms should also be defined to remove any possible confusion.
Include only the top KPIs and summarized metrics. 

If the dashboard contains too many details it can be difficult to know what’s a KPI vs. a subset of metrics.
Most often the numbers presented have a story behind them. It can be valuable to include a summary of this story or incident that may allude to the metric displayed.

I.e. Data outage, large campaign, holidays.

Let’s break down these best practices framed by a few executive roles Kubit sees leveraging executive dashboards. A good rule of thumb is to keep it short and simple, a dashboard with more that 7+ reports is difficult to navigate so you should challenge yourself to keep it under 7 charts.

CPO – Chief Product Officer

Across industries a Chief Product Officer (CPO) often needs to get a broad view of how users are adopting, growing and maturing through their product(s). This becomes especially important if your business has several products that promote one another. A few core KPIs we see with these types of executive dashboards include:

  • Daily/Weekly/Monthly Active Users
  • A summary of those users based on their current status or maturity
    • This typically includes a definition of New, Existing and Dormant users. 
    • Seeing this breakdown will best illustrate what segment comprises your average user base.
  • Conversion Rate of a core flow i.e. Login or Purchase
    • This should be a flow that nearly all users are expected to do or denote value within your product.
  • Top KPIs defined by your product managers that ladder to specific areas of user engagement.
    • These are typically called “Input Metrics” and include things like:
      • Average Sessions per User
      • Average Engagement Events per User
      • % of Users Who performed X event(s)
      • Average Duration of Sessions
  • A North Star Metric that encompasses the successful value exchange between your product and the user.
    • A North Star Metric can be difficult to design but when done correctly it gives the best possible picture of how the product is performing.
    • Examples include:
      • Average Session Duration per Paying User
      • Average Checkout Value per Loyalty Member User
      • Average Minutes Watched per Paying User 
  • Furthermore if you have several products your CPO is responsible for you can add another layer of Avg Products per User to understand if cross-sell or upsell strategies are working.
    • Also provide a filter control within the dashboard to allow the CPO to filter to a specific product or product line.

CMO – Chief Marketing Officer

From our experience a majority of CMOs focus on two things:

  1. How good are we at attracting new customers to our platform or service?
  2. How efficient are we at converting those new customers into paying or returning users?

Most often teams will have a good idea of #1 because Marketing data like impressions, clicks and click through are tracked using a single toolset or tagging infrastructure. We typically see this data generated by looks like Google Analytics, Braze, Iterable or the various other Marketing Acquisition Tools. These do a great job of collecting this data in a standardized way across platforms and products. 

An added benefit of leveraging tools like GA is that you typically can see ad spend next to the impression and click through data so CMOs can quickly see ROI on their team’s efforts. ROI will be a critical KPI for most CMOs so taking steps to attempt to bring the ad spend data next to the user behavior (i.e. impressions and clicks) is something we recommend marketing teams investigate and attempt. 

Inversely #2 is the most difficult data for CMOs to see as the outcomes of their efforts are left to be collected and analyzed by the CPO and the data no longer exists within their marketing toolset. Tools like Kubit will help connect the dots between these two datasets as both often live within the data warehouse and can be analyzed within Kubit. 

Kubit for CMO

A few core KPIs we see with CMO executive dashboards include:

  • User Acquisitions from various marketing channels like search, social media and email.
    • This metric refers to counting the number of users on your platform or site that were referred by your marketing channels. 
  • High value Conversions via emails, ad and organic channels that you can tie to product growth.
    • Also referred to as Click Through Rates and you’ll divide the number of impressions by the number of clicks typically by marketing channel.
  • Primary SEO metrics if that’s a point of focus for your Marketing organization.
    • This includes domain rankings, backlinks, keywords, content spend etc.
  • Reputation signals from 3rd party reviews and testimonials.
    • Even better if you have a portal that can send this data into your data warehouse for easier analysis. 
    • Often though this data comes in spreadsheets and not connected to a specific user, if the data is challenging to fit into a data model best to leave it out of the dashboard and share ad-hoc.
  • Presence on social media page likes and follows to ensure your audience is growing and engaged.
    • These numbers are also often retrieved from 3rd parties or you can purchase a social media monitoring system and collate data across social networks for easier reporting.
  • Product and site traffic, conversion and engagement metrics from your acquisition audience.
    • This piece is critical to understand if the gains seen in the top of the funnel translate to long term users and revenue. 

Why are executive dashboards important?

Executives are busy and have dozens of workstreams, initiatives and goals they are tracking at any given time. The best lifeline and resource they can have is reliable and well organized data to aid in their decision making. Executive dashboards provide this information and should be a centerpiece of any executive’s regular tasks, which means they are probably one of the most important things to any c-suite.

What happens when this information isn’t readily available? In short, gut instinct and intuition take a front seat in most decision making. While those two things are incredibly valuable attributes of an executive it doesn’t mean they are seeing the entire picture which leads to blindspots and missed opportunities. Executives are asked to make big sweeping decisions that could impact people, process and technologies across portions of an organization so asking “What are executive dashboards important?” is frankly a silly question… even though I just asked it! 

In short, they are incredibly important and to summarize why:

  • Improves decision making 
  • Reduces speed to insights 
  • Provides transparency across the organization
  • Improves communication between cross-functional teams

How to use Kubit to build an executive dashboard

Kubit has several features that enable valuable executive dashboards and with our warehouse-native architecture bringing data from various sources together is made easier. In traditional tools that let you analyze user behavioral data you’re often limited by what that tool collects or the format of data it expects. Because Kubit has a “bring your own data” model we can mold to your unique data structure.

A few core features of Kubit specific to executive dashboards include:

Rich Text in Dashboards

Add rich text to dashboards which allow you to infuse story, context and resource hyperlinks alongside the data. 

Kubit - Rich text in dashboards

Incidents in Kubit

Create Incidents in Kubit to highlight critical releases, outages, campaigns or holidays that would impact KPIs. This means executives can toggle this information on and quickly get context without reaching out for more information.

Incidents in Kubit

Dashboard Schedules

Schedule dashboard refreshes and send emails regularly to automate the distribution of this information via Kubit’s UI. 

Dashboard Schedules

Unleashing the Power of Self-Service Analytics with Snowflake-native Kubit

Kubit offers the market-leading self-service analytics platform that runs natively on Snowflake.

In today’s data-centric world, the ability to sift through large amounts of information and extract actionable insights quickly is not just an advantage—it’s a necessity. With IDC predicting that global data volume will surpass 160 zettabytes by 2025, a tenfold increase from 2017, having the ability to quickly access, analyze, and act on company data that you can trust will be a competitive differentiation point that organizations will not be able to ignore.

The Rise of Snowflake

This explosion of data has led to the creation of an entirely new generation of cloud data warehousing technologies, all positioned to help organizations have more flexibility and control of their data with a scalable cost model. Among these companies, Snowflake is a trusted leader of thousands of organizations, realizing the value and necessity of data for their business.

While there are numerous ways customers can derive value from Snowflake, this article, 8 Reasons to Build Your Cloud Data Lake on Snowflake, highlights several reasons why organizations turn to Snowflake to enable a more robust data practice in their organizations. The critical takeaway from this article is that when you store data in Snowflake, your experience is drastically simplified because many storage management functionalities are handled automatically. Yet, there are still some challenges and limitations in accessing and activating that data, which we will discuss here.

The biggest challenge and most common question is:
How do non-technical (non-Snowflake) users access and use the data that is relevant to them?

The reality is that this question persisted long before cloud data warehousing was around. Company data was still held directly in databases, and any analysis required a database administrator or engineer to access it for the business. This is where product analytics was born.

The Birth of Self-Service Product Analytics

Product analytics emerged from the frustration of traditional data analysis methods. While querying databases for insights was possible, the process was slow and cumbersome, requiring significant technical expertise. Business intelligence (BI) tools offered some relief but were often rigid and pre-configured for specific reports. This meant limited flexibility for stakeholders who needed to explore data independently and answer unforeseen questions quickly. The rise of product analytics addressed this need for speed and exploration. It provided user-friendly interfaces and intuitive data visualizations specifically designed to analyze user behavior within digital products rapidly. This empowered stakeholders to delve deeper into user data, identify trends and pain points, and ultimately make data-driven decisions to optimize the product and user experience.

Product analytics has always been pivotal to understanding customer behaviors, enhancing product offerings, and driving user engagement. However, the landscape of data analytics has undergone a seismic shift with the advent of Big Data, escalating both the opportunities and challenges it brings.

Traditional product analytics tools, while offering some level of self-service analytics, essentially create data silos. This situation conflicts with the organizational drive and investment toward cloud data warehousing. The core issue with this setup is that data residing outside the warehouse leads to concerns about trust and integrity. Moreover, organizations find themselves duplicating efforts and squandering resources to manage and reconcile data across disparate locations.

Enter Kubit’s Snowflake-native Product Analytics

Kubit is the first Snowflake-native product analytics platform purpose-built to address the limitations and challenges inherent in traditional product analytics approaches. Specifically, providing a self-service analytics platform native to Snowflake allows organizations to access their complete dataset with flexibility, agility, and trust. There are other value drivers as well including but not limited to:

 

  1. Self-Service Analytics
    Self-service analytics refers to the ability for non-technical users to access and analyze data without needing assistance from data engineers and analysts. This is made possible by Kubit’s intuitive and easy to use business interface that allows users to directly query and manipulate their data in real-time, without the need for SQL knowledge or complex ETL jobs.
  2. Flexibility
    Kubit empowers organizations to analyze ALL of their data within Snowflake, going beyond mere clickstream analysis to encompass a wide array of sources including marketing, product, customer success, sales, finance, and operations. By aggregating this diverse data, organizations are equipped to delve into one of the most vital inquiries – why? It’s only through a holistic overview of all data points that teams can begin to unravel this question, paving the way for more informed decision-making.
  3. Data Integrity
    The abundance and completeness of data for analysis becomes irrelevant if there’s a lack of trust in the data itself. Hence, it’s imperative that Kubit can directly access Snowflake, serving as the ‘single source of truth,’ to guarantee the accuracy and reliability of data throughout its lifecycle. This ensures compliance, operational excellence, and builds trust within any data-driven environment.
  4. Total Cost of Ownership
    Gartner’s research indicates that organizations can reduce their Total Cost of Ownership (TCO) by up to 30% through migrating to cloud data warehouses. Kubit further enhances this advantage by assisting organizations in streamlining their analytics technology stack. This enables the reallocation of valuable resources, which are currently underutilized in efforts to create, manage, measure, and validate data and analytics with tools not designed for these tasks. Kubit also cuts down on double paying for storage and compute of data residing in yet another repository for analytics purposes.
  5. The Real-world Impact
    The advantage of adopting a Snowflake-native strategy for self-service analytics lies in the ability of organizations to be operational within days, not weeks or months. This rapid realization of value empowers companies to immediately concentrate on their most crucial and impactful areas. For instance, this TelevisaUnivision case study illustrates how they focused on boosting retention rates for their ViX streaming service, showcasing just one of many successes where Kubit has facilitated the achievement of significant outcomes.
  6. Implementation Insights
    Kubit offers far more than just self-service analytics software; it boasts a world-class team dedicated to ensuring customer success through comprehensive onboarding, enablement, training, and support. Our commitment goes beyond just providing technology; we actively lean in with our customers to help create value and success.

The immediate advantages of leveraging Snowflake-native product analytics are evident, including improved decision-making capabilities and more profound insight into customer behaviors. Moreover, the long-term benefits herald a continuous shift towards predictive and prescriptive analytics, fundamentally transforming the future of business data interaction.

Get Started Today

What are you waiting for? Are you a Snowflake user ready to try Snowflake-native Kubit? Feel free to Take a Tour or Contact Us to discuss your specific goals and how Kubit can help you achieve them. Our team is here to provide personalized support and ensure a smooth onboarding experience.

If you want more information about our offering, including detailed features and implementation guidelines, check out our technical documentation. Whether you’re an experienced data analyst or a Product Manager just starting out, our resources are tailored to meet your needs and help you maximize the potential of your data.

Content Operations from Trustworthy Insights

Warehouse-native Analytics for Content Team

Following the earlier blog “Unveiling the Truth Behind Warehouse-native Product Analytics”, let’s cover how the content team for digital products can effectively operate with integrity with this new approach in analytics. 

What is Content Operations?

In streaming and entertainment applications, content plays a significant role in the engagement, retention and monetization since that’s everything customers interact with. It can also be used as a critical tool to attract new customers, drive conversion and reactivate dormant ones. 

Though content typically is not free. There are licensing and loyalty considerations, also the cost to promote certain content to the right audience. For example, using a free show to acquire new customers or drive them to sign up for a trial subscription; or target a specific cohort of users with episodes from certain genres to bring them back to the platform.  

Even with sophisticated recommendation systems based on modern machine learning algorithms, the content team must conduct lots of experiments and measure their results to maximize the impact. It is a tricky balance since the audience’s taste changes frequently and can be easily influenced by the season or cultural atmosphere. That’s why content management becomes live operations. 

Unfortunately, in most organizations, content analytics is typically overlooked and treated just as a reporting function and left for some pre-built reports to handle. Without self service and exploration, many enterprises couldn’t even connect the dots between content changes and long term customer behavior. 

Problems with Siloed Product Analytics

Some organizations managed to get content insights using last generation product analytics platforms at the expense of very high cost. 

[Lucid Diagram]

Complex Integration with Stale Data

Content data is massive and changes very frequently. Imagine a content database with every show and episode, with dynamic assignments to different channels, genres, carousels and promotions, along with confidential loyalty data. None of the information is available inside the digital application when a customer starts watching a video. 

In order for these product analytics platforms to provide content insights, complex integration has to be completed to duplicate the content data into their data silos. Either the content data must be available inside the application on the devices, or special ETL jobs have to be built to sync it over periodically. Neither approach is ideal because of the dynamic nature of the content data itself: any kind of copying or syncing causes problems of stale data, or even worse, potential of wrong data. 

There are also other vendors for each stage of a customer’s journey, like identity resolution, customer engagement and experimentation. The product analytics must have a copy of precious customer data from each and every vendor in order to deliver the insights. That is the root cause of all the headaches and issues. 

There are criss-cross connections to be established through various approaches (e.g. ETL, API and storage sharing) and conduct heavy duty data copying. More often to anyone’s like, these connections can be broken, require maintenance, or even worse need to restate the historical data because of mistakes made. Just imagine the impact on the critical campaigns which require almost real-time insights. 

Identity Resolution Becomes a Nightmare

Streaming and entertainment apps are all very sensitive about data security and privacy. As required by GDPR like policies, most customer identifiers are obfuscated or anonymous and require identity resolution vendors (e.g. Amperity, LiveRamp) to stitch them together. 

Unfortunately identity resolution is not deterministic and often there are desires to play/test with different strategies in order to measure certain content campaigns more accurately or efficiently. If the resolved ids have already been copied into product analytics platform’s data silos, there is no chance to restate or re-evaluate. Frankly, it is even hard to imagine how these insights can be trusted because technically as a third party, a product analytics platform shouldn’t store customers’ PII information in the first place. 

No Single Source of Truth

This one is really simple: with copies of data lying outside of the enterprises, how can anyone trust the insights where the analytics platform is a blackbox and there is zero transparency to understand how the insights are generated. Needless to say, there is no reconcilability whatsoever. It would really take some vote for confidence to rely on these findings to make content decisions, which often involves millions of dollars of budget.  

Limited View on Customer Impact

Because some content data (like loyalty) is too sensitive to be sent to the digital app or the third party analytics vendor, some changes very fast and require constant restating or backfilling (like cataloging information), there can never be a complete 360 view of customer journey with siloed product analytics. 

In addition, most media apps generate significant amounts of behavior data, like heartbeat events for video watching, which would lead to skyrocketing cost on such platforms which typically charge by data volume because they ingest customer data. Many content teams were forced to sample their data and live with partial insights which could lead to completely wrong results.

The Warehouse-native Way

All of these problems can be solved with the warehouse-native approach when the enterprise is committed to have full control of their data within a cloud data warehouse. By bringing all of the clickstream, identity resolution, impression, conversion and A/B test data from the vendors together and making their own data warehouse the Single Source of Truth, new generation of warehouse-native analytics platforms can connect directly to customer’s complete data model through effortless integration and ensuring both the integrity and self service perspectives required by content operations. 

[Lucid Diagram]

Effortless Data Integration

For the enterprise, they just need to collect their own customer data (including clickstream/behavior, content and operational data) and all vendors’ data into a central data warehouse which is under their full control. Often, access to vendors’ data can be achieved through Data Sharing protocols (available in most cloud data warehouses) instead of duplication with ETL or API. 

There is no complex graph of data flow outside of the enterprise, especially between vendors. When there are data issues, only one place needs to be fixed and it is easily verifiable instead of coordinating with several third parties to pray that they will do the right thing since there is no visibility into their black boxes. There is no data backfilling, scrubbing or restating required.

Flexible Identify Resolution

Because all the data now goes to the enterprises where the customers are from, all available customer identifiers can be explicitly stated and used for analytics internally without the need for hashing and complicated matching (often guessing) algorithms.  

Even better, the content team can experiment different identity resolution or attribution strategies on the fly, without the need to engage with vendors or reprocess any data. The ability of asking and validating “what if” questions before commitment gives complete confidence and flexibility. 

Moreover, sensitive identity data can also be hidden or dynamically masked for warehouse-native analytics platform’s access since they don’t need to see the individual data as long as the underlying join works. 

Exploration with Integrity

With One Single Source of Truth, and the ability to provide the SQLs behind every insight, the content team can now measure their content impact, explore customer insights in a self service manner while maintaining the highest level of integrity. There are never concerns about data volume or bringing in new data for analysis. 

The transparency delivered by warehouse-native analytics makes it complimentary to any other BI, AI or machine learning tools, where they can not only reconcile the insights but also build on top of them. For example, a complex subscription retention analysis for different content cohorts can now be embedded into the machine learning algorithm for content recommendation as the KPI for the tuning purposes because the SQL is fully accessible.  

Full Customer 360 View

With all the data about customers’ complete lifecycle stored in one place. warehouse-native analytics can easily analyze the impact from content campaigns with subscriptions, lifetime value, retention and reengagement data. Best yet, because all insights are generated dynamically, there are no ETL jobs to develop, no data to backfill when new data is required. That means that growth marketers don’t have to wait weeks or months for some data model changes required for specific vendors. Live customer insights with thorough depth is not a dream any more.  

Summary

The days of data silos are long gone. With the convenience and advantages, warehouse-native analytics for content operations is an undeniable trend for enterprises with media and entertainment focused digital products. Getting reliable, trustworthy insights from a Single Source of Truth should be on the top of the mind for every serious content team. 

Explore Customer 360 with Integrity

Warehouse-native Analytics for Growth Marketing

Following the earlier blog “Unveiling the Truth Behind Warehouse-native Product Analytics”, let’s cover how the growth marketing team for digital products can effectively explore customer 360 with integrity with this new approach in analytics. 

What is Growth Marketing?

There are many definitions out there. We’d like to think of Growth Marketing as an approach to attract, engage and retain customers through campaigns and experiments focusing on the everchanging motives and preferences of their customers. In practice, growth marketers build, deliver and optimize highly tailored and individualized messaging aligned with their customers needs through multiple channels. They are a cross functional team between Product, Marketing, Customer and Data. Product analytics play a significant role for this job with the focus on self service customer insights. 

From customers’ lifecycle perspective, there can be several stages:

Acquisition

At the top of the funnel, customer acquisition is all about the strategy to target potential customers with tailored content through multiple channels with highest efficiency and fastest but most accurate measurements. The campaigns can be executed as ads, paid search, call-to-actions, free offers or discount coupons on various third-party channels. Often, there is a significant amount of budget allocated with these campaigns, which are also super dynamic. 

Often the term “attribution” is used, which means to attribute every customer to the proper channel they come from in order to measure and find the most effective one. It requires constant monitoring, A/B testing and tuning to optimize acquisition channels on the fly in order to adapt to the market dynamics and get the best ROI. 

Engagement

Once a new customer comes in, the focus now is to drive their engagement and collect more data to help build better experiences by enhancing their customer journey. Typically there are critical stages or funnels for a digital product like onboarding (tutorial), sign up, engaging with the core loop (e.g. watch a video, invite a friend, add to cart), and checkout. The goal of the engagement is to prompt customers with the most relevant and attractive content and push them through the desired sequence in order to keep them in the application. 

Besides optimizing the user flow by improving design and usability, growth marketers typically rely on incentivized offers (first order discount, free trial), social/viral loop (invite a friend, refer someone) and loyalty programs to keep their customers engaging. All of these efforts require a deep understanding of customers’ journey (e.g. funnel, conversion, drop off) through product analytics in order to make the right decisions. 

Reactivation

There will always be customers who become dormant or churn completely. In order to get them back into the application and retain, growth marketers utilize every possible communication channel at their disposal: email, push, SMS or even targeted ads to get their attention and get them back. Often some third party tools like Braze, a Customer Engagement Platform, will be utilized to deliver these messages. Though, product analytics will be the driver for these campaigns to identify different cohorts, target them and measure the ultimate results, which is not only about impression and open rate, but also the long term impact inside the application: e.g. retention, subscription attach, LTV (lifetime value).

Problems with Siloed Product Analytics

Those last generation product analytics platforms worked out for growth marketing needs at the time when they needed to run fast, but with some high cost. 

Super Complex Data Flows

Since there are always different vendors for each stage of a customer’s journey, the product analytics must have a copy of precious customer data from each and every vendor in order to deliver the insights. That is the root cause of all the headaches and issues. 

There are criss-cross connections to be established through various approaches (e.g. ETL, API and storage sharing) and conduct heavy duty data copying. More often to anyone’s like, these connections can be broken, require maintenance, or even worse need to restate the historical data because of mistakes made. Just imagine the impact on the critical campaigns which require almost real-time insights. 

Identity Matching Becomes a Nightmare

When privacy concerns like GDPR arise, there are more and more limitations on what kind of customer identifiers can be shared with and between vendors themselves. Often the growth marketers get stuck in the middle of the battle between data engineering and security personnels. Eventually some aerobatic maneuvers have to be done on the data pipeline, which makes everything further more complicated and fragile. 

No Single Source of Truth

This one is really simple: with copies of data lying outside of the enterprises, how can anyone trust the insights where the analytics platform is a blackbox and there is zero transparency to understand how the insights are generated. Needless to say, there is no reconcilability whatsoever. It would really take some vote for confidence to rely on these findings to make growth marketing decisions, which often involves millions of dollars of budget.  

Limited View on Customer 360

For growth marketers, often just having the impression, conversion, CPI/CPM data is not enough at all. The deeper the insights into customer behavior, the better. For example, just measuring the open rate of a push campaign only scratched the surface, it is often desirable to understand what kind of content did the customer engage with, how long did they stay in the application, did they come back the week after, or if they converted to subscriber and if/when did they churn again. 

In order to get this complete view of customer 360, operational data are required (e.g. Items and Subscriptions), but often it is almost impossible for traditional product analytics platforms to get these data because they are usually not part of the clickstream (behavior) data and will require very complicated ETL integration to send a copy out. 

The Warehouse-native Way

All of these problems can be solved with the warehouse-native approach when the enterprise is committed to have full control of their data within a cloud data warehouse. By bringing all of the clickstream, campaign, impression, conversion data from the vendors together and making their own data warehouse the Single Source of Truth, new generation of warehouse-native analytics platforms can connect directly to the custom data model with effortless integration and ensuring both the integrity and self service perspectives required by growth marketers. 

Simplest Data Integration

For the enterprise, they just need to collect their own customer data (including clickstream/behavior and operational data) and all vendors’ data into a central data warehouse which is under their full control. Often, access to vendors’ data can be achieved through Data Sharing protocols (available in most cloud data warehouses) instead of duplication with ETL or API. 

There is no complex graph of data flow outside of the enterprise, especially between vendors. When there are data issues, only one place needs to be fixed and it is easily verifiable instead of coordinating with several third parties to pray that they will do the right thing since there is no visibility into their black boxes. There is no data backfilling, scrubbing or restating required.

Customizable Identify Resolution

Because all the data now goes to the enterprises where the customers are from, all available customer identifiers can be explicitly stated and used for analytics internally without the need for hashing and complicated matching (often guessing) algorithms.  

Even better, enterprises can experiment different identity resolution or attribution strategies on the fly, without the need to engage with vendors or reprocess any data. The ability of asking and validating “what if” questions before commitment gives complete confidence and flexibility. 

Moreover, sensitive identity data can also be hidden or dynamically masked for warehouse-native analytics platform’s access since they don’t need to see the individual data as long as the underlying join works. 

Exploration with Integrity

With One Single Source of Truth, and the ability to provide the SQLs behind every insight, growth marketers can now explore customer insights in a self service manner while maintaining the highest level of integrity. The transparency delivered by warehouse-native analytics makes it complimentary to any other BI, AI or machine learning tools, where they can not only reconcile the insights but also build on top of them.

Full Customer 360 View

With all the data about customers’ complete lifecycle stored in one place. warehouse-native analytics can easily bring in any operational data (e.g. Items, Subscriptions or LTV) into the analyses. Best yet, because all insights are generated dynamically, there are no ETL jobs to develop, no data to backfill when new data is required. That means that growth marketers don’t have to wait weeks or months for some data model changes required for specific vendors. Live customer insights with thorough depth is not a dream any more.  

Summary

The days of data silos are long gone. With the convenience and advantages, warehouse-native analytics for growth marketing is an undeniable trend for enterprises with customer focused digital products. Besides exploring customer 360, getting reliable, trustworthy insights from a Single Source of Truth should be on the top of the mind for every serious growth marketer.  

Mastering Conversion Analysis: A Deep Dive into Kubit’s Funnel Reports

In the fast-paced world of digital marketing and product analytics, understanding the intricacies of user behavior is not just advantageous—it’s essential. One of the most powerful tools at your disposal for dissecting user journeys is the funnel report. Effective use of funnel reports can illuminate the path to increased conversions, reveal bottlenecks in your user experience, and guide strategic decisions that drive business growth.

Here are the 5 unique capabilities of Kubit’s funnel report tool that have made it an indispensable asset for data-driven professionals aiming to unlock actionable insights from their user data.

Understanding Funnel Reports in Kubit

At its core, a funnel report is a visual representation of how users progress through a predetermined series of steps or actions on your website or app. This progression could relate to anything from completing a purchase to signing up for a newsletter.

Kubit’s Funnel Report offers these 5 capabilities to get the most out of your data:

  1. Multi-step Funnel Creation: Craft funnels that reflect the complexity of real user journeys.
  2. Partitioning Options: Slice your data by day, session, or custom conversion windows for nuanced analysis.
  3. Deeper Conversion Insights: Break down funnel stages by various fields to uncover underlying patterns.
  4. Advanced Visualization: Choose between step-by-step breakdowns or time-based line charts for dynamic report viewing.
  5. Cohort Analysis: Right click and build users into cohorts for targeted behavioral analysis over time.

Use Cases for Funnel Reports

The applications of funnel reports in Kubit are diverse, mirroring the myriad pathways users can take towards conversion. Here are just a few scenarios where Kubit’s funnel reports can be most valuable:

  • Enhancing User Onboarding: Track new users’ progress through your onboarding sequence to identify and rectify any stumbling blocks.
  • Optimizing Product Engagement: Discover where users disengage or drop off when interacting with specific features or content.
  • Streamlining Conversion Paths: Measure the time it takes for users to move from one stage of your funnel to the next, and deploy strategies to accelerate this progression.
  • Analyzing Behavior Pre-Conversion: Understand the actions repeat users take before finally converting, providing insights into which features or content are most influential in driving conversions.

Through these use cases and beyond, Kubit’s funnel reports offer actionable insights that can powerfully impact business strategies and outcomes.

Real-World Success with Kubit Funnel Reports

Consider Influence Mobile, a customer that leveraged Kubit’s funnel reports to uncover a costly problem. By carefully analyzing their onboarding process and identifying friction points with Kubit’s tools, they significantly improved user retention. Furthermore, Kubit’s capabilities enabled them to detect patterns indicative of fraudulent activity, ensuring a secure and trustworthy platform for their users. Their success story underlines the potential of Kubit’s funnel reports to transform challenges into triumphs.

Getting Started with Funnel Reports in Kubit

Kubit simplifies the process of building and deploying funnel reports. To get started:

  1. Define Your Conversion Goals: Determine what user actions or sequences you want to analyze.
  2. Set Up Your Funnel Steps: Using Kubit, create a funnel that reflects these steps in your user’s journey.
  3. Analyze and Iterate: Once your data starts flowing, use Kubit’s insights to refine your strategy and improve user outcomes.

Understanding how to interpret the data from your funnel reports is crucial. Look not just at where users are dropping off, but also why. This often involves cross-referencing funnel data with user feedback, usability tests, or other analytics reports.

Conclusion

Kubit’s funnel reports are a potent tool for anyone looking to enhance their understanding of user behavior and drive meaningful improvements in their conversion rates. Whether you’re just starting on your analytics journey or are looking to refine your approach with cutting-edge tools, Kubit offers a robust platform designed to elevate your analytics capabilities.

“The more people are looking at this data, the better. Everyone should be monitoring our most important conversions,” states a seasoned user of Kubit, underscoring the collective benefit of widespread engagement with analytics within an organization.

Ready to transform your data into actionable insights? Sign up for Kubit or reach out for a demo today, and discover how funnel reports can redefine the way you view your users’ journeys from first touch to conversion.

Unraveling The Truth About Warehouse-native Product Analytics

In recent years, warehouse-native has become a popular topic in the product analytics market. Along with the maturity of the modern data stack, it is not a coincidence that more and more companies have realized the need for customer insights coming directly from their warehouse. However, there needs to be more clarity from vendors making false claims to ride on this wave. In this blog, I will review the history of product analytics, explain the rationale behind the warehouse-native approach, and unveil the benefits of this new generation optimization tool for digital products.

Integrity vs Speed: The History of ‘Siloed’ Product Analytics

Historically, analytics has been conducted directly in a data warehouse. Consider traditional Business Intelligence (BI) tools like Tableau, Looker, and PowerBI; typically, data analysts create reports and charts in these tools to visualize the insights that ultimately stem from executing SQLs in their own data warehouse. The data control is entirely in the hands of the enterprises, though this approach requires dedicated and influential engineering and analytics teams.

With the exponential growth of digital products, from web to mobile applications, a different way of conducting analytics has emerged, starting with Omniture (later becoming Adobe Analytics) and Google Analytics. Due to the dynamics in the ecosystem, few enterprises’ data teams can keep up with the constant requirement changes and new data from different vendors. It became well-accepted to sacrifice integrity for speed by embedding SDKs and sending the data to third-party silos, and relying on a black box to get insights.

For a while, everyone was happy to rely on impression, conversion CPI/CPM, etc., and metrics from external analytics platforms to guide their marketing campaigns and product development. With the mobile era, the need for Continuous Product Design arose, along with a new breed of Growth Marketing people who rely on product insights to drive user acquisition, customer engagement, and content strategy. That’s when Mixpanel and Amplitude came into existence to provide self-service customer insights from their proprietary platforms, aiming to run fast and bypass data engineering and analytics teams.

Governance, Security, and Privacy: Rethink the Black Box   

Fairly soon, the industry started to correct itself. Sharing customers’ private data, like device identifiers, is no longer acceptable with other vendors. Many enterprises now realize that it is impossible to have complete data governance, security, and privacy control if their sensitive data has been duplicated and stored in third parties’ data silos. How can they trust the insights from a black box that can never reconcile with their data? Without a Single Source of Truth, there is no point in running fast when your insights don’t have the integrity to justify the decisions.

Let’s face it: why should anyone give up their data to third parties in the first place? With the new modern data stack, especially the development of cloud data warehouses like Snowflake, BigQuery, and Databricks, the days of having to rely on external analytics silos are long gone. More and more enterprises have taken data control as their top priority. It was time to rethink product analytics: is it possible to explore customer insights with integrity and speed at the same time?

Without a Single Source of Truth, there is no point in running fast when your insights don’t have the integrity to justify the decisions.

The Rise of Warehouse-native

Cloud warehouses have become many organization’s source of truth, leveraging millions of dollars in infrastructure investments. Scaling access to this information used to be as simple as dumping all the data into a warehouse and reverting to the BI tools. Unfortunately, reporting tools like Tableau, Looker, or PowerBI were designed exclusively for professionals answering static questions. To get insights, most product, marketing, and business people rely on analysts to build reports to answer their questions. Going through another team is tedious, slow, and even worse, highly susceptible to miscommunications. The nature of optimizing digital products necessitates ad-hoc exploration and spontaneous investigation. If each question takes hours or days to be answered, the opportunity window may have closed long before the decision is made.

This self-service demand and warehouse-native motion triggered a new generation of tools that provide SaaS product analytics directly from customers’ cloud data warehouse. It perfectly balances integrity and speed, which should be the objective of  analytics platforms.

If each question takes hours or days to be answered, the opportunity window may have closed long before the decision is made.

What is the Warehouse-native Way?

Here are four characteristics to identify a true warehouse-native solution:

Tailored to your data model

A warehouse-native solution should continually adapt to the customer’s data model instead of forcing them to develop ETL jobs to transform their data. Besides sharing data access, there should be zero engineering work required on the customer end, and all the integration should be entirely done by the vendor.

The effortless integration is one of the most significant differences from the traditional data silo approach, which mandates the customer to build and maintain heavy-duty ETL batch jobs, which could take months to develop and yet still can break frequently. One example is how Amplitude claims to be warehouse native, but in reality, it just means their application is “Snowflake Native” (running as containers) but still requires customers to transform their data into Amplitude’s schema. 

Data should never leave your control

This should be assumed under the term ‘warehouse-native’. However, some solutions are engaging in warehouse syncing or mirroring to copy customers’ data into their data silos. Some admin UI may be provided to configure the data connection and eliminate the need for custom ETL jobs, but if you see words like “load,” “transform,” or “sync,” the system is essentially making copies of customers’ data into its silos.

Besides losing control, the biggest problem with data duplication is how they adapt to customer data changes. There will be a constant struggle for backfilling, scrubbing, restating, and reprocessing when there are data quality issues, or data model changes (e.g., a new attribute or a dimension table), which are fairly common and happen regularly.

Besides reducing some engineering work, achieving a Single Source of Truth or data integrity with a data syncing method is impossible. It’s difficult to trust a black box without visibility into how insights are generated.

Complete transparency with SQL

One of the most prominent traits of a proper warehouse-native solution is to provide customers with the SQL behind every report. Since the data lives in the customer’s warehouse anyway, there should be complete transparency on how the insights are computed. Such a level of transparency can guarantee accuracy and provide reconcilability and allows customers to extend the work from product analytics platform to more advanced internal development, like machine learning and predictive modeling.

Dynamic configuration with exploratory insights

Because all reports come directly from the data in a customer’s warehouse leveraging SQL, every insight should be dynamically generated. There are several significant benefits:

  • Underline data changes will immediately be reflected in analytics. There is no data to scrub, no cache to poke, and no vendor to wait for.
  • Raw data can be analyzed on the fly in an exploratory manner. Warehouse-native analytics supports virtual events, virtual properties, dynamic functions, and analyzing unstructured data (e.g., JSON or struct), which helps in hypothesis testing before committing to lengthy data engineering work.
  • Data model improvements can be iterative and incremental. When new attributes or dimensions are added, they automatically apply to historical data. There is no data backfill required because everything happens with dynamic joins. With the multi-schema support, it is possible to have both raw and clean data schemas running in parallel to satisfy the speed and consistency requirements simultaneously.

Incorporate operational data without the need for ETL. All of the clickstream/behavior events, vendor data and operational tables can be dynamically combined for analytics, all inside the customer’s data warehouse with no data movements required.

Summary

With its unique advantages and momentum in the market, enterprises will inevitably choose warehouse-native analytics to optimize their digital products and explore customer insights with integrity. In the meantime, it is vital to look through the marketing claims and find truthful solutions. In upcoming blogs, I will cover the real-world use cases for applying true warehouse-native product analytics solutions to different teams and industries.

Key product analytics metrics

What are product analytics metrics and why are they important

In the digital age, data is the lifeblood of any business. It can transform a company’s trajectory, inform strategic decisions, and predict customer behavior. But data alone isn’t enough. It’s the application of relevant metrics that can truly drive business growth. When created and measured appropriately, metrics can help illuminate the path to better customer experiences, optimized products, and business success. However, not all metrics are created equal. The key lies in selecting ones that are meaningful, actionable, and tied to your specific business objectives.

In this blog post, we dive into the importance of metrics in product analytics, how to set the right ones, and when to measure and evolve them.

Understanding Quality Metrics

Quality metrics provide actionable insights that are specific to your business. They’re quantifiable, easy to understand, and directly linked to your key performance indicators (KPIs).

For instance, an essential metric is Viewing Time in seconds if you’re a streaming media business like ViX. That heartbeat metric is directly tied to the business goals of driving more watch time and directly impacts revenue. Please check out this case study for a more detailed overview of how ViX teams use Kubit to support and enhance their daily work.

Setting Quality Metrics

Identifying the right metrics is vital for your product’s success. Here are some common categories of metrics to consider:

Acquisition

Acquisition metrics are crucial in understanding how effectively you’re attracting new users. By capturing and utilizing these metrics, you gain valuable insights that fuel informed decisions about your product’s growth strategy.

Acquisition metrics track the process of bringing new users into your ecosystem. This includes aspects like website visits, app downloads, sign-ups, and user acquisition cost (UAC) across different marketing channels. Analyzing these metrics helps you identify which channels are most successful in attracting your target audience. Imagine you see a surge in sign-ups from social media ads compared to email marketing. This tells you to invest more resources in social media campaigns.

Furthermore, acquisition metrics help you optimize your marketing spend. You can identify areas where you get the most bang for your buck by tracking UAC per channel. This allows you to allocate your budget more efficiently towards channels that deliver high-quality users.

Overall, capturing and utilizing acquisition metrics is essential for any product team aiming to grow its user base. They provide a data-driven perspective on your marketing efforts, ultimately leading to a more targeted and successful product strategy.

Activation

Once you’ve acquired a new user, you must focus on how best to activate them and turn them into engaged users. Capturing and utilizing activation data is critical for optimizing your product and maximizing its long-term value.

Activation metrics focus on that critical “aha!” moment when users discover the core value proposition of your product. This might involve completing a specific action, like purchasing in an e-commerce app or creating a first post on a social media platform. Tracking activation rates (percentage of users who reach this point) and time to activation reveals valuable insights.

For example, a low activation rate could indicate a confusing onboarding process or a lack of a clear value proposition. By analyzing user behavior leading up to activation, you can identify friction points and streamline the user journey. Additionally, a long time to activate might suggest the need for in-app tutorials or targeted prompts to nudge users toward the core functionality.

Ultimately, utilizing activation metrics allows you to personalize the user experience and remove roadblocks that hinder engagement. By focusing on activation, you ensure those who acquire your product become invested users, driving long-term success.

Engagement

Engagement metrics are the lifeblood of understanding how users interact with your product. This data is paramount for fostering a sticky and successful product and goes beyond simply acquiring users; they delve into how deeply users interact and derive value from your product.

Examples include daily/monthly active users, session duration, feature usage frequency, and content consumption. By analyzing trends in these metrics, you can identify areas that spark user interest and those that lead to drop-off.

For instance, a consistent decline in daily active users might indicate waning interest. If you investigate further, you might discover a new competitor offering a similar feature or a recent update that introduced bugs or a confusing interface. Conversely, a surge in a specific feature’s usage could signal a hit with users. This valuable insight allows you to double down on success and prioritize improvements in areas causing disengagement.

Ultimately, utilizing engagement metrics empowers you to refine your product roadmap. You can prioritize features that drive deep user engagement, fostering a loyal user base that consistently returns. This translates to increased product adoption and opens doors for monetization and long-term product viability. By focusing on engagement, you ensure your product isn’t just acquired but actively used and loved by your target audience.

Conversion

In the realm of product analytics, conversion metrics are the champions of measuring success. Capturing and utilizing this data allows you to understand how effectively you’re guiding users towards achieving your desired goals within the product. These goals can vary depending on your product type – a purchase on an e-commerce platform, completing a level in a game, or subscribing to a premium service.

Conversion metrics track the user journey towards specific actions. Common examples include click-through rates on calls to action (CTAs), add-to-cart rates, sign-up completion rates, and conversion funnel analysis. By analyzing these metrics, you gain valuable insights into how well your product is facilitating the desired user behavior.

Imagine a low conversion rate for your premium service sign-up. This could indicate a confusing pricing structure, an unclear value proposition, or a poorly designed sign-up process. Utilizing conversion metrics lets you identify these bottlenecks and optimize the user journey. A/B testing different CTAs or simplifying the sign-up flow can significantly improve conversion rates.

Ultimately, capturing and utilizing conversion metrics empowers you to maximize the value users derive from your product. By optimizing conversion funnels, you ensure users complete desired actions, leading to increased revenue, higher user satisfaction with achieving their goals, and, ultimately, a successful business.

Impact

In the fast-paced world of product development, every decision counts. Capturing and utilizing feature impact metrics is a critical tool in helping you understand how individual features influence user behavior and overall product success.

These metrics go beyond simple feature usage. They delve deeper, measuring the impact a specific feature has on key performance indicators (KPIs) like engagement, conversion rates, or even user satisfaction. This allows you to identify features that are driving positive outcomes and those that might be hindering progress.

For example, imagine you introduce a new social sharing feature in your productivity app. While user adoption might be high (many users try it out), the feature impact metric could reveal a negligible improvement in overall user engagement. This valuable insight suggests the feature might not be addressing a core user need.

By capturing and utilizing feature impact metrics, you gain a clear picture of how each aspect of your product contributes to the bigger picture. This data empowers you to make data-driven decisions, prioritize features that deliver real value, and ultimately build a product that resonates deeply with your users.

Retention

After going through the hard work of acquiring and activating new users, retention is the key to measuring long-term success. Capturing and utilizing retention data is paramount for building a product with lasting value and a loyal user base.

Common retention metrics include daily/monthly active users (DAU/MAU), and user lifetime value (LTV). By analyzing trends in these metrics, you gain valuable insights into user satisfaction and the “stickiness” of your product.

Imagine a steady decline in DAU or a high churn rate. This could indicate features that lose their appeal over time, a confusing user interface, or a lack of ongoing value proposition. Utilizing retention metrics allows you to identify these pain points and take action. This might involve introducing new features that drive continued engagement, simplifying the user experience, or implementing onboarding programs that foster deeper user understanding.

Ultimately, capturing and utilizing retention metrics empowers you to build a product that users love. By optimizing for user retention, you foster a loyal user base that consistently returns, leading to increased revenue, and improved brand reputation.. Retention metrics are the compass that guides you toward building a product with lasting appeal and a sustainable future.

Churn

Having an early warning system in the form of churn metrics is critical in mitigating potential issues. By effectively capturing and utilizing churn data, you gain invaluable insights into why users abandon your product, allowing you to identify and address issues before they become widespread.

Churn metrics track the rate at which users stop using your product over a specific period. This seemingly simple metric reveals a wealth of information. Analyzing churn rates across different user segments, timeframes, and acquisition channels allows you to pinpoint areas where users are most likely to churn.

Imagine a high churn rate amongst users who signed up through a specific marketing campaign. This could indicate misleading advertising that didn’t accurately represent the product’s value proposition. Conversely, a surge in churn shortly after a major update might point to usability issues or a confusing new interface.

By capturing and utilizing churn metrics, you gain a proactive approach to user retention. This data empowers you to identify and address issues that lead users to churn, ultimately fostering a loyal user base and building a product with lasting appeal.

Choosing the right metrics depends on your business type, product, and specific goals. There’s no one-size-fits-all approach, but keeping these categories in mind will guide you toward meaningful metrics that reflect your product’s performance and user behavior.

Measuring and Evolving Metrics

Once you’ve identified the right metrics, the next step is to measure them regularly to understand the baselines and make informed decisions. The frequency of measurement depends on the specific metric and your business needs.

For instance, DAUs (Daily Active Users) might be measured daily, while churn rate or retention might be measured monthly or quarterly. Reviewing and updating your metrics periodically ensures they remain relevant as your product and market evolve.

Also, remember that metrics should be seen as tools for learning and improvement, not just reporting. If a metric is consistently underperforming, use it as an opportunity to investigate, learn, and iterate on your product.

Metrics with Kubit

Kubit stands out in the crowded data analytics space due to its unique ability to seamlessly handle a comprehensive spectrum of data types, including but not limited to online, offline, transactional, operational, and behavioral data. Our warehouse-native approach ensures that organizations have the ability to access, analyze, and assimilate ALL of their data with Zero-ETL. This sets a new standard for creating, measuring, and adjusting metrics, offering unparalleled flexibility and precision. Unlike other solutions that mandate predefined data models in their data silos or limit the scope you can view, Kubit’s platform empowers you to explore every facet of your data and gain deep, actionable insights. This differentiation unlocks improved data-driven decision-making and gives you a competitive edge in today’s data-centric business environment.

Conclusion

Meaningful metrics are the guiding compass in navigating the expansive realm of product analytics. They provide a clear direction, enable informed decisions, and drive business success. By understanding what good metrics look like, how to set them, and when to evolve them, product managers and data analysts can increase their positive impact on business outcomes.

Remember, numbers tell a story. Ensure your metrics tell a story that matters to your business. Happy analyzing!

Explore your data with Multi-Dimensional Data Tables

When analyzing data, analysts often need to compare various measures simultaneously and break them down by different properties and segments. Introducing Kubit’s latest analysis chart, Data Tables, which allows for multi-measure, multi-dimensional analysis in a single view.

Tools like Excel and Sheets have provided this type of data visualization and it works! While you may still want to see data in funnels, lines, bars and pie graphs; it can sometimes be best to see it laid out in a table view.

Our Customers are using Data Tables to understand things like:

  1. Cross Tab Analysis
    • How are User engagement metrics across different user segments and features?
  2. Custom Measures and KPI Analysis
    • Compare custom-defined measures or KPIs across different dimensions
  3. Segmented A/B Testing
    • Analyze user segments by control vs. variant groups
  4. Impact of Marketing Campaigns
    • Show click through rate, conversion rate by user segments and Campaigns all in one report

Getting Started with Data Tables:

  1. Navigate to Report → Data Table.
  2. As you can see from the snapshot below, the end user can easily begin adding new events, saved measures, breakdowns and segments.
  1. Highlighted below is an example of a user selecting 3 saved measures, building 2 measures on the fly and breaking it down by Country (United States, Canada, United Kingdom), Plan Type and Platform.
  1. When executed, the below table will be displayed. Users have the ability to sort, search, adjust columns widths, export to CSV, and view the SQL behind the chart.

Take it for a Ride

Now that you have a high level overview of Kubit’s Data Table, click through the guide below and get a feel for it yourself. If you’re interested in learning more, please reach out to our team.

Click the below GIF to walkthrough the demo.

Work around 5 common Data Quality issues with Kubit

Intro

We already know the perfect data model for product analytics but even with a perfect data model you can get tripped by other data issues on your way to obtaining insights. It often happens that a data issue is uncovered while working on a report in Kubit and it suddenly blocks the task at hand. Unfortunately, data issues typically take time to fix – in the best case scenario as early as the next sprint, often a month or two and in some rare cases the issue cannot be resolved at all. So while at Kubit we advocate for data modeling and data governance best practices, we have also developed a few features to help you work around 5 typical data issues in a self-service fashion while the root cause is being addressed:

  • Missing Data
  • Duplicate Data
  • Ambiguous Data
  • Inconsistent Data
  • Too Much Data

In this blog post we’ll explore how you can leverage these features to save the day whenever a data issue tries to stop you from getting your work done!

1. Incomplete Data

Very often we have some building blocks in our data but we don’t quite have the value we want to filter by. For example, we may have a timestamp property generated when a user installs our app, but for our report we want to measure the percentage of daily active users who installed our app within 7 days. Or we might want to filter by session duration but this information is not available when each event is triggered and must be computed afterwards. Or we may even want to extract the user’s device platform from a user-agent header.

Whenever this is the case you can reach out to the Kubit team to define what we call a “virtual property” which will be computed on the fly on top of your existing data. To continue our first example, let’s call the virtual property Install Days and it will be based on a timestamp column named  install_date. Now we can think of our virtual property in SQL like this:

datediff(day,install_date,event_date)

However, it looks like and is used as any other property within Kubit which makes our analysis very simple –  we get the amount of unique users who are active and filter by  Install Days <= 7, then divide that by the total number of daily active users like this:

2. Duplicate Data

Duplicate Data is always a pain to deal with and in the context of product insights we usually see it in the form of duplicate events. You can already leverage Kubit’s zero-ETL integration to do as much data scrubbing as you need. The results of your work will be immediately available in Kubit without any extra effort required. However, we often get asked to try and resolve some duplication on the fly – maybe the team who can fix the issue is overloaded, or there is some third-party responsible for the events generation – in both cases the process to resolve the issue will take any time between a lot and never.

Again, “virtual property” can come to the rescue, as we can generate a virtual property based on some criteria on only one of a set of duplicate events so we can distinguish it from the rest. Let’s consider the following example – imagine we have 5 purchase events for the same user, all for the same purchase but  at different timestamps:


user_id

event_name

purchase_id

event_date

purchase_amount

a7cb92df1c87c07fd

completed purchase

20041876

2023-10-23 15:23:11

$18.78

a7cb92df1c87c07fd

completed purchase

20041876

2023-10-23 17:05:47

$18.78

a7cb92df1c87c07fd

completed purchase

20041876

2023-10-24 10:32:03

$18.78

a7cb92df1c87c07fd

completed purchase

20041876

2023-10-25 22:11:59

$18.78

In this case, if we want to find the number of unique users who made a purchase, the duplication is not really a problem. But if we want to count the number of purchase events or aggregate the purchase_amounts, then our results will be way off.

How does Kubit fix this?

We can advise on the best solution, but one example is to assign a boolean property Deduped with a value true on the first of a sequence of duplicate events. Kubit can easily select the first duplicate event in a time range using some SQL along those lines:

CASE row_number = 1 ROW_NUMBER() OVER(PARTITION BY user_id, purchase_id ORDER BY event_date ASC NULLS LAST) AS row_number

And once we have the first event of the sequence we can assign the virtual property. So now we can aggregate without any adverse effects caused by the event duplication:

3. Ambiguous Data

What if 2 events in our dataset are easy to confuse with one another? Perhaps the naming is not ideal and people often make mistakes when they need to use them for a report. Let’s say we see 2 Signup events in Kubit – Sign Up and sign_up

But what is the difference between the two? Maybe one is a front-end event and the other is a back-end event, but the names don’t reflect that. There is a quick fix you can make yourself in Kubit to make the difference between the two events much clearer. You can simply go to Dictionary -> Event and  Rename from the Context menu for both events to give them more appropriate names, e.g. Sign Up (server) and Sign Up (client), and a nice description:

4. Inconsistent Data

This is true especially for multi-platform apps. As soon as you start instrumenting on multiple platforms inevitably from time to time there will be discrepancies between the implementations which can result in any of the following issues:

  • the same event comes back with a different name from one or more platforms
  • property name is different on one or more platforms compared to the others
  • property value mismatch between platforms

4.1 Same event, different name

Let’s say we have the same event coming back from different platforms in 3 different flavors – FavorFavourites and Favorites.

Such a situation can be extremely frustrating as you now have to go talk to multiple teams responsible for each instrumentation, align with their release schedules, prioritize the fix and wait for it to go live so you can go back and finish your work. This is one of the reasons why we developed Virtual Events as a way to group and filter raw level events to create new entities which have exactly the meaning we want them to.

It’s super easy to create a Virtual Event, anywhere in Kubit where you have an Event Group an a Filter you can save that combination like this:

And then the Virtual Event will simply appear in any event drop-down with the rest of the regular events, so you can use it for all types of reports:

4.2 Property name mismatch

Let’s say we have a streaming platform and for all the streaming events we have a property called Stream Type. However, a typo was made when implementing the instrumentation on Android and the property is called Stream type instead. Now, for the purposes of our reports we want to treat these two as one and the same, so that our metrics don’t get skewed.

To fix this in the data warehouse properly we would need to:

  1. correct the Android instrumentation in a new app version
  2. go back in our historical data and fix the property name retrospectively

And we still haven’t solved the issue completely – what about all the people who are using older app versions and don’t have the instrumentation fix? They will keep generating data using the inconsistent property name. Turns out a simple typo will be causing us trouble for a long time in our reporting.

There’s 2 solutions which Kubit can provide to help you work around such issues:

  1. You can create Named Filters using both property names and save them for reuse
  2. The Kubit team can easily make such configurations as to treat both properties as one and the same

Let’s explore option #1. In this case we have 2 properties which are actually the same – Plan Type and PlanType. So whenever we want to filter by one of them we actually need to filter by both in order to ensure our filter is applied correctly: 

To help prevent mistakes you can then save a Named Filter which others can re-use. Also helps you save time by not having to create the same filter over and over again:

Once the filter is saved you can use it anywhere in Kubit:

4.3 Property value mismatch

This typically wreaks havoc in our report when we group by that property. For instance, a simple typo in a property value will lead to our report containing 2 groups of the same thing instead of 1 as in the example below:

To overcome issues like this on the spot you can use Kubit’s Binning feature:

Using the Value Binning option you can define custom Groups – in this case we want to merge back Winback Campaign and Winback Campaing into one group and then we want to leave  Group Othersturned off so all the other groups remain as they were:

Congratulations, you’ve successfully removed the extra group from your report:

5. Too Much Data

What if our perfect data model contains more event types than we need for our analytical purposes? Or we have an event which is still noisy and in the process of being fixed, so we want to prevent people from relying on it in their reports?

The Dictionary feature in Kubit keeps track of all your terms – Events and Fields coming from the raw data and also concepts defined by you such as Measure and Cohort. Dictionary also allows you to easily disable an event, which means it will no longer be available for selection in any event drop-down in Kubit. All you have to do is go to Dictionary -> Event and then hit Disable from the context menu of the event you want to hide:

Note that in the case where you are dealing with a noisy event you can easily enable it once the underlying issues with the event generation have been resolved.

Outro

We just explored 5 ways to overcome common data quality issues in Kubit and get to your insights on time. The best part is that all of these solutions are dynamic and the mapping happens at runtime so you can take action immediately. You don’t ever need to invest in complex ETL jobs to update and backfill data. This also gives you the ability to test some hypotheses with real data with live product insights.

At Kubit, we want our customers to have the best possible experience, so please, do let us know what else you would like to get from Kubit to tackle data quality issues!

Accurate vs. Directional: The Tradeoff that Product Leaders Need to Make… Or Do They?

One of the first questions I’ve seen asked in those big meetings, the ones we’ve all spent weeks or even months preparing for, is this: “Are these numbers accurate, or more directional?” 

You break into a cold sweat… 

I think they’re accurate?…

I think we used the same data from our main source of truth?… 

Then, you nod in full confidence. “These are more… directional… For now.”

What a relief. Weight lifted off. Now, the meeting can continue.

But something changed, without anyone saying a word. 

A giant shadow has fallen over the meeting. You realize that future facts and figures you present may be seen with that shadow hanging overhead. 

Result: you didn’t make the impact you’d hoped for.

Confidence now clouded by directionality

The tradeoff between numbers being accurate vs. being directional is an ongoing battle – and one that’s particularly challenging in Product Analytics, for reasons I’m going to explore in this article. 

But first, a quick reminder of what we mean when we say “accurate” vs. “directional.” 

“Accuracy” is when the information derived from your data can be confidently shared with internal and external stakeholders; it’s been “blessed” as delivered from your Single Source of Truth. 

“Directional,” by contrast, refers to information that’s considered “good enough” to validate or justify initial decisions. Generally, directional data is not good enough to form the basis of the final numbers you present. And it certainly should not be part of the measurable outcomes shared with your stakeholders.

Often, the outcome of the accuracy-vs.-directional struggle isn’t poor decision-making, but a lack of decision-making; if you’re unsure of the accuracy of your data, then the default is to not take any action based upon it. 

But, in Product Analytics, sometimes the worst thing you can do is nothing. Product Managers (PMs) are often expected to make decisions quickly – decisions that can have a major impact on revenue and user engagement.


Why accuracy vs. directionality is challenging in Product Analytics 

There are two reasons why Product Analytics is unique when it comes to accuracy-vs.-directionality.

First, Product has an ENORMOUS amount of data. I’m talking trillions of events per day in some cases. 

For many organizations, this data is monstrous, ever-changing, and non-standard. This means that fitting their product data into existing solutions for data governance becomes very challenging. Example: “Active User definition at Company A does not equal Active User definition at Company B.”

The second reason why the accuracy-vs.-directionality question is tricky in Product Analytics: Product people are relatively new to data being a core part of their day-to-day work, compared to teams like Business Intelligence (BI) or even Marketing – which have always been front-and-center in the data game. Building confidence in the data they use and making decisions off of it can be challenging for PMs, especially when it comes to the accuracy-vs.-directional tradeoff.

(Secret third reason… I work for a company that specializes in Product Analytics, Kubit… We all know what we are doing here, right?? 😂)

Weighted Scale Confusion

Traditionally, Product Managers shared data internally, and sometimes weren’t asked for data at all because it was too cumbersome to wrangle; however the prophecy that Product would become the profit center is coming true across many industries. With that high visibility, Product’s key numbers–like Monthly Active Users (MAU), total downloads, activated users from free-to-paid, etc.–have risen to the highest level of importance, sitting alongside the dollars and cents. This is an amazing development. 

But with higher visibility comes greater scrutiny. 

In this new Product-first world, the de facto mode must be accuracy. PMs and Data Engineers can no longer rely on directional metrics.


So, how do you make accuracy your de facto mode?

Collecting information is step one, and there are several methods that businesses use to accomplish this. Each has their own tradeoff and nuance (a topic that we can dive into in another blog), but let’s run through the highlights. 

Tools that have their own data collection methods tend to be inflexible, forcing you to conform your data into their schema. Maybe you’ve been farming the collection out to an auto-tracking tool, or you trust the logging done to monitor uptime as “events.” These collection methods can lead to data that’s potentially accurate, but that may be prone to misalignment with how YOUR business thinks about this data. 

The only way a company can fully understand where and how its product data came to be: first-party collection. But even first-party collection can be challenging! 

So what do you do?

You collect stuff, using whichever method you decide. You need this data to make decisions on your product bets, experiments, and growth strategies. My point of view aligns with what we’re seeing in the market today: an increasing awareness that product data must live inside an organization’s source-of-truth data warehouse

Typically, data warehouses and BI reports are designed, governed, and maintained to uphold a single source of truth. If we want Product Analytics data to hold the weight it deserves, then it too must live up to this standard.

So… Once your Product Analytics data is stored in the data warehouse, you have the ability to access it via multiple solutions, and you’ve achieved accuracy nirvana…right? Not so fast.

Now, you have to decide how you want to analyze this data. Tools available in the Product Analytics space typically follow the same value proposition: “send us your data, and we’ll optimize it for you so you can run high-performance queries on complex user journeys.” This is great! Until… certain limitations arise. 

When your user base grows and events balloon, now you have to pay a large bill, or begin pruning that data via sampling or excluding events. Another problem: you’ve also created another “source of truth,” because the data has left the warehouse. When it breaks, who fixes it? Now, we’re creeping into directional territory…

Before you know it, you can see the directionality shadow encroaching into your next Product leadership meeting.

With next-gen tools, directional data can be a thing of the past

To avoid the accuracy-vs.-directionality tradeoff, a new generation of tools have emerged that leverage your data warehouse directly – no sampling or pruning needed, and no alternative “truth” sources created in silos. These new solutions provide insights using existing data investments made by your organization, leveraging the cloud and removing the requirement to ship data outside your firewalls. The result of this warehouse-native approach: you have Product Managers who are enabled to work with fully accurate data.

Rachel Herrera leads Customer Success at Kubit, the first warehouse-native Product Analytics platform. Do you have thoughts on accuracy-vs.-directionality or on improving your product analytics workflows or infrastructure? Drop her a note at rachel.herrera@kubit.co.