Skip to main content

2 posts tagged with "Snowflake"

View All Tags

Why Your Data Kitchen Needs Separate Stations

· 6 min read

Why Your Data Kitchen Needs Separate Stations

The case for splitting Analytics from Operations

Introduction: How I Learned to Separate My Kitchens

After 30 years as a Data Architect, I’ve seen the same scenario play out countless times. Companies, particularly those in the SaaS space, come to me because they’re struggling to scale their analytics. They’ve built impressive systems, but as their data needs grow, so do their problems. Often, the root cause is the same, they’re trying to analyze data directly from their operational stores.

Imagine trying to build detailed reports for hundreds of clients, each with their own unique needs and schedules, all while relying on the same operational system that’s supposed to keep your business running smoothly. It’s a recipe for disaster. Whether they’re using a relational SQL store or a memory-hungry NoSQL solution, these companies are burning resources and slowing down operations.

What’s more, the market often pushes the idea that copying data is inherently bad and that accessing data directly from wherever it resides is faster and more efficient. But from my experience, this approach can lead to bloated systems, sluggish performance, and skyrocketing costs. The truth is, copying data isn’t bad, as long as it’s done thoughtfully, with a focus on directionality, master versus reference data, and lifecycle management.

An Analogy:

Imagine you're running a restaurant kitchen. You have chefs preparing food for hungry customers, and every second counts. But what if, in the middle of dinner service, a food critic walks in and asks for detailed recipes, nutritional information, and a history of every dish you've ever served?

If the same chefs handling customer orders had to stop and dig through old recipe books, calculate nutrition facts, and compile a history of dishes, the kitchen would grind to a halt. Orders would get delayed, customers would be unhappy, and the overall dining experience would suffer.

KitchenFlames

This is what happens when you don't separate analytics from your operational data store.

The Operational Data Store (ODS): The Cooking Station In our kitchen analogy, the Operational Data Store (ODS) is like the cooking station. It's where all the action happens, data is created, updated, and used in real-time, just like how chefs are constantly chopping, sautéing, and plating dishes for customers. The ODS is optimized for speed and efficiency, designed to handle high volumes of transactions quickly and reliably.

Analytics: The Food Critic's Desk Analytics, on the other hand, is like the food critic's desk. It's where deeper insights are drawn, trends are analyzed, and strategic decisions are made. This process requires access to large amounts of historical data, complex calculations, and the ability to look at the data from various angles, similar to how a critic might assess a dish from taste, presentation, and nutritional value perspectives.

Why They Should Be Separate

If you were to try and handle both cooking and food critics' demands from the same station, you'd overwhelm your chefs, slow down service, and ultimately hurt your restaurant’s performance. The same thing happens in your data architecture if you try to process analytics directly from your ODS:

Performance Hit: Just like chefs can’t keep up with orders if they’re constantly interrupted, your operational systems slow down when bogged down by complex queries.

Data Contamination: Mixing operational and analytical processes can lead to inconsistencies in data. It’s like a chef accidentally mixing salt instead of sugar, small errors can have big impacts.

Scalability Issues: As your restaurant (or business) grows, the demands on both your ODS and analytics will increase. Keeping them separate allows each to scale according to its own needs.

The Solution: A Separate Analytics Kitchen

In a well-run restaurant, there’s a separate space where food critics are entertained. They have access to all the data they need, but they don’t interfere with the daily operations. Similarly, in data architecture, we separate the analytics environment from the operational data store.

Data Warehouses: These are specialized kitchens designed for analytics. They store historical data in a specific way to support complex queries, and are optimized for analysis without affecting operational performance.

ETL Processes and Data Pipelines: Just like how the best ingredients are carefully selected and prepared before they reach the kitchen, data is extracted, transformed, and loaded (ETL) from the ODS into the data warehouse. This ensures that the data used for analysis is clean, consistent, and ready for deep dives.

Kitchen Multi-Station

Conclusion: Keep Your Kitchens Separate!

By keeping your operational and analytical processes separate, you ensure that both run smoothly, customers get their food on time, and critics get the detailed information they need without disrupting the flow. In the world of data architecture, this means faster operations, more accurate insights, and a more scalable and resilient system.

And on that note... I think I'm in the mood to cook. Maybe some shrimp in a roasted red-pepper and white-wine cream sauce on delicious home made pasta... I like quality in my cooking, almost as much as I like it in my data!

trillabit

About TrillaBit

TrillaBit is an analytics and business intelligence (BI) software company founded in 2022 and headquartered in Toronto, Canada. The company provides a no-code, search-driven, low-cost analytics cloud platform tailored for B2B SaaS providers. TrillaBit's platform, Quick Intelligence, delivers fast, easy, and secure access to data, allowing product owners and business users to create dashboards and visualizations without developer assistance.

The TrillaBit platform is a hosted, highly dynamic, meta-data engine that points to client data stores and automatically generates queries based on configuring tokens in search (AKA Search-driven analytics). The platform creates smart visualizations then allows users to refine the results. This ease of use and exploration allows end users to quickly dive into the data and create their own dashboards to derive further insights instantly. All without the need to rely on developers.

Quick Intelligence is designed to handle complex security scenarios, including multi-tenant environments, and supports a wide range of data sources. TrillaBit emphasizes data monetization through visualization, making data easily accessible and actionable for its users. The platform is scalable and affordable, designed to meet the needs of both small and large enterprises. For more information about TrillaBit, please visit: www.TrillaBit.com or [email protected]

Snowflake is here

· 3 min read

WE’RE RUNNING ON SNOWFLAKE!

leap

How it was

So we’ve been doing this for a while. Decades actually. Early on I was running denormalized dimensional modeling on row based databases and squeezing every ounce of performance out if it that I could. Pre-aggregating years of transactional data and preprocessing as much as I could to get that fast end user experience. Of course this wasn’t highly efficient or cost effective, and we couldn’t easily or dynamically get back down to fine grained data.

Back then we poured over every detail down to the actual hardware, io and memory etc. (it mattered what the disk controller was and how our raid array was configured), we kept processing as close to the data as possible… basically because we had to, to get any kind of real performance. Then a world of change happened. Distributed processing and columnar stores. Eventually columnar stores became the de facto standard for analytics. This makes a lot of sense. It’s more aligned with how data is read for analytics, reduces io with higher data compression rates, and the models lend themselves better to distributed processing.

Then came big data - and with it columnar based file formats from hadoop, like parquet and orc. The cloud became a bigger thing and data lakes were the way to go. But they weren’t something that was prepackaged for you like the databases of yore. You had to build them almost from scratch, and it wasn’t easy. Your query engine was separate from your index store and separate from your data storage. You needed to handle the writing with integrity on failure. Understanding Hadoop was a big thing and tools felt like they were lego bricks you needed to click together in just the right way.

snowflake

A better way

With the arrival of Snowflake things changed for the better again. It handled so much for you, providing that ‘database engine feel’ on big data infrastructure. Beautiful! Because of this, Snowflake became popular in no time. It grew like crazy and became a desired tech because it made modern approaches more accessible.

What was making big data costly and expensive was resourcing (human, hardware), Snowflake helped cut those by abstracting all the complexity of big data ecosystems from developers and letting you just write basic SQL. It handled the hardware aspects of the lake, instead of spinning up and managing a farm of hardware and machines to process everything. Snowflake just took care of it all. The separation of storage and compute allowed you to minimize your data footprint, and maximize your processing… elasticity! Running the required resources for a particular process for a limited period of time lowered your cost (the headache of dealing with node failure was gone too). Your developers could focus on implementing business needs rather than constantly maintaining and enhancing the big data cluster.

trillabit

Our Value Add

Enter onto the scene the next step in capabilities and simplification - TrillaBit. We created a smart low-code analytics layer that dynamically runs on top of highly performant and efficient analytic processing platforms like Snowflake. We help you leverage your Snowflake investment further by letting users drill into and explore data at their whim, without knowing sql or other underlying tech.

TrillaBit can now simply point to an instance or multiple instances of Snowflake, locally or globally to provide self service analytics to users in the most efficient and cost effective way.

Thanks,

Keith

quote

“We are the music makers and we are the dreamers of dreams” ~ Willy Wonka"

contact us!