Performance Tuning Techniques for Daily Company Work

✨ Story Time — “Why Is My ETL Slower Than Yesterday?”

Meera is a data engineer at a fast-growing company.

Yesterday, her ETL job completed in 12 minutes.
Today, it takes 25 minutes.

Her first thought:

“Did I change anything?”

Nothing changed in SQL. But the company added more data, more users ran queries, and the warehouse size wasn’t optimized.

Meera realized Snowflake is fast, but only if you follow performance tuning best practices.

Here’s what she learned.

🧱 1️⃣ Optimize Warehouse Usage

Right-size warehouses: Small → Medium → Large based on query/data size
Enable Auto-Suspend (1–5 mins): Avoid paying for idle compute
Auto-Resume: Immediate execution without waiting
Multi-cluster warehouses: Only for high concurrency workloads

Example: Daily ETL pipeline with 200M rows → Medium warehouse with auto-suspend/resume is cost-efficient.

2️⃣ Leverage Caching

Snowflake caches:

Result Cache: Returns repeated query results instantly
Metadata Cache: Reduces query compilation time
Warehouse Cache: Speeds up repeated scans of large tables

Tips:

Keep frequently queried tables on an active warehouse for repeated runs
Avoid unnecessary warehouse resizing between queries

3️⃣ Use Clustering Keys Smartly

Clustering improves micro-partition pruning for large tables
Helps queries filter efficiently
Avoid over-clustering — increases maintenance cost

Example: CUSTOMER table partitioned by REGION → queries for a single region read fewer micro-partitions.

4️⃣ Query Optimization Techniques

Select only needed columns → reduces scan bytes
Filter early → Snowflake pushes WHERE filters for pruning
**Avoid SELECT *** for large tables
Use CTEs carefully → materialize large intermediate results only when needed
Use query profile to find bottlenecks

5️⃣ Optimize Joins

Broadcast small tables in joins
Avoid cross joins unless necessary
Push filters before join → reduces join size
Check query profile → identify slow join nodes

Real example: Joining 50GB SALES with 2MB CUSTOMER → broadcast join reduces runtime from 8 mins → 20 secs.

6️⃣ Partitioning & Micro-Partition Awareness

Snowflake automatically partitions data in micro-partitions (50–500MB)
Design queries to benefit from min/max values
Avoid full table scans for large tables when filters exist

Tip: Use date or ID filters for high selectivity → fewer partitions scanned → faster queries.

7️⃣ Monitor & Tune Regularly

Use Query Profile: Identify slow nodes, bottlenecks
Check warehouse utilization: Avoid over/under-sized clusters
Analyze bytes scanned vs rows returned
Keep ETL and BI dashboards aligned with warehouse performance

🧪 Real-World Story — Meera Fixes Slow ETL

Problem:

ETL reads 200M rows → takes 25 mins

Analysis:

Warehouse: Medium → OK
Query scanned 90% of table → filter not selective
Join with CUSTOMER table was cross join → broadcast join missing
Query did SELECT * → unnecessary columns

Fix:

Filter pushed early
Broadcast join applied
Selected only necessary columns
Auto-suspend enabled

Result: Runtime reduced to 9 minutes, cost-efficient and reliable.

💡 Key Takeaways

Right-size warehouses → cost & speed balance
Leverage caching → repeated queries run faster
Apply clustering only when necessary
Optimize queries → select needed columns, filter early
Monitor Query Profile & adjust joins
Be aware of micro-partitions → benefit pruning
Review performance regularly

Performance tuning isn’t one-time — it’s an ongoing practice.

📘 Summary

Snowflake performance tuning for daily company work involves:

Warehouse sizing & auto-suspend/resume
Smart caching usage
Efficient clustering & pruning
Query optimization & filtering
Join strategy tuning
Micro-partition awareness
Continuous monitoring & adjustment

By combining these techniques, data engineers like Meera can keep ETL, dashboards, and queries fast, reliable, and cost-effective.

👉 Next Topic

Handling Semi-Structured Data (JSON, XML, Avro)

✨ Story Time — “Why Is My ETL Slower Than Yesterday?”​

🧱 1️⃣ Optimize Warehouse Usage​

2️⃣ Leverage Caching​

3️⃣ Use Clustering Keys Smartly​

4️⃣ Query Optimization Techniques​

5️⃣ Optimize Joins​

6️⃣ Partitioning & Micro-Partition Awareness​

7️⃣ Monitor & Tune Regularly​

🧪 Real-World Story — Meera Fixes Slow ETL​

💡 Key Takeaways​

📘 Summary​