Ramblings about data communities and your contributions, no excuses

I have been active in the data community throughout my career. I have met people and made friends in the process. As I look back on it, I am thankful I was involved and participated. I firmly believe you should as well.

Contents

  • The value of data communities
  • Why you should be a contributor
  • Write about it
  • Talk about it
  • Are you ready to ramble?

The value of data communities

I want to kick off this section with my experience with community. Then delve into the value of being involved in the data community.

A little history

When I started my consulting career in SQL Server nearly 25 years ago, I was a newbie. I came from a background in Microsoft Access. At some point, I got connected with some other SQL Server professionals in Minnesota. We decided to create a user group – Minnesota SQL Server User Group. Eventually, we joined PASS.

This was my first experience in the data community. I made many friends in this community. We supported each other’s skills and career growth. We helped each other with technical issues and shared technical wins.

The value of community is community

You should definitely participate in communities. The best option is to take the time, usually once a month, to engage in community. Meeting in person is preferred because you can focus on the people and the topic. Talk to each other. If you join a virtual group, participate! Don’t do something else during the meeting. Engage in the comments, Q&A, and any banter. While virtual group meetings are more convenient, the onus is on you to interact. If there is no option to interact with each other, it is a webinar, not a user group. Find a user group.

The value of the data community is the community. Yes, we can and will learn from each other. Knowing and networking with peers leads to more growth and maturity as a person and a professional.

Why you should be a contributor

Simply put, if everyone is consumer, the community dies. The community is not intended to be a school with a couple of teachers and a bunch of students. In a true community, we are all contributors. Every user group has consumers and contributors. To be clear, not everyone who contributes leads the group or gives talks. Some ask questions, others stick around for discussions, and some extend a hand to welcome others. Consumers come, listen, and leave. Introversion is not a good excuse. Some of the best contributors I know are introverts.

Storytelling

We all have a story to tell from “it’s all new to me” to “I have been doing this for 20 years.”  Everyone can give back to group through questions, advice, inclusion, and insights. The key is proactively engaging and including. In this way, we build friendships, grow careers, and expand horizons. You miss out on all this if you only consume.

I am going to expand on two specific types of contributions in the remainder of this post – writing and talking. These are two ways to tangibly contribute to the community.

Write about it

When you write it down, you will remember it more. Writing for others forces clarity, accuracy, and defensibility. I call this “writing with CAD.” When we write to share with the community, we are compelled to write this way.

Clarity

Writing for others forces us to be clear about the topic. We need to organize our thoughts and write with purpose. We have to answer questions about what we are writing and does it make sense. I think this includes good editing. I highly recommend using tools that check grammar and spelling like Microsoft Editor in Office and Edge. Be careful using tools like Grammarly and Copilot. Use AI to clarify thoughts and not generate them.

Accuracy

Accuracy is especially important in technical writing. We should never assume our readers can fill in the blanks. If you’re writing a step-by-step blog, make sure you have all the steps. Be sure to include any context or assumptions. While we cannot guarantee that we didn’t miss anything, we should do our best to be precise so our readers can reproduce, practice, or implement what we are writing about.

Defensibility

Are you prepared to defend what you are writing about? I don’t mean this negatively. You should be able to explain why you did it that way or why you think you are correct. Sometimes this is part of the comment or post. Other times you just need to be prepared to answer questions. Defensibility is about being prepared.

To be clear, defensibility does not mean you are always right. Be prepared to hear new ideas and accept corrections. You can’t and won’t know everything. But it is important to know your “why.”

My motivation to write

When I started to blog, I made the mistake of writing for others. I made a decision early on to change the goal of my content. No longer would I try to write about what I think people would read. I decided to write for me. My technical writing became my personal knowledge base. If no one reads it, oh well. It was the best decision I made, and I recommend it for new bloggers all the time.

One great example from my writing is my series on Excel.  I was embedding Excel workbooks into SharePoint. The workbooks were backed by SQL Server Analysis Services cubes. The goal was to make elegant dashboards without looking like Excel. This tips and tricks series has many of my most read posts and some are still being viewed today. I wrote them as a reference for me, and others still find them helpful.

Next, I want to look at some ways you can contribute to the community through writing. It’s not just about blogging.

Where to write

If you’re interested in starting to write, here are some good options. If you’re already writing, maybe try something new.

Blogging

This is where I started in 2010. However, it’s not necessarily the easiest option. There are many decisions to be made to launch a blog.

  • Where to host it? I use and like WordPress. We are still using the free version. I have seen blogs on Medium lately. Check out this article on the best options for free blogging platforms.
  • What to call it? You can be creative here.
  • Deciding on your first article.
  • Where to promote it? LinkedIn, Facebook, X?

One reason I like blogging is that I own and control my content. I can point people to my blog, and it is clearly my work.

LinkedIn articles

LinkedIn articles can be a nice way to start writing. You can also promote a newsletter for people to subscribe to. It comes with a built-in promotional platform. You are on LinkedIn after all.

Data on Rails

Not sure where to start? We have a shared blog site where you can write a couple of posts to see if you like blog writing. We will promote your work as well.

If you like it and want to start your own blog, great! We encourage you to take your content to help kick it off. You are always welcome to keep writing here if you prefer to.

Commenting

The last area I want to cover is commenting. This is a great way to piggy back onto topics, content, or questions posted by others. You can share your thoughts, insights, and stories with others easily. Some of these can serve as prompts for your own content. Here are a few options for joining the conversation.

  • LinkedIn
  • Reddit
  • Microsoft forums

Talk about it

Most technologists are afraid of public speaking. Even now some of you are getting queasy just thinking about it. But talking about what you know and what you are learning is a great way to give back to the community. Speaking on a topic requires you to succinctly describe what you are talking about.

I have used speaking opportunities to share my experience with a product, pattern, and code. What I have found is that I have gaps to fill in about the topic. So, I learned more than I knew before I started.

I know getting started can be hard. Here is a pattern that may be helpful.

  1. Choose
  2. Prepare
  3. Practice
  4. Present
  5. Improve

Let’s break these down.

Choose

When you start out, choose something you are familiar with and think is cool. You should be excited and comfortable with your topic. Also, try to be concise. A narrow topic is easier to prepare for.

Prepare

This is where most people get stuck. They start out with the wrong questions. What do I need? A presentation? A demo? Wrong questions? What you really need is the outline. An outline will help you stay focused. Start out by identifying three, no more, no less, points to make about your topic. I would recommend writing them down.

Once you have them ready, fill in the blanks.” What do you want to say about each point? Do you have sample code? A picture of a whiteboard? Lessons learned? Compile these into a document. Now you can expand your outline. It could look something like this :

  • Topic: working with window functions in SQL Server
  • What are window functions
    • What they do
    • Why I needed them
  • How to build one
    • Sample code
    • Define key functions
      • Partition
      • Over
      • Order by
  • Aggregation
    • Sample code of my use case

The presentation

Now you can build a presentation. You should start a slide deck with the following slides.

  1. Title. Includes topic and your name.
  2. Who you are. Name, role, something interesting about you.
  3. Introduction. Topic, why the topic interests you.
  4. First point.
  5. Second point.
  6. Third point.
  7. Lessons learned. How did it help you? Or a quick summary about how to use it.
  8. Thank you. Q & A, references.

You may need extra slides for some of your points, especially if you have sample code or diagrams. Don’t add to many extra slides. Remember that the slides support what you are talking about. If want to read something, use a printed document. DO NOT JUST READ YOUR SLIDES!

Practice

People practice in different ways. You should try a couple to find what works best for you. Whatever works for you is the right way for you to prepare. Here are some examples that you could use:

  • Practice in front of a mirror
  • Use PowerPoint’s timing feature
  • Run through it with a friend
  • Do a dry run with an experienced speaker
  • Rehearse in your head

Present

You get to do your presentation. Exciting!

Improve

After your presentation be critical of your presentation, in a good way. We can always improve. If the event has reviews, use to make improvements. Don’t try change everything, focus on one or two things.

Demos are risky. If something goes wrong, have a backup ready. You don’t want to troubleshoot your demo live. I had slides with screenshots ready to go. Everyone has demos fail. I don’t recommend demos for your first presentation.

Where to speak

Next, we will look at some good opportunities to speak at.  

User groups

User groups are a great opportunity to speak to a friendly audience.

Lunch and learn

Lunch and learns are an informal way to get comfortable talking about your topic. Usually, you do these with your peers at work or with a client.

Small conferences

SQL Saturdays, Days of Data, and Data Saturdays are examples of small conferences. These are the next step after user groups.

Calls for presenters

You can find many opportunities to present when you are ready to stretch your wings here.

Career growth

While community involvement benefits the community, it also benefits your career. Not only can you build up your resume, but you can also build up your professional network.

Are you ready to ramble?

Well, if you made this far, I hope I have inspired you to get involved. Many people have started out small and grew their professional career using these activities. Everyone can contribute, even you. Let us know what you do in the comments. We would love to hear from you.

Ramblings of a retired data architect

Let me start by saying that I have been working with data for over thirty years. I think that just means I am old. I thought it would be fun to discuss my thoughts on various topics. This is my take and some of my thoughts are definitely “tongue in cheek.” So, enjoy the ride and feel free to share your take in the comments.

SQL Saturday Baton Rouge 2025

SQL Saturday Baton Rouge is a great event hosted on LSU’s campus that has some of the best food, presenters, and attendees in the country! Had some incredible questions for my session and learned a ton throughout the conference. Thank you to all the sponsors and volunteers who make this event happen year after year (16 years running!). It’s one of the first events I ever spoke at (https://dataonwheels.wordpress.com/2022/08/23/sql-saturday-baton-rogue-recap/), and I’m so excited I was able to speak here again!

Here’s a link to the event where you can find other incredible speakers and sessions! Pictured below are some of my SML (Saturday Morning Learning) buddies – Dominick Raimato, Jason Romans, Kevin Pereira, and me! Thanks Sheila Romans for the great picture of us nerds!

My session – Mastering Microsoft Fabric Data Warehouses Tips & Tricks

Thanks to everyone who came out to my session and came with me on a journey to kill Pippin’s long-running query, create case insensitive warehouses, and monitor our warehouse and lakehouse queries on a single pane of glass. You guys were such an engaged audience and I loved being able to laugh and commiserate about our SQL experiences together!

You can find the presentation and resources here – https://github.com/Anytsirk12/DataOnWheels/tree/main/2025%20Presentations/2025%20SQL%20Saturday%20Baton%20Rouge

Jason Romans – How to Diagnose a Slow Power BI Report

Jason gave an awesome session about how to use external tools to diagnosis and even proactively solve a slow Power BI Report. Check out his session live if you get a chance, lots of great slides and demos! Below is the abstract from his session, be sure to follow the DAX Shepherd.
https://thedaxshepherd.com/presentations/

Users complain that a Power BI report is so slow that it is not worth using. A Power BI report may not perform well for many reasons, but how do you know where to start? In this session, I will show you some of the main tools I use to dive deep into the common and the not-so-common performance issues I run into. Starting from a slow report, we will learn how to diagnose where the problem is. Beginning with Performance Analyzer in Power BI, we can start to figure out where the issue is. Is it the visual that is slow, or are the DAX queries taking too long?

Next, we will use DAX Studio to tune the DAX queries that are being generated. Along the way, we will look beneath the visuals to see how Power BI is architected so we can better understand how to write better code from the start. We will examine Query Plans and Server Timings to evaluate which of our changes are more beneficial.

Lastly, we will examine methods for detecting and preventing performance issues before they occur. Using Tabular Editor, we will learn how Best Practice Analyzer can detect performance issues with your semantic models.

Kevin Feasel – Applying Forensic Accounting Techniques using SQL and Python

Below are my stream of conscious notes from this session, it was a ton of fun to think of forensically investigate a SQL environment using python. Session was packed with great examples and was an absolute blast to play detective. Kevin runs curated SQL which is an awesome site to get a “curated” list of recent blogs with incredible insights across the data community.
https://curatedsql.com/

Forensic accounting deals with primarily investigation and litigation. Goal is to connect the dots on potential fraud in a way that will stand up in court.

Step 1 – Summary Analysis

  1. Find largest table in terms of records or size
  2. For largest table, review data types and keys
  3. Look at foreign keys off that largest table and do the same analysis over and over.

You end up with a database model. That gives you a high-level view of the database.
There will be some python notebooks available in the future that will walk through how to explore a sample database.

Pandas has a command called df.describe(include=’all’) that will give you stats on a dataframe. Pretty neat way to easily get some info on numeric values especially. Note – you should convert the date fields to be dates within the dataframe (likely default to objects) then you’ll get similar stats for the date columns.
Pandas also has good ways to easily group, count, and sorting (also able to do this in SQL) so you can easily see what the top artifacts are within certain categories. One example is number of vendors with certain expense categories. For example, maybe only one Vendor is selling cleaning supplies. There may be a reason for it, but in an auditing scenario, that is a good place to start looking for preferential treatment and fraud.

Step 2 – Growth Analysis

This looks at changes in ratios over time. Goal is to look for major changes which look to be far outside the norm. Typically this is performed against important additive measures like revenue, cost, number of customers, number of orders, and even active users as a percent of total users (a “fixed cap” ratio).

We can perform an eyeball test at this phase. We can also do comprehensive analysis using KS and PPS tests. Seaborn library in python allows you to easily create lineplots and other visuals to see how additive measures perform over time. This allows you to investigate and gut check growth over time. This can be particularly interesting if you expect a linear increase or decrease but see something more random.

Min max scaler inside of sklearn library (preprocessing) can rescale everything on a range from 0 – 1 to allow you to really compare growth of two very different scaled items.

Gap analysis – sometimes the lack of value can be more interesting. Two meanings – review of controls and where things can go wrong & what data is missing.
Controls – do people have access to edit logs, who can do what? Does data change and do we know when it does? Are we tracking failures? Double-checking calculations on reports.
Gaps and Islands – gaps in data. Gaps are elements in a sequence which are missing. Islands are contiguous sets of data up to and not including the gap.
Accountants care about sequences that should be complete like receipt numbers, purchase orders, check numbers, invoice numbers, etc. Example check 1003 is missing, your accountant wants to see the voided check.

How to solve gaps and islands. Use DENSE_RANK() to build an ordering then calculate a difference between actual value and the DENSE_RANK() to find where the gap pops up and the islands get created.

Warning with gaps – SQL will typically have gaps in IDENTITY. This happens often with rollbacks where identity value gets burned. This also applies to SEQUENCE. Building a gapless sequence is a huge hit on capacity which is why it’s not default. It forces single row insertion. Creates serializable isolation level.

Regression Analysis

Contains a dependent variable (the thing we want to explain). For example, count of invoices per month, count of invoices is the dependent variable. Dependent variable must be continuous, countable feature. Can only be one dependent variable in a regression. Multivariate multiple regression is a thing but it’s really a series of regressions. We have one or more independent variables that we think will explain the dependent variable (time, people, customer, location, etc.). We assume the independent variables will drive the dependent variable but not other independent variables. Data should ideally have a good amount of variety in the independent variables. Ideal is 30-100 rule of thumb (data points per combination of independent variables). More data is better. We also assume that the past behavior is similar to future behavior.

sklearn.linear_model import LinearRegression and sklearn.metrics import mean_square_error from python make conducting a linear regression super easy to implement. You need to fit the model to see how the dependent variable is impacted by the provided independent variables. sm_model.sumary() will give you the OLS regression results. Be sure to use what you learned from your gut check line charts. For example, if there’s a significant deviation from the line in a certain year, try building the linear regression for before and during that significant difference occurs and compare that to what you get after. This may also help you find the true outliers that led to the craziness overall.

Cohort Analysis – looking at top-level aggregates or by a single slice might help identify problems. Differences should be able to be explained by domain experts.

  • Duplication of entries
  • frequency of one thing versus another
  • changes in frequency over time
  • shape of numeric distribution

Intuition on duplication

  • People don’t like to perform more work than necessary
  • when given an option between more work and less work, people will generally choose to less work
  • duplication is usually considered more work
  • if the low-work is easy to understand, people are more likely to gravitate to it ceteris paribus

These are not rules of nature, just intuitions.

Digit analysis – you should expect a uniformed distribution of the last digit in a sequence. There are exceptions. Budgets usually are rounded off and end with 5s or 0s. Estimates in general are also rounded to 5 or 0. Less precise measurements often end in 0.

When people try to fake data, they tend to go with “more random” numbers like 2, 3, 7, and 8. Benford’s law – unlike the final digit, the first digit of a sequence often follows a power law because 1 is more likely to occur than 2, etc. Works best when there’s 2-3 orders of magnitude or more. Pretty neat, definitely look up Benford’s law.

Python library is called beautiful soup for this. Pretty cool.

Keep records of your findings. As you dig, you will want to have that data readily available for future reference.

Jonathan Stewart – Building scalable data warehouses in Fabric

Jonathan walks through the basics of building a data warehouse that can easily grow without taking down capacity. He walks through best data integration options, star schema principles, and some great data modeling best practices. Column selection, JOIN, and aggregation optimization techniques were the focus.

V-Ordering, when should you do it? It’s enabled by default for read optimization on the data warehouse but off by default for data lakehouses to prioritize data timeliness.

  • Direct Lake Semantic models
  • If user experience is priority over data timeliness

Size Optimization Strategy = choose the smallest data type that supports your data requirements. Plan for scaling, but realistically. A column that is yes/no will never have more than 3 characters so it can be a VARCHAR(3).

His abstract:

Learn how to design and implement scalable data warehouses using Microsoft Fabric. This session will cover best practices for data storage, management, and retrieval, as well as the use of Fabric’s data warehousing features. Attendees will gain insights into creating efficient data pipelines, optimizing performance, and ensuring data consistency.

SQL Saturday NYC Recap 2025

My first time at SQL Saturday New York City was an absolute blast! Thank you so much to the organizers and the incredible people who chatted with me during and after the event, the people always make these events worth the trip. Here’s the link to the event, feel free to connect with speakers that you wish you would have been able to hear from! You never know, they may give the talk near you sometime! https://sqlsaturday.com/2025-05-10-sqlsaturday1105/

My Session – Mastering Microsoft Fabric Data Warehousing: Tips & Tricks You Need to Know

Thank you so much to everyone who came to my session! I need to do some more tongue twisters to not mix up Warehouse and Workspace when I do this talk in the future but thank you all for laughing with me. There were some incredible questions and comments, and I loved the conversations that came from this session. This was my first time giving this session, so please feel free to reach out with any feedback you may have, I love evolving my sessions to be more useful to everyone who attends.

Here’s a link to my GitHub that has all the slides, monitoring scripts, and Power BI report used to monitor the queries: https://github.com/Anytsirk12/DataOnWheels/tree/main/2025%20Presentations/2025%20SQL%20Saturday%20NYC

Women in Technology Panel

This was also my first time speaking on a Women in Tech (WIT) panel! It was incredible to sit up there with some extremely talented women and discuss what “Dangerous Phrases” in business impact us and others in the community. The discussion was electric, and I loved hearing all the varying experiences we’ve had and discussion on how we can grow together to make an even brighter future. Can’t wait for the day when there’s so many qualified, incredible women speakers that we have lines in the rest room and a panel with all the women speakers at an event isn’t possible. For context, this event had 35 wonderful speakers, 6 of which were women (17%).

John Miner – Create an analytics foundation using Fabric Warehouse

GitHub link= https://github.com/JohnMiner3/community-work/tree/master/analytics-foundation-fabric-warehouse

Don’t be afraid to use GenAI to help rewrite scripts to be compliant with Fabric.

Definitely look at the make metadata notebook TSQL notebook, great structures to replicate when building a metadata driven approach. Even covers building a log for your ETL process! Super neat.

Paul Turley – Moving from Power BI to Microsoft Fabric

Slides: https://sqlserverbi.blog/presentations

With Fabric, you now have the option to push elements of BI upstream. Medallion architecture is a common way we move data from ingestion to cleaned up and ready for reporting.

Semantic modeling options

  • Vertipaq (import mode)
    • In-memory cache
    • refreshed
    • column store/compression
    • internal storage optimization is done with various formats of encoding
  • Direct lake
    • Optimized for Fabric lakehouse storage
    • similar characteristics to vertipaq
    • in-memory based on column use
    • compressed in parquet
    • data available within seconds of updates
    • v2 direct lake limitations will be eliminated in late 2025 – database views, calculated columns, mixed storage mode
  • Direct query
    • pass through to/from source
    • reads directly from the data source
    • DAX query generated by report and translated to source native query language
    • no model data on disk
    • no data cached in memory
    • no compression
    • row size limits
    • significantly slower in visuals

Don’t freak out, you don’t HAVE to change anything. You just have more options now.

Enterprise Principles:

  • Governance is king: ownership, business & IT
  • Traceability from source to semantic model & report – defendable numbers
  • endorsement & certs
  • scalable – storage, transformation, modeling, visual presentation
  • version control
  • continuous delivery
  • deployment

Version control – moving from Power BI to Fabric isn’t about what you have to do, but what you git to do lol

What do I have to do with Fabric?

  • Replace PBI Premium capacities with Fabric capacities
  • consider fabric data engineering patterns in the future

That’s it! You have more opportunities, but that’s all you HAVE to do.

SQL Saturday Atlanta BI Recap

https://sqlsaturday.com/2025-03-08-sqlsaturday1102/#schedule

SQL Saturday Atlanta BI is one of my favorite SQL Saturdays of the year. This year was especially sweet to see a lot of the SML (Saturday Morning Learning) crew plus even got to see a couple of first-time presenters! Pretty amazing to see this community continue to grow and shape more careers.
Thank you to everyone who attended my session 🙂 It was super fun to present to a post-lunch full room! If you weren’t able to attend (there were a ton of amazing sessions at the same time!), or want to refer back to the code we covered, please refer to this github folder for all the resources used during the demos. Thanks again to everyone who made this conference possible!
https://github.com/Anytsirk12/DataOnWheels/tree/main/2025%20SQL%20Saturday%20ATL%20BI

Power Hour- Introduction to Power BI by Christopher Small

https://www.linkedin.com/in/christopher-small-training/
https://github.com/Chris-JN/Power-Hour

Chris is a natural speaker, and it was great to see so many people eager to learn how to use Power BI for the first time! Check out his github link above for all the files used in the session and rebuild his demos!

Automated Testing with DAX Query View and Git Integration by John Kerski and Roshini Damodaran

https://www.linkedin.com/in/john-kerski/
https://www.linkedin.com/in/roshini-damodaran/

Why test?
– Find errors before customers do
– Never see the same error
– Tests are your safety net
DataOps = https://www.kerski.tech/

https://github.com/kerski/fabric-dataops-patterns/blob/main/DAX%20Query%20View%20Testing%20Pattern/dax-query-view-testing-pattern.md

DAX query view – don’t be afraid to use quick queries to easily generate DAX code for things like column stats. Column stats will give you DAX queries for a lot of information on columns/measures, including NULL counts, data types, etc.

Use performance analyzer to grab DAX queries for visuals to test.

PBIX Pattern:

  • Standardize schema and naming conventions (keep it simple)
  • Build Tests
    • Test content
    • Test calcs
    • Test schema and data types

One interesting thing to remember – in the online PBI/Fabric service, the DAX queries don’t stay with the model. That means you will need to store tests in a git repo if you want to reference them in notebooks and/or deployment pipelines.

Pretty mind blowing! Check out the git repo for some incredible code snippets to use it!

DAX 201: CALCULATE, Unfolding the Swiss Army Knife of DAX Functions by Mark Walter & Michael Hewitt

https://storybi.com/2020/10/01/learning/
https://www.linkedin.com/in/markwaltercpa/
https://www.linkedin.com/in/mdhewitt83/

Objectives:

  • Unfold CALCULATE
  • Understand where to use CALCULATE
  • What is CALCULATE doing behind the scenes

Order of evaluation is below. Each subsequent step can overwrite the step before it.
Steps 1 and 2 happen automatically. 3 – 5 are built by DAX devs within CALCULATE itself.

  1. Original report filters
  2. Table rows
    • Context transition
  3. CALCULATE modifiers
    • REMOVEFILTERS
    • USERELATIONSHIP
    • CROSSFILTER
  4. Add explicit filters
    • Table functions like ALL(), VALUES(), DISTINCT, color = Red, etc.
  5. KEEPFILTERS
    • Standalone protection. This can turn off the overwrite functionality that happens in steps 2-4 and makes 1 the priority.

Challenge #1 – USERELATIONSHIP vs TREATAS()

  • USERELATIONSHIP
    • You cannot use this on a table that leverages RLS because that only works on a single active relationship
    • Leverages inactive relationships
  • TREATAS()
    • This creates a virtual relationship and works similar to USERELATIONSHIP, but not the same as USERELATIONSHIP
    • The TREATAS() method will combine the active relationship with this virtual relationship so it will yield unexpected results at times because it effectively creates an AND filter.
    • You can use TREATAS() with REMOVEFILTERS() to allow the virtual relationship to be the only relationship that’s taken into consideration. The combination of TREATAS() and REMOVEFILTERS() will return results like USERELATIONSHIP

Challenge #2 – Slicer not filtering large orders

  • When users filter on a field that’s included in our CALCULATE statement, our filter overwrites so they don’t end up really filtering.
  • To solve, use a KEEPFILTERS() so that it’s more of an AND versus only taking the filter context from CALCULATE

CALCULATE in slow motion

Original measure that you make and pull into a visual: Sales = SUM(SalesFact[Sales]

  1. Under the hood iterator: SUMX(SalesFact,SalesFact[Sales])
    • Row context
  2. Measure sent to total sales: CALCULATE(SUMX(SalesFact,[Sales]),<Filters>)
    • Filter context
  3. CALCULATE filters the dim tables: CALCULATE([Total Sales, Customer Name = “Chris”, Vendor = “KARPARTS”, Cal Year = 2024)
    • this comes from the visual and the page filters
  4. Now our fact table will only contain values that match the filters that have been sent and will return the correct metric

CALCULATE = filter context.

Most important concept in DAX – Row Context and Filter Context.

  • Row Context: evaluates row by row, does not propagate over filters. Horizontal.
  • Filter Context: vertical. Filters propagate through relationships. Aggregating columns.
  • These work together.

Look at the slide deck, some killer examples in there about different options for syntax sugar vs different functionality. For example, CALCULATE([Sales], color = “Red”) is the same as CALCULATE([Sales], FILTER(ALL(Colors),color = “Red”) but CALCULATE([Sales], FILTER(VALUES(Color), color = “Red”) is different because the value will not display whenever red is not included in the filter/row context. Super interesting.

TREATAS() is pretty amazing. Look up more info here: https://dax.guide/treatas/

KEEPFILTERS builds a barrier around levels 2-4, preventing competing filters. You can also use KEEPFILTERS with a multiple nested calculates to respect the outer calculate filter context.

Microsoft Fabric Pipelines – Metadata-driven ETL patterns by Mike Diehl

Code and slides: https://github.com/xhead/SqlSatATL-2025
t-sql.dk/2025/02/etl-orchestration-air-traffic-control-for-data/

Why use metadata for Fabric Piplines?

  • Reduces ETL development, increased velocity compared to tools like SSIS
  • Lakehouse tables – schema evolution (vs SQL databases)

Scenario: metadata-driven approach Target: Lakehouse in OneLake

Set of tasks:

  • Get data from source
    • load all data, overwrite all data
    • load some data in staging then incrementally merge into target
  • Dependencies
    • Load dims first then facts that depend on dims
  • Develop Pipelines at the data source type level
    • SQL server, oracle, plus auth type
    • File Source (file system, sharepoint, azure) and type (XML, JSON, CSV)

Pretty incredible process that Mike has put together, check out his slides for all the awesome illustrations of the architecture and data processing flows (https://github.com/xhead/SqlSatATL-2025).

T-SQL Tuesday #176: That Piece of Technical Advice for the Past Me

DIVERSIFY!

We devote a lot of time to mastering the technology that we are passionate about or use for our jobs. For example, I specialized in Microsoft business intelligence, which has a wide range of technologies indeed. But I was still relatively narrow in focus.

I would suggest that you broaden your knowledge and skills in related fields like security and networking. You do not need to master them, but you should have a solid understanding of the concepts. Learning related technologies enables you to communicate effectively and confidently with your colleagues about how to provide high-quality solutions to your customers or businesses.

Learn New Things

There are many ways to learn new skills. Microsoft Learn is a great place to start but not the only option. There are online training programs (Udemy, Pragmatic Works), YouTube channels (Tales from the Field, Data on Wheels), blogs (Simple Talk, Eric on Identity), events (SQL Saturday, Microsoft Ignite), and even books (Hands on Analysis Services, M365 Security). These are some ways you can learn.

Get Certified

Certifications require you to show that you have a solid grasp of the topic. You have to be able to reason through concepts and respond to questions about the technology and its use. While certifications do not make you an expert, they do indicate a good knowledge of the technology. Finishing the certification process also gives you confidence to engage in relevant discussions with other technologists about the subject. Another advantage is that you can include this in your resume.

Getting certified in your area of expertise should be a formality, getting certified in something different is a stretch!

Other Benefits

Here are some other reasons to diversify:

  • You will improve your troubleshooting skills.
  • You may discover a new favorite technology.
  • You could increase your value in the job market.
  • You have something new to share with the community.

Learn something new, share what you learn.