This query does something interesting, but I have no clue why.
Dear Data-Traveller, please note that this is a Linkedin-Remix.
I posted this content already on Linkedin in May 2022, but I want to make sure it doesn´t get lost in the social network abyss.
For your accessibility-experience and also for our own content backup, we repost the original text here.
Have a look, leave a like if you like it, and join the conversation in the comments if this sparks a thought!
Screenshot with Comments:
Plain Text:
Tales from SQL land – #3
This query does something interesting, but I have no clue why.
Sometimes SQL feels like a different planet for me. You can get it to do things that I thought would be unlikely to achieve, and you do this in ways that are even harder to understand.
From the workbench:
I want to eventify aggregated data. That means I have rows with a metric column, let’s say: views. To make it simple, two columns: date & views.
Now I want to create one row per 1 view. Why? Because I want to push this data into PostHog to build some funnels, I need it in an atomic event format. With python super easy.
But how to achieve that with SQL.
My approach – check the developer’s second brain: Stack Overflow.
And yes, I find some approaches. Some are using recursive approaches. Recursion was always some kind of black magic in programming to me – so I skipped this.
Another approach is using a numbers table. In some cases, just some system tables are misused to generate a table with values from 1..X. dbt Labs has their utils package, and here you can find: dbt_utils.generate_series(upper_bound=1000)
Now comes the pure magic part and the part I can’t fully explain to you (embarrassing, I know – but we are talking about SQL, not Python here).
To get 4 rows for a date with 4 views, you do this:
inner join numbers as nb on t.views >= nb.generated_number
You use an INNER JOIN with the numbers table and the sum of views as matching criteria.
I still want to take some time to dissect why it works as it works. So, if anyone can easily explain it – go for it in the comments.
Takeaway: you know how to eventify things in SQL. And most important – use Stack Overflow to find ways to query stuff you haven’t queried before.