In this blog post I show you how to do a simple Monte Carlo Simulation by using DAX or Power Query. I will compare both approaches and evalute the pros and cons.

Last month a colleague of mine wrote about the amazing possibilities of the R integration into SQL Server 2016. In his blog post he showed how easy it is to do a Monte Carlo Simulation with R in SQL Server 2016. The demo case shows a calculation of the expected value of leads from our CRM.

I don’t know why, but i started thinking about the possibility of doing a Monte Carlo Simulation in Power BI only with DAX or Power Query, so without R. There is nothing against R and my colleague’s simulation is easy to transfer to other uses cases, but sometimes things arouse my curiosity even if I know that there are smarter ways.

### Monte Carlo Simulation

Monte Carlo Simulation is a random sampling methode. It can be used to find probability distributions. Sometime you get confronted with problems where it is hard to evaluate the distribution, because their are so many variable factors or you don’t know how different components interact. The idea of a Monte Carlo Simulation is to try out many different scenarios or variable configurations and see how the output of your model is behaving. I will stick to the use case of my colleague Hilmar in his article.

### The setup

Let’s say we want to evaluate the range of our sales amount from new customer leads. For that we build a little demo table. I think the table is very straight forward to keep thinks simple: all customers generate the same revenue, but the probability of getting the Job differs. To make things a bit more interesting I added the scenario column. Let’s say our company needs to make a decision that will influence the probability of getting new projects (scenario=1 and scenario=2). Every customer has a different probability for each of the two scenarios. Which one is the better scenario? To evaluate this we do a Monte Carlo simulation.

### Create a random sample with M

At the beginning we need random samples. To create this random smaple we can use Power Query to create a simple list of x elements. Every element is one Monte Carlo simulation run. By cross joining our iteration list with the shown customer table we get a table that has x entries for every row in our customer table. For example our Customer table has 20 rows and the Monte Carlo Simulation should do 100 iterations. We end up with 20*100 = 2000 rows.

1 2 3 4 5 6 7 |
let Quelle = List.Generate(()=>1, each _ < #"Number of Simulation Runs", each _ + 1), CreateTable = Table.FromList(Quelle, Splitter.SplitByNothing(), null, null, ExtraValues.Error), RenameCol = Table.RenameColumns(CreateTable,{{"Column1", "Iteration"}}), Result = Table.AddColumn(RenameCol, "temp", each Customer), ResultExpand = Table.ExpandTableColumn(Result, "temp", {"Szenario", "Customer", "Revenue", "Propability"}, {"Szenario", "Customer", "Revenue", "Propability"}) in ResultExpand |

Now we can add a new column which will generate a random number between 0 and 1. This is the random draw of the Monte Carlo Simulation. Now we add one more column ([Result]) with a simple if condition if [Random]<[Probability] then 1 else 0. [Result] shows if we got the job (1) or not (0).

Now we keep only the rows where [Result] is 1. After that we simply group by [Szenario] and [Iteration] and aggregate [Revenue] by sum operation.

In the end we have a table which shows a [Revenue] value for every iteration and scenario.

Complete M pattern:

1 2 3 4 5 6 7 8 9 10 11 12 13 |
let Quelle = List.Generate(()=>1, each _ < #"Number of Simulation Runs", each _ + 1), CreateTable = Table.FromList(Quelle, Splitter.SplitByNothing(), null, null, ExtraValues.Error), RenameCol = Table.RenameColumns(CreateTable,{{"Column1", "Iteration"}}), Result = Table.AddColumn(RenameCol, "temp", each Customer), #"Erweiterte temp1" = Table.ExpandTableColumn(Result, "temp", {"Szenario", "Customer", "Revenue", "Propability"}, {"Szenario", "Customer", "Revenue", "Propability"}), #"Hinzugefügte benutzerdefinierte Spalte" = Table.AddColumn(#"Erweiterte temp1", "Random", each Number.Random()), #"Hinzugefügte benutzerdefinierte Spalte1" = Table.AddColumn(#"Hinzugefügte benutzerdefinierte Spalte", "Result", each if [Propability] > [Random] then "1" else "0" ), #"Gefilterte Zeilen" = Table.SelectRows(#"Hinzugefügte benutzerdefinierte Spalte1", each ([Result] = "1")), #"Gruppierte Zeilen" = Table.Group(#"Gefilterte Zeilen", {"Szenario", "Iteration"}, {{"Kum Umfang", each List.Sum([Revenue]), type number}}), #"Umbenannte Spalten" = Table.RenameColumns(#"Gruppierte Zeilen",{{"Szenario", "Scenario"}, {"Kum Umfang", "Agg Revenue"}}) in #"Umbenannte Spalten" |

### Monte Carlo DAX driven

It is also possible to do the random sample drawing in DAX. For that you have to use SSAS Tabular 2016 or Power BI Desktop, the following steps are not possible with the Excel Power BI implementation. The problem is that Excel Power BI does not support Calculated Tables via DAX.

First you have to add a table to the datamodel which contains a list of numbers from 1 to x (iterations). Actually you can’t create a list or set in DAX itself, so you have to use Power Query to create a dummy list with our iterations.

let Quelle = List.Generate(()=>1, each _ < #“Number of Simulation Runs“, each _ + 1), #“In Tabelle konvertiert“ = Table.FromList(Quelle, Splitter.SplitByNothing(), null, null, ExtraValues.Error), #“Umbenannte Spalten“ = Table.RenameColumns(#“In Tabelle konvertiert“,{{„Column1“, „Iteration“}}) in #“Umbenannte Spalten“

When you have loaded the dummy table and the customer table into the data model, you can use DAX to do the same steps we have done before, but here it is a bit more compact.

1 2 3 4 5 6 7 8 |
MonteCarloResultDax = SUMMARIZE( FILTER( CROSSJOIN(Customer;DummyTable);RAND()<Customer[Propability]); Customer[Scenario]; DummyTable[Iteration]; "Summe Umfang";SUM(Customer[Revenue]) ) |

### Visualize your data

The best thing of all this stuff is that now you can use Power BI to do some nice visuals for your result set. In my example you can see the distribution of both scenarios, the divergence between scenario one and two and the winner! I also expanded it a bit so that you can filter by Customer etc. Click here to explore my model.

### Performance DAX vs Power Query

Both approaches work and can be used, but which one is faster? In the end the results are the same, both approaches only differ in the engine that does the computation(Power Query or VertiPaq-Engine). To evaluate the performance I have done a bigger simulation by using 1.000.000 Monte Carlo and stopped the time. The result is clear!

- Power Query needed 7 Minutes and 30 Seconds for creating the list, doing a cross join, drawing random samples, grouping everything by scenario and iteration and loading it to the data model.
- DAX needed 31 Seconds for creating the dummy list, loading the dummy list to the data model, doing a cross join, drawing random samples, summarize by scenario and iteration and create the calculated table.

To be honest, I thought it would be a draw or Power Query would be slightly faster, but actually the VertiPaq-Engine of DAX seems to be a real beast. I’m very impressed.

I hope you enjoyed reading this blog post.

## Update 1 :

Both Power BI Projects are avaible for download here:

Monte Carlo Simulation Power BI

Thank you for your positiv feedback!

Mai 18th, 2016 on 12:54

Very impressive work. I’m glad you did the comparison between Dax and Power Query. I like you, would have expected Power Query to be faster. Dax and Vertipaq are really powerful.

Mai 18th, 2016 on 14:06

Do you have a corresponding benchmark using R for the same date on the same machine? It would be really interesting.

Thanks and good post!

Marco

Mai 19th, 2016 on 07:21

great article! Could you share the pbix for this experiment?

Thanks!

Min

Mai 20th, 2016 on 08:55

Hi Marco,

actually i don’t have R Benchmarks, but I will create some benchmarks and hand them in here!

Thanks for reading my post.

Mai 20th, 2016 on 08:56

HI Min,

I added the pbix files (Update 1). Some columns names will be in german, but i hope thats fine.

Thanks for reading my post.

Lukas