{"id":2923,"date":"2007-12-07T14:33:00","date_gmt":"2007-12-07T14:33:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/pfxteam\/2007\/12\/07\/parallelizing-a-query-with-multiple-from-clauses\/"},"modified":"2007-12-07T14:33:00","modified_gmt":"2007-12-07T14:33:00","slug":"parallelizing-a-query-with-multiple-from-clauses","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/dotnet\/parallelizing-a-query-with-multiple-from-clauses\/","title":{"rendered":"Parallelizing a query with multiple \u201cfrom\u201d clauses"},"content":{"rendered":"<p class=\"MsoNormal\"><span><font face=\"Calibri\">Consider a simplified version of <a class=\"\" href=\"https:\/\/blogs.msdn.com\/lukeh\/archive\/2007\/10\/01\/taking-linq-to-objects-to-extremes-a-fully-linqified-raytracer.aspx\">Luke Hoban&#8217;s LINQ ray&nbsp;tracer<\/a><\/font><\/span><\/p>\n<p class=\"MsoNormal\"><span>var<\/span><span> Xs = <span>Enumerable<\/span>.Range(1, screenWidth);<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>var<\/span><span> Ys = <span>Enumerable<\/span>.Range(1, screenHeight);<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span><\/p>\n<p>&nbsp;<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>var<\/span><span> sequentialQuery = <span>&nbsp; <\/span><\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>from<\/span><span> x <span>in<\/span> Xs<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>from<\/span><span> y <span>in<\/span> Ys<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>select<\/span><span> <span>new<\/span> { X = x, Y = y, Color = TraceRay(x, y) };<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span><\/p>\n<p>&nbsp;<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span><font face=\"Calibri\">If the screen width is much larger than the screen height, we would choose to parallelize the computation along the screen width. This is because PLINQ would be more beneficial when the overhead to parallelize a given of piece work into n &ldquo;chunks&rdquo; is much lower than the running the &ldquo;chunk of work&rdquo; itself. Since the screen width is larger, there is more potential for creating bigger chunks of work and in turn getting a better payoff for the overhead of parallelizing as compared to the height. To change <\/font><\/span><span>sequentialQuery <\/span><span><font face=\"Calibri\">to use PLINQ, we need to modify it to<span>&nbsp; <\/span><span>&nbsp;<\/span><\/p>\n<p><\/font><\/span><\/p>\n<p class=\"MsoNormal\"><span>var<\/span><span> parallelAlongWidthQuery = <\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>from<\/span><span> x <span>in<\/span> Xs<b><span>.AsParallel() <\/span><\/b><span>\/\/Parallel along screen width<\/span><\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>from<\/span><span> y <span>in<\/span> Ys<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>select<\/span><span> <span>new<\/span> { X = x, Y = y, Color = TraceRay(x, y) };<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span><font face=\"Calibri\">Let us look at <\/font><\/span><span>parallelAlongWidthQuery <\/span><span><font face=\"Calibri\">more closely. We can derive the exact methods that will be invoked for this query by referring to the <\/font><\/span><a href=\"http:\/\/msdn2.microsoft.com\/en-us\/library\/bb308966.aspx#csharp3.0overview_topic18\"><span><font face=\"Calibri\">MSDN documentation on C# 3.0<\/font><\/span><\/a><font face=\"Calibri\"><span> &ndash; <\/span><b><span>26.7.1 Query Expression Translation. <\/span><\/b><span>Section<b> 26.7.1.4 <\/b>explains that the <\/span><\/font><span>parallelAlongWidthQuery <\/span><font face=\"Calibri\"><span>will be translated to something like &#8211; <\/span><span><\/p>\n<p><\/span><\/font><\/p>\n<p class=\"MsoNormal\"><span>(Xs<b><span>.AsParallel()<\/span><span>)<\/span><\/b>.SelectMany(<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>x =&gt; Ys, <\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>(x, y, Color) =&gt; <span>new<\/span> {X = x, Y = y, Color = TraceRay(x, y)});<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span><\/p>\n<p>&nbsp;<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span><font face=\"Calibri\">The AsParallel () call converts Xs to an IParallelEnumerable type. [IParallelEnumerable enables the &ldquo;parallel&rdquo; implementations of the standard query operators] So, this query will bind to the Parallel version of SelectMany. When we run the application, we will notice that the query executes across all available processors on the machine.<\/p>\n<p><\/font><\/span><\/p>\n<p class=\"MsoNormal\"><span><font face=\"Calibri\">We have successfully parallelized the application!!<\/p>\n<p><\/font><\/span><\/p>\n<p class=\"MsoNormal\"><span><\/p>\n<p><font face=\"Calibri\">&nbsp;<\/font><\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><font face=\"Calibri\"><b><span>However, assume that our dataset was different to begin with<\/span><\/b><span> and the screen height was much larger than the screen width. Using the same logic for parallelizing along the larger sized dataset, we would choose Ys this time. To achieve this, it might be tempting to change the query to <\/p>\n<p><\/span><\/font><\/p>\n<p class=\"MsoNormal\"><span>var<\/span><span> parallelAlongHeightQuery = <span>&nbsp;&nbsp;&nbsp;&nbsp; <\/span><\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>from<\/span><span> x <span>in<\/span> Xs<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>from<\/span><span> y <span>in<\/span> Ys<b><span>.AsParallel()<\/span><\/b><span>\/\/Parallel along screen height??<\/span><b><span><\/p>\n<p><\/span><\/b><\/span><\/p>\n<p class=\"MsoNormal\"><span>select<\/span><span> <span>new<\/span> { X = x, Y = y, Color = TraceRay(x, y) };<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><b><u><span><\/p>\n<p><span><\/span><\/p>\n<p><\/span><\/u><\/b><\/p>\n<p class=\"MsoNormal\"><span><font face=\"Calibri\"><\/font><\/span>&nbsp;<\/p>\n<p class=\"MsoNormal\"><span><\/span><span><font face=\"Calibri\">Let us apply the query translation to <\/font><\/span><span>parallelAlongHeightQuery.<\/span><span><font face=\"Calibri\"> Using the <\/font><\/span><a href=\"http:\/\/msdn2.microsoft.com\/en-us\/library\/bb308966.aspx#csharp3.0overview_topic18\"><span><font face=\"Calibri\">MSDN documentation on C# 3.0<\/font><\/span><\/a><font face=\"Calibri\"><font size=\"3\"> <\/font><span>rules, we can say that <\/span><\/font><span>parallelAlongHeightQuery <\/span><font face=\"Calibri\"><span>will be translated <\/span><span>to something like &#8211; <\/span><span><\/p>\n<p><\/span><\/font><\/p>\n<p class=\"MsoNormal\"><span>(Xs<b><span>)<\/span><\/b>.SelectMany(<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>x =&gt; Ys<b><span>.AsParallel()<\/span><\/b>, <\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>(x, y, Color) =&gt; <span>new<\/span> {X = x, Y = y, Color = TraceRay(x, y)});<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span><\/p>\n<p>&nbsp;<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><font face=\"Calibri\"><span>Xs is of Enumerable type and the compiler will bind to the LINQ version of SelectMany and not the expected parallel version! <\/span><span>When we run the application, we will notice that the query <b>does not<\/b> execute across all available processors on the machine.<\/p>\n<p><\/span><\/font><\/p>\n<p class=\"MsoNormal\"><font face=\"Calibri\"><span>The 2 &ldquo;from&rdquo; clauses are not <b>symmetric<\/b> as they seem!<\/span><span><\/p>\n<p><\/span><\/font><\/p>\n<p class=\"MsoNormal\"><span><font face=\"Calibri\">A good way to force parallelization across the screen height is to specify it as the first data source &#8211;<\/p>\n<p><\/font><\/span><\/p>\n<p class=\"MsoNormal\"><span>var<\/span><span> parallelAlongHeightQuery =<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span><\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>from<\/span><span> y <span>in<\/span> Ys<span>.AsParallel()<\/span><span>\/\/Parallel along screen height<\/span><\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>from<\/span><span> x <span>in<\/span> Xs<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>select<\/span><span> <span>new<\/span> { X = x, Y = y, Color = TraceRay(x, y) };<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span><font face=\"Calibri\">This will translate to something like &#8211; <\/p>\n<p><\/font><\/span><\/p>\n<p class=\"MsoNormal\"><span>(Ys<b><span>.AsParallel()<\/span><span>)<\/span><\/b>.SelectMany(<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>x =&gt; Xs, <\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span>(x, y, Color) =&gt; <span>new<\/span> {X = x, Y = y, Color = TraceRay(x, y)});<\/p>\n<p><\/span><\/p>\n<p class=\"MsoNormal\"><span><font face=\"Calibri\">Since Ys is now an IParallelEnumerable, this query will bind to the &ldquo;parallel&rdquo; version of SelectMany. When executed, we will notice that the query now runs across all available processors.<\/p>\n<p><\/font><\/span><\/p>\n<p class=\"MsoNormal\"><span><font face=\"Calibri\">One more point to note is that even though SelectMany accepts mutliple datasources, PLINQ will partition work across the first datasource only. This is because it assumes that partioning across just one of the dataset will give enough parallelization and speedup.<\/p>\n<p><\/font><\/span><\/p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Consider a simplified version of Luke Hoban&#8217;s LINQ ray&nbsp;tracer var Xs = Enumerable.Range(1, screenWidth); var Ys = Enumerable.Range(1, screenHeight); &nbsp; var sequentialQuery = &nbsp; from x in Xs from y in Ys select new { X = x, Y = y, Color = TraceRay(x, y) }; &nbsp; If the screen width is much larger than [&hellip;]<\/p>\n","protected":false},"author":480,"featured_media":58792,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[7908],"tags":[7911,7909,7910],"class_list":["post-2923","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-pfxteam","tag-code-samples","tag-parallel-extensions","tag-plinq"],"acf":[],"blog_post_summary":"<p>Consider a simplified version of Luke Hoban&#8217;s LINQ ray&nbsp;tracer var Xs = Enumerable.Range(1, screenWidth); var Ys = Enumerable.Range(1, screenHeight); &nbsp; var sequentialQuery = &nbsp; from x in Xs from y in Ys select new { X = x, Y = y, Color = TraceRay(x, y) }; &nbsp; If the screen width is much larger than [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/2923","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/users\/480"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/comments?post=2923"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/2923\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media\/58792"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media?parent=2923"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/categories?post=2923"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/tags?post=2923"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}