Skip to content

Supporting _include:iterate and _revinclude:iterate #1314

Merged
limorl merged 24 commits intomicrosoft:masterfrom
limorl:feature/iterate
Oct 23, 2020
Merged

Supporting _include:iterate and _revinclude:iterate #1314
limorl merged 24 commits intomicrosoft:masterfrom
limorl:feature/iterate

Conversation

@limorl
Copy link

@limorl limorl commented Sep 30, 2020

Description

The :iterate modifier (:recurse in STU3) allows to query iteratively for resources.


For example, returning MedicationRequest ‘mr123’, and the Patient resources it references and those patient's Practitioner:
/MedicationRequest?_include=MedicationRequest:patient&_include:iterate=Patient:general-practitioner&_id=mr123

The solution is based mainly on updating SqlQueryGenerator to store a dictionary of result resource type and the cte names associated with them. This way, when iterating over results, the selection is done from the relevant cte (or ctes in case of multiple results sets).

For this to work, the IncludeRewriter should have been updated to sort the search parameters so that an _include/_revinclude parameter will appear after all the expressions yielding the result sets it selects from.

The feature doesn't fully implement the spec and some capabilities were left out following a discussion with the PG.

In scope of this PR:

  • Supporting _include:iterate expressions, with single/multiple iterations and with wildcards.
    For example:

    • /MedicationRequest?_include=MedicationRequest:patient&_include:iterate=Patient:general-practitioner&_id=mr123
    • MedicationRequest?_include:MedicationRequest:*&_include:iterate=Patient:general-practitioner

    • MedicationRequest&_include:MedicationRequest:patient&_include:iterate=Patient:*
MedicationRequest?_include:MedicationRequest:*&_include:iterate=Patient:*
  • Supporting _revinclude:iterate expressions with single/multiple iterations and with wildcard.

    • This is similar to the _include queries. However, for _revinclude we require that if the TargetType can be of multiple types, a resource type must be specified. Other a BadRequest is returned. For example:
    • 
Practitioner?_revinclude=Patient:*&_revinclude:iterate=MedicationRequest:patient
    • 
Practitioner?_revinclude:iterate=MedicationRequest:*&_revinclude=Patient:general-practitioner

    • Patient?_revinclude:iterate=MedicationDispense:*&_revinclude=MedicationRequest:*

Out of scope of this PR:

  • Support recursive _include:iterate and _revinclude:iterate queries #1310
    Recursive iterate expressions are expression where the Source and the Target are of the same type. For example:

    Organization?_include:iterate=Organization:partof
Patient&_include=Patient:organization&_include:iterate=Organization:partof


  • Supporting _revinclude:iterate expressions where there is more than a single target type and the target type is not specified:


    Unsupported: Patient?_revinclude=MedicationRequest:patient&_revinclude:iterate=MedicationDispense:performer
    Supported: Patient?_revinclude=MedicationRequest:patient&_revinclude:iterate=MedicationDispense:performer:patient

    Supported: Patient?_revinclude=MedicationRequest:patient&_revinclude:iterate=MedicationDispense:prescription (since prescription can be only of type MedicationRequest)


  • _revinclude:iterate with WildCard. If the wildcard is associated with the _revinclude:iterate parameter, it is not supported. However, _revinclude:iterate can be used to iteratively query a results set of a wildcard expression:
    
Unsupported: Practitioner?_revinclude=Patient:general-practitioner&_revinclude:iterate=MedicationRequest:*

    Supported: Practitioner?_revinclude=Patient:*&_revinclude:iterate=MedicationRequest:patient


  • Limit the number of included results for _include, and not only for _revinclude #1309
    The current implementation limits the item count of each included result set. In case where _include:iterate (or _revinclude:iterate) is applied on more than one result set, each application is limited.
    For example: MedicationDispense?_include=MedicationDispense:patient&_include=MedicationDispense:prescription&_include:iterate=MedicationRequest:patient&_include:iterate=Patient:general-practitioner
    Here, '_include:iterate=Patient:general-practitioner' refers to two included result sets (ctes): MedicationDispense:patient' and MedicationRequest:patient`. So each selection will be limited.

Related issues

Addresses #1313

Testing

  • IncludeRewriter unit tests to ensure the expressions are ordered correctly before executing them
  • IncludeSearch E2E tests are added for all the supported cases
  • IncludeSearch E2E test case is added to ensure a BadRequest is returned in case of _revinclude:iterate expression is used with multiple target resource types without specifying one.

@limorl
Copy link
Author

limorl commented Sep 30, 2020

FYI @eladiw

@lgtm-com
Copy link

lgtm-com bot commented Sep 30, 2020

This pull request introduces 1 alert when merging 98de3ca into 2e8c1b6 - view on LGTM.com

new alerts:

  • 1 for Useless assignment to local variable

Copy link
Contributor

@eladiw eladiw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work.
I had some questions and comments.

<ItemGroup>
<ProjectReference Include="..\Microsoft.Health.Fhir.SqlServer\Microsoft.Health.Fhir.SqlServer.csproj" />
<ProjectReference Include="..\Microsoft.Health.Fhir.Core\Microsoft.Health.Fhir.Core.csproj" />
<ProjectReference Include="..\Microsoft.Health.Fhir.Stu3.Core\Microsoft.Health.Fhir.Stu3.Core.csproj" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to avoid the version specific reference here?

Copy link
Author

@limorl limorl Sep 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the only way I could get it working. Had to add it when added the new IncludeRewriterTests.cs file.

Do you have any suggestion? I'd be happy to give it a try.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion above. I'm assuming this is to fulfill the requirement for ModelInfoProvider?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my reply above,

@lgtm-com
Copy link

lgtm-com bot commented Sep 30, 2020

This pull request introduces 2 alerts when merging c38223b into 6468d57 - view on LGTM.com

new alerts:

  • 2 for Useless assignment to local variable

Copy link
Contributor

@eladiw eladiw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

return new IncludeExpression(resourceType, refSearchParameter, originalType.ToString(), targetType, wildCard, isReversed);
referencedTypes = new List<string>();
_searchParameterDefinitionManager.GetSearchParameters(resourceType)
.Where(p => p.Type == ValueSets.SearchParamType.Reference).ToList()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty allocation-heavy. Also, it seems like something we could cache (and populate lazily).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. The challenge was to propagate the searchParameterDefinitionManager to the IncludeExpression, so that it will be used upon need. That didn't make much sense to me since IncludeExpression represents a parsed expression, so ideally searchParameterDefinitionManager shouldn't be used in it.

Maybe we can use a delegate that is applied once when ReferencedTypes is used and populates it. WDYT?

Copy link
Author

@limorl limorl Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, after giving it another thought, It’s worth noting that it will always be used when the expression is _include wildcard since it adds the potential reference types to the cte dictionary to be used later. .

So I suggest to at least populate it for wildcard paramegers only.

WDYT?

// Order the following query:
// [base]/MedicationDispense?_include:iterate=Patient:general-practitioner&_include:iterate=MedicationRequest:patient&_include=MedicationDispense:prescription&_id=smart-MedicationDispense-567

var refSearchParameter = _searchParameterDefinitionManager.GetSearchParameter("Patient", "general-practitioner");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion I have for avoiding references to SPDM is to just create the Search Parameter you need (e.g. new SearchParameterInfo("_id", SearchParamType.Token, new Uri(ResourceId))). If this doesn't work let me know, I'll create a task to follow up

Copy link
Author

@limorl limorl Oct 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more challenging to create SearchParameterInfo since I also need to populate the URI and the target types, etc. I think it's better to use SPDM, since this is how it is being used in production code and it populates all the fields correctly. Otherwise, we may miss populating one of the fields and we won't be testing with the right entities.

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree in principle, however, by only linking to Stu3 it misses R4 and R5 Search parameter definitions which may be different. Keeping the DataLayer projects free of version specific references has been very helpful with maintaining multiple versions of FHIR. If we want the full coverage in all versions of FHIR would it make sense to move these to https://github.com/microsoft/fhir-server/tree/master/test/Microsoft.Health.Fhir.Shared.Tests.Integration?
I know its a SQL Specific rewriter, but there isn't a SQL unit test project that offers this coverage, thoughts? //cc @johnstairs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I see it, the R4 and R5 parameter definitions are not needed for this uni tests since we generated the relevant test cases for the unit tests using Stu3 parameters. We are not checking all the possible parameters and their version fit, we are checking a sorting algorithm... we need a few parameters (not all of them)

<ItemGroup>
<ProjectReference Include="..\Microsoft.Health.Fhir.SqlServer\Microsoft.Health.Fhir.SqlServer.csproj" />
<ProjectReference Include="..\Microsoft.Health.Fhir.Core\Microsoft.Health.Fhir.Core.csproj" />
<ProjectReference Include="..\Microsoft.Health.Fhir.Stu3.Core\Microsoft.Health.Fhir.Stu3.Core.csproj" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion above. I'm assuming this is to fulfill the requirement for ModelInfoProvider?

@johnstairs
Copy link
Member

johnstairs commented Oct 9, 2020

It looks like the link url in the bundle is leaving out the :iterate parameters:

GET https://localhost:44348/MedicationRequest?_include=MedicationRequest:patient&_include:iterate=Patient:general-practitioner&_id=123
{
  "resourceType": "Bundle",
  "id": "949805aa-9bc2-4195-b581-98df15005424",
  "meta": {
    "lastUpdated": "2020-10-09T20:29:01.8167293+00:00"
  },
  "type": "searchset",
  "link": [
    {
      "relation": "self",
      "url": "https://localhost:44348/MedicationRequest?_include=MedicationRequest%3Apatient&_id=123"
    }
  ]
}

@limorl limorl requested a review from a team as a code owner October 10, 2020 16:46
@limorl
Copy link
Author

limorl commented Oct 10, 2020

It looks like the link url in the bundle is leaving out the :iterate parameters:

@johnstairs good catch!
It's since the include iterate parameters appear in the unsupportedSearchParameters due to issue: FirelyTeam/firely-net-sdk#222.

I removed them from the unsupported parameters list and added validation tests.

@johnstairs
Copy link
Member

Parenthesis indentation is inconsistent (let's follow the style of cte7):

cte7 AS
(
    SELECT DISTINCT Sid1, IsMatch, 0 AS IsPartial
    FROM cte6
),
cte8 AS
(
    SELECT DISTINCT refTarget.ResourceSurrogateId AS Sid1, 0 AS IsMatch
    FROM dbo.ReferenceSearchParam refSource
    INNER JOIN dbo.Resource refTarget
    ON refSource.ReferenceResourceTypeId = refTarget.ResourceTypeId
        AND refSource.ReferenceResourceId = refTarget.ResourceId
    WHERE refSource.SearchParamId = @p11
        AND refTarget.IsHistory = 0
        AND refSource.IsHistory = 0
        AND refSource.ResourceTypeId = @p12
        AND refSource.ResourceSurrogateId IN (SELECT TOP(@p13) Sid1 FROM cte3)
    ),
cte9 AS
    (
    SELECT DISTINCT TOP (@p14) Sid1, IsMatch, CASE WHEN count(*) over() > @p15 THEN 1 ELSE 0 END AS IsPartial
    FROM cte8
    ),
cte10 AS
    (
    SELECT DISTINCT refTarget.ResourceSurrogateId AS Sid1, 0 AS IsMatch
    FROM dbo.ReferenceSearchParam refSource
    INNER JOIN dbo.Resource refTarget
    ON refSource.ReferenceResourceTypeId = refTarget.ResourceTypeId
        AND refSource.ReferenceResourceId = refTarget.ResourceId
    WHERE refSource.SearchParamId = @p11
        AND refTarget.IsHistory = 0
        AND refSource.IsHistory = 0
        AND refSource.ResourceTypeId = @p12
        AND refSource.ResourceSurrogateId IN (SELECT TOP(@p16) Sid1 FROM cte7)
),

@limorl
Copy link
Author

limorl commented Oct 13, 2020

Parenthesis indentation is inconsistent (let's follow the style of cte7):

cte7 AS
(
    SELECT DISTINCT Sid1, IsMatch, 0 AS IsPartial
    FROM cte6
),
cte8 AS
(
    SELECT DISTINCT refTarget.ResourceSurrogateId AS Sid1, 0 AS IsMatch
    FROM dbo.ReferenceSearchParam refSource
    INNER JOIN dbo.Resource refTarget
    ON refSource.ReferenceResourceTypeId = refTarget.ResourceTypeId
        AND refSource.ReferenceResourceId = refTarget.ResourceId
    WHERE refSource.SearchParamId = @p11
        AND refTarget.IsHistory = 0
        AND refSource.IsHistory = 0
        AND refSource.ResourceTypeId = @p12
        AND refSource.ResourceSurrogateId IN (SELECT TOP(@p13) Sid1 FROM cte3)
    ),
cte9 AS
    (
    SELECT DISTINCT TOP (@p14) Sid1, IsMatch, CASE WHEN count(*) over() > @p15 THEN 1 ELSE 0 END AS IsPartial
    FROM cte8
    ),
cte10 AS
    (
    SELECT DISTINCT refTarget.ResourceSurrogateId AS Sid1, 0 AS IsMatch
    FROM dbo.ReferenceSearchParam refSource
    INNER JOIN dbo.Resource refTarget
    ON refSource.ReferenceResourceTypeId = refTarget.ResourceTypeId
        AND refSource.ReferenceResourceId = refTarget.ResourceId
    WHERE refSource.SearchParamId = @p11
        AND refTarget.IsHistory = 0
        AND refSource.IsHistory = 0
        AND refSource.ResourceTypeId = @p12
        AND refSource.ResourceSurrogateId IN (SELECT TOP(@p16) Sid1 FROM cte7)
),

@johnstairs Yes, this is because SQLQueryGenerator uses Microsoft.Health.SqlServer.IndentedStringBuilder which has no option to un-indent (removing chars from the end doesn;t work and is quite ugly).
I don't see how we can work around it.

Note that when calling recursively to AcceptVisitor, in the case where we are iterating over multiple results sets, that is done within an already indented expression sue to the main call:

           StringBuilder.AppendDelimited($",{Environment.NewLine}", expression.TableExpressions, (sb, tableExpression) =>
            {
                sb.Append(TableExpressionName(++_tableExpressionCounter)).AppendLine(" AS").AppendLine("(");

                using (sb.Indent())
                {
                    tableExpression.AcceptVisitor(this, context);
                }

                sb.Append(")");
            });

@limorl
Copy link
Author

limorl commented Oct 13, 2020

In cte4 below, we apply a top clause on the resultset from cte3 we use for the intersection:

      cte3 AS
      (
          SELECT DISTINCT Sid1, IsMatch, 0 AS IsPartial
          FROM cte2
      ),
      cte4 AS
      (
          SELECT DISTINCT refTarget.ResourceSurrogateId AS Sid1, 0 AS IsMatch
          FROM dbo.ReferenceSearchParam refSource
          INNER JOIN dbo.Resource refTarget
          ON refSource.ReferenceResourceTypeId = refTarget.ResourceTypeId
              AND refSource.ReferenceResourceId = refTarget.ResourceId
          WHERE refSource.SearchParamId = @p5
              AND refTarget.IsHistory = 0
              AND refSource.IsHistory = 0
              AND refSource.ResourceTypeId = @p6
              AND refSource.ResourceSurrogateId IN (SELECT TOP(@p7) Sid1 FROM cte3)
      )

We should not only be applying iterate to the first ten includes in cte3. (We've also discussed that we should apply an overall limit to included resources like we do for _revinclude, though I'm happy to take that as a separate PR)

True, but I thought we agreed it will be in a separated PR. We even have an open issue for it: #1309

@johnstairs
Copy link
Member

Regarding the indentation, let's revisit later. We can expose the indentation level as a property on the writer, but I think I would look at an approach that does not require recursion. But later...

@johnstairs
Copy link
Member

johnstairs commented Oct 14, 2020

It looks like we still are applying a TOP clause on the intersection for non-iterate includes:

      DECLARE @p0 SmallInt = 77;
      DECLARE @p1 VarChar(64) = '1';
      DECLARE @p2 Int = 11;
      DECLARE @p3 SmallInt = 828;
      DECLARE @p4 Int = 10;

      SET STATISTICS IO ON;
      SET STATISTICS TIME ON;

      WITH cte0 AS
      (
          SELECT ResourceSurrogateId AS Sid1
          FROM dbo.Resource
          WHERE IsHistory = 0
              AND IsDeleted = 0
              AND ResourceTypeId = @p0
              AND ResourceId = @p1

      ),
      cte1 AS
      (
          SELECT DISTINCT TOP (@p2) Sid1, 1 AS IsMatch, 0 AS IsPartial
          FROM cte0
          ORDER BY  Sid1 ASC
      ),
      cte2 AS
      (
          SELECT DISTINCT refTarget.ResourceSurrogateId AS Sid1, 0 AS IsMatch
          FROM dbo.ReferenceSearchParam refSource
          INNER JOIN dbo.Resource refTarget
          ON refSource.ReferenceResourceTypeId = refTarget.ResourceTypeId
              AND refSource.ReferenceResourceId = refTarget.ResourceId
          WHERE refSource.SearchParamId = @p3
              AND refTarget.IsHistory = 0
              AND refSource.IsHistory = 0
              AND refSource.ResourceTypeId = @p0
              AND refSource.ResourceSurrogateId IN (SELECT TOP(@p4) Sid1 FROM cte1)
      ),
      cte3 AS
      (
          SELECT DISTINCT Sid1, IsMatch, 0 AS IsPartial
          FROM cte2
      ),
...

I don't know if that's new from this PR...

@limorl
Copy link
Author

limorl commented Oct 17, 2020

It looks like we still are applying a TOP clause on the intersection for non-iterate includes:

      DECLARE @p0 SmallInt = 77;
      DECLARE @p1 VarChar(64) = '1';
      DECLARE @p2 Int = 11;
      DECLARE @p3 SmallInt = 828;
      DECLARE @p4 Int = 10;

      SET STATISTICS IO ON;
      SET STATISTICS TIME ON;

      WITH cte0 AS
      (
          SELECT ResourceSurrogateId AS Sid1
          FROM dbo.Resource
          WHERE IsHistory = 0
              AND IsDeleted = 0
              AND ResourceTypeId = @p0
              AND ResourceId = @p1

      ),
      cte1 AS
      (
          SELECT DISTINCT TOP (@p2) Sid1, 1 AS IsMatch, 0 AS IsPartial
          FROM cte0
          ORDER BY  Sid1 ASC
      ),
      cte2 AS
      (
          SELECT DISTINCT refTarget.ResourceSurrogateId AS Sid1, 0 AS IsMatch
          FROM dbo.ReferenceSearchParam refSource
          INNER JOIN dbo.Resource refTarget
          ON refSource.ReferenceResourceTypeId = refTarget.ResourceTypeId
              AND refSource.ReferenceResourceId = refTarget.ResourceId
          WHERE refSource.SearchParamId = @p3
              AND refTarget.IsHistory = 0
              AND refSource.IsHistory = 0
              AND refSource.ResourceTypeId = @p0
              AND refSource.ResourceSurrogateId IN (SELECT TOP(@p4) Sid1 FROM cte1)
      ),
      cte3 AS
      (
          SELECT DISTINCT Sid1, IsMatch, 0 AS IsPartial
          FROM cte2
      ),
...

I don't know if that's new from this PR...

It wasn't introduced by this PR. I only removed the TOP for iterate expressions since they select from an IncludeLimit cte.

I think we should change the non-iterate Top in a separate PR. The include limit feature is not tested AFAIK and there's a bit of a challenge to test it by providing a test configuration. It deserves attention.
I think it should be part of this issue: #1309

@limorl
Copy link
Author

limorl commented Oct 18, 2020

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@limorl
Copy link
Author

limorl commented Oct 18, 2020

@johnstairs The build seems to fail on issues not related to the PR changes, but rather environment setup.
For example, provisioning environment, while a previous environment exists:
Uploading image.png…

@LTA-Thinking
Copy link
Contributor

@limorl Sorry, your PR environment got stuck in a bad state due to a provisioning issue we were having last week. I'm cleaning up the environment and will kick off a new build.

@LTA-Thinking
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

private class IncludeExpressionTopologicalSort
{
// Based on Khan's algorithm. See https://en.wikipedia.org/wiki/Topological_sorting.
// The search queries are acyclic.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know that they are acyclic?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a better way to say it was that we are not supporting cyclic queries (cyclic queries are actually recursive queries). According to the Fhir Team, it's up to the server implementor to decide how deep.
We agreed that recursive queries are out of scope.

If it is a cyclic query (recursive query), we can throw an error. It's really a product decision.

See the FHIR chat: https://chat.fhir.org/#narrow/stream/179166-implementers/topic/_include.3Arecurse.20(_include.3Aiterate).20ambiguity

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should throw an error if we encounter a cycle.

{
// Based on Khan's algorithm. See https://en.wikipedia.org/wiki/Topological_sorting.
// The search queries are acyclic.
public static IList<TableExpression> Sort(IEnumerable<TableExpression> includeExpressions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complexity of this implementation is pretty high. My thought would have been to think of it in this way:

Suppose you have a query like

/MedicationDispense?_include=MedicationDispense:context&_include:iterate=Encounter:appointment&_include:iterate=Encounter:based-on&_include:iterate=Appointment:patient&_include:iterate=ServiceRequest:specimen

I would model it in this way:

The edges arise from Requires to Produces, indicating a dependency. I would then do a DFS to create a topological sort. You would need a dictionary mapping a type name to the nodes that "produce it".

What am I missing?

Copy link
Author

@limorl limorl Oct 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's pretty much what I did. Only that I was looking at all combinations of _include and _revinclude pairs.
Note that for _revinclude , the Requires and Produces are reversed. For example _revinclude=MedicationDispense:prescription requires MedicationRequest and produces MedicationDispense.

So basically, the direction of the edge is determined by the IsDependencyEdge() function.
The runtime of the topological sort (Khan's algorithm) is O(V+E) and the implementation is simple IMO.
And building the graph was O(V^2).

Copy link
Author

@limorl limorl Oct 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we could do to make it cleaner and more intuitive is basically add IReadOnlyCollection Requires and IReadOnlyCollection Produces properties to IncludeExpression. This way IsDependencyEdge(x,y) is replaced with asking if x.Requires.Intersection(y.Produces).Any().
You can see how it looks like on my private branch: limorl@4e9a31d

The sort itself is simple enough and of the same complexity of DFS O(V+E), after the graph is built.

Are you ok with my suggestion above or should we keep as is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a lot clearer. I've expanded on it to show you what I had in mind: ca25c8b. (Quickly implemented - would not claim it's properly tested. Also would need to add a proper error message for the cycle detection and a test)

Copy link
Author

@limorl limorl Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice commit! I like the a AssertAppearsBefore, I was thinking about implementing it, but it wasn't needed in most tests after all

I'll adopt some of your changes, but continue with Khan's implementation if that's ok.

I will add the error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go for it

}
}
}
throw new InvalidSearchOperationException("Cyclic include iterate queries are not allowed");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"not supported". And please put the string in a resx file.

Copy link
Author

@limorl limorl Oct 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops.. Saw that after I merged.
The reason I didn;t add the string to resx is that I wanted to be consistent with how all the Rewriters throw InvalidSearchOperationException.

Will fix in a new PR

@limorl limorl merged commit 981402d into microsoft:master Oct 23, 2020
This was referenced Oct 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants