Skip to content

Fixes#1105#1194

Closed
doronp wants to merge 1 commit intomoby:masterfrom
doronp:Fix-#1105
Closed

Fixes#1105#1194
doronp wants to merge 1 commit intomoby:masterfrom
doronp:Fix-#1105

Conversation

@doronp
Copy link
Contributor

@doronp doronp commented Jul 20, 2016

service command should work with id prefix and with name as in network command
see:
#1097
#1096

@doronp
Copy link
Contributor Author

doronp commented Jul 20, 2016

@codecov-io
Copy link

codecov-io commented Jul 20, 2016

Current coverage is 55.07%

Merging #1194 into master will increase coverage by <.01%

@@             master      #1194   diff @@
==========================================
  Files            77         77          
  Lines         12190      12188     -2   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits           6713       6712     -1   
  Misses         4559       4559          
+ Partials        918        917     -1   

Sunburst

Powered by Codecov. Last updated by e0ffaf6...a654a6e

},
},
)
service, err := getServiceByName(ctx, c, input)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better to lookup by prefix, then let the caller know the name is ambiguous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe See #1097 In Networks we want to have the same behaviour?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any time you are trying to resolve a set to single fuzzy item, this is the behavior to use. In psuedocode, it looks like this:

results := find(query)

switch len(results) {
case 0: not found
case 1: found
default: ambiguous -> error
}

This reduces round trips and ensures that you get the right result. If you don't handle the ambiguous case, you end up with a potential security exploit.

@doronp doronp mentioned this pull request Jul 20, 2016
@stevvooe
Copy link
Contributor

@doronp Thanks for the bugfixes, but do you mind taking the time to give your PRs and commit messages descriptive titles?

@doronp
Copy link
Contributor Author

doronp commented Jul 21, 2016

@stevvooe Done.

@stevvooe
Copy link
Contributor

@doronp Thanks! It makes it a lot easier when we're going through all these issues and commits.

@doronp
Copy link
Contributor Author

doronp commented Jul 27, 2016

Are we good to go with this and PR #1097?

@stevvooe
Copy link
Contributor

@doronp No. You didn't change the code based on the feedback.

@doronp
Copy link
Contributor Author

doronp commented Jul 28, 2016

@stevvooe You must mean your comment about the result length. Can you please explain in which case The code will return an arbitrary result instead of stating ambiguous? I re-read it and Each of the private methods (getServiceByPrefixedID; getServiceByName) each returns a result IFF it has one value.

@stevvooe
Copy link
Contributor

@doronp Do the prefix lookup first, as it is least likely to fail. The way the code is written now, it incurs an extra request to get services by prefix. If you query prefix and it is unique, it should only take a single request.

There are three cases:

  1. Query by prefix and have exact match that is unique. Return the single match.
  2. Query by prefix and have exact match that is also prefix of another record. Return the record that matches exactly.
  3. Query by prefix and have multiple matches sharing the query as the prefix. This case is ambiguous.

We can have a similar set of cases for query by ID Prefix. I am not sure, but there should be a single index that has both names and ids, queryable by prefix, such that this only takes 1 RPC to do in total.

@doronp
Copy link
Contributor Author

doronp commented Aug 3, 2016

@stevvooe

I think we should try name before short ID: full ID -> name -> short ID.

see
PR #1097 (same issue for networks)

I think @aluzzardi @dongluochen @aaronlehmann and ourselves agree that services and networks should behave the same.

What you and @aluzzardi suggest is contradicting. Can we please have a resolution for both issues?
Personally I think that the order suggested by @aluzzardi Is also the order of events in plain docker when querying for a container so it makes sense to me.

@stevvooe
Copy link
Contributor

stevvooe commented Aug 3, 2016

@doronp Yes, they should all behave the same. No, they are not contradictory. I am proposing a matching algorithm that is correct, safe and efficient. This PR is not.

Both git and docker implement an algorithm similar to this and it needs to be done here, as well. Please do it or we cannot merge this PR.

@doronp
Copy link
Contributor Author

doronp commented Aug 4, 2016

@stevvooe
I suspected we misunderstand each other on some basic terminology.

You wish to have:

Do the prefix lookup first

And @aluzzardi

I think we should try name before short ID: full ID -> name -> short ID.

I may have miss understood but isn't short ID==prefix?

@ezrasilvera
Copy link

@stevvooe Just to clarify
(1) when you speak about prefix you mean just ID prefix or also name prefix (which is currently not supported in any command AFAIK)
(2) If you check prefix first (and it's just the ID prefix) it means that if the ID prefix of service A == name of service B you would never be able to access service B by name.

@stevvooe
Copy link
Contributor

@ezrasilvera @doronp Here is the data model.

First, we have a function, given an object that gives you thinks that can look it up. The current function for most objects will returns a name and id:

fn(Object) -> []string

Examples:

fn(task1) -> {<id1>, <name1>}
fn(task2) {<id2>, <name2>}

We call this function "vectorization". We take the result and apply the entries in a common reverse index-space, with tuples of <Result, Object>, usually in a sorted data structure, such as a red-black tree or suffix/patricia/whatever trie.

So, when we have fn(task1) -> {<id1>, <name1>}, we create two entries in our index-space:

<id1> task1
<name1> task1

When we look these up in a dataset, we have no clue whether they are ids or names and we can't make assumptions about their relationships. We can only assume that IDs are unique. We can look at the entries as a table. Here is one with concrete values:

1 task1
2 task2
3 task3
3 task4 # has name 3
33 task5 # has name 33
4 task4
5 task5
6 task6 
6 task7 # has name 6
7 task7
7 task6 # has name 7
bar task2
foo task1
fooer task3

On the left, we have a lookup key and on the right we have the target object (or likely an identifier). We show this lexicographically and each tuple is unique.

With such a dataset, we can serve queries. The query function has the following signature:

fn(query) -> []Object

From our dataset, here are a few results:

fn("fo") -> {task1, task3}
fn("foo") -> {task1, task3}
fn("bar") -> {task2}
fn("3") -> {task3, task4, task5}
fn("6") -> {task6, task7}
fn("7") -> {task6, task7}

Now, we have our problem laid out correctly, we can handle various cases. Given the matched set based on prefix scan of the index, we choose an item or return an error indicating ambiguity.

We have the following rules:

  1. Unique match for prefix or exact match on name or id succeeds.
  2. Multiple matches without exact match returns ambiguous error.
  3. Exact match under multiple items gets selected, favoring id.

The results applied to some queries follow:

Query Result Match Reason
"fo" {task1, task3} Error ambiguous
"foo" {task1, task3} task1 exact name match
"ba" {task2} task2 unique prefix match
"bar" {task2} task2 exact name match
"fo" {task1, task3} Error ambiguous
"3" {task3, task4, task5} task3 id unique match
"6" {task6, task7} task6 id unique match
"7" {task6, task7} task7 id unique match

In this setup, you ensure that every object is accessible, which is guaranteed by the id uniqueness property. We get both name and id prefix matching, with resolved ambiguities. Impact of mutual name-id cycles (task6, task7 above) is minimized by favoring id over name on exact match.

I hope this clarifies the methodology.

@dperny
Copy link
Collaborator

dperny commented Aug 29, 2016

closing in favor of #1279

feel free to reopen if this is in error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants