Skip to content
This repository was archived by the owner on Nov 24, 2025. It is now read-only.

Conversation

@ocket8888
Copy link
Contributor

What does this PR (Pull Request) do?

  • This PR is not related to any Issue

This is a blueprint for adding Tags to the Traffic Ops API and Traffic Portal UI.

Note that this is not the same thing as Service Categories (from PR #4518), which have semantic meaning, whereas Tags are expressly just Tags.

Which Traffic Control components are affected by this PR?

  • Documentation

What is the best way to verify this PR?

Read the blueprint

The following criteria are ALL met by this PR

  • Tests are unnecessary
  • This PR includes documentation
  • An update to CHANGELOG.md is not necessary
  • This PR includes any and all required license headers
  • This PR does not include a database migration
  • This PR DOES NOT FIX A SERIOUS SECURITY VULNERABILITY

@ocket8888 ocket8888 added the blueprint feature requirements / implementation details label Jun 23, 2020
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine they may be using the "description" field for key-value pairs as well. If the tag is alphanumeric how should someone represent a key-value pair?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They won't, which just means tags won't replace those kinds of description abuse.

Can you give me an example of something that might be used as a key-value pair in a description field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've relaxed the naming constraints to allow _ and =, but I wouldn't recommend actually using tags for key/value pairs. If there's a key that's needed on Delivery Services, I'd rather see it as a proper property of DSes. Otherwise implementing that would essentially mean creating a model of Delivery Service that is mutable at runtime, which seems like a bad idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key-value tags would allow operators to essentially create whatever informational fields that they may want on a delivery service that might not make sense for everyone else's installation. For instance, maybe Company A wants to create "service categories" for delivery services in order to group them and do certain things with those groups outside the scope of TO. Company B might not care about or even want service categories, so instead of it just always being an empty, unused field in Company B's delivery services, it could exist purely in Company A's world via key-value tags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if you want to put Delivery Services into groups of service categories, say either the "Foo" service category or the "Bar" category, you could use key-value pairs to say serviceCategory: "Foo" | "Bar", or you could just give them either the "Foo" or the "Bar" tag. The only difference is enforced exclusivity, and some semantic meaning (which tags are not supposed to have).

If there's a need to store some data in a way that's validated - either structurally or semantically - and it's not useful outside of your organization, such that you don't want to contribute it to the project, then perhaps a plugin for an endpoint like deliveryservices_categories could be useful?

Copy link
Contributor

@jhg03a jhg03a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Is the same tag allowed to be applied multiple times with different values?

Copy link
Contributor

@jhg03a jhg03a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ocket8888
Copy link
Contributor Author

Question: Is the same tag allowed to be applied multiple times with different values?

There's no notion of "the same tag... with different values". A Tag is uniquely identified by its value and has no other properties, so that wouldn't make sense.

@jhg03a
Copy link
Contributor

jhg03a commented Jun 23, 2020

Question: Is the same tag allowed to be applied multiple times with different values?

There's no notion of "the same tag... with different values". A Tag is uniquely identified by its value and has no other properties, so that wouldn't make sense.

Ok, so we're not talking actual KVP; just single string values. You might expect a lot of tags with magic prefixes like email=.... or contact=...; not that it's your concern as the implementer since tags have no internal TO semantic meaning. As such it might make data entry more error-prone with typos like emial=... and overall requiring operators to know magic word prefixes to do things in their external tooling.

@ocket8888
Copy link
Contributor Author

Are you saying that a current common thing to do is to put email=... in a description field for a Delivery Service? Because obviously if it would go on something like a User I'd just say "use their Email property".

If that's the case, then I don't think Tags are a good way to replace that. If there's a need to know some email address for a Delivery Service, then maybe they should just have their own Email property?

@jhg03a
Copy link
Contributor

jhg03a commented Jun 24, 2020

Are you saying that a current common thing to do is to put email=... in a description field for a Delivery Service? Because obviously if it would go on something like a User I'd just say "use their Email property".

If that's the case, then I don't think Tags are a good way to replace that. If there's a need to know some email address for a Delivery Service, then maybe they should just have their own Email property?

No, it doesn't exist today because the existing LongDesc* fields are used by too many people too many different ways we can't really touch them. Real contact tracking for DS is a thing we've needed a while. Tags let us implement it without changing the software.

@mitchell852 mitchell852 added Traffic Portal v1 related to Traffic Portal version 1 Traffic Ops related to Traffic Ops labels Apr 14, 2021
@mitchell852
Copy link
Member

mitchell852 commented Apr 14, 2021

The ability to easily attach additional info to things like delivery services would be a HUGE win imo. the ds long_desc_1 and long_desc_2 fields are the result of users needing to do this so they were given 2 fields to throw info into (bad idea imo). other times we simply add yet another column to the DS table (50+ and counting i think) to simply attach more info to a DS. It would be a great feature to be able to create a foo and bar tag and add them to delivery service(s) and then fetch those ds's via GET /ds?tag=foo&tag=bar

Copy link
Contributor

@srijeet0406 srijeet0406 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo comment, rest looks good. This will be a great addition.

@rob05c
Copy link
Member

rob05c commented May 18, 2021

For the record, I don't agree with adding "tags." We shouldn't be adding more unstructured, unrelated data to Traffic Ops.

We already have unstructured data, Profiles and Parameters, and they're a nightmare. They're constantly wrong in Production, Parameters get copied to the wrong Profile, unstructured and unverified text gets typos and then silently doesn't take effect. I can search our Prod CDN and find dozens if not hundreds of wrong Parameters in minutes, and often nobody even knows what the right value should be. And then at best Production isn't operating how it's supposed to; at worst, the CDN goes down.

Tags don't solve any of these problems. We should never, ever change behavior or act on Tags within Traffic Control. If a behavior change or configuration is needed, we already have mechanisms for that, structured fields on objects like Servers and Delivery Services, or if we don't want to do the right thing, we already have unstructured and unverified Parameters.

Moreover, Traffic Control shouldn't store data unrelated to CDN. It makes the control plane confusing, storing data irrelevant to CDN operation in the control plane itself. Worse, when it does become relevant, someone says "well just change the behavior based on that data," and now we have obscure, usually undocumented behavior based on data fields that are "mostly" unrelated to control.

CDN Operators should store data unrelated to control in an Asset Management System, not the Traffic Ops control database. I would fully support us integrating with popular Asset Management Systems, for example in the Traffic Portal UI. But adding that data to the ATC control database is confusing and wrong, and dangerous when people later decide to act on it.

If we shouldn't be acting on the data, and we shouldn't be storing data we don't act on, it shouldn't exist.

This feature makes operating the CDN more dangerous, more error-prone, and more confusing.

I know "tags" are popular lately, and serve a good purpose on many platforms. But ATC isn't Github or a blog, it's a system for operating a CDN, and tags aren't beneficial in this context, they're dangerous and confusing.

But I realize I'm in the minority, so I won't -1.

@rawlinp
Copy link
Contributor

rawlinp commented May 18, 2021

Tags are meant to be purely informational only. If anything within TO actually acts upon or changes behavior based on tag data, that is a bug. Think "long_desc_1" and "long_desc_2" instead of profiles/parameters. The former is purely informational from ATC's perspective, whereas the latter actually affects system behavior.

I think there is tremendous power in being able to provide arbitrary data on ATC resources (servers, DSes, phys_locations, etc) that ATC is not itself responsible for whatsoever (other than providing the ability to CRUD that arbitrary data). People can add whatever arbitrary data they want, ensure that they conform to whatever external schemas they want, and make external systems act on that data however they want. As long as it's purely informational from ATC's perspective, I don't really have an issue with tags. It's not really profiles/parameters++.

@jhg03a
Copy link
Contributor

jhg03a commented May 18, 2021

Question: Is the same tag allowed to be applied multiple times with different values?

There's no notion of "the same tag... with different values". A Tag is uniquely identified by its value and has no other properties, so that wouldn't make sense.

Ok, so we're not talking actual KVP; just single string values. You might expect a lot of tags with magic prefixes like email=.... or contact=...; not that it's your concern as the implementer since tags have no internal TO semantic meaning. As such it might make data entry more error-prone with typos like emial=... and overall requiring operators to know magic word prefixes to do things in their external tooling.

As an operator, I don't intend to use tags as purely informational. They become a means through which I can supplement ATC with information necessary to automate processes required to operate a CDN.

@ocket8888
Copy link
Contributor Author

Tags don't solve any of these problems.

Tags aren't trying to solve those problems, because:

We should never, ever change behavior or act on Tags within Traffic Control.

... is what's specified in the blueprint:

"A Tag is nothing more than a unique name, which can be associated with ... objects..."

and

"Tags have no operational meaning..."

I can certainly make it more clear if you like, since those statements are pretty far apart in the document.

"Moreover, Traffic Control shouldn't store data unrelated to CDN. It makes the control plane confusing, storing data irrelevant to CDN operation in the control plane itself."

I think you could make the same argument for any description fields that exist, or lastUpdated or ILO interface details etc. etc.

Giving users a way to tag things is certainly meta data, but if it's beneficial to operators of the CDN then I wouldn't call it "irrelevant".

@ocket8888
Copy link
Contributor Author

As an operator, I don't intend to use tags as purely informational. They become a means through which I can supplement ATC with information necessary to automate processes required to operate a CDN.

That's your prerogative, but ATC components are forbidden from interpreting any tag or lack thereof with semantic meaning.

@jhg03a
Copy link
Contributor

jhg03a commented May 19, 2021

As an operator, I don't intend to use tags as purely informational. They become a means through which I can supplement ATC with information necessary to automate processes required to operate a CDN.

That's your prerogative, but ATC components are forbidden from interpreting any tag or lack thereof with semantic meaning.

That's fine. It's not expected that ATC itself knows anything from their use, but rather external supplementary code around ATC that would. It is admittedly dangerous because there's no safety in doing so since tags have no enforceable structure (other than uniqueness to the object to which they are attached).

@genzgd
Copy link

genzgd commented Jun 1, 2021

Maple/analytics is the quintessential use case for It's not expected that ATC itself knows anything from their use, but rather external supplementary code around ATC that would. Delivery services currently need certain groupings for specific types of operational and analytics queries, and there's not an obvious system (like an asset management system) to keep that data. Perhaps some of Rob's concerns could be alleviated by controlling the allowed tags for a particular TC installation? At least in that case if there was semantic meaning it could be described in the enumerated tag list, and typos/etc would be somewhat mitigated.

As for semantic meaning within TC, it basically becomes a rehash of continuing epic battle of dynamic versus static types. I would actually come down on the side of allowing semantic meaning, mostly because first (as Jeremy pointed out) we already have a huge number of columns in the database, and second, actually getting new fields into production takes absolutely forever (serviceCategory -- didn't that take something like a year?). If we'd had tags we could have done that without a code change. While I appreciate the theoretical goal of have precise and fixed meaning for every TC related field, I think the cost in flexibility and "agility" is too high.

@rawlinp
Copy link
Contributor

rawlinp commented Jun 1, 2021

I would actually come down on the side of allowing semantic meaning, mostly because first (as Jeremy pointed out) we already have a huge number of columns in the database, and second, actually getting new fields into production takes absolutely forever (serviceCategory -- didn't that take something like a year?). If we'd had tags we could have done that without a code change. While I appreciate the theoretical goal of have precise and fixed meaning for every TC related field, I think the cost in flexibility and "agility" is too high.

I've been kicking around the idea in my head of just adding some kind of JSON extensions field to things like delivery services. Contrary to tags, these fields would actually have semantic meaning (with full API validation), but they'd exist within a jsonb column instead of a more static type. For example, when adding new fields to delivery services, if we chose to add them to the extensions jsonb column, we would not need to:

  • rev the API minor version
  • create/run a DB migration

Clients would have to preserve all "unknown" fields within the extensions JSON, which would make the Go client experience a little tricky, but the Go client's existing behavior of basically dropping all unknown fields when it unmarshals JSON responses is one of the big reasons why we need to follow Semantic Versioning in the first place.

This would definitely increase our agility in terms of adding new fields, but our release agility is still a bit hampered by a number of other things, one of which being TO API major versions (which an extensions jsonb column wouldn't really fix).

That said, there does seem to be a growing movement of "just add new DS fields as DS profile parameters, then make them real DB columns later" in order to get a more immediate deployment of features to things like t3c. Had we done that for Service Categories, it could've been deployed and used much sooner.

@genzgd
Copy link

genzgd commented Jun 1, 2021

I like the flexibility of a JSON column, but in my experience such things are really subject to abuse very quickly, leading to a spaghetti mess of client code that has to parse and interpret lots of nested JSON. Unfortunately I don't think there's one right answer to the perfect balance between flexibility and control.

I currently hate profile parameters in large part because profiles have no include/nesting/combine functionality so each permutation requires a new profile name, and though I haven't kept up on the details it would be super nice to get Rob's proposal to fix that into TC.

@jhg03a
Copy link
Contributor

jhg03a commented Jun 1, 2021

Clients would have to preserve all "unknown" fields within the extensions JSON, which would make the Go client experience a little tricky, but the Go client's existing behavior of basically dropping all unknown fields when it unmarshals JSON responses is one of the big reasons why we need to follow Semantic Versioning in the first place.

That's not unique to Go. Any client implementation has these same problems, they just may have better tools to adapt.

I'll also just throw out that this is one of the reasons I'm a fan of GraphQL over REST. These rigid all-inclusive objects aren't a problem with GraphQL because as a client you request only what you want and not the entire thing. If 30 arbitrary fields are added to an object, it doesn't change any of my one-way requests, and I'm aware of them at runtime whenever the published graphql schema is updated.

@smalenfant
Copy link
Contributor

Seems like this have stopped. This would be the #1 feature needed. This solves so many issues about automation and inventory use cases.

@ocket8888
Copy link
Contributor Author

I'm certainly still open to implementing this and responding to blueprint feedback

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

blueprint feature requirements / implementation details Traffic Ops related to Traffic Ops Traffic Portal v1 related to Traffic Portal version 1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants