Skip to content

MSC3845: Draft: Expanding policy rooms to reputation#3845

Open
Yoric wants to merge 4 commits intomatrix-org:mainfrom
Yoric:yoric/opinions
Open

MSC3845: Draft: Expanding policy rooms to reputation#3845
Yoric wants to merge 4 commits intomatrix-org:mainfrom
Yoric:yoric/opinions

Conversation

@Yoric
Copy link
Copy Markdown
Contributor

@Yoric Yoric commented Jul 12, 2022

@Yoric Yoric marked this pull request as draft July 12, 2022 11:17
- a mechanism to store, publish and share actual *actions* against an entity (such as kicking or muting).

The current proposal builds upon policies introduced in MSC2313 to serve as the former, letting
communities share their opinion of an entity as a number in [0, 100). Further tools may be developed
Copy link
Copy Markdown
Contributor

@thibaultamartin thibaultamartin Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concerned about the user friendliness behind this number. Would this number be directly exposed to people trying to build opinion lists on entities? Or would it be hidden by the client in some way (e.g. by never displaying the number itself, but asking if an entity seems friendly, is nearly infringing the CoC, or has actually infringed the CoC)

Copy link
Copy Markdown
Contributor Author

@Yoric Yoric Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed that users would have some kind of slider from "toxic" to "pillar of the community" or such.

More advanced moderation tools would probably define some well-known values.

@uhoreg uhoreg added proposal A matrix spec change proposal kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Jul 12, 2022
Comment thread proposals/XXXX-expanding-policy-rooms-to-reputation.md Outdated
Comment thread proposals/XXXX-expanding-policy-rooms-to-reputation.md Outdated
Comment thread proposals/XXXX-expanding-policy-rooms-to-reputation.md Outdated
@turt2live turt2live changed the title Draft: Expanding policy rooms to reputation MSC3845: Draft: Expanding policy rooms to reputation Jul 12, 2022
@@ -0,0 +1,132 @@
# MSCXXXX: Expanding Policy Rooms towards Distributed Reputation

The Matrix network and protocol are open. However, some users, rooms or servers (let's call them
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally use the word party since its various parties that have opinions about each other in the case of reputation. Parties can be individual persons or groups/communities. Persons in this context is not Natural persons but persons inclusive of legal persons so companies etc.

I mean its just my prefered language for this but Entities could also be a choice. In my MSC3784 i like to use the word stakeholders but it doesnt apply here the same way since we are now talking about interactions between various parties instead of how this affects the stakeholders.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense.

In this case, we're building upon MSC2313, which uses "entity", so I'm reusing because it's the simplest path.

- a mechanism to store, publish and share actual *actions* against an entity (such as kicking or muting).

The current proposal builds upon policies introduced in MSC2313 to serve as the former, letting
communities share their opinion of an entity as a number in [-100, 100]. Further tools may be developed
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I personally don't have a good experience with numbers to rate stuff. People tend to pick 1-3 numbers and apply them to everyone and different people use numbers completely differently. I.e. on polls some people like me never give a score of 10/10, but always do 9/10 5/10 or 1/10.

What we currently do instead is make different severity lists. I.e. we have a ToS and a CoC list, where ToS is unambiguous bans, while CoC violations are often treated by communities differently. That way you can agree with a different communities CoC or ToS on a case by case basis instead of trying to agree on a global network to rate stuff using the same scale.

TL;DR I don't think numbers are a good way to rate things.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is that clients would present some kind of label mapping to these ratings, but it probably is something that should be standardized

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the point. The problem we're trying to solve is that ToS and CoC are sometimes (not always!) too binary.

Here, a few typical cases are:

  1. "first strike" (lose 15 points on your CoC m.opinion), "second strike" (lose 15 more points on your CoC m.opinion), etc. — and program Mjölnir to put users who are below -50 m.opinion on a CoC m.ban;
  2. program Mjölnir to combine the opinion of community A and community B — if they both believe that Marvin is a bad user (for instance they both have an opinion of -50 or worse on Marvin), then ban Marvin.

For these uses m.ban is the (almost) end result, but the m.opinion is an intermediate step to reach that decision.

Copy link
Copy Markdown
Contributor

@deepbluev7 deepbluev7 Jul 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think being able to share your opinion on a user in a more nuanced manner is a reasonable goal. What I am struggling with is how to put a scale on those numbers.

If you want mjolnir to automatically ban when the opinion reaches -50 and you want to automatically ban when the opinion reaches -50 in 2 other communities, you can already reach something very similar to the latter by banning a user only if they are banned in A and B. However if a different community or application uses a different scale for bans and strikes (i.e. -1 per strike and ban on -3), how do you combine 2 of those lists? Since this MSC doesn't put any concrete moderation actions on a scale, you now have a lot of room for ambiguity and to add incompatibilities beween implementations. And it also doesn't solve how to distinguish a ban for "just being annoying" from a ban for something serious like sending CSAM. In the latter case the opinion might be -100, but there isn't really a way to tell in this MSC.

It also leaves a lot of room for new issues. If someone sent CSAM in a room, got -100, but you combine the score by averaging 3 policy lists, they wouldn't cross the -50 threshold. Otoh if you naively add the 3 rooms, they might get banned far too early, since a single strike applied by several communities might still add up to a ban. There is also no decay, so any account might be banned eventually, since the -30 score you racked up 5 years ago is still there and a later misunderstanding suddenly puts you over the threshold.

Basically what I want to say is, while numbers have more states than 2, without attaching concrete units to them, that have a proper meaning at certain intervals, they will need a lot of undocumented knowledge to be useful. Meanwhile this MSC doesn't really address aging an opinion. Usually an opinion isn't binary, but it also isn't onedimensional. I think providing an explicit strike system or temporary bans might be more useful than an overly generic mechanism, that clients will use in incompatible ways.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You make entirely good points. I'll try to address them all.

If you want mjolnir to automatically ban when the opinion reaches -50 and you want to automatically ban when the opinion reaches -50 in 2 other communities, you can already reach something very similar to the latter by banning a user only if they are banned in A and B.

That is true. Now what about more nuanced policies such as "you can only start posting links and images if we have reached a state in which we believe that we can trust you"?

However if a different community or application uses a different scale for bans and strikes (i.e. -1 per strike and ban on -3), how do you combine 2 of those lists? Since this MSC doesn't put any concrete moderation actions on a scale, you now have a lot of room for ambiguity and to add incompatibilities beween implementations.
[...]
It also leaves a lot of room for new issues. If someone sent CSAM in a room, got -100, but you combine the score by averaging 3 policy lists, they wouldn't cross the -50 threshold. Otoh if you naively add the 3 rooms, they might get banned far too early, since a single strike applied by several communities might still add up to a ban.

I agree that, in many cases, neither adding nor averaging are good operators. My hope is to experiment with operators during prototyping. I have (just) started writing a few experimental patches for Mjölnir to implement opinions and I expect that this will inform a bit further evolutions of this MSC.

I suspect that, fairly soon, we'll end up needing a full range of operators and that will require an MSC on how to combine policy lists, something that is so far pretty much unspecified.

Basically what I want to say is, while numbers have more states than 2, without attaching concrete units to them, that have a proper meaning at certain intervals, they will need a lot of undocumented knowledge to be useful.

That makes entire sense. What units would you use?

Meanwhile this MSC doesn't really address aging an opinion. Usually an opinion isn't binary, but it also isn't onedimensional.

I agree that aging is something that needs to be solved, probably both for opinions and for bans. I hope that this can be done further down the line by a subsequent MSC, because I'd like to limit the scope of this MSC.

That being said, you get me thinking. One way to implement aging could be:

  • add a duration to policies (including m.ban policies) ;
  • instead of opinion being an absolute number, make it an operation (for the time being +x or -x, possibly other operations down the line).

Would this make sense to you?

And it also doesn't solve how to distinguish a ban for "just being annoying" from a ban for something serious like sending CSAM. In the latter case the opinion might be -100, but there isn't really a way to tell in this MSC.

It also leaves a lot of room for new issues. If someone sent CSAM in a room, got -100, but you combine the score by averaging 3 policy lists, they wouldn't cross the -50 threshold. Otoh if you naively add the 3 rooms, they might get banned far too early, since a single strike applied by several communities might still add up to a ban. There is also no decay, so any account might be banned eventually, since the -30 score you racked up 5 years ago is still there and a later misunderstanding suddenly puts you over the threshold.

Your example on CSAM makes entire sense. I believe that the best way to do things would be to have both opinions and absolute bans (which could be issued from opinions in some cases, but not all). In this specific example, CSAM would deserve an immediate absolute ban in addition to a low opinion.

I think providing an explicit strike system or temporary bans might be more useful than an overly generic mechanism, that clients will use in incompatible ways.

I would really like to have temporary bans and I have a few ideas for them, but they're mostly orthogonal to this work.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will just have to experience with that for a bit. My current idea would be to instead allow others to label content posted in or by an entity and then allow my moderation bot to take that data, either put it to a vote by cross-checking multiple lists or assign a score based on those labels and then decide to ban based on that instead of some number. I.e. to allow describing a malicious entity in a manner that allows me to form an opinion instead of relying on a numeric opinion. I currently just don't see pure numbers work as well, but that might change if I have some implementation experience with a few different approaches to this.

Thank you for reading through my messy feedback. I think you have some good ideas to address it, but I really just need to just try it, I guess. But so far the ban process in my communities was "what was posted" -> "cross check if others can confirm or deny that" -> "ban or no ban" instead of giving opinions that can be put into numbers, so I have a hard time picturing that in practice.

@Yoric Yoric marked this pull request as ready for review July 20, 2022 09:42

This MSC does not specify how to combine opinions from two trusted groups. If group A assigns
an opinion of -20 to Marvin and group B assigns an opinion of -10 to Marvin, does this mean
that Marvin should have a total opinion of -30? -15?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth considering the potential drawbacks of a single, all-deciding reputation score. Instead, it may be more useful to allow clients to interpret reputation in a way that is most relevant to their needs.

For example, instead of a single score, you could consider categorizing users as "controversial" or "bad" based on the balance of positive and negative scores they receive. This approach might be particularly useful for mitigating direct message spam, as it would allow clients to identify users who are considered "generally bad" or "mostly good with a few haters" or "bad because one person has rated them so far" and take appropriate action if necessary.

The community of Alicites may publish a positive opinion on users that they appreciate. If,
however, the activities of Alicites are illegal in some regions, authorities may decide to
use the opinions published by the Alicites to try and locate users friendly to Alicites.
Similarly, a malicious group of Marvinites may use the opinions published by the Alicites
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this might not solve the issue of bullying, it's important to consider the potential for malicious actors to manipulate a reputation system when trying to have bad users snitch. It's important to discourage users from considering the reputation of "bad" users, as this could destabilize the reputation system.

Ignoring the reputation scores assigned by negatively scored users can help to prevent situations where a malicious user, such as Marvin in your example, can unjustifiably hurt the reputation of a positively scored user, such as Alice, by introducing a large number of bots that highly recommend her and then misbehave.

or perhaps be muted for 15 minutes, or lose the ability to post links and images, etc.

To achieve this, we need two mechanisms:
- a mechanism to store, publish and share the *opinion* of a community (or a single user) on an entity;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to consider how users will be able to determine the source of reputation scores and whether they reflect the views of a single user or a broader community. For example, Alice might have difficulty determining whether a reputation score is coming from Bob or a wider consensus, especially if Bob moderates many of the rooms that Alice is a part of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants