-
Notifications
You must be signed in to change notification settings - Fork 30
Add :strict context for stricter sanitization
#46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@yujinakayama review please |
|
The name |
|
Typos in the commit message |
| '<i class="fa fa-user"></i>user' | ||
| end | ||
|
|
||
| it "sanitized them" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description should be does not sanitize them or allows them
| '<i class="fa fa-user"></i>user' | ||
| end | ||
|
|
||
| it "sanitized them" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sanitizes
|
Could you extract the RuboCop and CI commits into another PR? 🙏 |
4ec6839 to
961668f
Compare
|
@yujinakayama Updated |
yujinakayama
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a regression where the UserInputSanitizer breaks Markdown blockquotes (> foo) by confusing the angle brackets as HTML tags, since the sanitizer has no knowledge of Markdown and is run against raw Markdown source before the core Markdown renderer (greenmat). I've added a pending spec for it.
1) Qiita::Markdown::Processor#call with strict context with blockquote syntax does not confuse it with HTML tag angle brackets
# No reason given
Failure/Error: should eq "<blockquote>\n<p>foo</p>\n</blockquote>\n"
expected: "<blockquote>\n<p>foo</p>\n</blockquote>\n"
got: "<p>> foo</p>\n"
(compared using ==)
Diff:
@@ -1,4 +1,2 @@
-<blockquote>
-<p>foo</p>
-</blockquote>
+<p>> foo</p>
# ./spec/qiita/markdown/processor_spec.rb:1048:in `block (5 levels) in <top (required)>'
|
Maybe we should merge the Edit: We need not necessarily to handle it right now in this PR. |
Edit: We decided to not handle the problem in this PR for simplicity. |
|
Greenmat filter inserts
We have to fix them ahead. 🏃 |
... so that we can sanitize `class` attribute inputted by user in post-process. Ref: increments/qiita-markdown#46
... so that we can sanitize `class` attribute inputted by user in post-process. Ref: increments/qiita-markdown#46
... so that we can sanitize `class` attribute inputted by user in post-process. Ref: increments/qiita-markdown#46
| "rev" => %w[footnote], | ||
| }, | ||
| "sup" => { | ||
| "id" => /^fnref\d+$/, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We always want to use \A instead of ^, and \z instead of $ when possible.
| private | ||
|
|
||
| def transform_attribute(attr, pattern) | ||
| node.attributes[attr].value = node.attributes[attr].value.split.select do |value| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can write as:
node.attributes[attr].value->node[attr]node.attributes[attr].value =->node[attr] =
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Omitting argument of String#split might be dangerous, since it depends on a special global variable $;:
If pattern is nil, the value of
$; is used. If $ ; is nil (which is the default), str is split on whitespace as if ‘ ’ were specified.
Since we're doing security things here, it's better to always be defensive 😉
|
@yujinakayama Updated 🙇 |
yujinakayama
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
:strict context for stricter stanitization:strict context for stricter sanitization
What
This patch introduces
:strictcontext. When it is given,UserSanitizerfilter sanitizes malicious HTML tags right after core Markdown renderer and before any other filters.Why
Since
Sanitize(renamed asFinalSanitizer) filter is applied at the end of html-pipeline, its rules are intentionally weakened to allow elements and attributes which are generated by other filters.Intentionally weakened?
For example, in previous,
<div>element andclassattribute was allowed so that users can emulate almost all UI components of Qiita in their articles. Since articles are hosted on qiita.com, there is a potential risk of being abused as a phishing site.With
:strictcontext, they are not permitted.