Skip to content

Conversation

@westonruter
Copy link
Owner

Add ability to restrict the set of languages which are used for auto-detection.

Usage:

add_filter( 'syntax_highlighting_code_block_auto_detect_languages', function() {
    return [ 'ruby', 'python', 'perl' ];
} );

Fixes #34.

@westonruter
Copy link
Owner Author

@allejo Thoughts on this?

Copy link
Contributor

@allejo allejo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested this code myself but it looks logically good to me

Comment on lines +281 to +283
if ( ! empty( $auto_detect_languages ) ) {
$highlighter->setAutodetectLanguages( $auto_detect_languages );
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good! Without this check and setting autodetect to an empty array would still work, however, it would trigger the highlighter to reindex all of its languages.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right..

What is the behavior when the array is empty? Does it iterate over all languages in alphabetical order? If so, perhaps a default should be devised based on how popular the languages are?

Copy link
Contributor

@allejo allejo Feb 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the highlighter is configured to have 0 auto-detect languages, either by calling setAutodetectLanguages explicitly or setting the subset to null in highlightAuto, it will use every language registered (all ~185 languages) via brute-force.

See: https://github.com/scrivo/highlight.php/blob/master/Highlight/Highlighter.php#L783-L791

Copy link
Contributor

@allejo allejo Feb 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for having a default set of languages to autodetect:

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the highlighter is configured to have 0 auto-detect languages, either by calling setAutodetectLanguages explicitly or setting the subset to null in highlightAuto, it will use every language registered (all ~185 languages) via brute-force.

And this brute force of all 185 languages is the default behavior?

As for having a default set of languages to autodetect:

Yeah, I suppose in this plugin. If we could somehow determine the relative popularity of each language by frequency of use on GitHub, for example, then this could be the default value used here when calling setAutodetectLanguages. Users could filter it to modify the order.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this brute force of all 185 languages is the default behavior?

So the default behavior for highlight.php is to use "xml", "json", "javascript", "css", "php", "http" as the default auto-detect languages. However, this is legacy behavior and will change since highlight.js does not have any default languages (I should clarify this in the README). highlight.js will brute force its way through all its registered languages but it has different default builds with just some languages, so the brute force approach doesn't typically go through all 185 languages unless you use a full build. highlight.php only has a full build.

But yes, if you give the autodetection an empty array, the default behavior will be to use all 185 languages.

@westonruter westonruter merged commit a0af262 into master Feb 17, 2020
@allejo allejo deleted the add/auto-detect-languages-filter branch July 4, 2020 04:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Language Auto Detection & Styling

3 participants