Skip to content

Added in new MissingValueReplacing method.#5205

Merged
michaelgsharp merged 8 commits intodotnet:masterfrom
michaelgsharp:mode-missing-values
Jun 10, 2020
Merged

Added in new MissingValueReplacing method.#5205
michaelgsharp merged 8 commits intodotnet:masterfrom
michaelgsharp:mode-missing-values

Conversation

@michaelgsharp
Copy link
Copy Markdown
Contributor

Adds in the missing values replacing method of Mode. Replaces missing values with the most frequent value in a column. In the case that multiple values have the same count, the first one encountered is the one that is returned.

This also moves a test helping method from OnnxConverstionTest.cs into the BaseTestBaseline class so that every test class can use it.

@michaelgsharp michaelgsharp requested review from a team and harishsk June 3, 2020 22:27
@michaelgsharp michaelgsharp self-assigned this Jun 3, 2020
Comment thread test/Microsoft.ML.Tests/OnnxConversionTest.cs Outdated
Comment thread src/Microsoft.ML.Transforms/MissingValueReplacingUtils.cs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 4, 2020

Codecov Report

Merging #5205 into master will increase coverage by 0.49%.
The diff coverage is 96.12%.

@@            Coverage Diff             @@
##           master    #5205      +/-   ##
==========================================
+ Coverage   73.08%   73.57%   +0.49%     
==========================================
  Files        1004     1016      +12     
  Lines      187398   190214    +2816     
  Branches    20212    20456     +244     
==========================================
+ Hits       136952   139952    +3000     
+ Misses      44929    44687     -242     
- Partials     5517     5575      +58     
Flag Coverage Δ
#Debug 73.57% <96.12%> (+0.49%) ⬆️
#production 69.37% <91.73%> (+0.49%) ⬆️
#test 87.53% <100.00%> (+0.30%) ⬆️
Impacted Files Coverage Δ
...c/Microsoft.ML.Transforms/MissingValueReplacing.cs 77.53% <ø> (+0.17%) ⬆️
...rosoft.ML.Transforms/MissingValueReplacingUtils.cs 54.15% <91.73%> (+15.20%) ⬆️
...est/Microsoft.ML.TestFramework/BaseTestBaseline.cs 77.23% <100.00%> (+4.53%) ⬆️
test/Microsoft.ML.Tests/OnnxConversionTest.cs 96.62% <100.00%> (-0.19%) ⬇️
.../Microsoft.ML.Tests/Transformers/NAReplaceTests.cs 100.00% <100.00%> (ø)
....ML.AutoML/PipelineSuggesters/PipelineSuggester.cs 83.19% <0.00%> (-3.37%) ⬇️
src/Microsoft.ML.AutoML/Sweepers/Parameters.cs 84.32% <0.00%> (-0.85%) ⬇️
...c/Microsoft.ML.SamplesUtils/SamplesDatasetUtils.cs 40.00% <0.00%> (-0.68%) ⬇️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 89.29% <0.00%> (-0.16%) ⬇️
....ML.Tests/Transformers/CountTargetEncodingTests.cs 100.00% <0.00%> (ø)
... and 39 more

Comment thread src/Microsoft.ML.Transforms/MissingValueReplacingUtils.cs
Comment thread test/Microsoft.ML.Tests/Transformers/NAReplaceTests.cs Outdated
Copy link
Copy Markdown
Contributor

@harishsk harishsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Copy Markdown
Contributor

@harishsk harishsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

@harishsk
Copy link
Copy Markdown
Contributor

harishsk commented Jun 5, 2020

            Append(mlContext.Transforms.NormalizeMinMax("Features")).

Can you please add a separate onnx test for ReplaceMissingValues with all the supported types of replacements?


Refers to: test/Microsoft.ML.Tests/OnnxConversionTest.cs:581 in 701d9d8. [](commit_id = 701d9d8, deletion_comment = False)

Comment thread src/Microsoft.ML.Transforms/MissingValueReplacingUtils.cs
Comment thread test/Microsoft.ML.Tests/OnnxConversionTest.cs Outdated
Copy link
Copy Markdown
Contributor

@harishsk harishsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

/// <summary>
/// Replace with the most frequent value of the column.
/// </summary>
Mode = 5
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we skip 4 here? It went from 0, 1, 2, 3 and then jumped to 5.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants