Skip to content

Conversation

@jnothman
Copy link
Member

#3994 handled the min_samples boundary case differently to the prior DBSCAN implementation. This is now clarified in the docs. Unfortunately, when properly testing boundary cases, I found the inconsistency reported at #4072. I fix it here for 'brute' search without tests, pending a complete patch for #4072.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: boundaries

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@amueller
Copy link
Member

amueller commented Mar 3, 2015

This should now be testable, right?

@ogrisel ogrisel changed the title [MRG pending #4072] FIX/TST boundary cases in dbscan [MRG] FIX/TST boundary cases in dbscan Mar 4, 2015
@jnothman
Copy link
Member Author

jnothman commented Mar 4, 2015

This should now be testable, right?

Rather the tests were written elsewhere with a different patch.

This is now rebased and ready for review.

@amueller
Copy link
Member

amueller commented Mar 4, 2015

the case with no core samples fails...

@jnothman
Copy link
Member Author

jnothman commented Mar 5, 2015

Of course I reviewed #4052, but what basis did we have for thinking X = rng.rand(40, 10); X[X < 8] = 0 would generate data without core samples for eps=.5, min_samples=5? I get:

>>> np.bincount(pairwise_distances(X) <= .5)
[ 0 18 10  3  4  5]

I've made that test more certain.

@amueller
Copy link
Member

amueller commented Mar 5, 2015

Sorry, that was a hacky test. It probably came from some example that was failing at the time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a stupid question but why do you need [1] twice?

@jnothman jnothman closed this in 15c9c0f Mar 5, 2015
alexsavio pushed a commit to alexsavio/scikit-learn that referenced this pull request Mar 9, 2015
rasbt pushed a commit to rasbt/scikit-learn that referenced this pull request Apr 6, 2015
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Jul 11, 2015
* tag '0.16b1': (1589 commits)
  0.16.X branching, version 0.16b1
  Fix scikit-learn#4351. Rendering of docs in MinMaxScaler.
  Fix rebase conflict
  MAINT use canonical PEP-440 dev version consistently
  Adding fix for issue scikit-learn#4297, isotonic infinite loop
  DOC deprecate random_state for DBSCAN
  FIX/TST boundary cases in dbscan (closes scikit-learn#4073)
  Do not shuffle in DBSCAN (warn if `random_state` is used).
  Update docstring predict_proba()
  Update documentation of predict_proba in tree module
  add scipy2013 tutorial links to presentations on website.
  TST boundary handling in LSHForest.radius_neighbors
  ENH improve docstrings and test for radius_neighbors models
  use a pipeline for pre-processing feature selection, as per best practise
  DOC remove unnecessary backticks in CONTRIBUTING.
  ENH no need for tie breaking jitter in calibration
  Implement "secondary" tie strategy in isotonic.
  Adding unit test to cover ties/duplicate x values in Isotonic Regression re: issue scikit-learn#4184
  MAINT fix typo pyagm -> pygamg in SkipTest
  STYLE trailing spaces
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Jul 11, 2015
* releases: (1589 commits)
  0.16.X branching, version 0.16b1
  Fix scikit-learn#4351. Rendering of docs in MinMaxScaler.
  Fix rebase conflict
  MAINT use canonical PEP-440 dev version consistently
  Adding fix for issue scikit-learn#4297, isotonic infinite loop
  DOC deprecate random_state for DBSCAN
  FIX/TST boundary cases in dbscan (closes scikit-learn#4073)
  Do not shuffle in DBSCAN (warn if `random_state` is used).
  Update docstring predict_proba()
  Update documentation of predict_proba in tree module
  add scipy2013 tutorial links to presentations on website.
  TST boundary handling in LSHForest.radius_neighbors
  ENH improve docstrings and test for radius_neighbors models
  use a pipeline for pre-processing feature selection, as per best practise
  DOC remove unnecessary backticks in CONTRIBUTING.
  ENH no need for tie breaking jitter in calibration
  Implement "secondary" tie strategy in isotonic.
  Adding unit test to cover ties/duplicate x values in Isotonic Regression re: issue scikit-learn#4184
  MAINT fix typo pyagm -> pygamg in SkipTest
  STYLE trailing spaces
  ...

Conflicts:
	sklearn/externals/joblib/__init__.py
	sklearn/externals/joblib/numpy_pickle.py
	sklearn/externals/joblib/parallel.py
	sklearn/externals/joblib/pool.py
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Jul 11, 2015
* dfsg: (1589 commits)
  0.16.X branching, version 0.16b1
  Fix scikit-learn#4351. Rendering of docs in MinMaxScaler.
  Fix rebase conflict
  MAINT use canonical PEP-440 dev version consistently
  Adding fix for issue scikit-learn#4297, isotonic infinite loop
  DOC deprecate random_state for DBSCAN
  FIX/TST boundary cases in dbscan (closes scikit-learn#4073)
  Do not shuffle in DBSCAN (warn if `random_state` is used).
  Update docstring predict_proba()
  Update documentation of predict_proba in tree module
  add scipy2013 tutorial links to presentations on website.
  TST boundary handling in LSHForest.radius_neighbors
  ENH improve docstrings and test for radius_neighbors models
  use a pipeline for pre-processing feature selection, as per best practise
  DOC remove unnecessary backticks in CONTRIBUTING.
  ENH no need for tie breaking jitter in calibration
  Implement "secondary" tie strategy in isotonic.
  Adding unit test to cover ties/duplicate x values in Isotonic Regression re: issue scikit-learn#4184
  MAINT fix typo pyagm -> pygamg in SkipTest
  STYLE trailing spaces
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants