Skip to content

Fix broken links in the documentation #23631

@lesteve

Description

@lesteve

Below is the list of broken links in the documention from a make linkcheck run, together with the file the link appears in and the error message.

If you want to work on this, please:

  • do one Pull Request per link
  • add a comment in this issue saying which link you want to tackle so that different people can work on this issue in parallel
  • mention this issue (#23631) in your Pull Request description so that progress on this issue can more easily be tracked

Possible solutions for a broken link include:

  • find a replacement for the broken link. In case of links to articles, being able to link to a resource where the article is openly accessible (rather than behind a paywall) would be nice.
  • The link can be added to the linkcheck_ignore variable:
    linkcheck_ignore = [
    . This is the only thing to do for example when:
    • the link is broken with no replacement (for example in testimonials some companies were acquired and their website does not exist)
    • the link works fine in a browser but is flagged as broken by make linkcheck tool. This may happen because some websites are trying to prevent bots to scrape the content of their website

Something that may be useful in the complicated cases is to search on the Internet Archive for the broken link. You may be able to look at the old content and it may help you to find an appropriate link replacement.

  • http://blanche.polytechnique.fr/~mallat/papiers/MallatPursuit93.pdf modules/generated/sklearn.linear_model.OrthogonalMatchingPursuit.rst
    403 Client Error: Forbidden for url: http://blanche.polytechnique.fr/~mallat/papiers/MallatPursuit93.pdf
    
  • http://scgroup.hpclab.ceid.upatras.gr/faculty/stratis/Papers/HPCLAB020107.pdf modules/decomposition.rst
    404 Client Error: Not Found for url: https://scgroup.hpclab.ceid.upatras.gr/faculty/stratis/Papers/HPCLAB020107.pdf
    
  • http://seat.massey.ac.nz/personal/s.r.marsland/Code/10/lle.py modules/generated/sklearn.datasets.make_swiss_roll.rst
    403 Client Error: Forbidden for url: http://seat.massey.ac.nz/personal/s.r.marsland/Code/10/lle.py
    
  • DOC Link works fine, added it to linkcheck_ignore #23679 http://users.jyu.fi/~samiayr/pdf/ayramo_eurogen05.pdf modules/linear_model.rst
    HTTPConnectionPool(host='users.jyu.fi', port=80): Max retries exceeded with url: /~samiayr/pdf/ayramo_eurogen05.pdf (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f02da35c340>, 'Connection to users.jyu.fi timed out. (connect timeout=10)'))
    
  • Fixes Robust Regression Example Link From UCLA issue #23631 #23660 http://www.ats.ucla.edu/stat/r/dae/rreg.htm modules/linear_model.rst
    HTTPConnectionPool(host='www.ats.ucla.edu', port=80): Max retries exceeded with url: /stat/r/dae/rreg.htm (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f02dfd53a60>, 'Connection to www.ats.ucla.edu timed out. (connect timeout=10)'))
    
  • http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html datasets/real_world.rst
    404 Client Error: Not Found for url: https://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
    
  • http://www.columbia.edu/~jwp2128/Papers/HoffmanBleiWangPaisley2013.pdf modules/decomposition.rst
    404 Client Error: Not Found for url: http://www.columbia.edu/~jwp2128/Papers/HoffmanBleiWangPaisley2013.pdf
    
  • http://www.iucnredlist.org/apps/redlist/details/3038/0 auto_examples/neighbors/plot_species_kde.rst
    404 Client Error: Not Found for url: https://www.iucnredlist.org/apps/redlist/details/3038/0
    
  • http://www.recognition.mccme.ru/pub/papers/SVM/sch99estimating.pdf modules/outlier_detection.rst
    HTTPSConnectionPool(host='www.recognition.mccme.ru', port=443): Max retries exceeded with url: /pub/papers/SVM/sch99estimating.pdf (Caused by SSLError(SSLCertVerificationError("hostname 'www.recognition.mccme.ru' doesn't match 'kvant.ras.ru'")))
    
  • http://www.ttic.edu/sigml/symposium2011/papers/Moore+DeNero_Regularization.pdf modules/generated/sklearn.metrics.hinge_loss.rst
    404 Client Error: Not Found for url: https://www.ttic.edu/sigml/symposium2011/papers/Moore+DeNero_Regularization.pdf
    
  • https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.214.6398&rep=rep1&type=pdf modules/decomposition.rst
    HTTPSConnectionPool(host='citeseerx.ist.psu.edu', port=443): Max retries exceeded with url: /viewdoc/download?doi=10.1.1.214.6398&rep=rep1&type=pdf (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)')))
    
  • https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.227.1802&rep=rep1&type=pdf modules/kernel_approximation.rst
    HTTPSConnectionPool(host='citeseerx.ist.psu.edu', port=443): Max retries exceeded with url: /viewdoc/download?doi=10.1.1.227.1802&rep=rep1&type=pdf (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)')))
    
  • https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.392.8794&rep=rep1&type=pdf modules/linear_model.rst
    HTTPSConnectionPool(host='citeseerx.ist.psu.edu', port=443): Max retries exceeded with url: /viewdoc/download?doi=10.1.1.392.8794&rep=rep1&type=pdf (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)')))
    
  • https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.68.5164&rep=rep1&type=pdf modules/decomposition.rst
    HTTPSConnectionPool(host='citeseerx.ist.psu.edu', port=443): Max retries exceeded with url: /viewdoc/download?doi=10.1.1.68.5164&rep=rep1&type=pdf (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)')))
    
  • https://dev.pandas.io/docs/development/maintaining.html developers/bug_triaging.rst
    HTTPSConnectionPool(host='dev.pandas.io', port=443): Max retries exceeded with url: /docs/development/maintaining.html (Caused by SSLError(SSLCertVerificationError("hostname 'dev.pandas.io' doesn't match either of '*.numericable.fr', 'numericable.fr'")))
    
  • https://docs.scipy.org/doc/scipy/reference/dev/contributor/development_workflow.html developers/contributing.rst
    404 Client Error: Not Found for url: https://docs.scipy.org/doc/scipy/reference/dev/contributor/development_workflow.html
    
  • DOC Fix scipy broken link #23697 https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.reciprocal.html modules/grid_search.rst
    404 Client Error: Not Found for url: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.reciprocal.html
    
  • [MRG] DOC added link to linkcheck_ignore #23739 https://doi.org/10.13140/RG.2.2.35280.02565 modules/generated/sklearn.cluster.spectral_clustering.rst
    403 Client Error: Forbidden for url: https://www.researchgate.net/publication/354448354?channel=doi&linkId=6138e932a3a397270a8f1300&showFulltext=true
    
  • https://imageio.readthedocs.io/en/latest/userapi.html datasets/loading_other_datasets.rst
    404 Client Error: Not Found for url: https://imageio.readthedocs.io/en/latest/userapi.html
    
  • https://newcircle.com/s/post/1152/scikit-learn_machine_learning_in_python presentations.rst
    HTTPSConnectionPool(host='newcircle.com', port=443): Max retries exceeded with url: /s/post/1152/scikit-learn_machine_learning_in_python (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f02da1007c0>, 'Connection to newcircle.com timed out. (connect timeout=10)'))
    
  • https://pythonhosted.org/joblib/memory.html modules/compose.rst
    404 Client Error: Not Found for url: https://pythonhosted.org/joblib/memory.html
    
  • https://staff.washington.edu/jakevdp presentations.rst
    404 Client Error:  for url: https://staff.washington.edu/jakevdp
    
  • https://trevorhastie.github.io modules/generated/sklearn.metrics.d2_absolute_error_score.rst
    404 Client Error: Not Found for url: https://trevorhastie.github.io/
    
  • https://users.soe.ucsc.edu/~optas/papers/jl.pdf modules/generated/sklearn.random_projection.SparseRandomProjection.rst
    404 Client Error: Not Found for url: https://users.soe.ucsc.edu/~optas/papers/jl.pdf
    
  • https://www.cs.technion.ac.il/~mic/doc/skl-ip.pdf modules/generated/sklearn.decomposition.IncrementalPCA.rst
    HTTPSConnectionPool(host='mic.net.technion.ac.il', port=443): Max retries exceeded with url: //doc/skl-ip.pdf (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)')))
    
  • https://www.datascience-paris-saclay.fr/ about.rst
    HTTPSConnectionPool(host='www.datascience-paris-saclay.fr', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)')))
    
  • https://www.frs-fnrs.be/-fnrs about.rst
    404 Client Error: Not Found for url: https://www.frs-fnrs.be/fr/-fnrs
    
  • https://www.jstor.org/stable/2984099 modules/generated/sklearn.impute.IterativeImputer.rst
    403 Client Error: Forbidden for url: https://www.jstor.org/stable/2984099
    
  • This link is working in a browser, it should be addded to linkcheck_ignore similarly to what was done in DOC added link to linkcheck_ignore #23737 https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf modules/svm.rst
    HTTPSConnectionPool(host='www.microsoft.com', port=443): Read timed out. (read timeout=10)
    
  • https://www.numfocus.org/support-numfocus.html about.rst
    403 Client Error: Forbidden for url: https://www.flipcause.com/secure/cause_pdetails/MjM2OA==
    
  • https://www.researchgate.net/publication/233096619_A_Dendrite_Method_for_Cluster_Analysis modules/clustering.rst
    403 Client Error: Forbidden for url: https://www.researchgate.net/publication/233096619_A_Dendrite_Method_for_Cluster_Analysis
    
  • This link is working in a browser, it should be addded to linkcheck_ignore similarly to what was done in DOC added link to linkcheck_ignore #23737 https://www.researchgate.net/publication/4974606_Hedonic_housing_prices_and_the_demand_for_clean_air modules/generated/sklearn.datasets.load_boston.rst
    403 Client Error: Forbidden for url: https://www.researchgate.net/publication/4974606_Hedonic_housing_prices_and_the_demand_for_clean_air
    
  • https://www.sri.com/sites/default/files/publications/ransac-publication.pdf modules/generated/sklearn.linear_model.RANSACRegressor.rst
    404 Client Error: Not Found for url: https://www.sri.com/sites/default/files/publications/ransac-publication.pdf
    
  • https://www.stat.washington.edu/research/reports/2000/tr371.pdf modules/cross_decomposition.rst
    HTTPSConnectionPool(host='www.stat.washington.edu', port=443): Max retries exceeded with url: /research/reports/2000/tr371.pdf (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)')))
    

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocumentationEasyWell-defined and straightforward way to resolveMeta-issueGeneral issue associated to an identified list of tasksgood first issueEasy with clear instructions to resolve

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions