[INFRA] Use linkchecker to verify URLs#79
[INFRA] Use linkchecker to verify URLs#79yarikoptic wants to merge 15 commits intobids-standard:masterfrom
Conversation
|
hm, did I screw up circle-ci or it is not enabled at all? I see only Travis build, which doesn't run mkdocs, so I was enhancing circle-ci configuration |
|
I eneabled it on forks, push something new, maybe it will trigger. |
|
ok, linkchecker quick and dirty fixes proposed now and fresh finding now is (edits from maintainers are allowed, so welcome to push the proper fix) |
ATM it reveals way too many problems to deal at once, e.g.: URL 'http://www.cognitiveatlas.org/term/id/trm_54e69c642d89b' Name 'http://www.cognitiveatlas.org/term/id/trm_54e69c642d89b' Parent URL file:///home/yoh/proj/bids/bids-specification/site/04-modality-specific-files/02-magnetoencephalography.html, line 639, col 184 Real URL http://www.cognitiveatlas.org/term/id/trm_54e69c642d89b Check time 1.214 seconds Result Error: 404 Not Found
I am not sure if there is any other than this config file way - could not find
|
anyone has a clue what is up with circle CI? I hoped for current run to depict the correctly detected anchor but got IMHO unrelated to my changes now I will push the fix for the detected wrong anchor |
|
Not sure what is going on. This error should not have affected pip 18.0 and master is not failing. Try committing updated Pipenv.lock |
|
time will come when I loose my superpowers of breaking things: $> pipenv lock
Locking [dev-packages] dependencies…
Locking [packages] dependencies…
Traceback (most recent call last):
File "/usr/bin/pipenv", line 11, in <module>
load_entry_point('pipenv==11.9.0', 'console_scripts', 'pipenv')()
File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/usr/lib/python3/dist-packages/pipenv/cli.py", line 512, in lock
verbose=verbose, clear=clear, pre=pre, keep_outdated=keep_outdated
File "/usr/lib/python3/dist-packages/pipenv/core.py", line 1140, in do_lock
vcs_deps = convert_deps_to_pip(project.vcs_packages, project, r=False)
File "/usr/lib/python3/dist-packages/pipenv/utils.py", line 672, in convert_deps_to_pip
extra = '{0}+{1}'.format(vcs, deps[dep][vcs])
TypeError: string indices must be integersmild wild guess -- probably due to my introduced git+https url |
…he "file" Got it after manually doing pipenv install http:/... ; pipenv lock
For some reason they aren't installed automagically
|
I almost won! but damn things doesn't give up ;) ;-) |
|
oh, python3 is not supported by linkchecker yet... overall it is starting to look more ugly than more beautiful |
|
I think that this PR would be a valuable addition. What is holding you off from trying to make it work @yarikoptic ? |
|
hiccup:
singledispatch was not installedprobably could be easily mitigated by adjusting pipenv setup to include it explicitly (for py2 only). But I already forgot how to use that pipenv beast... help would be welcome! ;) |
|
Okay thanks for the summary. Not sure whether linkchecker will support python 3 anytime soon though --> linkchecker/linkchecker#40 |
|
yeap, but bids-standard afaik can be built under python2 right? |
I'd rather keep it under Python3, but that might be my irrational dislike for python2.
right, ... we could perhaps have a second "environment" or "workflow", however one would call it. |
|
although true - it would entail setting up the full website building env on travis too then. |
ah, I had hoped that the linkchecker could work on the source data (markdown). Okay, then circle ci might work just as well in a separate py2 workflow.
okay. It'll have to wait for some more then. |
|
Small update: There is an ongoing active effort to make linkchecker py3 compatible (linkchecker/linkchecker#210) so I expect to come back to this one as soon as there is a version to try. |
|
FWIW note -- not yet, e.g. here is the most recent encounter linkchecker/linkchecker#230 (comment) |
|
is this MkDocs plugin perhaps already sufficient? --> https://github.com/manuzhang/mkdocs-htmlproofer-plugin |
|
who knows? might as well be and shouldn't hurt to be enabled regardless if it makes build less buggy! |
I am just surprised that the plugin has a class with three methods (less than 100lines of code) ... whereas the linkchecker is a huge software project ... but both should serve the same purpose (identify urls / links that lead to bad pages). Makes me suspicious 🤔 |
|
I just stumbled over this: https://github.com/davidtheclark/remark-lint-no-dead-urls it would probably be easy to implement, because we are using remark already: bids-specification/.travis.yml Lines 1 to 10 in e151e41 Lines 1 to 9 in e151e41 |
|
Shouldn't hurt but seems to only care about external URLs. I wanted to check all the |
|
I have worked out locally a docker (since should already be present on travis/circle) based recipe only to realize that circle-ci environments are already docker environments! d'oh! Learning how to chain the jobs now.... Replacement PR might come shortly or I will comment on here that I give up again ;) |
More TODOs outside the scope of this PR:
Anchor Link Checking Only Checks One Link Per Thread wummel/linkchecker#513link anchors checks broken linkchecker/linkchecker#179 so would be very nice if someone looked into it. Should be something really fixable--check-externbecause apparently there is quite a number of URLs which seems to be broken, including the ones to cognitive atlas (e.g. http://www.cognitiveatlas.org/term/id/trm_54e69c642d89b)