Skip to content

Comments

nixpkgs: poor mans doc blocks wip / rfc / etc#27394

Closed
grahamc wants to merge 18 commits intoNixOS:masterfrom
grahamc:fake-doc-blocks
Closed

nixpkgs: poor mans doc blocks wip / rfc / etc#27394
grahamc wants to merge 18 commits intoNixOS:masterfrom
grahamc:fake-doc-blocks

Conversation

@grahamc
Copy link
Member

@grahamc grahamc commented Jul 15, 2017

There has been a lot of chatter around doc blocks in Nix. Since Nix is almost purely expression based, it is likely very challenging to actually make that work. We might look in to how JSDoc handles it, as they have similar challenges I think.

However, that is complicated. I decided to get creative and try something simple out. Maybe you'll like it, maybe you'll hate it ;)

Basically, where we have:

  /* id returns its parameter
     Example:

    id x
    => x
*/
  id = x: x;

I changed it to:

  docs.id = mkDoc {
    description = "id returns its parameter";
    examples = [
      { title = "basic example";
        body = ''
          id x
          => x
        '';
    ];
  };
  id = x: x;

which get dumped to a docbook file and included in the manual. Currently very WIP, I spent an hour on this, looking for ways to solve this that are easier and more actionable than "we need a better parser."

Example result:
screenshot-2017-7-15 nixpkgs contributors guide

What do y'all think?

cc @joepie91 @domenkozar @edolstra @copumpkin @Profpatsch @zimbatm @vcunat @everyoneelse.

@mention-bot
Copy link

@grahamc, thanks for your PR! By analyzing the history of the files in this pull request, we identified @zimbatm, @edolstra and @Profpatsch to be potential reviewers.

@bjornfor
Copy link
Contributor

Tough one. It does make the generated documentation nice to read, but less so the source itself. On the other hand, it also makes the doc available in nix-repl, which is nice. I'm leaning towards +1 :-)

lib/lists.nix Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type could be his own attribute, it could be re-used for runtime checks as well

lib/lists.nix Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would you think of using a markdown syntax instead?

@zimbatm
Copy link
Member

zimbatm commented Jul 15, 2017

It would be nice if the function could embed the metadata somehow, that would allow to query any function and not have to worry of it's location or where it came from. But that would require a language change.

Even in it's current implementation I think that it's an improvement over writing XML.

@vcunat
Copy link
Member

vcunat commented Jul 16, 2017

Theoretically, nix could be extended to parse documentation comments and expose them via some builtins.getDoc. I don't expect that to be a very difficult change, but certainly not as easy as what you did.

Syntax within the docs seem a partially independent topic. Maybe it would be simpler to read if we didn't solve syntax at nix layer (i.e. write just string lines) and left parsing to the docs generator, perhaps some standard format (markdown, asciidoc, ...).

@grahamc
Copy link
Member Author

grahamc commented Jul 16, 2017

I quite like the idea of doing it without language features, so the functionality can evolve on its own. That said, I'm not particularly attached to any specific method.

I do actually like #23505 better than this, if we could find a way to make it more performant... and of course, joepie thinks in-code docs are not good at all.

@Profpatsch
Copy link
Member

I started thinking about a comment-parsing implementation for the hnix parser of the nix language a few days back.

I do actually like #23505 better than this, if we could find a way to make it more performant...

This indeed looks awesome. Not sure why self is passed into __functor, looks to me like it makes optimizations kind of hard.

@zimbatm
Copy link
Member

zimbatm commented Jul 19, 2017

This indeed looks awesome. Not sure why self is passed into __functor, looks to me like it makes optimizations kind of hard.

nix doesn't bind the function to the object's context so there wouldn't be a way to parameterize the function. It's more obvious with __toString where a reference to self is necessary most of the time. It's a bit like python in that regard.

@zimbatm
Copy link
Member

zimbatm commented Jul 19, 2017

I started thinking about a comment-parsing implementation for the hnix parser of the nix language a few days back.

If comments can be introspected they become part of the code and need to be tested like the rest of the code. It's probably better to keep comments free-form and add a notion of introspectable object annotation to the language. That way comments are reserved to the developer to better understand the code and it's possible to extend the language with more information. Nix values are already a big union struct so more data could be bolted-on.

@zimbatm
Copy link
Member

zimbatm commented Jul 19, 2017

For today, I think that @grahamc's proposal is a good solution for technical code that needs to stay in sync with the code. When/if the language gets extended the documentation can get refactored.

@edolstra
Copy link
Member

Maybe an example like

  body = ''
    id x
    => x
  '';

should be

  input = "id 123";
  output = "123";

That way we get more flexibility in rendering the examples, and they can be tested automatically.

@ericsagnes
Copy link
Contributor

For the record, I use a similar approach to #23505 in styx to generate library documentation and automatic tests, so far it works pretty well with no visible performance impact, but styx codebase size is nowhere close to nixpkgs.

It is still very basic and makes function declarations a lot more verbose, but having generated documentation and tests can really help.

Example of generated docs: https://styx-static.github.io/styx-site/documentation/library.html
Exemple of a test run (with a voluntary error in the expected result to show interesting output):

57 tests run.
- 56 success(es).
- 1 failure(s).

Failures details:

====================
lib.utils.sortBy, example number 1:

code:
---
sortBy "priority" "asc" [ { priority = 5; } { priority = 2; } ]
---

expected:
---
[ {
  priority = 2;
} {
  priority = 3;
} ]
---

got:
---
[ {
  priority = 2;
} {
  priority = 5;
} ]
---
====================

@grahamc
Copy link
Member Author

grahamc commented Sep 17, 2017

Bringing this back from the dead, I realize I never commited docs.nix :( I'm rebasing on #27797, as I'm assuming that'll merge.

@grahamc
Copy link
Member Author

grahamc commented Sep 17, 2017

Ok,You won't like how I did this, but:

    examples =[
      (testEx {
        title = "constructing a concat function with foldr";
        test = rec {
          concat = fold (a: b: a + b) "z";
          ensure = concat [ "a" "b" "c" ] == "abcz";
        };
      })

produced:

screenshot-2017-9-17 nixpkgs contributors guide

and if the ensure is wrong, it'll fail to build the manual. ... just an experiment...:

{
  testEx = test:
    assert test.test.ensure;
    let
      posTest = builtins.unsafeGetAttrPos "test" test;
      posVerify = builtins.unsafeGetAttrPos "ensure" test.test;

      fullContent = lib.strings.splitString "\n" (builtins.readFile posTest.file);
      endsAtVerify = lib.lists.take posVerify.line fullContent;
      onlyTestLines = lib.lists.drop posTest.line endsAtVerify;

      onlyTest = lib.strings.concatStringsSep "\n" onlyTestLines;
      asExample = builtins.replaceStrings ["ensure = " " == "] ["" "\n=> "] onlyTest;

      removeWhitespace = text:
        let
          lines = lib.strings.splitString "\n" text;
          firstLine = builtins.head lines;
          normalized = lib.strings.stringAsChars
            (c: if c == " " then " " else ".")
            firstLine;
          normaSplit = lib.strings.splitString "." normalized;
        in
           lib.strings.concatMapStrings
             (line:
               let
                 p = lib.strings.removePrefix (builtins.head normaSplit) line;
               in "${p}\n")
             lines;

    in ({ title, test }: {
      inherit title;
      body = removeWhitespace asExample;
    }) test;
}

I've removed this bit of code from my branch.

@Profpatsch
Copy link
Member

and if the ensure is wrong, it'll fail to build the manual. ... just an experiment :)

Nice, I like that. I’ve been thinking about how/where we should integrate such tests that they fail as fast as possible. We should chat about this.

@grahamc
Copy link
Member Author

grahamc commented Sep 20, 2017

@grahamc
Copy link
Member Author

grahamc commented Sep 20, 2017

here are some totally bogus benchmarks to see if I slowed things down a lot:

master:

nix-instantiate ./nixos/tests/*.nix  23.40s user 0.59s system 132% cpu 18.070 total
nix-instantiate ./nixos/tests/*.nix  24.51s user 0.61s system 146% cpu 17.193 total
nix-instantiate ./nixos/tests/*.nix  24.44s user 0.57s system 147% cpu 17.006 total
note: discarding 10% highest outliers
maximum RSS:        median = 1395356.0000  mean = 1395950.2222  stddev =  66714.8549  min = 1272640.0000  max = 1488616.0000
soft page faults:   median = 354453.0000  mean = 354176.4444  stddev =  17868.0051  min = 321877.0000  max = 377749.0000
system CPU time:    median =      0.5580  mean =      0.5607  stddev =      0.0273  min =      0.5220  max =      0.6100
user CPU time:      median =     17.0440  mean =     17.0872  stddev =      0.7512  min =     16.2310  max =     18.0400
elapsed time:       median =     19.0006  mean =     19.0188  stddev =      0.7837  min =     18.1093  max =     20.0029

this branch:

nix-instantiate ./nixos/tests/*.nix  25.18s user 0.83s system 123% cpu 21.044 total
nix-instantiate ./nixos/tests/*.nix  25.37s user 0.54s system 153% cpu 16.854 total
nix-instantiate ./nixos/tests/*.nix  24.65s user 0.54s system 150% cpu 16.760 total
nix-instantiate ./nixos/tests/*.nix  23.96s user 0.60s system 150% cpu 16.337 total

note: discarding 10% highest outliers
maximum RSS:        median = 1416476.0000  mean = 1412738.6667  stddev =  59387.1624  min = 1344740.0000  max = 1529156.0000
soft page faults:   median = 358098.0000  mean = 359371.6667  stddev =  13884.1354  min = 339940.0000  max = 387896.0000
system CPU time:    median =      0.5700  mean =      0.5689  stddev =      0.0291  min =      0.5110  max =      0.6040
user CPU time:      median =     17.1010  mean =     16.9808  stddev =      0.6232  min =     16.0450  max =     17.6820
elapsed time:       median =     19.0186  mean =     18.8907  stddev =      0.6469  min =     17.8797  max =     19.6176

@zimbatm
Copy link
Member

zimbatm commented Sep 20, 2017

I think that the latest changes go a bit too far in terms of verbosity, the function description now takes a full page in my browser. In most cases it's enough to document the parameter names and types and then expand in the description and examples in case it's not clear what those parameters are doing.

How about changing the output to be lists.findSingle — pred: (a -> Bool) -> default: a -> multiple: a -> [a] -> a? With named arguments it makes it easier to reference them in the function description.

@zimbatm
Copy link
Member

zimbatm commented Sep 20, 2017

I don't know if there is a type notation that could be closer to nix code

lists.findSingle =
  pred: # (a -> Bool)
  default: # a
  multiple: # a
  ; #=> a

@grahamc
Copy link
Member Author

grahamc commented Sep 21, 2017

This might be a bigger debate, but I think there are two issues:

the function description now takes a full page in my browser.

Frankly this doesn't seem very concerning to me. These function documents should probably be split out in to their own HTML pages anyway so as to not make the Nixpkgs manual massively long. I think a function's documentation shouldn't be held to an arbitrary length requirement.

I think that the latest changes go a bit too far in terms of verbosity,

Anecdotally why I disagree

Based on feedback I've received, I'd ask you to reconsider. I guess it depends on who your audience is. Some of my team-members are novice Nix programmers and when shown simply:

lists.findSingle — pred: (a -> Bool) -> default: a -> multiple: a -> [a] -> a

or

lists.findSingle =
  pred: # (a -> Bool)
  default: # a
  multiple: # a
  ; #=> a

balked.

Having the parameters explicitly broken out in a list with individual descriptions of their role made it much easier for them to understand and use the docs.

I'd ask you to reconsider who is needing these docs. Clearly it isn't the users we have now, we're all evidently happy enough to just go read lib/*.nix. I'd really like our docs to be thorough enough to pass to my coworkers and feel confident they'll be able to use them without needing me to do a 30 minute 1:1 to supplement them.

Syntactically why I disagree

lists.findSingle — pred: (a -> Bool) -> default: a -> multiple: a -> [a] -> a

In this case the code isn't valid nix and requires readers to understand type signatures, a concept that Nix doesn't actually have. (As an aside, note the real type signature of findSingle is (a -> Bool) -> b -> c -> [a] -> a|b|c.)

or in this case:

lists.findSingle =
  pred: # (a -> Bool)
  default: # a
  multiple: # a
   #=> a

Which is almost valid Nix, I think is not sufficient for docs aimed at all levels of Nix users.

That said, I really like this formatting for actual code!

{
  findSingle =
    /*
      Find the sole element in the list matching the specified
      predicate, returns `default' if no such element exists, or
      `multiple' if there are multiple matching elements.
    */
    pred:     # a -> Bool, the selecting function
    default:  # a, The default if pred never returns true
    multiple: # a, The value if pred returns true multiple times 
    list:     # [a], The list of values to search with pred
      let found = filter pred list; len = length found;
      in if len == 0 then default
        else if len != 1 then multiple
        else head found;
}

@grahamc
Copy link
Member Author

grahamc commented Sep 21, 2017

Note that the approach I've taken here of making docs in code makes the lib functions much too hard to read, and the output too hard to make right. With this proof of concept, I'm throwing it away and am going to explore just writing docbook.

@grahamc grahamc closed this Sep 21, 2017
@grahamc
Copy link
Member Author

grahamc commented Sep 21, 2017

@zimbatm please feel free to discuss further with me, here :)

@zimbatm
Copy link
Member

zimbatm commented Sep 21, 2017

hmm. I hope you didn't get discouraged because of my comment. I still think it's a great idea and improves what we currently have.

How about skipping the types altogether for now? Most of the time, a description with a few well selected examples are enough for the reader to understand the purpose of a function.

@grahamc
Copy link
Member Author

grahamc commented Sep 23, 2017

No, I wasn't discouraged by your comment. I think the output I have is good. I don't like how it is done in code.

@vcunat
Copy link
Member

vcunat commented Sep 24, 2017

Well, for exceedingly simple combinators like findSingle, I look at three-line text description and if I'm not certain I look at the simple five-line implementation. A one-page html would seem an overkill here, though there certainly are functions in lib where I wouldn't say so.

Speed: if I understand this correctly, docs.* will only be parsed and not evaluated normally, so the performance impact should be negligible.

Overall, the approach seems OK to me. I didn't delve into any details (yet). I would personally fear moving source of the docs too far from the code, even if lib is relatively stable (i.e. unchanging).

@ixmatus
Copy link
Contributor

ixmatus commented Jan 24, 2018

I'd like to add support to this effort by saying the lack of easily discoverable docs for lib functions is a frequent gripe for people who don't know what's in nixpkgs/lib/* (it's a gripe for me too, honestly, but I have so much in memory now it's less painful.)

@grahamc how could I help with this effort?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants