Initial Python formatting support#3349
Conversation
|
This is awesome! I wouldn’t be opposed to merging it in the prettier repo so that other people can help with the effort, if you are interested. Down the road, it would be nice to compile python to js via emscriptem so that we don’t need the binary dependency and we can show it in the playground. Here’s a build of python 2.7 as an example: https://github.com/replit/empythoned |
|
If there's a python parser written in JS that we can leverage, that'd be our first choice (will work in the browser and etc), but if that's not available, we might want to find a way to vendor ASTExport with prettier. We vendor all of our other parsers so that we don't have to worry about the AST format differing when the user has a different parser version installed. |
|
@vjeux oh, I'll have a look at empython! thanks. @suchipi vendorizing ASTExport shouldn't be a problem, it is a simple script :) |
The HTML support in the repo is super experimental as well :-) |
|
We had a couple of issues where people used the experimental HTML support and were surprised it didn't work, though. We'd want to be careful to ensure that doesn't happen again. |
|
Yeah, you're right. I thought that was fixed with |
|
Experimental parser should not be added in Lines 176 to 189 in fbbfa52 |
|
If that's fine with all of you ok :) but I'd wait to have a decision on the parser as well, just to prevent having to change some pieces related to the current AST structure! |
|
can I just add that this would change my life for the much better 👍 AWESOME |
|
Very neat. People have run Python on the client with emscripten. Otherwise, the Python binary can be optional, no? Note that the standard AST parser in Python is part of the core stdlib: https://docs.python.org/3/library/ast.html, so there should be a strong presumption in favor of using that instead of a separate parser. |
|
Python provides a way to query the ast as part of the language itself, so there’s very little incentive for third party implementations to build competing ones. Building a faithful parser is not an easy task and it is a critical requirement if we want to reprint the code without introducing different semantics. So the only realistic choice here is to use the one provided by default, like you did. It’s fine to require a binary to be installed on the machine for now, it’ll take a while for the whole language to be supported anyway. If we can get it to run using emscriptem, then that’s perfect. Otherwise we’ll have to invest in getting third party dependencies like this one installed and have a server for the playground. We’ve been lucky not to have had to do it so far, but it’s not the end of the world. As for merging it in, if you don’t feel like it yet, that’s ok. The most important thing right now is that you churn code as fast as possible to print as much as the language as possible. Then there’s going to be a phase to make it actually correct and one to make it pretty. There’s basically a clock where you won’t be motivated anymore / won’t be able to spend the time anymore, so we should do whatever you feel best to get the maximum amount of work done :) |
|
That said, it might also be worth taking a look at Baron and RedBaron. They're Python CSTs (instead of ASTs), but it may be unnecessary to fully duplicate the Python code printing logic, especially as the language adds new features. |
|
You may need to vendor in something like astexport, though – the release on PyPI only supports Python 3 (despite having a Python 2 wheel for some reason). Like I imagine this will work just fine with whatever virtualenv you're in, and will use that Python binary, but not supporting Python 2 might not be ideal. |
src/printer-python.js
Outdated
| parts.push(concat(["**", path.call(print, "kwarg")])); | ||
| } | ||
|
|
||
| if (n.kwonlyargs.length > 0) { |
There was a problem hiding this comment.
kwonlyargs need to go before **kwargs
|
Looking at this some more, this might be really tricky without using something like Baron. The Python AST is a pure AST, as with ESTree. Unlike Babylon, it doesn't include information about comments at all. |
|
@taion comments are handled in a different way with prettier. We take the raw list of comments and positions and attach them to nodes ourself. All we need to do is to implement a small lexer to extract out comments. This is super easy to do, even if python tooling doesn't provide it for us. |
|
Ah, cool. Looks like |
eab07d8 to
32fec6b
Compare
|
This is really cool, but I'm going to play devil's advocate and list some of my concerns:
All that being said, I'm fine with merging this in an experimental status, provided we address (5.) As @vjeux mentioned, being able to compile the parser down to JavaScript would be a huge step towards getting this released. |
|
What do you mean by "Python doesn't have a specification"? The Python 2/Python 3 support thing is trivial. astexport basically does very little, and it's a few little syntax things that make it not support Python 2. There's not much necessary to get it to work with Python 2, and otherwise it's just using The AST parser here is literally built in to Python and exposed as such, so it's not an extra binary dependency unless you think people are going to try to format Python without having Python installed somehow. In some sense you actually can't neatly bundle it here, though, because the AST parser is somewhat specific to individual versions of Python, and as such the correct version of Python to use would be the system (or virtualenv Python). Though I think the biggest question here is – "what's the cost of bridging over the Python AST to JS, versus porting over Prettier's printer to Python?" Is most of the value-add here a set of language-agnostic heuristics that are complicated and subtle, or is there a lot to handle JS-specific idioms? If it's the latter, then perhaps there's not much presumptive reason to implement Python tooling in Node. |
|
@azz thanks for the comments!
|
JavaScript has a spec, which informs us of the rules in writing valid JavaScript, as well as the runtime semantics of the code. I forgot that Python has a grammar reference, which helps with this.
This is the case with JavaScript, too, but we make sure we can handle all JavaScript language features (including a lot of experimental features). This way the output of Prettier doesn't vary based on which version of Node.js, or which browser you are using.
The doc-building/printing portion of Prettier is mostly independent of any programming language. There have been tweaks to support certain features, like quote blocks in Markdown, and template strings in JavaScript, but those might be useful for printing Python too (e.g. multi-line strings) |
|
I've met a guy who did a talk on batavia. It's a cool project. I might have a play with it and try and get |
|
Python AST parsing is version-specific though. It's not like how all syntactically-valid ES5 code is also syntactically-valid "modern JS". If you run In general, the Python backward compat policy doesn't say "no backward-incompatible syntax changes": https://www.python.org/dev/peps/pep-0387/. If you want to use |
|
The trade-off ends up being language-specific printing stuff. The astor docs for example specifically flag that many attempts to print the Python AST to source have failed on corner cases: https://github.com/berkerpeksag/astor#id2. |
|
If it lowers the dev/support burden, maybe just support Python3? Many projects will be removing support for Python2 at the end of next year anyway. |
|
Are there syntax breaking changes from Python 3.0 to 3.7? |
|
@azz none that I can't think of, I previously said that f-string will be supported in py 3.7, but they are already in (since 3.6) |
086c322 to
029b7a9
Compare
|
@azz can you do another review? :) |
|
I've started another repo with (one for now) some unit tests that I would use to verify to validity of the formatted files, I had a bug where it was printing correct python code but it was doing a different thing. I still have to add the steps to download prettier, prettify the code and then run the tests :) |
8ca22fe to
c461605
Compare
|
I'll do another soon, but I'm considering the advice from @jlongster in #3503 (comment) and thinking about how we can share Prettier as a tool, while having separate maintainable codebases. To that end I'm working on a plugin API in #3536, and once that's done we can move this PR to a new repository ( This way we can achieve:
All the while keeping:
We can do the same for the PHP PR in #3521, and potentially the HTML support in #3534. Thoughts? |
|
Sounds good to me, especially for this:
I can see it being a problem when having many languages baked into prettier. Let's see what the other folks think :) |
|
Yes @azz thank you for working on that! I'd like to start watching prettier's issues again but I can't imagine trying to lump a whole new language into here and the amount of issues that it would add on top of everything. I really don't think another language should be included here. Markdown is passable because it's a big part of the JS ecosystem, and it's probably easy enough. But we should keep prettier as a JS tool. |
|
I'm wondering about that community aspect, though. Like, as set up, the only people who can actually contribute to this are people who are familiar with Python AST manipulation and who can work in Node. So I'd ask then how this matches @vjeux's points in #3349 (comment). |
|
What I want is for people to stop wasting time formatting their code. The reality is that people use many different programming languages on a daily basis and not just JS. It turns out that prettier is a really good foundation for building a pretty printer in any language so I feel like it would be a shame for it not to expand to more places now that it's covering the core use case very well. I also found out that a big reason that languages build silos around themselves is that all the tooling is completely different. One of the reason I joined the Nuclide team was to break silos by not having to learn a completely new IDE when from switching languages, which is one less ramp up time. I believe that having the same pretty printer which shares similar decisions and design philosophy is going to be one more stone in that direction. Now, this is --not-- going to be easy as forces to push language ecosystems away from each other are strong, but I don't think it's impossible and I believe that having PHP and Python pretty printers as part of prettier is something that can work. Having a separate repo to manage issues and PR is a good idea. |
|
I definitely agree with what you're saying. My concern is maybe more around the API boundaries here. If the idea per #3536 is a plugin API, then, in that world where there is a level of separation, it would be great to implement the AST token printing in Python rather than in JS. That way Prettier could just leverage e.g. astor for Python, &c. The core printing API, the platform API, the CLI stuff – that's all great. To me, especially if prettier-python lives in a different package entirely, it just looks weird that BTW, what's that about IDEs? Maybe I'm spoiled, but I use JetBrains IDEs for just about everything... they seem to support pretty much everything under the sun. |
|
@taion Interesting idea. One of the important components of Prettier is the actual logic that figures out where to break lines. Maybe the Python component could generate the document using translated builders, then serialize the result to JSON and had that over to Prettier’s JS so that it can print the prettified code. |
|
@j-f1 Yeah – I mentioned this upthread, but it's not a super-compelling thing for part of Prettier proper. As a Prettier plugin that's in a separate repo aimed at Python people, it's a better idea IMO. It looks like |
|
I've copied all the commits from this PR to a new repository 🎉 https://github.com/prettier/prettier-python. @patrick91 I've made you an admin of the new repo. |


Hi everyone! I've started hacking around prettier to add support for Python files since I wasn't happy with autopep8 and yapf (and I'm using prettier at work).
I'm working on this in my free time so I can't say when it'll be ready, there's a lot of things to
do and to figure out, I'll update this issue while working on it :)
TODO:
#3354