This project is a sourcecode transpiler for commutative diagrams.
The aim is being able to translate from and to any format, most of which are LaTeX DSLs.
Here is the progress on the planned ones:
| Target | Import | Export |
|---|---|---|
| amscd | ██████████ |
██████████ |
| amscdx | ██████░░░░ |
███████░░░ |
| CoDi | ░░░░░░░░░░ |
░░░░░░░░░░ |
| quiver | ████████░░ |
███████░░░ |
| tikz-cd | ░░░░░░░░░░ |
░░░░░░░░░░ |
| xymatrix | ██░░░░░░░░ |
░░░░░░░░░░ |
| ... |
Private repo: https://github.com/paolobrasolin/ouroboros
A transpiler like this could be realized with many technologies. I have a few end goals:
- integrating this in quiver;
- creating a conversion service with no backend infrastructure;
- using a pleasant language, with good libraries;
- learning something new.
TypeScript therefore looks like the best choice. On top of it, two outstanding libraries that trivialize a lot of groundwork are nearley.js for grammar-based parsing and Superstruct for data validation and coercion.
Freely transpiling among many DSLs requires a transpilation procedure for each ordered source/target language pair we want to connect.
How many transpilers do we need in total?
- If we connect
nDSLs directly, then we need two timesn(n-1)/2(i.e. twice the number of edges of aKₙ graph). - If we connect
nDSLs through an artificial Universal Language, then we need two timesn(i.e. twice the number of edges of theSₙ graph).
Implementing an Universal Language (UL for short) clearly is the winning strategy.
Each DSL will have a dedicated folder It will contain a some components allowing it to be transpiled back and forth from the UL.
-
schemadescribes the AST withsuperstructstructures.- Optional fragments of the DSL are accounted for by using
optional. Anything which is valid for the original processor mustvalidate. - Implicit defaults of the DSL are accounted for by using
defaulted. The schema must be the single source of truth for about the DSL defaults: consumers of the AST must simply trust coercion (e.g. viacreate) to make them defaults explicit. - In nested objects
defaulteds must be on childrenStructs, whileoptionals should be on the parentStructs. This allowsasserts to be a simple way (after coercion) to get rid of the... | undefinedfrom the signatures ofoptionalparts when processing the AST.
- Optional fragments of the DSL are accounted for by using
-
grammardescribes the DSL with anearleygrammar.- It is an optional component which might be used by the
parser. - It must not be ambiguous.
- It is an optional component which might be used by the
-
parserimplements aparsefunction to transform sourcecode into an AST.parseis responsible to perform any extra necessary decoding/deserialization on the input.parseoutputs a bona fide object respectingschema, meaning that the signatures are correct but no explicit validation (and especially no coercion) is done at this time.parsemay output an array to account for ambiguity and simplify testing, but it should only contain a single object as we ban ambiguous grammars.
-
injectorimplements aninjectfunction mapping the DSL AST into the UL AST.injectmust assume scheme coercion has been done, so it can have no knowledge of the DSL defaults and can simply perform a fewasserts to check for presence and cirvumvent the inconvenient* | undefinedsignatures.injectmustcreateits output, so it can avoid reasoning only about the features being actively used. This allows targeted testing withtoMatchObjectand avoids the need for backtracking when adding new features to the UL, all while keeping the UL fully explicit.
-
projectorimplements aprojectfunction mapping the UL AST onto the DSL AST.projectmaps only features available in the target DSL.- TODO: a policy for approximating missing features and collecting waringns for unsupported ones must be estabilished.
-
rendererimplements arenderfunction to transform an AST to sourcecode.- TODO: perhaps
rendershould include a minification process to produce the minimal code leveraging implicit defaults of the DSL. Maybe avoiding coercion is enough, but I haven't made up my mind yet.
- TODO: perhaps
-
indexties together all components into a simple API.- It implements
read = inject ∘ coerce ∘ parse, which translates DSL source into its representation in universal language. - It implements
write = render ∘ project ∘ coerce, which translates a univesal language representation into DSL source. (NOTE: coercion here can be omitted as long as we keep the UL completely explicit.)
- It implements
The UL will also have its own folder. It will contain much less than other DSLs, since it's used only for internal representation.
-
schemadescribes the AST withsuperstructstructures.- This is the only component of the UL and is used only for internal representation.
A few more words should be spent about the design of the UL, as two very different approaches can be followed for the usage of optional structures.
-
Everything is optional (except topology).
- PRO: injectors output can be limited to the used attributes
- CON: projectors input needs to be
asserted to circumvent partial signatures (after the input has been coerced externally, of course)
-
Everything is mandatory (and has reasonable defaults).
- CON: injectors output must be
created as the injector must not know about defaults and all properties are mandatory; this also avoids breakage on UL extensions - PRO: projectors input has simple signatures (no
* | undefined) and can be destructured right away while simply ignoring unsupported features of the target DSL
- CON: injectors output must be
It's a matter of balance, but ultimately the latter alternative has slightly better ergonomics, and a fully explicit UL schema should be simpler to reason about.
