This document (notebook) demonstrates the functions of “Graph::RandomMaze”, [AAp1], for generating and displaying random mazes. The methodology and implementations of maze creation based on random rectangular and hexagonal grid graphs are described in detail in the blog post “Day 24 – Maze Making Using Graphs”, [AA1], and in the Wolfram notebook “Maze Making Using Graphs”, [AAn1].
This blog post (notebook) presents various visualizations related to the Collatz conjecture, [WMW1, Wk1] using Raku.
The Collatz conjecture, a renowned, unsolved mathematical problem, questions whether iteratively applying two basic arithmetic operations will lead every positive integer to ultimately reach the value of 1.
In this notebook the so-called “shortcut” version of the Collatz function is used:
That function is used repeatedly to form a sequence, beginning with any positive integer, and taking the result of each step as the input for the next.
The Collatz conjecture is: This process will eventually reach the number 1, regardless of which positive integer is chosen initially.
Raku-wise, subs for the Collatz sequences are easy to define. The visualizations are done with the packages “Graph”, [AAp1], “JavaScript::D3”, [AAp2], and “Math::NumberTheory”, [AAp3].
There are many articles, blog posts, and videos dedicated to visualizations of the Collatz conjecture. (For example, [KJR1, PZ1, Vv1]).
Remark: Using simple sampling like the code block below would generally produce very non-uniform length- and max-value sequences. Hence, we do the filtering above.
Here is a comparison with the corresponding Benford’s law values:
#% html
sub benford-law(UInt:D $d, UInt:D $b = 10) { log($d + 1, $b) - log($d, $b) };
my @dsDigitTally =
%digitTally.sort(*.key.Int).map({%(
digit => $_.key,
value => round($_.value / %digitTally.values.sum, 10 ** -7),
benford => round(benford-law($_.key.Int), 10 ** -7)) })
==> to-html(field-names => <digit value benford>)
digit
value
benford
1
0.2836276
0.30103
2
0.1886912
0.1760913
3
0.1194717
0.1249387
4
0.1176338
0.09691
5
0.0797422
0.0791812
6
0.0615111
0.0669468
7
0.0458833
0.0579919
8
0.0607828
0.0511525
9
0.0426562
0.0457575
Good adherence is observed for a relatively modest number of sequences. Here is a corresponding bar chart:
#% js
my @data =
|@dsDigitTally.map({ <x y group>.Array Z=> [|$_<digit value>, 'Collatz'] })».Hash,
|@dsDigitTally.map({ <x y group>.Array Z=> [|$_<digit benford>, 'Benford'] })».Hash;
js-d3-bar-chart(
@data,
title => "First digits frequencies (up to $m)",
:$title-color,
x-label => 'digit',
y-label => 'frequency',
:!grid-lines,
:$background,
:700width,
:400height,
margins => { :50left }
)
Sunflower embedding
A certain concentric pattern emerges in the spiral embedding plots of the Collatz sequences lengths mod 8. (Using mod 3 makes the pattern clearer.) Similarly, a clear spiral pattern is seen for the maximum values.
It is easy to make a simple Rock-Paper-Scissors (RPS) game graph using the Raku package “Graph”, [AAp1]. Here is such a graph in which the arrow directions indicate which item (vertex) wins:
In this post (notebook) we show how to do all of the above points.
Remark: Interesting analogies of the presented graphs can be made with warfare graphs, [AAv1]. For example, the graph tanks-infantry-guerillas is analogous to RPS.
TL;DR
LLMs “know” the RPS game and its upgrades.
LLMs know how to (mostly, reliably) translate to emojis.
The package “Graph” (via Graphviz DOT) can produce SVG plots that are readily rendered in different environments.
And the graphs of hand-games like RPS look good.
The class Graph have handy methods and attributes that make the creation and modification of graphs smooth(er).
Setup
This notebook is a Raku-chatbook, hence, its Jupyter session preloads certain packages and LLM-personas.
# Preloaded in any chatbook
# use LLM::Functions;
# use LLM::Prompts;
# Preloaded in a user init file
# use Graph;
# For this concrete session
use Text::Emoji;
LLM configurations:
my $conf4o = llm-configuration('chat-gpt', model => 'gpt-4o', :4096max-tokens, temperature => 0.4);
my $conf4o-mini = llm-configuration('chat-gpt', model => 'gpt-4o-mini', :4096max-tokens, temperature => 0.4);
($conf4o, $conf4o-mini)».Hash».elems
Raku-chatbooks, [AAp4], can have initialization Raku code and specified preloaded LLM-personas. One such LLM-persona is “raku”. Here we use the “raku” chat object to get Raku code for the edges of the RPS extension Rock-Paper-Scissors-Lizard-Spock, [Wv1].
#% chat raku
Make an array the edges of a graph for the game Rock-Paper-Scissors-Lizard-Spock.
Each edges is represented with a hash with the keys "from", "to", "label".
The label corresponds to the action taken with the edge, like, "Paper covers Rock", "Paper disproves Spock".
my @edges = (
{ from => 'Rock', to => 'Scissors', label => 'Rock crushes Scissors' },
{ from => 'Rock', to => 'Lizard', label => 'Rock crushes Lizard' },
{ from => 'Paper', to => 'Rock', label => 'Paper covers Rock' },
{ from => 'Paper', to => 'Spock', label => 'Paper disproves Spock' },
{ from => 'Scissors', to => 'Paper', label => 'Scissors cuts Paper' },
{ from => 'Scissors', to => 'Lizard', label => 'Scissors decapitates Lizard' },
{ from => 'Lizard', to => 'Spock', label => 'Lizard poisons Spock' },
{ from => 'Lizard', to => 'Paper', label => 'Lizard eats Paper' },
{ from => 'Spock', to => 'Scissors', label => 'Spock smashes Scissors' },
{ from => 'Spock', to => 'Rock', label => 'Spock vaporizes Rock' },
);
We use the generated code in the next section.
Plain text graph
Here we create the Rock-Paper-Scissors-Lizard-Spock graph generated with the LLM-magic cell above:
my @edges =
{ from => 'Rock', to => 'Scissors', label => 'Rock crushes Scissors' },
{ from => 'Scissors', to => 'Paper', label => 'Scissors cuts Paper' },
{ from => 'Paper', to => 'Rock', label => 'Paper covers Rock' },
{ from => 'Rock', to => 'Lizard', label => 'Rock crushes Lizard' },
{ from => 'Lizard', to => 'Spock', label => 'Lizard poisons Spock' },
{ from => 'Spock', to => 'Scissors', label => 'Spock smashes Scissors' },
{ from => 'Scissors', to => 'Lizard', label => 'Scissors decapitates Lizard' },
{ from => 'Lizard', to => 'Paper', label => 'Lizard eats Paper' },
{ from => 'Paper', to => 'Spock', label => 'Paper disproves Spock' },
{ from => 'Spock', to => 'Rock', label => 'Spock vaporizes Rock' }
;
my $g = Graph.new(@edges, :directed);
Remark: Currently the class Graph does not “deal” with edge labels, but some of its methods (like, dot) do.
Convenient LLM functions
Graph edges
Instead of using chat-cells, we can define an LLM function that provides the graph edges dataset for different RPS variants. Here is such an LLM function using “LLM::Functions”, [AAp1], and “LLM::Prompts”, [AAv2]:
my sub rps-edge-dataset($description, Str:D $game-name = 'Rock-Paper-Scissors', *%args) {
llm-synthesize([
"Give the edges the graph for this $game-name variant description",
'Give the edges as an array of dictionaries. Each dictionary with keys "from", "to", "label",',
'where "label" has the action of "from" over "to".',
$description,
llm-prompt('NothingElse')('JSON')
],
e => %args<llm-evaluator> // %args<e> // %args<conf> // $conf4o-mini,
form => sub-parser('JSON'):drop
)
}
Remark:: Both “LLM::Functions” and “LLM::Prompts” are pre-loaded in Raku chatbooks.
Emoji translations
We can translate to emojis the plain-text vertex labels of RPS graphs in several ways:
Again, let us define an LLM function for that does emojification. (I.e. for option 3.)
One way is to do a simple application of the prompt “Emojify” and process its result into a dictionary:
my $res = llm-synthesize( llm-prompt("Emojify")($g.vertex-list), e => $conf4o-mini );
$res.split(/\s+/, :skip-empty)».trim.Hash
It is better to have a function that provides a more “immediate” result:
my sub emoji-rules($words, *%args) {
llm-synthesize( [
llm-prompt("Emojify")($words),
'Make a JSON dictionary of the original words as keys and the emojis as values',
llm-prompt('NothingElse')('JSON')
],
e => %args<llm-evaluator> // %args<e> // %args<conf> // $conf4o-mini,
form => sub-parser('JSON'):drop
)
}
Emoji graph
Let us remake the game graph using suitable emojis. Here are the corresponding egdes:
my @edges-emo =
{ from => '🪨', to => '✂️', label => 'crushes' },
{ from => '✂️', to => '📄', label => 'cuts' },
{ from => '📄', to => '🪨', label => 'covers' },
{ from => '🪨', to => '🦎', label => 'crushes' },
{ from => '🦎', to => '🖖', label => 'poisons' },
{ from => '🖖', to => '✂️', label => 'smashes' },
{ from => '✂️', to => '🦎', label => 'decapitates' },
{ from => '🦎', to => '📄', label => 'eats' },
{ from => '📄', to => '🖖', label => 'disproves' },
{ from => '🖖', to => '🪨', label => 'vaporizes' }
;
my $g-emo = Graph.new(@edges-emo, :directed);
Here is the interactions table of the upgraded game:
#% html
game-table($g-chuck)
✊🏻
✋🏻
✌🏻
🖖🏻
🤏🏻
🦶🏻
✊🏻
–
+
–
+
–
✋🏻
+
–
+
–
–
✌🏻
–
+
–
+
–
🖖🏻
+
–
+
–
–
🤏🏻
–
+
–
+
–
🦶🏻
+
+
+
+
+
In order to ensure that we get an “expected” graph plot, we can take the vertex coordinates of a wheel graph or compute them by hand. Here we do the latter:
We can use “LLM vision” to get the colors of the original image:
my $url = 'https://www.merchandisingplaza.us/40488/2/T-shirts-Chuck-Norris-Chuck-Norris-Rock-Paper-Scissors-Lizard-Spock-TShirt-l.jpg';
llm-vision-synthesize('What are the dominant colors in this image? Give them in hex code.', $url)
The dominant colors in the image are:
- Olive Green: #5B5D4A
- Beige: #D0C28A
- White: #FFFFFF
- Black: #000000
Graph generating with LLMs
Instead of specifying the graph edges by hand, we can use LLM-vision and suitable prompting. The results are not that good, but YMMV.
my $res2 =
llm-vision-synthesize([
'Give the edges the graph for this image of Rock-Paper-Scissors-Lizard-Spock-Chuck -- use relevant emojis.',
'Give the edges as an array of dictionaries. Each dictionary with keys "from" and "to".',
llm-prompt('NothingElse')('JSON')
],
$url,
e => $conf4o,
form => sub-parser('JSON'):drop
)
# [{from => ✋, to => ✌️} {from => ✌️, to => ✊} {from => ✊, to => 🦎} {from => 🦎, to => 🖖} {from => 🖖, to => ✋} {from => ✋, to => ✊} {from => ✊, to => ✋} {from => ✌️, to => 🦎} {from => 🦎, to => ✋} {from => 🖖, to => ✌️} {from => ✌️, to => 🖖} {from => 🖖, to => ✊}]
#% html
Graph.new($res2, :directed).dot(:5graph-size, engine => 'neato', arrow-size => 0.5):svg
Rock-Paper-Scissors-Fire-Water
One notable variant is Rock-Paper-Scissors-Fire-Water. Here is its game table:
#% html
my @edges = |('🔥' X=> $g0.vertex-list), |($g0.vertex-list X=> '💦'), '💦' => '🔥';
my $g-fire-water = $g0.clone.edge-add(@edges, :directed);
game-table($g-fire-water)
✂️
💦
📄
🔥
🪨
✂️
+
+
–
–
💦
–
–
+
–
📄
–
+
–
+
🔥
+
–
+
+
🪨
+
+
–
–
Here is the graph:
#% html
$g-fire-water.dot(|%opts, engine => 'neato'):svg
my $txt = data-import('https://www.umop.com/rps9.htm', 'plaintext');
text-stats($txt)
# (chars => 2143 words => 355 lines => 46)
Extract the game description:
my ($start, $end) = 'relationships in RPS-9:', 'Each gesture beats out';
my $txt-rps9 = $txt.substr( $txt.index($start) + $start.chars .. $txt.index($end) - 1 )
ROCK POUNDS OUT
FIRE, CRUSHES SCISSORS, HUMAN &
SPONGE.
FIRE MELTS SCISSORS,
BURNS PAPER, HUMAN & SPONGE.
SCISSORS SWISH THROUGH AIR,
CUT PAPER, HUMAN & SPONGE.
HUMAN CLEANS WITH SPONGE,
WRITES PAPER, BREATHES
AIR, DRINKS WATER.
SPONGE SOAKS PAPER, USES
AIR POCKETS, ABSORBS WATER,
CLEANS GUN.
PAPER FANS AIR,
COVERS ROCK, FLOATS ON WATER,
OUTLAWS GUN.
AIR BLOWS OUT FIRE,
ERODES ROCK, EVAPORATES WATER,
TARNISHES GUN.
WATER ERODES ROCK, PUTS OUT
FIRE, RUSTS SCISSORS & GUN.
GUN TARGETS ROCK,
FIRES, OUTCLASSES SCISSORS, SHOOTS HUMAN.
Here we invoke the defined LLM function to get the edges of the corresponding graph:
my @rps-edges = |rps-edge-dataset($txt-rps9)
# [{from => ROCK, label => POUNDS OUT, to => FIRE} {from => ROCK, label => CRUSHES, to => SCISSORS} {from => ROCK, label => CRUSHES, to => HUMAN}, ..., {from => GUN, label => FIRES, to => FIRE}]
Here we translate the plaintext vertices into emojis:
my %emojied = emoji-rules(@rps-edges.map(*<from to>).flat.unique.sort)
{AIR => 🌬️, FIRE => 🔥, GUN => 🔫, HUMAN => 👤, PAPER => 📄, ROCK => 🪨, SCISSORS => ✂️, SPONGE => 🧽, WATER => 💧}
In the (very near) future I plan to use the built-up RPS graph making know-how to make military forces interaction graphs. (Discussed in [AJ1, SM1, NM1, AAv1].)
Sparse matrices are an essential tool in computational mathematics, allowing us to efficiently represent and manipulate large matrices that are predominantly composed of zero elements. In this blog post, we will delve into a few intriguing examples of sparse matrix utilization, specifically in the Raku programming language.
Examples Covered:
Random Graph:
We will explore the adjacency matrix of a random graph generated from a model of social interactions.
Additionally, we will overlay adjacency matrices with a shortest path within the graph.
Movie-Actor Bipartite Graph:
This example involves ingesting data about relationships between actors and movies.
We will demonstrate how sparse matrix algebra can facilitate certain information retrieval tasks.
Sparse Matrices Visualization:
We will discuss techniques for visualizing sparse matrices.
Support for sparse matrix linear algebra is a hallmark of a mature computational system. Here’s a brief timeline of when some popular systems introduced sparse matrices:
In the code above, we create a random graph with 20 vertices and a connection probability of 0.06. We also find the shortest path between vertices ‘0’ and ’12’.
Corresponding Matrix
The adjacency matrix of this graph is a sparse matrix, where non-zero elements indicate the presence of an edge between vertices.
# 1 => [X2 => 1 The Lord of the Rings: The Fellowship of the Ring => 1 Pirates of the Caribbean: The Curse of the Black Pearl => 1 The Lord of the Rings: The Return of the King => 1 Pirates of the Caribbean: At World's End => 1 X-Men: The Last Stand => 1 The Lord of the Rings: The Two Towers => 1 Pirates of the Caribbean: Dead Man's Chest => 1]
# 0 => [Sean Astin => 0 Patrick Stewart => 0 Elijah Wood => 0 Rebecca Romijn => 0 Ian McKellen => 0 Keira Knightley => 0 Orlando Bloom => 0 Famke Janssen => 0 Bill Nighy => 0 Johnny Depp => 0 Jack Davenport => 0 Hugh Jackman => 0 Liv Tyler => 0 Halle Berry => 0 Andy Serkis => 0 Geoffrey Rush => 0 Stellan Skarsgård => 0 Anna Paquin => 0 Viggo Mortensen => 0]
We create a sparse matrix representing the movie-actor relationships:
my @allVertexNames = [|@dsMovieRecords.map(*<Movie>).unique.sort, |@dsMovieRecords.map(*<Actor>).unique.sort];
my %h = @allVertexNames Z=> ^@allVertexNames.elems;
# {Andy Serkis => 8, Anna Paquin => 9, Bill Nighy => 10, Elijah Wood => 11, Famke Janssen => 12, Geoffrey Rush => 13, Halle Berry => 14, Hugh Jackman => 15, Ian McKellen => 16, Jack Davenport => 17, Johnny Depp => 18, Keira Knightley => 19, Liv Tyler => 20, Orlando Bloom => 21, Patrick Stewart => 22, Pirates of the Caribbean: At World's End => 0, Pirates of the Caribbean: Dead Man's Chest => 1, Pirates of the Caribbean: The Curse of the Black Pearl => 2, Rebecca Romijn => 23, Sean Astin => 24, Stellan Skarsgård => 25, The Lord of the Rings: The Fellowship of the Ring => 3, The Lord of the Rings: The Return of the King => 4, The Lord of the Rings: The Two Towers => 5, Viggo Mortensen => 26, X-Men: The Last Stand => 6, X2 => 7}
The row and column names are sorted, with movie titles first, followed by actor names:
.say for @allVertexNames
# Pirates of the Caribbean: At World's End
# Pirates of the Caribbean: Dead Man's Chest
# Pirates of the Caribbean: The Curse of the Black Pearl
# The Lord of the Rings: The Fellowship of the Ring
# The Lord of the Rings: The Return of the King
# The Lord of the Rings: The Two Towers
# X-Men: The Last Stand
# X2
# Andy Serkis
# Anna Paquin
# Bill Nighy
# Elijah Wood
# Famke Janssen
# Geoffrey Rush
# Halle Berry
# Hugh Jackman
# Ian McKellen
# Jack Davenport
# Johnny Depp
# Keira Knightley
# Liv Tyler
# Orlando Bloom
# Patrick Stewart
# Rebecca Romijn
# Sean Astin
# Stellan Skarsgård
# Viggo Mortensen
The sparse matrix of the bipartite graph is constructed:
my $m = Math::SparseMatrix.new(edge-dataset => $g.edges(:dataset))
The matrix plot now clearly indicates a bipartite graph:
#%js
$m.Array ==> js-d3-matrix-plot(width=>400)
For an alternative visualization, we can create an HTML “pretty print” of the sparse matrix:
#% html
$m
.to-html(:v)
.subst('<td>1</td>', '<td><b>●</b></td>', :g)
Fundamental Information Retrieval Operation
Sparse matrices are particularly useful for information retrieval operations. Here, we demonstrate how to retrieve data about an actor, such as Orlando Bloom.
Retrieve the row/vector corresponding to the actor and transpose it:
#%html
my $m-actor = $m['Orlando Bloom'].transpose;
$m-actor.to-html.subst('<td>0</td>','<td> </td>'):g
Multiply the incidence matrix with the actor-vector to find other actors who starred in the same movies:
#% html
$m.dot($m-actor).to-html
Matrix Plot (Details)
There are two primary methods for plotting sparse matrices.
Via Tuples
This method uses a heatmap plot specification:
#% js
my @ds3D = $m.tuples.map({ <x y z tooltip>.Array Z=> [|$_.Array, "⎡{$m.row-names[$_[0]]}⎦ : ⎡{$m.column-names[$_[1]]}⎦ : {$_.tail}"] })».Hash;
js-d3-matrix-plot(
@ds3D,
:$tooltip-background-color,
:$tooltip-color,
:$background,
width => 400)
Here is the corresponding (“coordinates”) list plot:
This post encapsulates the essence of that presentation, offering a walk-through of how these packages can be leveraged to create good, informative geographic visualizations.
The primary focus of our exploration is on two Raku packages:
Data::Geographics: This package provides comprehensive country and city data, which is crucial for geographic data visualization and analysis.
JavaScript::Google::Charts: This package interfaces with Google Charts, an established framework for creating various types of charts, including geographic plots.
Geographic Data Visualization
Data::Geographics: The Protagonist
The “Data::Geographics” package is the star of the presentation. It provides extensive data on countries and cities, which is essential for geographic data visualization and analysis. Initially, I attempted to create geographic plots using JavaScript freehand, but it proved challenging. Instead, I found it more practical to use the “JavaScript::Google::Charts” package, which offers a more structured framework for creating pre-defined chart types.
Creating Geographic Plots
Using the “JavaScript::Google::Charts” package, I demonstrated how to generate geographic plots. For instance, we visualized country data with a simple plot highlighting countries known to the “Data::Geographics” package in shades of green, while unknown regions were depicted in gray. (That is presentation’s “opening image.”)
Beyond simple visualization, certain analytical tasks can be done using the country data in “Data::Geographics”. For example, I conducted a rudimentary analysis of gross domestic product (GDP) and electricity production using linear regression.
The package also includes city data, enabling us to perform proximity searches and create neighbor graphs.
Practical Demonstrations
Country Data
Currently, “Data::Geographics” knows about 29 countries (≈195 data elements for each.) Here are the countries:
#% html use Data::Geographics; country-data().keys.sort ==> to-html(:multicolumn, columns => 3)
Botswana
Hungary
Serbia
Brazil
Iran
Slovakia
Bulgaria
Iraq
SouthAfrica
Canada
Japan
SouthKorea
China
Mexico
Spain
CzechRepublic
NorthKorea
Sweden
Denmark
Poland
Turkey
Finland
Romania
Ukraine
France
Russia
UnitedStates
Germany
SaudiArabia
(Any)
Name Recognition
The package “DSL::Entity::Geographics” was specially made to recognize city and country names, which is particularly useful for conversational agents.
Here is named entity recognition example:
use DSL::Enitity::Geographics; entity-city-and-state-name('Las Vegas, Nevada', 'Raku::System')
# United_States.Nevada.Las_Vegas
Correlation Plots
We created correlation plots to analyze the relationship between GDP and electricity production. Using Google Charts’ built-in functionality, we plotted regression lines to visualize trends. But Google Charts’ very nice “trend lines” functionality has certain limitations over logarithmic plots. Hence, that gave us the excuse to do linear regression with “Math::Fitting”:
City Data Tabulation and Visualization
City data visualization was another highlight. We filtered city data to display information such as population and location. By integrating Google Maps links, we provided an interactive way to explore city locations.
Tabulation
#% html @dsCityData.pick(12) ==> { .sort(*<ID>) }() ==> to-html(field-names => <State City Population LocationLink>) ==> { $_.subst(:g, / <?after '<td>'> ('http' .*?) <before '</td>'> /, { "<a href=\"$0\">link</a>" }) }()
Here are city locations plotted with “JavaScript::D3”:
Here are city locations plotted with “JavaScript::Google::Charts”:
Remark: In both plots above Las Vegas, Nevada and cities close to it are given focus.
Proximity Searches
Using the “Math::Nearest” package, we performed proximity searches to find the nearest neighbors of a given city. This feature is particularly useful for geographic analysis and planning.
Graph Visualization
For visualizing neighbor graphs, we used the packages “WWW::MermaidInk” and “JavaScript::D3”. The former interfaces with a web service to generate graph diagrams. The latter has its own built-in graph plotting functionalities. (Based on the force-directed graph plotting component of D3.js.)
Both approaches allow the creation of appealing visual representations of city connections.
Here is a Nearest Neighbor Graph plotted with “JavaScript::D3”:
Here is a Nearest Neighbor Graph plotted with “WWW::MermaidInk”:
Future Plans and Enhancements
While the current capabilities of “Data::Geographics” and “JavaScript::Google::Charts” are impressive, there is always room for improvement. Future plans include:
Enhancing the “Math::Fitting” package to support multidimensional regression.
Exploring the potential of “JavaScript::D3” for more flexible and advanced visualizations.
Conclusion
In summary, the combination of “Data::Geographics”, “JavaScript::Google::Charts” in Raku provides a powerful toolkit for geographic data visualization and analysis. “JavaScript::D3” is also very applicable exploratory data analysis. The function objects (functors) created by “Math::Nearest” and “Math::Fitting” make them very convenient to use.
Remark: Mermaid-JS specs are automatically rendered in GitHub Markdown files, and have plug-in support in Integrated Development Environments (IDEs) like IntelliJ IDEA, VS Code, Emacs, etc.
Examples using Large Language Models (LLMs) are provided. (Via “WWW::OpenAI” and “WWW::PaLM”, [AAp4, AAp5].)
The document has the following structure:
Reflections and observations I.e. “conclusions first.” Those make the anecdotal examples below more scientific, not just conjecture-worthy anecdotes.
Packages and interactions A list the packages utilized and how they can be used in a coherent way.
Generating Mermaid diagrams for EBNFs Two simple and similar EBNF grammars with graphs for instructive comparison.
Generating graphs with LLMs Can LLMs replace Raku programming (of grammar-graphs making)?
Generating Mermaid diagrams for Raku grammars Same exercise but over Raku grammars, not some EBNFs…
LLMs grammars for sentence collections An interesting way to derive EBNFs is to “just ask.” Well, some questions might get really long (and their answers really expensive.)
Random LLMs grammars Why ask for EBNFs of sentence collections, if we can just say “Give me a random EBNF grammar (or five.)”
More complicated grammar A larger grammar example, in order to illustrate the usefulness — or uselessness — of grammar-graphs.
Reflections and observations
I consider grammar graph representation “neat” and “cool” for small enough grammars, but I am not sure how useful it is for large grammars.
Especially, large grammars with recursive dependencies between the production rules.
I made a fair amount of experiments with relatively small grammars, and fewer experiments with a few large grammars.
The ability to make the grammar graphs, of course, has at least didactic utility.
Another utility of the examples given below is to show coherent interaction between the packages:
This Markdown document is “executed” with the package “Text::CodeProcessing”, which allows generation of Markdown specs that are, say, automatically processed by Web browsers, IDEs, etc.
Like Mermaid-JS charts and graphs.
Visualizing grammars generated by Large Language Models (LLMs) — like, ChatGPT and PaLM — is both didactic and “neat.”
One of my primary motivations for making the packages “FunctionalParsers” and “EBNF::Grammar” was to be able to easily (automatically or semi-automatically) process grammars generated with LLMs.
It is not trivial to parse EBNF hallucinations by LLMs. (More details below.)
Generating Raku grammars with ChatGPT-3.5 or PaLM often produces “non-working” grammars. That is why I focused on EBNF grammars.
My assumption is that EBNF has been around for a longer period of time, hence, LLMs are “better trained for it.”
This Markdown document can be converted into a Mathematica notebook using “Markdown::Grammar”, [AAp6]. Mathematica notebooks in RakuMode, [AAp7], make much easier the experiments with diagram generation and LLM utilization. (And more fun!)
Packages and interactions
Here we load the packages used below:
use FunctionalParsers;
use FunctionalParsers::EBNF;
use EBNF::Grammar;
use Grammar::TokenProcessing;
use WWW::OpenAI;
use WWW::PaLM;
# (Any)
Here are flowcharts that summarize use cases, execution paths, and interaction between the packages:
Generating Mermaid diagrams for EBNFs
The function fp-ebnf-parse can produce Mermaid-JS diagrams corresponding to grammars with the target “MermaidJS::Graph”. Here is an example:
The order of parsing in sequences is indicated with integer labels
Pick-left and pick-right sequences use the labels “L” and “R” for the corresponding branches
Remark: The Markdown cell above has the parameters output-lang=mermaid, output-prompt=NONE which allow for direct diagram rendering of the obtained Mermaid code in various Markdown viewers (GitHub, IntelliJ, etc.)
Compare the following EBNF grammar and corresponding diagram with the ones above:
It is interesting to see do LLMs do better at producing (Mermaid-JS) graph representations.
More importantly, we want to answer the question:
Can we generate graph-specs (like, Mermaid-JS) without the need of programming the corresponding interpreters?
Here is a LLM request for a Mermaid-JS spec generation for one of the simple grammars above:
my $request = "Make a Mermaid JS diagram for the EBNF grammar:\n$ebnfCode1";
#my $mmdLLM = openai-completion($request, max-tokens => 600, format => 'values', temperature => 1.2);
my $mmdLLM = palm-generate-text($request, max-tokens => 600, format => 'values', temperature => 0.9);
# ```mermaid
# graph LR
# top[top]
# a[a] --> top
# b[b] --> top
# a --> "a"
# a --> "{"
# a --> "A"
# a --> "}"
# a --> "["
# a --> "1"
# a --> "]"
# b --> "b"
# b --> "("
# b --> "B"
# b --> ")"
# b --> "2"
# ```
Here is the corresponding graph:
$mmdLLM
Remark:: After multiple experiments I can say that the obtained Mermaid-JS code is either:
Simple, somewhat relevant, and wrong
Closer to correct after suitable manual editing
As for the question above — the answer is “No”. But the LLM answers provide (somewhat) good initial versions for manual (human) building of graph specifications.
Generating Mermaid diagrams for Raku grammars
In order to generate graphs for Raku grammars we use the following steps:
Translate Raku-grammar code into EBNF code
Translate EBNF code into graph code (Mermaid-JS or WL)
Consider a grammar for parsing proclaimed feeling toward different programming languages:
Here is an EBNF grammar generated with ChatGPT, [AAp4], over a list of chemical formulas:
#my @sentences = <BrI BrClH2Si CCl4 CH3I C2H5Br H2O H2O4S AgBr AgBrO AgBrO2 AgBrO3 AgBrO4 AgCL>;
my @sentences = <AgBr AgBrO AgBrO2 AgBrO3 AgBrO4 AgCL>;
my $request = "Generate EBNF grammar for the sentences:\n{@sentences.map({ $_.comb.join(' ')}).join("\n")}";
#my $ebnfLLM = openai-completion($request, max-tokens => 600, format => 'values');
my $ebnfLLM = palm-generate-text($request, max-tokens => 600, format => 'values');
# ```
# <sentence> ::= <element> <element>
# <element> ::= <letter> | <letter> <element>
# <letter> ::= A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
# ```
Often LLM requests as the ones above return code as Markdown code cells, hence, we try to remove the code cell markings:
# <sentence> ::= <element> <element>
# <element> ::= <letter> | <letter> <element>
# <letter> ::= A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
fp-grammar-graph($ebnfLLM, style => Whatever)
Another way for “verify” a grammar is to generate random sentences with it:
.say for ebnf-random-sentence($ebnfLLM, 12, style => Whatever)
# O
# A
# G H
# I
# D N
# I L
# J L
# P Y
# J F
# M I
# K
# P J
Remark: Random sentences can be also generated with the function fp-random-sentence provided by “FunctionalParsers”.
Remark: The function ebnf-random-sentence uses fp-random-sentence, but ebnf-random-sentence (parses and) standardizes the given EBNF grammar first, (then it gives the standardized grammar to fp-random-sentence.)
Remark: It is not trivial to parse EBNF hallucinations by LLMs. For the same EBNF-making request a given LLM can produce different EBNF grammars, each having “its own” EBNF style. Hence, both “FunctionalParsers” and “EBNF::Grammar” have parsers for different EBNF styles. With the spec style => Whatever parsing of all of the “anticipated” styles are attempted.
Random LLMs grammars
Why ask for EBNF graphs with sentence collections, if we can just say:
Give me a random EBNF grammar. (Or five.)
Remark: Note, the implications of testing the parsers in that way. We can try to produce extensive parser tests by multiple randomly obtained grammars from different LLMs. (Using different LLM parameters, like, temperatures, etc.)
Here is another example using a random (hopefully small) EBNF grammar:
my $request2 = "Give an example of simple EBNF grammar.";
#my $ebnfLLM2 = openai-completion($request2, max-tokens => 600, format => 'values', temperature => 1.2);
my $ebnfLLM2 = palm-generate-text($request2, max-tokens => 600, format => 'values', temperature => 0.9);
$ebnfLLM2 = $ebnfLLM2.subst(/ ^ '`' ** 3 <-[\v]>* \n | '`' ** 3 \h* $ /,''):g;