Give me a 5-step implementation plan for adding authentication to a FastAPI app. VERY CONCISE.
Magic cell parameter values can be assigned using the equal sign (“=”):
#%chat assistant1 > markdown
Now rewrite step 2 with test-first details.
Default chat object (NONE)
#%chat
Does vegetarian sushi exist?
# Yes, vegetarian sushi definitely exists! It's a popular option for those who avoid fish or meat. Instead of raw fish, vegetarian sushi typically includes ingredients like:
- Avocado
- Cucumber
- Carrots
- Pickled radish (takuan)
- Asparagus
- Sweet potato
- Mushrooms (like shiitake)
- Tofu or tamago (Japanese omelette)
- Seaweed salad
These ingredients are rolled in sushi rice and nori seaweed, just like traditional sushi. Vegetarian sushi can be found at many sushi restaurants and sushi bars, and it's also easy to make at home.
Gist: LLM::Functions::Chat(chat-id = assistant1, llm-evaluator.conf.name = ChatGPT, messages.elems = 6, last.message = ${:content("2. Write tests to verify user data retrieval and password verification; then define user model and fake user database accordingly."), :role("assistant"), :timestamp(DateTime.new(2026,3,14,9,23,6.901396036148071,:timezone(-14400)))})
Clear message history of one persona (keep persona)
Give me three Linux troubleshooting tips. VERY CONCISE.
Remark: In order to run the magic cell above you have to run a llamafile program/model on your computer. (For example, ./google_gemma-3-12b-it-Q4_K_M.llamafile.)
A dark-mode digital painting of a lighthouse in stormy weather.
Here we use a DALL-E meta cell to see how many images were generated in a notebook session:
#% dalle meta
elems
# 3
Here we export the second image — using the index 1 — into a file named “stormy-weather-lighthouse-2.png”:
#% dalle export, index=1
stormy-weather-lighthouse-2.png
# stormy-weather-lighthouse-2.png
Here we show all generated images:
#% dalle meta
show
Here we export all images (into file names with the prefix “cheatsheet”):
#% dalle export, index=all, prefix=cheatsheet
6) LLM provider access facilitation
API keys can be passed inline (api-key) or through environment variables.
Notebook-session environment setup
%*ENV<OPENAI_API_KEY> = "YOUR_OPENAI_KEY";
%*ENV<GEMINI_API_KEY> = "YOUR_GEMINI_KEY";
%*ENV<OLLAMA_API_KEY> = "YOUR_OLLAMA_KEY";
Ollama-specific defaults:
OLLAMA_HOST (default host fallback is http://localhost:11434)
OLLAMA_MODEL (default model if model=... not given)
The magic cells take as argument base-url. This allows to use LLMs that have ChatGPT compatible APIs. The argument base_url is a synonym of host for magic cell #%ollama.
7) Notebook/chatbook session initialization with custom code + personas JSON
Initialization runs when the extension is loaded.
A) Custom Raku init code
Env var override: RAKU_CHATBOOK_INIT_FILE
If not set, first existing file is used in this order:
~/.config/raku-chatbook/init.py
~/.config/init.raku
Use this for imports/helpers you always want in chatbook sessions.
B) Pre-load personas from JSON
Env var override: RAKU_CHATBOOK_LLM_PERSONAS_CONF
If not set, first existing file is used in this order:
~/.config/raku-chatbook/llm-personas.json
~/.config/llm-personas.json
The supported JSON shape is an array of dictionaries:
Interesting analogies of Rock-Paper-Scissors (RPS) hand games can be made with military forces interactions; see [AAv1]. Those analogies are easily seen using graphs. For example, the extension of the graph of Rock-Paper-Scissors-Lizard-Spock, [Wv1], into the graph “Chuck Norris defeats all” is analogous to the extension of “older” (say, WWII) military forces interactions graphs with drones.
Here is the graph of Rock-Paper-Scissors-Lizard-Spock-ChuckNorris, [AA1]:
In this document (notebook), we use Raku to create graphs that show how military forces interact. We apply the know-how for making graphs for RPS-games detailed in the blog post “Rock-Paper-Scissors extensions”, [AA1].
We can define an LLM function that provides the graph edges dataset for different RPS variants. Here is such an LLM function using “LLM::Functions”, [AAp1], and “LLM::Prompts”, [AAv2]:
my sub rps-edge-dataset($description, Str:D $game-name = 'Rock-Paper-Scissors', *%args) {
llm-synthesize([
"Give the edges the graph for this $game-name variant description",
'Give the edges as an array of dictionaries. Each dictionary with keys "from", "to", "label",',
'where "label" has the action of "from" over "to".',
$description,
llm-prompt('NothingElse')('JSON')
],
e => %args<llm-evaluator> // %args<e> // %args<conf> // $conf4o-mini,
form => sub-parser('JSON'):drop
)
}
Remark: We reuse the sub definition rps-edge-dataset from [AA1].
Rock-Paper-Scissors and its Lizard-Spock extensions
Here is the graph of the standard RPS game and it “Lizard-Spock” extension:
#% html
# Graph edges: LLM-generated and LLM-translates
my @edges-emo =
{ from => '🪨', to => '✂️', label => 'crushes' },
{ from => '✂️', to => '📄', label => 'cuts' },
{ from => '📄', to => '🪨', label => 'covers' },
{ from => '🪨', to => '🦎', label => 'crushes' },
{ from => '🦎', to => '🖖', label => 'poisons' },
{ from => '🖖', to => '✂️', label => 'smashes' },
{ from => '✂️', to => '🦎', label => 'decapitates' },
{ from => '🦎', to => '📄', label => 'eats' },
{ from => '📄', to => '🖖', label => 'disproves' },
{ from => '🖖', to => '🪨', label => 'vaporizes' }
;
# Edge-label rules
my %edge-labels-emo;
@edges-emo.map({ %edge-labels-emo{$_<from>}{$_<to>} = $_<label> });
# RPS-3 Lizard-Spock extension
my $g-emo = Graph.new(@edges-emo, :directed);
# Standard RPS-3 as a subgraph
my $g-rps = $g-emo.subgraph(<🪨 ✂️ 📄>);
# Plot the graphs together
$g-rps.dot(|%opts, edge-labels => %edge-labels-emo, :svg)
~
$g-emo.dot(|%opts, edge-labels => %edge-labels-emo, :svg)
Simple analogy
We consider the following military analogy with RPS:
Tanks attack (and defeat) Infantry
Guerillas defend against Tanks
Infantry attacks Guerillas
Here we obtain the corresponding graph edges using an LLM:
my $war-game = rps-edge-dataset('tanks attack infantry, guerillas defend against tanks, infantry attacks querillas')
# [{from => Tanks, label => attack, to => Infantry} {from => Guerillas, label => defend, to => Tanks} {from => Infantry, label => attack, to => Guerillas}]
Plotting the graphs together:
#% html
my %edge-labels = Empty;
for |$war-game -> %r { %edge-labels{%r<from>}{%r<to>} = %r<label> };
Graph.new($war-game, :directed).dot(|%opts-plain, :%edge-labels, :svg)
~
$g-rps.dot(|%opts, edge-labels => %edge-labels-emo, :svg)
Military forces interaction
Here is a Mermaid-JS-made graph of a more complicated military forces interactions diagram; see [NM1]:
Using diagram’s Mermaid code here the graph edges are LLM-generated:
#% html
my $mmd-descr = q:to/END/;
graph TD
AT[Anti-tank weapons] --> |defend|Arm[Armor]
Arm --> |attack|IA[Infantry and Artillery]
Air[Air force] --> |attack|Arm
Air --> |attack|IA
M[Missiles] --> |defend|Air
IA --> |attack|M
IA --> |attack|AT
END
my $war-game2 = rps-edge-dataset($mmd-descr);
$war-game2 ==> to-html(field-names => <from label to>)
Direct assignment (instead of using LLMs):
my $war-game2 = $[
{:from("Anti-tank weapons"), :label("defend"), :to("Armor")}, {:from("Armor"), :label("attack"), :to("Infantry and Artillery")},
{:from("Air force"), :label("attack"), :to("Armor")}, {:from("Air force"), :label("attack"), :to("Infantry and Artillery")},
{:from("Missiles"), :label("defend"), :to("Air force")}, {:from("Infantry and Artillery"), :label("attack"), :to("Missiles")},
{:from("Infantry and Artillery"), :label("attack"), :to("Anti-tank weapons")}
];
The diagram does not correspond to modern warfare — it is taken from a doctoral thesis, [NM1], discussing reconstruction of historical military data. The corresponding graph can be upgraded with drones in a similar way as the Chuck-Norris-defeats-all upgrade in [AA1].
my $war-forces = Graph.new($war-game2, :directed);
my $drone = "Air drones";
my $war-game-d = $war-game2.clone.append( $war-forces.vertex-list.map({ %( from => $drone, to => $_, label => 'attack' ) }) );
$war-game-d .= append( ['Missiles', 'Air force'].map({ %(from => $_, to => $drone, label => 'defend') }) );
my $war-forces-d = Graph.new($war-game-d, :directed);
It is easy to make a simple Rock-Paper-Scissors (RPS) game graph using the Raku package “Graph”, [AAp1]. Here is such a graph in which the arrow directions indicate which item (vertex) wins:
In this post (notebook) we show how to do all of the above points.
Remark: Interesting analogies of the presented graphs can be made with warfare graphs, [AAv1]. For example, the graph tanks-infantry-guerillas is analogous to RPS.
TL;DR
LLMs “know” the RPS game and its upgrades.
LLMs know how to (mostly, reliably) translate to emojis.
The package “Graph” (via Graphviz DOT) can produce SVG plots that are readily rendered in different environments.
And the graphs of hand-games like RPS look good.
The class Graph have handy methods and attributes that make the creation and modification of graphs smooth(er).
Setup
This notebook is a Raku-chatbook, hence, its Jupyter session preloads certain packages and LLM-personas.
# Preloaded in any chatbook
# use LLM::Functions;
# use LLM::Prompts;
# Preloaded in a user init file
# use Graph;
# For this concrete session
use Text::Emoji;
LLM configurations:
my $conf4o = llm-configuration('chat-gpt', model => 'gpt-4o', :4096max-tokens, temperature => 0.4);
my $conf4o-mini = llm-configuration('chat-gpt', model => 'gpt-4o-mini', :4096max-tokens, temperature => 0.4);
($conf4o, $conf4o-mini)».Hash».elems
Raku-chatbooks, [AAp4], can have initialization Raku code and specified preloaded LLM-personas. One such LLM-persona is “raku”. Here we use the “raku” chat object to get Raku code for the edges of the RPS extension Rock-Paper-Scissors-Lizard-Spock, [Wv1].
#% chat raku
Make an array the edges of a graph for the game Rock-Paper-Scissors-Lizard-Spock.
Each edges is represented with a hash with the keys "from", "to", "label".
The label corresponds to the action taken with the edge, like, "Paper covers Rock", "Paper disproves Spock".
my @edges = (
{ from => 'Rock', to => 'Scissors', label => 'Rock crushes Scissors' },
{ from => 'Rock', to => 'Lizard', label => 'Rock crushes Lizard' },
{ from => 'Paper', to => 'Rock', label => 'Paper covers Rock' },
{ from => 'Paper', to => 'Spock', label => 'Paper disproves Spock' },
{ from => 'Scissors', to => 'Paper', label => 'Scissors cuts Paper' },
{ from => 'Scissors', to => 'Lizard', label => 'Scissors decapitates Lizard' },
{ from => 'Lizard', to => 'Spock', label => 'Lizard poisons Spock' },
{ from => 'Lizard', to => 'Paper', label => 'Lizard eats Paper' },
{ from => 'Spock', to => 'Scissors', label => 'Spock smashes Scissors' },
{ from => 'Spock', to => 'Rock', label => 'Spock vaporizes Rock' },
);
We use the generated code in the next section.
Plain text graph
Here we create the Rock-Paper-Scissors-Lizard-Spock graph generated with the LLM-magic cell above:
my @edges =
{ from => 'Rock', to => 'Scissors', label => 'Rock crushes Scissors' },
{ from => 'Scissors', to => 'Paper', label => 'Scissors cuts Paper' },
{ from => 'Paper', to => 'Rock', label => 'Paper covers Rock' },
{ from => 'Rock', to => 'Lizard', label => 'Rock crushes Lizard' },
{ from => 'Lizard', to => 'Spock', label => 'Lizard poisons Spock' },
{ from => 'Spock', to => 'Scissors', label => 'Spock smashes Scissors' },
{ from => 'Scissors', to => 'Lizard', label => 'Scissors decapitates Lizard' },
{ from => 'Lizard', to => 'Paper', label => 'Lizard eats Paper' },
{ from => 'Paper', to => 'Spock', label => 'Paper disproves Spock' },
{ from => 'Spock', to => 'Rock', label => 'Spock vaporizes Rock' }
;
my $g = Graph.new(@edges, :directed);
Remark: Currently the class Graph does not “deal” with edge labels, but some of its methods (like, dot) do.
Convenient LLM functions
Graph edges
Instead of using chat-cells, we can define an LLM function that provides the graph edges dataset for different RPS variants. Here is such an LLM function using “LLM::Functions”, [AAp1], and “LLM::Prompts”, [AAv2]:
my sub rps-edge-dataset($description, Str:D $game-name = 'Rock-Paper-Scissors', *%args) {
llm-synthesize([
"Give the edges the graph for this $game-name variant description",
'Give the edges as an array of dictionaries. Each dictionary with keys "from", "to", "label",',
'where "label" has the action of "from" over "to".',
$description,
llm-prompt('NothingElse')('JSON')
],
e => %args<llm-evaluator> // %args<e> // %args<conf> // $conf4o-mini,
form => sub-parser('JSON'):drop
)
}
Remark:: Both “LLM::Functions” and “LLM::Prompts” are pre-loaded in Raku chatbooks.
Emoji translations
We can translate to emojis the plain-text vertex labels of RPS graphs in several ways:
Again, let us define an LLM function for that does emojification. (I.e. for option 3.)
One way is to do a simple application of the prompt “Emojify” and process its result into a dictionary:
my $res = llm-synthesize( llm-prompt("Emojify")($g.vertex-list), e => $conf4o-mini );
$res.split(/\s+/, :skip-empty)».trim.Hash
It is better to have a function that provides a more “immediate” result:
my sub emoji-rules($words, *%args) {
llm-synthesize( [
llm-prompt("Emojify")($words),
'Make a JSON dictionary of the original words as keys and the emojis as values',
llm-prompt('NothingElse')('JSON')
],
e => %args<llm-evaluator> // %args<e> // %args<conf> // $conf4o-mini,
form => sub-parser('JSON'):drop
)
}
Emoji graph
Let us remake the game graph using suitable emojis. Here are the corresponding egdes:
my @edges-emo =
{ from => '🪨', to => '✂️', label => 'crushes' },
{ from => '✂️', to => '📄', label => 'cuts' },
{ from => '📄', to => '🪨', label => 'covers' },
{ from => '🪨', to => '🦎', label => 'crushes' },
{ from => '🦎', to => '🖖', label => 'poisons' },
{ from => '🖖', to => '✂️', label => 'smashes' },
{ from => '✂️', to => '🦎', label => 'decapitates' },
{ from => '🦎', to => '📄', label => 'eats' },
{ from => '📄', to => '🖖', label => 'disproves' },
{ from => '🖖', to => '🪨', label => 'vaporizes' }
;
my $g-emo = Graph.new(@edges-emo, :directed);
Here is the interactions table of the upgraded game:
#% html
game-table($g-chuck)
✊🏻
✋🏻
✌🏻
🖖🏻
🤏🏻
🦶🏻
✊🏻
–
+
–
+
–
✋🏻
+
–
+
–
–
✌🏻
–
+
–
+
–
🖖🏻
+
–
+
–
–
🤏🏻
–
+
–
+
–
🦶🏻
+
+
+
+
+
In order to ensure that we get an “expected” graph plot, we can take the vertex coordinates of a wheel graph or compute them by hand. Here we do the latter:
We can use “LLM vision” to get the colors of the original image:
my $url = 'https://www.merchandisingplaza.us/40488/2/T-shirts-Chuck-Norris-Chuck-Norris-Rock-Paper-Scissors-Lizard-Spock-TShirt-l.jpg';
llm-vision-synthesize('What are the dominant colors in this image? Give them in hex code.', $url)
The dominant colors in the image are:
- Olive Green: #5B5D4A
- Beige: #D0C28A
- White: #FFFFFF
- Black: #000000
Graph generating with LLMs
Instead of specifying the graph edges by hand, we can use LLM-vision and suitable prompting. The results are not that good, but YMMV.
my $res2 =
llm-vision-synthesize([
'Give the edges the graph for this image of Rock-Paper-Scissors-Lizard-Spock-Chuck -- use relevant emojis.',
'Give the edges as an array of dictionaries. Each dictionary with keys "from" and "to".',
llm-prompt('NothingElse')('JSON')
],
$url,
e => $conf4o,
form => sub-parser('JSON'):drop
)
# [{from => ✋, to => ✌️} {from => ✌️, to => ✊} {from => ✊, to => 🦎} {from => 🦎, to => 🖖} {from => 🖖, to => ✋} {from => ✋, to => ✊} {from => ✊, to => ✋} {from => ✌️, to => 🦎} {from => 🦎, to => ✋} {from => 🖖, to => ✌️} {from => ✌️, to => 🖖} {from => 🖖, to => ✊}]
#% html
Graph.new($res2, :directed).dot(:5graph-size, engine => 'neato', arrow-size => 0.5):svg
Rock-Paper-Scissors-Fire-Water
One notable variant is Rock-Paper-Scissors-Fire-Water. Here is its game table:
#% html
my @edges = |('🔥' X=> $g0.vertex-list), |($g0.vertex-list X=> '💦'), '💦' => '🔥';
my $g-fire-water = $g0.clone.edge-add(@edges, :directed);
game-table($g-fire-water)
✂️
💦
📄
🔥
🪨
✂️
+
+
–
–
💦
–
–
+
–
📄
–
+
–
+
🔥
+
–
+
+
🪨
+
+
–
–
Here is the graph:
#% html
$g-fire-water.dot(|%opts, engine => 'neato'):svg
my $txt = data-import('https://www.umop.com/rps9.htm', 'plaintext');
text-stats($txt)
# (chars => 2143 words => 355 lines => 46)
Extract the game description:
my ($start, $end) = 'relationships in RPS-9:', 'Each gesture beats out';
my $txt-rps9 = $txt.substr( $txt.index($start) + $start.chars .. $txt.index($end) - 1 )
ROCK POUNDS OUT
FIRE, CRUSHES SCISSORS, HUMAN &
SPONGE.
FIRE MELTS SCISSORS,
BURNS PAPER, HUMAN & SPONGE.
SCISSORS SWISH THROUGH AIR,
CUT PAPER, HUMAN & SPONGE.
HUMAN CLEANS WITH SPONGE,
WRITES PAPER, BREATHES
AIR, DRINKS WATER.
SPONGE SOAKS PAPER, USES
AIR POCKETS, ABSORBS WATER,
CLEANS GUN.
PAPER FANS AIR,
COVERS ROCK, FLOATS ON WATER,
OUTLAWS GUN.
AIR BLOWS OUT FIRE,
ERODES ROCK, EVAPORATES WATER,
TARNISHES GUN.
WATER ERODES ROCK, PUTS OUT
FIRE, RUSTS SCISSORS & GUN.
GUN TARGETS ROCK,
FIRES, OUTCLASSES SCISSORS, SHOOTS HUMAN.
Here we invoke the defined LLM function to get the edges of the corresponding graph:
my @rps-edges = |rps-edge-dataset($txt-rps9)
# [{from => ROCK, label => POUNDS OUT, to => FIRE} {from => ROCK, label => CRUSHES, to => SCISSORS} {from => ROCK, label => CRUSHES, to => HUMAN}, ..., {from => GUN, label => FIRES, to => FIRE}]
Here we translate the plaintext vertices into emojis:
my %emojied = emoji-rules(@rps-edges.map(*<from to>).flat.unique.sort)
{AIR => 🌬️, FIRE => 🔥, GUN => 🔫, HUMAN => 👤, PAPER => 📄, ROCK => 🪨, SCISSORS => ✂️, SPONGE => 🧽, WATER => 💧}
In the (very near) future I plan to use the built-up RPS graph making know-how to make military forces interaction graphs. (Discussed in [AJ1, SM1, NM1, AAv1].)
The Doomsday Clock is a symbolic timepiece maintained by the Bulletin of the Atomic Scientists (BAS) since 1947. It represents how close humanity is perceived to be to global catastrophe, primarily nuclear war but also including climate change and biological threats. The clock’s hands are set annually to reflect the current state of global security; midnight signifies theoretical doomsday.
We take text data from the past announcements, and extract the Doomsday Clock reading statements.
Evolution of Doomsday Clock times
We extract relevant Doomsday Clock timeline data from the corresponding Wikipedia page.
(Instead of using a page from BAS.)
We show how timeline data from that Wikipedia page can be processed with LLMs.
The result plot shows the evolution of the minutes to midnight.
The plot could show trends, highlighting significant global events that influenced the clock setting.
Hence, we put in informative callouts and tooltips.
The data extraction and visualization in the post (notebook) serve educational purposes or provide insights into historical trends of global threats as perceived by experts. We try to make the ingestion and processing code universal and robust, suitable for multiple evaluations now or in the (near) future.
Remark: Keep in mind that the Doomsday Clock is a metaphor and its settings are not just data points but reflections of complex global dynamics (by certain experts and a board of sponsors.)
Remark: This post (notebook) is the Raku-version of the Wolfram Language (WL) notebook with the same name, [AAn1]. That is why the “standard” Raku-grammar approach is not used. (Although, in the preliminary versions of this work relevant Raku grammars were generated via both LLMs and Raku packages.)
I was very impressed by the looks and tune-ability of WL’s ClockGauge, so, I programmed a similar clock gauge in Raku’s package “JavaScript::D3” (which is based on D3.js.)
Setup
use LLM::Functions;
use LLM::Prompts;
use LLM::Configurations;
use Text::SubParsers;
use Data::Translators;
use Data::TypeSystem;
use Data::Importers;
use Data::Reshapers;
use Hash::Merge;
use FunctionalParsers :ALL;
use FunctionalParsers::EBNF;
use Math::DistanceFunctions::Edit;
use Lingua::NumericWordForms;
JavaScript::D3
my $background = 'none';
my $stroke-color = 'Ivory';
my $fill-color = 'none';
JavaScript::Google::Charts
my $format = 'html';
my $titleTextStyle = { color => 'Ivory' };
my $backgroundColor = '#1F1F1F';
my $legendTextStyle = { color => 'Silver' };
my $legend = { position => "none", textStyle => {fontSize => 14, color => 'Silver'} };
my $hAxis = { title => 'x', titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'}, logScale => False, format => 'scientific'};
my $vAxis = { title => 'y', titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'}, logScale => False, format => 'scientific'};
my $annotations = {textStyle => {color => 'Silver', fontSize => 10}};
my $chartArea = {left => 50, right => 50, top => 50, bottom => 50, width => '90%', height => '90%'};
my $background = '1F1F1F';
Functional parsers
my sub parsing-test-table(&parser, @phrases) {
my @field-names = ['statement', 'parser output'];
my @res = @phrases.map({ @field-names Z=> [$_, &parser($_.words).raku] })».Hash.Array;
to-html(@res, :@field-names)
}
Data ingestion
Here we ingest the Doomsday Clock timeline page and show corresponding statistics:
my $url = "https://thebulletin.org/doomsday-clock/past-announcements/";
my $txtEN = data-import($url, "plaintext");
text-stats($txtEN)
# (chars => 73722 words => 11573 lines => 756)
By observing the (plain) text of that page we see the Doomsday Clock time setting can be extracted from the sentence(s) that begin with the following phrase:
my $start-phrase = 'Bulletin of the Atomic Scientists';
my $sentence = $txtEN.lines.first({ / ^ $start-phrase /})
# Bulletin of the Atomic Scientists, with a clock reading 90 seconds to midnight
Remark: The EBNF grammar above can be obtained with LLMs using a suitable prompt with example sentences. (We do not discuss that approach further in this notebook.)
Here the parsing functions are generated from the EBNF string above:
my @defs = fp-ebnf-parse($ebnf, <CODE>, name => 'Doomed2', actions => 'Raku::Code').head.tail;
.say for @defs.reverse
# my &pINTEGER = apply(&{ $_.Int }, symbol('_Integer'));
# my &pSECONDS = sequence-pick-left(&pINTEGER, (alternatives(symbol('second'), symbol('seconds'))));
# my &pMINUTES = sequence-pick-left(&pINTEGER, (alternatives(symbol('minute'), symbol('minutes'))));
# my &pANY = symbol('_String');
# my &pOPENING = sequence(option(many(&pANY)), sequence(symbol('clock'), sequence(option(symbol('is')), symbol('reading'))));
# my &pCLOCK-READING = sequence(&pOPENING, sequence((alternatives(&pMINUTES, sequence(option(sequence(&pMINUTES, option(alternatives(symbol('and'), symbol(','))))), &pSECONDS))), sequence(symbol('to'), symbol('midnight'))));
# my &pTOP = &pCLOCK-READING;
Remark: The function fb-ebnf-parse has a variety of actions for generating code from EBNF strings. For example, with actions => 'Raku::Class' the generation above would produce a class, which might be more convenient to do further development with (via inheritance or direct changes.)
Here the imperative code above — assigned to @defs — is re-written using the infix form of the parser combinators:
We must redefine the parser pANY (corresponding to the EBNF rule “<any>”) in order to prevent pANY of gobbling the word “clock” and in that way making the parser pOPENING fail.
&pANY = satisfy({ $_ ne 'clock' && $_ ~~ /\w+/});
Here are random sentences generated with the grammar:
.say for fp-random-sentence($ebnf, 12).sort;
# clock reading 681 minutes to midnight
# clock reading 788 minutes to midnight
# clock is reading 584 seconds to midnight
# clock is reading 721 second to midnight
# clock is reading 229 minute and 631 second to midnight
# clock is reading 458 minutes to midnight
# clock is reading 727 minute to midnight
# F3V; clock is reading 431 minute to midnight
# FXK<GQ 3RJJJ clock is reading 369 seconds to midnight
# NRP FNSEE K0EQO OPE clock is reading 101 minute to midnight
# QJDV; R<K7S; JMQ>HD AA31 clock is reading 369 minute 871 second to midnight
# QKQGK FZJ@BB M8C1BD BPI;C: clock reading 45 minute 925 second to midnight
Here is a verification table with correct- and incorrect spellings:
#% html
my @phrases =
"doomsday clock is reading 2 seconds to midnight",
"dooms day cloc is readding 2 minute and 22 sekonds to mildnight";
parsing-test-table(shortest(&pCLOCK-READING), @phrases)
statement
parser output
doomsday clock is reading 2 seconds to midnight
(((), {:second(2)}),)
dooms day cloc is readding 2 minute and 22 sekonds to mildnight
(((), {:minute(2), :second(22)}),)
Parsing of numeric word forms
One way to make the parsing more robust is to implement the ability to parse integer names (or numeric word forms) not just integers.
Remark:&pINTEGER has to be evaluated before the definitions of the rest of the parsers programmed in the previous subsection.
Let us try the new parser using integer names for the clock time:
my $str = "the doomsday clock is reading two minutes and forty five seconds to midnight";
$str.words
==> take-first(&pCLOCK-READING)()
# ((() {minute => 2, second => 45}))
Enhance with LLM parsing
There are multiple ways to employ LLMs for extracting “clock readings” from arbitrary statements for Doomsday Clock readings, readouts, and measures. Here we use LLM few-shot training:
my &flop = llm-example-function([
"the doomsday clock is reading two minutes and forty five seconds to midnight" => '{"minute":2, "second": 45}',
"the clock of the doomsday gives 92 seconds to midnight" => '{"minute":0, "second": 92}',
"The bulletin atomic scientist maybe is set to a minute an 3 seconds." => '{"minute":1, "second": 3}'
],
e => $conf4o,
form => sub-parser('JSON')
)
Here is an example invocation:
&flop("Maybe the doomsday watch is at 23:58:03")
# {minute => 1, second => 57}
The following function combines the parsing with the grammar and the LLM example function — the latter is used for fallback parsing:
my sub get-clock-reading(Str:D $st) {
my $op = just(&pCLOCK-READING)($st.words);
my %h = $op.elems > 0 && $op.head.head.elems == 0 ?? $op.head.tail !! &flop( $st );
return Date.today.DateTime.earlier(seconds => (%h<minute> // 0) * 60 + (%h<second> // 0) )
}
# &get-clock-reading
Robust parser demo
Here is the application of the combine function above over a certain “random” Doomsday Clock statement:
my $s = "You know, sort of, that dooms-day watch is 1 and half minute be... before the big boom. (Of doom...)";
$s.&get-clock-reading
# 2024-12-31T23:58:30Z
Remark: The same type of robust grammar-and-LLM combination is explained in more detail in the video “Robust LLM pipelines (Mathematica, Python, Raku)”, [AAv1]. (See, also, the corresponding notebook [AAn1].)
Timeline
In this section we extract Doomsday Clock timeline data and make a corresponding plot.
Parsing page data
Instead of using the official Doomsday clock timeline page we use Wikipedia. We can extract the Doomsday Clock timeline using LLMs. Here we get the plaintext of the Wikipedia page and show statistics:
my $url = "https://en.wikipedia.org/wiki/Doomsday_Clock";
my $txtWk = data-import($url, "plaintext");
text-stats($txtWk)
my $res;
if False {
$res = llm-synthesize([
"Give the time table of the doomsday clock as a time series that is a JSON array.",
"Each element of the array is a dictionary with keys 'Year', 'MinutesToMidnight', 'Time', 'Summary', 'Description'.",
"Do not shorten or summarize the descriptions -- use their full texts.",
"The column 'Summary' should have summaries of the descriptions, each summary no more than 10 words.",
$txtWk,
llm-prompt("NothingElse")("JSON")
],
e => $conf4o,
form => sub-parser('JSON'):drop
);
} else {
my @field-names = <Year MinutesToMidnight Time Summary Description>;
my $url = 'https://raw.githubusercontent.com/antononcube/RakuForPrediction-blog/refs/heads/main/Data/doomsday-clock-timeline-table.csv';
$res = data-import($url, headers => 'auto');
$res = $res.map({ my %h = $_.clone; %h<Year> = %h<Year>.Int; %h<MinutesToMidnight> = %h<MinutesToMidnight>.Num; %h }).Array
}
deduce-type($res)
#% html
my @field-names = <Year MinutesToMidnight Time Summary Description>;
$res ==> to-html(:@field-names, align => 'left')
Remark: The LLM derived summaries in the table above are based on the descriptions in the column “Reason” in the Wikipedia data table. The tooltips of the plot below use the summaries.
Timeline plot
In order to have informative Doomsday Clock evolution plot we obtain and partition dataset’s time series into step-function pairs:
my @dsDoomsdayTimes = |$res;
my @ts0 = @dsDoomsdayTimes.map({ <Year MinutesToMidnight role:tooltip> Z=> $_<Year MinutesToMidnight Summary> })».Hash;
my @ts1 = @dsDoomsdayTimes.rotor(2=>-1).map({[
%( <Year MinutesToMidnight mark role:tooltip> Z=> $_.head<Year MinutesToMidnight MinutesToMidnight Summary>),
%( <Year MinutesToMidnight mark role:tooltip> Z=> [$_.tail<Year>, $_.head<MinutesToMidnight>, NaN, ''])
]}).map(*.Slip);
@ts1 = @ts1.push( merge-hash(@ts0.tail, {mark => @ts0.tail<MinutesToMidnight>}) );
deduce-type(@ts1):tally
#% html
js-google-charts('ComboChart',
@ts2,
column-names => <Year MinutesToMidnight mark role:annotation role:tooltip>,
width => 1200,
height => 500,
title => "Doomsday clock: minutes to midnight, {@dsDoomsdayTimes.map(*<Year>).Array.&{ (.min, .max).join('-') }}",
series => {
0 => {type => 'line', lineWidth => 4, color => 'DarkOrange'},
1 => {type => 'scatter', pointSize => 10, opacity => 0.1, color => 'Blue'},
},
hAxis => { title => 'Year', format => '####', titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'},
viewWindow => { min => 1945, max => 2026}
},
vAxes => {
0 => { title => 'Minutes to Midnight', titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'} },
1 => { titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'}, ticks => (^18).map({ [ v => $_, f => ($_ ?? "23::{60-$_}" !! '00:00' ) ] }).Array }
},
:$annotations,
:$titleTextStyle,
:$backgroundColor,
:$legend,
:$chartArea,
:$format,
div-id => 'DoomsdayClock',
:!png-button
)
Remark: The plot should be piecewise constant — simple linear interpolation between the blue points would suggest gradual change of the clock times.
Remark: By hovering with the mouse over the blue points the corresponding descriptions can be seen. We considered using clock-gauges as tooltips, but showing clock-settings reasons is more informative.
This can be remedied with the (complicated) HTML tooltip procedure described in [AA1].
But I decided to just make the LLM data extraction to produce short summaries of the descriptions.
No right vertical axis ticks
The Doomsday Clock timeline plot in Wikipedia and its reproduction in [AAn1] have the “HH::MM” time axis.
I gave up smoothing out those deficiencies after attempting to fix or address each of them a few times. (It is not that important to figure out Google Charts interface settings for that kind of plots.)
Conclusion
As expected, parsing, plotting, or otherwise processing the Doomsday Clock settings and statements are excellent didactic subjects for textual analysis (or parsing) and temporal data visualization. The visualization could serve educational purposes or provide insights into historical trends of global threats as perceived by experts. (Remember, the clock’s settings are not just data points but reflections of complex global dynamics.)
One possible application of the code in this notebook is to make a “web service“ that gives clock images with Doomsday Clock readings. For example, click on this button:
In this blog post (notebook), we showcase the recently added “magic” cells (in May 2024) to the notebooks of “Jupyter::Chatbook”, [AA1, AAp5, AAv1].
“Jupyter::Chatbook” gives “LLM-ready” notebooks and it is built on “Jupyter::Kernel”, [BDp1], created by Brian Duggan. “Jupyter::Chatbook” has the general principle that Raku packages used for implementing interactive service access cells are also pre-loaded into the notebooks Raku contexts. (I.e. at the beginning of notebooks’ Raku sessions.)
Here is a mind-map that shows the Raku packages that are “pre-loaded” and the available interactive cells:
#% mermaid, format=svg, background=SlateGray mindmap (**Chatbook**) (Direct **LLM** access) OpenAI ChatGPT DALL-E Google PaLM Gemini MistralAI LLaMA (Direct **DeepL** access) Plain text result JSON result (**Notebook-wide chats**) Chat objects Named Anonymous Chat meta cells Prompt DSL expansion (Direct **MermaidInk** access) SVG result PNG result (Direct **Wolfram|Alpha** access) wa1["Plain text result"] wa2["Image result"] wa3["Pods result"] (**Pre-loaded packages**) LLM::Functions LLM::Prompts Text::SubParsers Data::Translators Data::TypeSystem Clipboard Text::Plot Image::Markup::Utilities WWW::LLaMA WWW::MermaidInk WWW::OpenAI WWW::PaLM WWW::Gemini WWW::WolframAlpha Lingua::Translation::DeepL
Remark: Recent improvement is Mermaid-JS cells to have argument for output format and background. Since two months aga (beginning of March, 2024) by default the output format is SVG. In that way diagrams are obtained 2-3 times faster. Before March 9, 2023, “PNG” was the default format (and the only one available.)
The structure of the rest of the notebook:
DeepL Translation from multiple languages into multiple other languages
In this section we show magic cells for direct access of the translation service DeepL. The API key can be set as a magic cell argument; without such key setting the env variable DEEPL_AUTH_KEY is used. See “Lingua::Translation::DeepL”, [AAp1], for more details.
#% deepl, to-lang=German, formality=less, format=text I told you to get the frames from the other warehouse!
# Ich habe dir gesagt, du sollst die Rahmen aus dem anderen Lager holen!
#% deepl, to-lang=Russian, formality=more, format=text I told you to get the frames from the other warehouse!
# Я же просил Вас взять рамки с другого склада!
DeepL’s source languages:
#% html deepl-source-languages().pairs>>.Str.sort.List ==> to-html(:multicolumn, columns => 4)
bulgarian BG
finnish FI
japanese JA
slovak SK
chinese ZH
french FR
latvian LV
slovenian SL
czech CS
german DE
lithuanian LT
spanish ES
danish DA
greek EL
polish PL
swedish SV
dutch NL
hungarian HU
portuguese PT
turkish TR
english EN
indonesian ID
romanian RO
ukrainian UK
estonian ET
italian IT
russian RU
(Any)
DeepL’s target languages:
#% html deepl-target-languages().pairs>>.Str.sort.List ==> to-html(:multicolumn, columns => 4)
In this section we show magic cells for direct access of the LLM service Gemini by Google. The API key can be set as a magic cell argument; without such key setting the env variable GEMINI_API_KEY is used. See “WWW::Gemini”, [AAp2], for more details.
Using the default model
#% gemini Which LLM you are and what is your model?
I am Gemini, a multi-modal AI language model developed by Google.
#% gemini Up to which date you have been trained?
I have been trained on a massive dataset of text and code up until April 2023. However, I do not have real-time access to the internet, so I cannot access information beyond that date. If you have any questions about events or information after April 2023, I recommend checking a reliable, up-to-date source.
Using a specific model
In this subsection we repeat the questions above, and redirect the output to formatted as Markdown.
#% gemini > markdown, model=gemini-1.5-pro-latest
Which LLM are you? What is the name of the model you use?
I'm currently running on the Gemini Pro model.
I can't share private information that could identify me specifically, but I can tell you that I am a large language model created by Google AI.
#% gemini > markdown, model=gemini-1.5-pro-latest Up to which date you have been trained?
I can access pretty up-to-date information, which means I don't really have a "knowledge cut-off" date like some older models.
However, it’s important to remember:
I am not constantly updating. My knowledge is based on a snapshot of the internet taken at a certain point in time.
I don’t have access to real-time information. I can’t tell you what happened this morning, or what the stock market is doing right now.
The world is constantly changing. Even if I had information up to a very recent date, things would still be outdated quickly!
If you need very specific and current information, it’s always best to consult reliable and up-to-date sources.
In this section we show magic cells for direct access to Wolfram|Alpha (W|A) by Wolfram Research, Inc. The API key can be set as a magic cell argument; without such key setting the env variable WOLFRAM_ALPHA_API_KEY is used. See “WWW::WolframAlpha”, [AAp3], for more details.
W|A provides different API endpoints. Currently, “WWW::WolframAlpha” gives access to three of them: simple, result, and query. In a W|A magic the endpoint can be specified with the argument “type” or its synonym “path”.
#% wolfram-alpha Calories in 5 servings of potato salad.
Here is how the image above can be generated and saved in a regular code cell:
my $imgWA = wolfram-alpha('Calories in 5 servings of potato salad.', path => 'simple', format => 'md-image'); image-export('WA-calories.png', $imgWA)
WA-calories.png
Result (plaintext output)
#% w|a, type=result Biggest province in China
the biggest administrative division in China by area is Xinjiang, China. The area of Xinjiang, China is about 629869 square miles
Pods (Markdown output)
#% wa, path=query GDP of China vs USA in 2023
Input interpretation
scanner: Data
China United States | GDP | nominal 2023
Results
scanner: Data
China | $17.96 trillion per year United States | $25.46 trillion per year (2022 estimates)
Relative values
scanner: Data
| visual | ratios | | comparisons United States | | 1.417 | 1 | 41.75% larger China | | 1 | 0.7055 | 29.45% smaller
GDP history
scanner: Data
Economic properties
scanner: Data
| China | United States GDP at exchange rate | $17.96 trillion per year (world rank: 2nd) | $25.46 trillion per year (world rank: 1st) GDP at parity | $30.33 trillion per year (world rank: 1st) | $25.46 trillion per year (world rank: 2nd) real GDP | $16.33 trillion per year (price-adjusted to year-2000 US dollars) (world rank: 2nd) | $20.95 trillion per year (price-adjusted to year-2000 US dollars) (world rank: 1st) GDP in local currency | ¥121 trillion per year | $25.46 trillion per year GDP per capita | $12720 per year per person (world rank: 93rd) | $76399 per year per person (world rank: 12th) GDP real growth | +2.991% per year (world rank: 131st) | +2.062% per year (world rank: 158th) consumer price inflation | +1.97% per year (world rank: 175th) | +8% per year (world rank: 91st) unemployment rate | 4.89% (world rank: 123rd highest) | 3.61% (world rank: 157th highest) (2022 estimate)
GDP components
scanner: Data
| China | United States final consumption expenditure | $9.609 trillion per year (53.49%) (world rank: 2nd) (2021) | $17.54 trillion per year (68.88%) (world rank: 1st) (2019) gross capital formation | $7.688 trillion per year (42.8%) (world rank: 1st) (2021) | $4.504 trillion per year (17.69%) (world rank: 2nd) (2019) external balance on goods and services | $576.7 billion per year (3.21%) (world rank: 1st) (2022) | -$610.5 billion per year (-2.4%) (world rank: 206th) (2019) GDP | $17.96 trillion per year (100%) (world rank: 2nd) (2022) | $25.46 trillion per year (100%) (world rank: 1st) (2022)
Value added by sector
scanner: Data
| China | United States agriculture | $1.311 trillion per year (world rank: 1st) (2022) | $223.7 billion per year (world rank: 3rd) (2021) industry | $7.172 trillion per year (world rank: 1st) (2022) | $4.17 trillion per year (world rank: 2nd) (2021) manufacturing | $4.976 trillion per year (world rank: 1st) (2022) | $2.497 trillion per year (world rank: 2nd) (2021) services, etc. | $5.783 trillion per year (world rank: 2nd) (2016) | $13.78 trillion per year (world rank: 1st) (2015)
Download and export pods images
W|A’s query-pods contain URLs to images (which expire within a day.) We might want to download and save those images. Here is a way to do it:
# Pods as JSON text -- easier to extract links from
my $pods = wolfram-alpha-query('GDP of China vs USA in 2023', format => 'json');
# Extract URLs
my @urls = do with $pods.match(/ '"src":' \h* '"' (<-["]>+) '"'/, :g) {
$/.map({ $_[0].Str })
};
# Download images as Markdown images (that can be shown in Jupyter notebooks or Markdown files)
my @imgs = @urls.map({ image-import($_, format => 'md-image') });
# Export images
for ^@imgs.elems -> $i { image-export("wa-$i.png", @imgs[$i] ) }
This blog post proclaims the Raku package “WWW::WolframAlpha” that provides access to the answer engine Wolfram|Alpha, [WA1, Wk1]. For more details of the Wolfram|Alpha’s API usage see the documentation, [WA2].
Remark: To use the Wolfram|Alpha API one has to register and obtain an authorization key.
Installation
Package installations from both sources use zef installer (which should be bundled with the “standard” Rakudo installation file.)
To install the package from Zef ecosystem use the shell command:
zef install WWW::WolframAlpha
To install the package from the GitHub repository use the shell command:
Remark: When the authorization key, auth-key, is specified to be Whatever then the functions wolfam-alpha* attempt to use the env variable WOLFRAM_ALPHA_API_KEY.
≈ 1.6 × mass of a Good Delivery gold bar ( 400 oz t )
Interpretations
scanner: Unit
mass
Corresponding quantities
scanner: Unit
Relativistic energy E from E = mc^2: | 1.794×10^18 J (joules) | 1.12×10^37 eV (electronvolts)
Weight w of a body from w = mg: | 44 lbf (pounds-force) | 1.4 slugf (slugs-force) | 196 N (newtons) | 1.957×10^7 dynes | 19958 ponds
Volume V of water from V = m/ρ_(H_2O): | 5.3 gallons | 42 pints | 20 L (liters) | 19958 cm^3 (cubic centimeters) | (assuming conventional water density ≈ 1000 kg/m^3)
Command Line Interface
Playground access
The package provides a Command Line Interface (CLI) script:
wolfram-alpha --help
# Usage:
# wolfram-alpha [<words> ...] [--path=<Str>] [--output-format=<Str>] [-a|--auth-key=<Str>] [--timeout[=UInt]] [-f|--format=<Str>] [--method=<Str>] -- Command given as a sequence of words.
#
# --path=<Str> Path, one of 'result', 'simple', or 'query'. [default: 'result']
# --output-format=<Str> The format in which the response is returned. [default: 'Whatever']
# -a|--auth-key=<Str> Authorization key (to use WolframAlpha API.) [default: 'Whatever']
# --timeout[=UInt] Timeout. [default: 10]
# -f|--format=<Str> Format of the result; one of "json", "hash", "values", or "Whatever". [default: 'Whatever']
# --method=<Str> Method for the HTTP POST query; one of "tiny" or "curl". [default: 'tiny']
Remark: When the authorization key argument “auth-key” is specified set to “Whatever” then wolfram-alpha attempts to use the env variable WOLFRAM_ALPHA_API_KEY.
Mermaid diagram
The following flowchart corresponds to the steps in the package function wolfram-alpha-query:
This blog posts proclaims and describes the Raku package “ML::NLPTemplateEnine” that aims to create (nearly) executable code for various computational workflows
The current version of the NLP-TE of the package heavily relies on Large Language Models (LLMs) for its QAS component.
Future plans involve incorporating other types of QAS implementations.
The Raku package implementation closely follows the Wolfram Language (WL) implementations in “NLP Template Engine”, [AAr1, AAv1], and the WL paclet “NLPTemplateEngine”, [AAp2, AAv2].
An alternative, more comprehensive approach to building workflows code is given in [AAp2].
Problem formulation
We want to have a system (i.e. TE) that:
Generates relevant, correct, executable programming code based on natural language specifications of computational workflows
Can automatically recognize the workflow types
Can generate code for different programming languages and related software packages
The points above are given in order of importance; the most important are placed first.
Reliability of results
One of the main reasons to re-implement the WL NLP-TE, [AAr1, AAp1], into Raku is to have a more robust way of utilizing LLMs to generate code. That goal is more or less achieved with this package, but YMMV — if incomplete or wrong results are obtained run the NLP-TE with different LLM parameter settings or different LLMs.
use ML::NLPTemplateEngine;
my $qrCommand = q:to/END/;
Compute quantile regression with probabilities 0.4 and 0.6, with interpolation order 2, for the dataset dfTempBoston.
END
concretize($qrCommand);
Remark: In the code above the template type, “QuantileRegression”, was determined using an LLM-based classifier.
Latent Semantic Analysis (R)
my $lsaCommand = q:to/END/;
Extract 20 topics from the text corpus aAbstracts using the method NNMF.
Show statistical thesaurus with the words neural, function, and notebook.
END
concretize($lsaCommand, template => 'LatentSemanticAnalysis', lang => 'R');
my $command = q:to/END/;
Make random table with 6 rows and 4 columns with the names <A1 B2 C3 D4>.
END
concretize($command, template => 'RandomTabularDataset', lang => 'Raku', llm => 'gemini');
Remark: In the code above it was specified to use Google’s Gemini LLM service.
How it works?
The following flowchart describes how the NLP Template Engine involves a series of steps for processing a computation specification and executing code to obtain results:
Here’s a detailed narration of the process:
Computation Specification:
The process begins with a “Computation spec”, which is the initial input defining the requirements or parameters for the computation task.
Workflow Type Decision:
A decision step asks if the workflow type is specified.
Guess Workflow Type:
If the workflow type is not specified, the system utilizes a classifier to guess relevant workflow type.
Raw Answers:
Regardless of how the workflow type is determined (directly specified or guessed), the system retrieves “raw answers”, crucial for further processing.
Processing and Templating:
The raw answers undergo processing (“Process raw answers”) to organize or refine the data into a usable format.
Processed data is then utilized to “Complete computation template”, preparing for executable operations.
Executable Code and Results:
The computation template is transformed into “Executable code”, which when run, produces the final “Computation results”.
LLM-Based Functionalities:
The classifier and the answers finder are LLM-based.
Data and Templates:
Code templates are selected based on the specifics of the initial spec and the processed data.
Bring your own templates
0. Load the NLP-Template-Engine package (and others):
use ML::NLPTemplateEngine;
use Data::Importers;
use Data::Summarizers;
1. Get the “training” templates data (from CSV file you have created or changed) for a new workflow (“SendMail”):
my $url = 'https://raw.githubusercontent.com/antononcube/NLP-Template-Engine/main/TemplateData/dsQASParameters-SendMail.csv';
my @dsSendMail = data-import($url, headers => 'auto');
records-summary(@dsSendMail, field-names => <DataType WorkflowType Group Key Value>);
# +-----------------+----------------+-----------------------------+----------------------------+----------------------------------------------------------------------------------+
# | DataType | WorkflowType | Group | Key | Value |
# +-----------------+----------------+-----------------------------+----------------------------+----------------------------------------------------------------------------------+
# | Questions => 48 | SendMail => 60 | All => 9 | ContextWordsToRemove => 12 | 0.35 => 9 |
# | Defaults => 7 | | Who the email is from => 4 | Threshold => 12 | {_String..} => 8 |
# | Templates => 3 | | What it the content => 4 | TypePattern => 12 | to => 4 |
# | Shortcuts => 2 | | What it the body => 4 | Parameter => 12 | _String => 4 |
# | | | What it the title => 4 | Template => 3 | {"to", "email", "mail", "send", "it", "recipient", "addressee", "address"} => 4 |
# | | | What subject => 4 | body => 1 | None => 4 |
# | | | Who to send it to => 4 | Emailing => 1 | body => 3 |
# | | | (Other) => 27 | (Other) => 7 | (Other) => 24 |
# +-----------------+----------------+-----------------------------+----------------------------+----------------------------------------------------------------------------------+
2. Add the ingested data for the new workflow (from the CSV file) into the NLP-Template-Engine:
In this blog post we describe a series of different (computational) notebook transformations using different tools. We are using a series of recent articles and notebooks for processing the English and Russian texts of a recent 2-hour long interview. The workflows given in the notebooks are in Raku and Wolfram Language (WL).
Remark: Wolfram Language (WL) and Mathematica are used as synonyms in this document.
Remark: Using notebooks with Large Language Model (LLM) workflows is convenient because the WL LLM functions are also implemented in Python and Raku, [AA1, AAp1, AAp2].
We can say that this blog post attempts to advertise the Raku package “Markdown::Grammar”, [AAp3], demonstrated in the videos:
Here is the corresponding Mermaid-JS diagram (using the package “WWW::MermaidInk”, [AAp6]):
use WWW::MermaidInk;
my $diagram = q:to/END/; graph TD A[Make the Raku Jupyter notebook] --> B[Convert the Jupyter notebook into Markdown] B --> C[Publish to WordPress] C --> D[Convert the Markdown file into a Mathematica notebook] D --> E[Publish that to Wolfram Community] E --> F[Make the corresponding Mathematica notebook using WL functions] F --> G[Publish to Wolfram Community] G --> H[Make the Russian version with the Russian transcript] H --> I[Publish to Wolfram Community] I --> J[Convert the Mathematica notebook to Markdown] J --> K[Publish to WordPress] K --> L[Convert the Markdown file to Jupyter] L --> M[Re-make the workflows using Raku] M --> N[Re-make the workflows using Python] N -.-> Nen([English]) N -.-> Nru([Russian]) C -.-> WordPress{{Word Press}} K -.-> WordPress E -.-> |Deleted:<br>features Raku| WolframCom{{Wolfram Community}} G -.-> WolframCom I -.-> |"Deleted:<br>not in English"|WolframCom D -.-> MG[[Markdown::Grammar]] B -.-> Ju{{Jupyter}} L -.-> jupytext[[jupytext]] J -.-> M2MD[[M2MD]] E -.-> RakuMode[[RakuMode]] END
say mermaid-ink($diagram, format => 'md-image');
Clarifications
Russian versions
The first Carlson-Putin interview that is processed in the notebooks was held both in English and Russian. I think just doing the English study is “half-baked.” Hence, I did the workflows with the Russian text and translated to Russian the related explanations.
Remark: The Russian versions are done in all three programming languages: Python, Raku, Wolfram Language. See [AAn4, AAn5, AAn7].
Using different programming languages
From my point of view, having Raku-enabled Mathematica / WL notebook is a strong statement about WL. Fair amount of coding was required for the paclet “RakuMode”, [AAp4].
When we compare WL, Python, and R over Machine Learning (ML) projects, WL always appears to be the best choice for ML. (Overall.)
I do use these sets of comparison posts at Wolfram Community to support my arguments in discussions regarding which programming language is better. (Or bigger.)
Example comparison: WL workflows
The following three Wolfram Community posts are more or less the same content — “Workflows with LLM functions” — but in different programming languages:
Remark: The movie, [AAv1], linked in those notebooks also shows a comparison with the LSA workflow in R.
Using Raku with LLMs
I generally do not like using Jupyter notebooks, but using Raku with LLMs is very convenient [AAv2, AAv3, AAv4]. WL is clunkier when it comes to pre- or post-processing of LLM results.
Also, the Raku Chatbooks, [AAp5], provided better environment for display of the often Markdown formatted results of LLMs. (Like the ones in notebooks discussed here.)
… aka “How to use software manuals effectively without reading them”
Introduction
In this blog post (generated from this Jupyter notebook) we use Large Language Model (LLM) functions, [AAp1, AA1], for generating (hopefully) executable, correct, and harmless code for Operating System resources managements.
In order to be concrete and useful, we take the Markdown files of the articles “It’s time to rak!”, [EM1], that explain the motivation and usage of the Raku module “App::Rak”, [EMp1], and we show how meaningful, file finding shell commands can be generated via LLMs exposed to the code-with-comments from those articles.
In other words, we prefer to apply the attitude Too Long; Didn’t Read (TLDR) to the articles and related Raku module README (or user guide) file. (Because “App::Rak” is useful, but it has too many parameters that we prefer not to learn that much about.)
Remark: We say that “App::Rak” uses a Domain Specific Language (DSL), which is done with Raku’s Command Line Interface (CLI) features.
Get comment-and-code line pairs from the code blocks
Using Raku text manipulation capabilities
(After observing code examples)
Generate from the comment-and-code pairs LLM few-shot training rules
Use the LLM example function to translate natural language commands into (valid and relevant) “App::Rak” DSL commands
With a few or a dozen natural language commands
Use LLMs to generate natural language commands in order to test LLM-TLDR-er further
Step 6 says how we do our TLDR — we use LLM-translations of natural language commands.
Alternative procedure
Instead of using Raku to process text we can make LLM functions for extracting the comment-and-code pairs. (That is also shown below.)
Extensions
Using LLMs to generate:
Stress tests for “App::Rak”
Variants of the gathered commands
And make new training rules with them
EBNF grammars for gathered commands
Compare OpenAI and PaLM and or their different models
Which one produces best results?
Which ones produce better result for which subsets of commands?
Article’s structure
The exposition below follows the outlines of procedure subsections above.
The stress-testing extensions and EBNF generation extension have thier own sections: “Translating randomly generated commands” and “Grammar generation” respectively.
Remark: The article/document/notebook was made with the Jupyter framework, using the Raku package “Jupyter::Kernel”, [BD1].
Setup
use Markdown::Grammar;
use Data::Reshapers;
use Data::Summarizers;
use LLM::Functions;
use Text::SubParsers;
Workflow
File names
my $dirName = $*HOME ~ '/GitHub/lizmat/articles';
my @fileNames = dir($dirName).grep(*.Str.contains('time-to-rak'));
@fileNames.elems
4
Texts ingestion
Here we ingest the text of each file:
my %texts = @fileNames.map({ $_.basename => slurp($_) });
%texts.elems
With the function md-section-tree we extract code blocks from Markdown documentation files into data structures amenable for further programmatic manipulation (in Raku.) Here we get code blocks from each text:
my @blocks = %docTrees.values.Array.&flatten;
@blocks.elems
52
Extract command-and-code line pairs
Here from each code block we parse-extract comment-and-code pairs and we form the LLM training rules:
my @rules;
@blocks.map({
given $_ {
for m:g/ '#' $<comment>=(\V+) \n '$' $<code>=(\V+) \n / -> $m {
@rules.push( ($m<comment>.Str.trim => $m<code>.Str.trim) )
} } }).elems
52
Here is the number of rules:
@rules.elems
69
Here is a sample of the rules:
.say for @rules.pick(4)
save --after-context as -A, requiring a value => rak --after-context=! --save=A
Show all directory names from current directory down => rak --find --/file
Reverse the order of the characters of each line => rak '*.flip' twenty
Show number of files / lines authored by Scooby Doo => rak --blame-per-line '*.author eq "Scooby Doo"' --count-only
Nice tabulation with LLM function
In order to tabulate “nicely” the rules in the Jupyter notebook, we make an LLM functions to produce an HTML table and then specify the corresponding “magic cell.” (This relies on the Jupyter-magics features of [BDp1].) Here is an LLM conversion function, [AA1]:
my &ftbl = llm-function({"Convert the $^a table $^b into an HTML table."}, e=>llm-configuration('PaL<', max-tokens=>800))
Find files that have “lib” in their name from the current dir
rak lib –find
Look for strings containing y or Y
rak –type=contains –ignorecase Y twenty
Show all directory names from current directory down
rak –find –/file
Show all lines with numbers between 1 and 65
rak ‘/ \d+ /’
Show the lines that contain “six” as a word
rak §six twenty
look for “Foo”, while taking case into account
rak Foo
look for “foo” in all files
rak foo
produce extensive help on filesystem filters
rak –help=filesystem –pager=less
save –context as -C, setting a default of 2
rak –context='[2]’ –save=C
save searching in Rakudo’s committed files as –rakudo
rak –paths=’~/Github/rakudo’ –under-version-control –save=rakudo
search for “foo” and show 4 lines of context
rak foo -C=4
start rak with configuration file at /usr/local/rak-config.json
RAK_CONFIG=/usr/local/rak-config.json rak foo
Remark: Of course, in order to program the above sub we need to know how to use “Markdown::Grammar”. Producing HTML tables with LLMs is much easier — only knowledge of “spoken English” is required.
Code generation examples
Here we define an LLM function for generating “App::Rak” shell commands:
my &frak = llm-example-function(@rules, e => llm-evaluator('PaLM'))
my @cmds = ['Find files that have ".nb" in their names', 'Find files that have ".nb" or ".wl" in their names',
'Show all directories of the parent directory', 'Give me files without extensions and that contain the phrase "notebook"',
'Show all that have extension raku or rakumod and contain Data::Reshapers'];
my @tbl = @cmds.map({ %( 'Command' => $_, 'App::Rak' => &frak($_) ) }).Array;
@tbl.&dimensions
(5 2)
Here is a table showing the natural language commands and the corresponding translations to the “App::Rak” CLI DSL:
Find files that have “.nb” or “.wl” in their names
rak –find –extensions=nb,wl
Show all directories of the parent directory
rak –find –/file –parent
Give me files without extensions and that contain the phrase “notebook”
rak –extensions= –type=contains notebook
Show all that have extension raku or rakumod and contain Data::Reshapers
rak ‘/ Data::Reshapers /’ –extensions=raku,rakumod
Verification
Of course, the obtained “App::Rak” commands have to be verified to:
Work
Produce expected results
We can program to this verification with Raku or with the Jupyter framework, but we not doing that here. (We do the verification manually outside of this notebook.)
Remark: I tried a dozen of generated commands. Most worked. One did not work because of the current limitations of “App::Rak”. Others needed appropriate nudging to produce the desired results.
Here is an example of command that produces code that “does not work”:
&frak("Give all files that have extensions .nd and contain the command Classify")
rak '*.nd <command> Classify' --extensions=nd
Here are a few more:
&frak("give the names of all files in the parent directory")
rak --find --/file --/directory
&frak("Find all directories in the parent directory")
rak --find --/file --parent
Here is a generated command that exposes an “App::Rak” limitation:
&frak("Find all files in the parent directory")
rak --find ..
Translating randomly generated commands
Consider testing the applicability of the approach by generating a “good enough” sample of natural language commands for finding files or directories.
We can generate such commands via LLM. Here we define an LLM function with two parameters the returns a Raku list:
my &fcg = llm-function({"Generate $^_a natural language commands for finding $^b in a file system. Give the commands as a JSON list."}, form => sub-parser('JSON'))
["Find all files in the current directory", "Find all files with the .txt extension in the current directory", "Search for all files with the word 'report' in the file name", "Search for all files with the word 'data' in the file name in the Documents folder"]
Here are the corresponding translations to the “App::Rak” DSL:
Find all files with the .txt extension in the current directory
rak –extensions=txt
Search for all files with the word ‘report’ in the file name
rak report –find
Search for all files with the word ‘data’ in the file name in the Documents folder
rak data Documents
Let use redo the generation and translation using different specs:
my @gCmds2 = &fcg(4, 'files that have certain extensions or contain certain words').flat;
@gCmds2.raku
["Find all files with the extension .txt", "Locate all files that have the word 'project' in their name", "Show me all files with the extension .jpg", "Find all files that contain the word 'report'"]
Locate all files that have the word ‘project’ in their name
rak –find project
Show me all files with the extension .jpg
rak –extensions=jpg
Find all files that contain the word ‘report’
rak report –find
Remark: Ideally, there would be an LLM-based system that 1) hallucinates “App::Rak” commands, 2) executes them, and 3) files GitHub issues if it thinks the results are sub-par. (All done authomatically.) On a more practical note, we can use a system that has the first two components “only” to stress test “App::Rak”.
Alternative programming with LLM
In this subsection we show how to extract comment-and-code pairs using LLM functions. (Instead of working hard with Raku regexes.)
Here is LLM function that specifies the extraction:
my &fcex = llm-function({"Extract consecutive line pairs in which the first start with '#' and second with '\$' from the text $_. Group the lines as key-value pairs and put them in JSON format."},
form => 'JSON')
# Look for “ve” at the end of all lines in file “twenty”
Grammar generation
The “right way” of translating natural language DSLs to CLI DSLs like the one of “App::Rak” is to make a grammar for the natural language DSL and the corresponding interpreter. This might be a lengthy process, so, we might consider replacing it, or jump-starting it, with LLM-basd grammar generation: we ask an LLM to generate a grammar for a collection DSL sentences. (For example, the keys of the rules above.) In this subsection we make a “teaser” demonstration of latter approach.
Here we create an LLM function for generating grammars over collections of sentences:
my &febnf = llm-function({"Generate an $^a grammar for the collection of sentences:\n $^b "}, e => llm-configuration("OpenAI", max-tokens=>900))
The Raku package “WWW::OpenAI” provides access to the machine learning service OpenAI, [OAI1]. For more details of the OpenAI’s API usage see the documentation, [OAI2].
The package “WWW::OpenAI” was proclaimed approximately two months ago — see [AA1]. This blog post shows all the improvements and additions since then together with the “original” features.
Remark: The Raku package “WWW::OpenAI” is much “less ambitious” than the official Python package, [OAIp1], developed by OpenAI’s team.
use WWW::OpenAI;
openai-playground('Where is Roger Rabbit?', max-tokens => 64);
# [{finish_reason => stop, index => 0, logprobs => (Any), text =>
#
# Roger Rabbit is a fictional character created by Disney in 1988. He has appeared in several movies and television shows, but is not an actual person.}]
Another one using Bulgarian:
openai-playground('Колко групи могат да се намерят в този облак от точки.', max-tokens => 64);
# [{finish_reason => length, index => 0, logprobs => (Any), text =>
#
# В зависимост от размера на облака от точки, може да бъдат}]
Remark: The function openai-completion can be used instead in the examples above. See the section “Create chat completion” of [OAI2] for more details.
Models
The current OpenAI models can be found with the function openai-models:
There are two types of completions : text and chat. Let us illustrate the differences of their usage by Raku code generation. Here is a text completion:
openai-completion(
'generate Raku code for making a loop over a list',
type => 'text',
max-tokens => 120,
format => 'values');
# my @list = <a b c d e f g h i j>;
# for @list -> $item {
# say $item;
# }
Here is a chat completion:
openai-completion(
'generate Raku code for making a loop over a list',
type => 'chat',
max-tokens => 120,
format => 'values');
# Here's an example of how to make a loop over a list in Raku:
#
# ```
# my @list = (1, 2, 3, 4, 5);
#
# for @list -> $item {
# say $item;
# }
# ```
#
# In this code, we define a list `@list` with some values. Then, we use a `for` loop to iterate over each item in the list. The `-> $item` syntax specifies that we want to assign each item to the variable `$item` as we loop through the list. Finally, we use the
Remark: The argument “type” and the argument “model” have to “agree.” (I.e. be found agreeable by OpenAI.) For example:
model => 'text-davinci-003' implies type => 'text'
Images can be generated with the function openai-create-image — see the section “Images” of [OAI2].
Here is an example:
my $imgB64 = openai-create-image(
"racoon with a sliced onion in the style of Raphael",
response-format => 'b64_json',
n => 1,
size => 'small',
format => 'values',
method => 'tiny');
Here are the options descriptions:
response-format takes the values “url” and “b64_json”
n takes a positive integer, for the number of images to be generated
size takes the values ‘1024×1024’, ‘512×512’, ‘256×256’, ‘large’, ‘medium’, ‘small’.
Here we generate an image, get its URL, and place (embed) a link to it via the output of the code cell:
my @imgRes = |openai-create-image(
"racoon and onion in the style of Roy Lichtenstein",
response-format => 'url',
n => 1,
size => 'small',
method => 'tiny');
'';
my @modRes = |openai-moderation(
"I want to kill them!",
format => "values",
method => 'tiny');
for @modRes -> $m { .say for $m.pairs.sort(*.value).reverse; }
my $fileName = $*CWD ~ '/resources/HelloRaccoonsEN.mp3';
say openai-audio(
$fileName,
format => 'json',
method => 'tiny');
# {
# "text": "Raku practitioners around the world, eat more onions!"
# }
To do translations use the named argument type:
my $fileName = $*CWD ~ '/resources/HowAreYouRU.mp3';
say openai-audio(
$fileName,
type => 'translations',
format => 'json',
method => 'tiny');
# {
# "text": "How are you, bandits, hooligans? I've lost my mind because of you. I've been working as a guard for my whole life."
# }
Embeddings
Embeddings can be obtained with the function openai-embeddings. Here is an example of finding the embedding vectors for each of the elements of an array of strings:
my @queries = [
'make a classifier with the method RandomForeset over the data dfTitanic',
'show precision and accuracy',
'plot True Positive Rate vs Positive Predictive Value',
'what is a good meat and potatoes recipe'
];
my $embs = openai-embeddings(@queries, format => 'values', method => 'tiny');
$embs.elems;
# 4
Here we show:
That the result is an array of four vectors each with length 1536
The distributions of the values of each vector
use Data::Reshapers;
use Data::Summarizers;
say "\$embs.elems : { $embs.elems }";
say "\$embs>>.elems : { $embs>>.elems }";
records-summary($embs.kv.Hash.&transpose);
# $embs.elems : 4
# $embs>>.elems : 1536 1536 1536 1536
# +--------------------------------+------------------------------+-------------------------------+-------------------------------+
# | 3 | 1 | 0 | 2 |
# +--------------------------------+------------------------------+-------------------------------+-------------------------------+
# | Min => -0.6049936 | Min => -0.6674932 | Min => -0.5897995 | Min => -0.6316293 |
# | 1st-Qu => -0.0128846505 | 1st-Qu => -0.012275769 | 1st-Qu => -0.013175397 | 1st-Qu => -0.0125476065 |
# | Mean => -0.00075456833016081 | Mean => -0.000762535416627 | Mean => -0.0007618981246602 | Mean => -0.0007296895499115 |
# | Median => -0.00069939 | Median => -0.0003188204 | Median => -0.00100605615 | Median => -0.00056341792 |
# | 3rd-Qu => 0.012142678 | 3rd-Qu => 0.011146013 | 3rd-Qu => 0.012387738 | 3rd-Qu => 0.011868718 |
# | Max => 0.22202122 | Max => 0.22815572 | Max => 0.21172291 | Max => 0.21270473 |
# +--------------------------------+------------------------------+-------------------------------+-------------------------------+
Here we find the corresponding dot products and (cross-)tabulate them:
use Data::Reshapers;
use Data::Summarizers;
my @ct = (^$embs.elems X ^$embs.elems).map({ %( i => $_[0], j => $_[1], dot => sum($embs[$_[0]] >>*<< $embs[$_[1]])) }).Array;
say to-pretty-table(cross-tabulate(@ct, 'i', 'j', 'dot'), field-names => (^$embs.elems)>>.Str);
Remark: Note that the fourth element (the cooking recipe request) is an outlier. (Judging by the table with dot products.)
Finding textual answers
Here is an example of finding textual answers:
my $text = "Lake Titicaca is a large, deep lake in the Andes
on the border of Bolivia and Peru. By volume of water and by surface
area, it is the largest lake in South America";
openai-find-textual-answer($text, "Where is Titicaca?")
# [Andes on the border of Bolivia and Peru .]
By default openai-find-textual-answer tries to give short answers. If the option “request” is Whatever then depending on the number of questions the request is one those phrases:
“give the shortest answer of the question:”
“list the shortest answers of the questions:”
In the example above the full query given to OpenAI’s models is
Given the text “Lake Titicaca is a large, deep lake in the Andes on the border of Bolivia and Peru. By volume of water and by surface area, it is the largest lake in South America” give the shortest answer of the question: Where is Titicaca?
Here we get a longer answer by changing the value of “request”:
openai-find-textual-answer($text, "Where is Titicaca?", request => "answer the question:")
# [Titicaca is in the Andes on the border of Bolivia and Peru .]
Remark: The function openai-find-textual-answer is inspired by the Mathematica function FindTextualAnswer; see [JL1]. Unfortunately, at this time implementing the full signature of FindTextualAnswer with OpenAI’s API is not easy. (Or cheap to execute.)
Multiple questions
If several questions are given to the function openai-find-textual-answer then all questions are spliced with the given text into one query (that is sent to OpenAI.)
For example, consider the following text and questions:
my $query = 'Make a classifier with the method RandomForest over the data dfTitanic; show precision and accuracy.';
my @questions =
['What is the dataset?',
'What is the method?',
'Which metrics to show?'
];
Then the query send to OpenAI is:
Given the text: “Make a classifier with the method RandomForest over the data dfTitanic; show precision and accuracy.” list the shortest answers of the questions:
What is the dataset?
What is the method?
Which metrics to show?
The answers are assumed to be given in the same order as the questions, each answer in a separated line. Hence, by splitting the OpenAI result into lines we get the answers corresponding to the questions.
If the questions are missing question marks, it is likely that the result may have a completion as a first line followed by the answers. In that situation the answers are not parsed and a warning message is given.
CLI
Playground access
The package provides a Command Line Interface (CLI) script:
openai-playground --help
# Usage:
# openai-playground <text> [--path=<Str>] [-n[=UInt]] [--max-tokens[=UInt]] [-m|--model=<Str>] [-r|--role=<Str>] [-t|--temperature[=Real]] [-l|--language=<Str>] [--response-format=<Str>] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Text processing using the OpenAI API.
# openai-playground [<words> ...] [-m|--model=<Str>] [--path=<Str>] [-n[=UInt]] [--max-tokens[=UInt]] [-r|--role=<Str>] [-t|--temperature[=Real]] [-l|--language=<Str>] [--response-format=<Str>] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Command given as a sequence of words.
#
# <text> Text to be processed or audio file name.
# --path=<Str> Path, one of 'chat/completions', 'images/generations', 'moderations', 'audio/transcriptions', 'audio/translations', 'embeddings', or 'models'. [default: 'chat/completions']
# -n[=UInt] Number of completions or generations. [default: 1]
# --max-tokens[=UInt] The maximum number of tokens to generate in the completion. [default: 100]
# -m|--model=<Str> Model. [default: 'Whatever']
# -r|--role=<Str> Role. [default: 'user']
# -t|--temperature[=Real] Temperature. [default: 0.7]
# -l|--language=<Str> Language. [default: '']
# --response-format=<Str> The format in which the generated images are returned; one of 'url' or 'b64_json'. [default: 'url']
# -a|--auth-key=<Str> Authorization key (to use OpenAI API.) [default: 'Whatever']
# --timeout[=UInt] Timeout. [default: 10]
# --format=<Str> Format of the result; one of "json" or "hash". [default: 'json']
# --method=<Str> Method for the HTTP POST query; one of "tiny" or "curl". [default: 'tiny']
Remark: When the authorization key argument “auth-key” is specified set to “Whatever” then openai-playground attempts to use the env variable OPENAI_API_KEY.
Finding textual answers
The package provides a CLI script for finding textual answers:
openai-find-textual-answer --help
# Usage:
# openai-find-textual-answer <text> -q=<Str> [--max-tokens[=UInt]] [-m|--model=<Str>] [-t|--temperature[=Real]] [-r|--request=<Str>] [-p|--pairs] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Text processing using the OpenAI API.
# openai-find-textual-answer [<words> ...] -q=<Str> [--max-tokens[=UInt]] [-m|--model=<Str>] [-t|--temperature[=Real]] [-r|--request=<Str>] [-p|--pairs] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Command given as a sequence of words.
#
# <text> Text to be processed or audio file name.
# -q=<Str> Questions separated with '?' or ';'.
# --max-tokens[=UInt] The maximum number of tokens to generate in the completion. [default: 300]
# -m|--model=<Str> Model. [default: 'Whatever']
# -t|--temperature[=Real] Temperature. [default: 0.7]
# -r|--request=<Str> Request. [default: 'Whatever']
# -p|--pairs Should question-answer pairs be returned or not? [default: False]
# -a|--auth-key=<Str> Authorization key (to use OpenAI API.) [default: 'Whatever']
# --timeout[=UInt] Timeout. [default: 10]
# --format=<Str> Format of the result; one of "json" or "hash". [default: 'json']
# --method=<Str> Method for the HTTP POST query; one of "tiny" or "curl". [default: 'tiny']
Refactoring
Separate files for each OpenAI functionality
The original implementation of “WWW::OpenAI” had design and implementation that were very similar to those of “Lingua::Translation::DeepL”, [AAp1].
Major refactoring of the original code was done — now each OpenAI functionality targeted by “WWW::OpenAI” has its code placed in a separate file.
In order to do the refactoring, of course, a comprehensive enough suite of unit tests had to be put in place. Since running the tests costs money, the tests are placed in the “./xt” directory.
De-Cro-ing the requesting code
The first implementation of “WWW::OpenAI” used “Cro::HTTP::Client” to access OpenAI’s services. Often when I use “Cro::HTTP::Client” on macOS I get the errors:
Cannot locate symbol ‘SSL_get1_peer_certificate’ in native library
(See longer discussions about this problem here and here.)
Given the problems of using “Cro::HTTP::Client” and the implementations with curl and “HTTP::Tiny”, I decided it is better to make the implementation of “WWW::OpenAI” more lightweight by removing the code related to “Cro::HTTP::Client”.