{"title":"The Hub of Heliopolis","link":[{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/","rel":"alternate"}},{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/feeds\/all.atom.xml","rel":"self"}}],"id":"https:\/\/p403n1x87.github.io\/","updated":"2022-02-11T14:01:00+00:00","entry":[{"title":"Running C unit tests with pytest","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/running-c-unit-tests-with-pytest.html","rel":"alternate"}},"published":"2022-02-11T14:01:00+00:00","updated":"2022-02-11T14:01:00+00:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2022-02-11:\/running-c-unit-tests-with-pytest.html","summary":"<p>In this post I will describe my approach to C unit testing using pytest. In particular, we get to see how to gracefully handle SIGSEGVs and prevent them from stopping the test runner abruptly. Furthermore, we shall try to write tests in a Pythonic way.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#why\">Why?<\/a><\/li>\n<li><a href=\"#calling-native-code-from-python\">Calling native code from Python<\/a><\/li>\n<li><a href=\"#enter-pytest\">Enter pytest<\/a><\/li>\n<li><a href=\"#in-the-wild\">In the wild<\/a><\/li>\n<li><a href=\"#dead-end\">Dead end?<\/a><\/li>\n<li><a href=\"#pythonic-c-unit-testing\">Pythonic C unit testing<\/a><\/li>\n<li><a href=\"#coverage-please\">Coverage, please!<\/a><\/li>\n<\/ul>\n<\/div>\n<h1 id=\"why\">Why?<\/h1>\n<p>That's probably what you might be asking right now. Why use a Python testing\nframework to test C code? Why don't just use C testing frameworks, like <a href=\"https:\/\/github.com\/google\/googletest\">Google\nTest<\/a>, or <a href=\"https:\/\/libcheck.github.io\/check\/\">Check<\/a>. I can give you an answer with all the\nreasons that led me to adopt <a href=\"https:\/\/github.com\/pytest-dev\/pytest\/\">pytest<\/a> as the testing framework of choice\nfor one of my C projects, <a href=\"https:\/\/github.com\/p403n1x87\/austin\">Austin<\/a>. The first is that I spend most time\ncoding in Python these days and I have more familiarity with pytest than any\nother testing framework. Secondly, whilst Austin is a C project, it actually\ntargets Python programs, so Python is already one of the testing dependencies\nthat ends up being installed in CI anyway. Hence, instead of spending time\nlearning an entire new testing framework, I could quickly write them in Python,\nand leverage all the features of pytest, as well as all the packages that are\navailable for Python, should I ever need to. But these are not all the reasons\nfor adopting pytest for running C tests. If you keep reading you will discover a\nfew more that might convince you to use pytest for your C unit tests too!<\/p>\n<h1 id=\"calling-native-code-from-python\">Calling native code from Python<\/h1>\n<p>Testing C code with Python would only make sense if it were easy to call native\ncode from the interpreter. Thankfully, the Python standard library comes with\nthe <a href=\"https:\/\/docs.python.org\/3\/library\/ctypes.html\"><code>ctypes<\/code><\/a> module that allows us to do just that! So let's start\nlooking at some C code, for instance,<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\"># file: fact.c<\/span>\n\n<span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"nf\">fact<\/span><span class=\"p\">(<\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"n\">n<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">n<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">fact<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>which we want to compile as a shared object, e.g. with<\/p>\n<div class=\"highlight\"><pre><span><\/span>gcc -shared -o fact.so fact.c\n<\/pre><\/div>\n\n\n<p>How do we test the <code>fact<\/code> function from Python? Easy peasy!<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># file: fact.py<\/span>\n\n<span class=\"kn\">from<\/span> <span class=\"nn\">ctypes<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">CDLL<\/span>\n\n<span class=\"n\">libfact<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.\/fact.so&quot;<\/span><span class=\"p\">)<\/span>\n\n<span class=\"k\">assert<\/span> <span class=\"n\">libfact<\/span><span class=\"o\">.<\/span><span class=\"n\">fact<\/span><span class=\"p\">(<\/span><span class=\"mi\">6<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">720<\/span>\n<span class=\"k\">assert<\/span> <span class=\"n\">libfact<\/span><span class=\"o\">.<\/span><span class=\"n\">fact<\/span><span class=\"p\">(<\/span><span class=\"mi\">0<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">1<\/span>\n<span class=\"k\">assert<\/span> <span class=\"n\">libfact<\/span><span class=\"o\">.<\/span><span class=\"n\">fact<\/span><span class=\"p\">(<\/span><span class=\"o\">-<\/span><span class=\"mi\">42<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">1<\/span>\n<\/pre><\/div>\n\n\n<p>Assuming we are in the directory where both <code>fact.so<\/code> and <code>fact.py<\/code> reside, we\ncan test the <code>fact<\/code> function inside <code>fact.c<\/code> simply with<\/p>\n<div class=\"highlight\"><pre><span><\/span>python3 fact.py\n<\/pre><\/div>\n\n\n<p>If the test succeeds, the script's return code will be 0.<\/p>\n<blockquote>\n<p>Congratulations! You have now tested some C code with Python! \ud83c\udf89<\/p>\n<\/blockquote>\n<h1 id=\"enter-pytest\">Enter pytest<\/h1>\n<p>We are not here to just play around with bare <code>assert<\/code>s. I promised you the full\npower of Python and <code>pytest<\/code>, so we can't settle with just this simple\nexample. Let's add <code>pytest<\/code> to our test dependencies and do this instead<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># file: test_fact.py<\/span>\n\n<span class=\"kn\">from<\/span> <span class=\"nn\">ctypes<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">CDLL<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">pytest<\/span>\n\n<span class=\"nd\">@pytest<\/span><span class=\"o\">.<\/span><span class=\"n\">fixture<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">libfact<\/span><span class=\"p\">():<\/span>\n    <span class=\"k\">yield<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.\/fact.so&quot;<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">test_fact<\/span><span class=\"p\">(<\/span><span class=\"n\">libfact<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">libfact<\/span><span class=\"o\">.<\/span><span class=\"n\">fact<\/span><span class=\"p\">(<\/span><span class=\"mi\">6<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">720<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">libfact<\/span><span class=\"o\">.<\/span><span class=\"n\">fact<\/span><span class=\"p\">(<\/span><span class=\"mi\">0<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">1<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">libfact<\/span><span class=\"o\">.<\/span><span class=\"n\">fact<\/span><span class=\"p\">(<\/span><span class=\"o\">-<\/span><span class=\"mi\">42<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">1<\/span>\n<\/pre><\/div>\n\n\n<p>Now run <code>pytest<\/code> to get<\/p>\n<div class=\"highlight\"><pre><span><\/span>$ pytest\n=========================== test session starts ============================\nplatform linux -- Python 3.10.2, pytest-7.0.0, pluggy-1.0.0\nrootdir: \/tmp\ncollected 1 item\n\ntest_fact.py .                                                       [100%]\n\n============================ 1 passed in 0.00s =============================\n<\/pre><\/div>\n\n\n<p>That's some more informative output than what a plain Python test script would\ngive us! How about starting to leverage some of the other <code>pytest<\/code> features,\nlike parametrised tests? Let's rewrite our test case like so<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># file: test_fact.py<\/span>\n\n<span class=\"kn\">from<\/span> <span class=\"nn\">ctypes<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">CDLL<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">pytest<\/span>\n\n\n<span class=\"nd\">@pytest<\/span><span class=\"o\">.<\/span><span class=\"n\">fixture<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">libfact<\/span><span class=\"p\">():<\/span>\n    <span class=\"k\">yield<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.\/fact.so&quot;<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"nd\">@pytest<\/span><span class=\"o\">.<\/span><span class=\"n\">mark<\/span><span class=\"o\">.<\/span><span class=\"n\">parametrize<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;n,e&quot;<\/span><span class=\"p\">,<\/span> <span class=\"p\">[(<\/span><span class=\"mi\">6<\/span><span class=\"p\">,<\/span> <span class=\"mi\">720<\/span><span class=\"p\">),<\/span> <span class=\"p\">(<\/span><span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">),<\/span> <span class=\"p\">(<\/span><span class=\"o\">-<\/span><span class=\"mi\">42<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">)])<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">test_fact<\/span><span class=\"p\">(<\/span><span class=\"n\">libfact<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"p\">,<\/span> <span class=\"n\">e<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">libfact<\/span><span class=\"o\">.<\/span><span class=\"n\">fact<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"n\">e<\/span>\n<\/pre><\/div>\n\n\n<p>Let's run this again with <code>pytest<\/code>, this time with a more verbose output:<\/p>\n<div class=\"highlight\"><pre><span><\/span>pytest -vv\n=========================== test session starts ============================\nplatform linux -- Python 3.10.2, pytest-7.0.0, pluggy-1.0.0 -- \/tmp\/.venv\/bin\/python3.10\ncachedir: .pytest_cache\nrootdir: \/tmp\ncollected 3 items\n\ntest_fact.py::test_fact[6-720] PASSED                                [ 33%]\ntest_fact.py::test_fact[0-1] PASSED                                  [ 66%]\ntest_fact.py::test_fact[-42-1] PASSED                                [100%]\n\n============================ 3 passed in 0.01s =============================\n<\/pre><\/div>\n\n\n<p>Sweet! \ud83c\udf6f<\/p>\n<h1 id=\"in-the-wild\">In the wild<\/h1>\n<p>Thus far we've got an idea of how to invoke C from Python and how to write some\nsimple tests that we can run with <code>pytest<\/code> while also leveraging features like\nfixtures and parametrised tests. Let us now step this up a notch and consider\nthe organisation of sources within an <em>actual<\/em> C project, for instance<\/p>\n<div class=\"highlight\"><pre><span><\/span>my-c-project\/\n\u251c\u2500\u2500 docs\/\n\u251c\u2500\u2500 src\/    &lt;- All *.c and *.h sources, perhaps organised into sub-folders\n\u251c\u2500\u2500 tests\/  &lt;- Our test sources, obviously!\n\u251c\u2500\u2500 ChangeLog\n\u251c\u2500\u2500 configure.ac\n\u251c\u2500\u2500 LICENCE\n\u251c\u2500\u2500 Makefile.am\n\u251c\u2500\u2500 README\n...\n<\/pre><\/div>\n\n\n<p>In the previous example, we built the shared object <code>fact.so<\/code> by hand, but in a\nCI environment we would probably want to automate that step too. What should we\nuse for that? A bash script? A makefile? Python, of course! What else?!? \ud83d\ude00<\/p>\n<p>Let's make our sample C sources slightly more interesting. For example, we could\nborrow a few parts of the <code>cache.c<\/code> and <code>cache.h<\/code> sources from <a href=\"https:\/\/github.com\/p403n1x87\/austin\">Austin<\/a>,\nwhich implement a simple LRU cache. This is part of the spec<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\">\/\/ file: src\/cache.h<\/span>\n\n<span class=\"cp\">#ifndef CACHE_H<\/span>\n<span class=\"cp\">#define CACHE_H<\/span>\n\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;stdint.h&gt;<\/span><span class=\"cp\"><\/span>\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;stdlib.h&gt;<\/span><span class=\"cp\"><\/span>\n\n<span class=\"k\">typedef<\/span><span class=\"w\"> <\/span><span class=\"kt\">uintptr_t<\/span><span class=\"w\"> <\/span><span class=\"n\">key_dt<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"k\">typedef<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">value_t<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"k\">typedef<\/span><span class=\"w\"> <\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">queue_item_t<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">prev<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">next<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">key_dt<\/span><span class=\"w\"> <\/span><span class=\"n\">key<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">value_t<\/span><span class=\"w\"> <\/span><span class=\"n\">value<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"c1\">\/\/ Takes ownership of a free-able object<\/span>\n<span class=\"p\">}<\/span><span class=\"w\"> <\/span><span class=\"n\">queue_item_t<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"k\">typedef<\/span><span class=\"w\"> <\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">queue_t<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"kt\">unsigned<\/span><span class=\"w\"> <\/span><span class=\"n\">count<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"kt\">unsigned<\/span><span class=\"w\"> <\/span><span class=\"n\">capacity<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">front<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">rear<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">deallocator<\/span><span class=\"p\">)(<\/span><span class=\"n\">value_t<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"> <\/span><span class=\"n\">queue_t<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue_item_new<\/span><span class=\"p\">(<\/span><span class=\"n\">value_t<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">key_dt<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"kt\">void<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue_item__destroy<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"p\">)(<\/span><span class=\"n\">value_t<\/span><span class=\"p\">));<\/span><span class=\"w\"><\/span>\n\n<span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue_new<\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"p\">)(<\/span><span class=\"n\">value_t<\/span><span class=\"p\">));<\/span><span class=\"w\"><\/span>\n\n<span class=\"kt\">int<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue__is_full<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"kt\">int<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue__is_empty<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"n\">value_t<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue__dequeue<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue__enqueue<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">value_t<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">key_dt<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"kt\">void<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue__destroy<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>and this is the corresponding part of the implementation<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\">\/\/ file: src\/cache.c<\/span>\n\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;stdbool.h&gt;<\/span><span class=\"cp\"><\/span>\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;stdio.h&gt;<\/span><span class=\"cp\"><\/span>\n\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&quot;cache.h&quot;<\/span><span class=\"cp\"><\/span>\n\n<span class=\"cp\">#define isvalid(x) ((x) != NULL)<\/span>\n\n<span class=\"c1\">\/\/ ----------------------------------------------------------------------------<\/span>\n<span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue_item_new<\/span><span class=\"p\">(<\/span><span class=\"n\">value_t<\/span><span class=\"w\"> <\/span><span class=\"n\">value<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">key_dt<\/span><span class=\"w\"> <\/span><span class=\"n\">key<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">item<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">)<\/span><span class=\"n\">calloc<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"k\">sizeof<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_item_t<\/span><span class=\"p\">));<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">item<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">value<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">value<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">item<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">key<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">key<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">item<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">\/\/ ----------------------------------------------------------------------------<\/span>\n<span class=\"kt\">void<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue_item__destroy<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">self<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">deallocator<\/span><span class=\"p\">)(<\/span><span class=\"n\">value_t<\/span><span class=\"p\">))<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">!<\/span><span class=\"n\">isvalid<\/span><span class=\"p\">(<\/span><span class=\"n\">self<\/span><span class=\"p\">))<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"k\">return<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">deallocator<\/span><span class=\"p\">(<\/span><span class=\"n\">self<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">value<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">free<\/span><span class=\"p\">(<\/span><span class=\"n\">self<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">\/\/ ----------------------------------------------------------------------------<\/span>\n<span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue_new<\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">capacity<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">deallocator<\/span><span class=\"p\">)(<\/span><span class=\"n\">value_t<\/span><span class=\"p\">))<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">queue<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">)<\/span><span class=\"n\">calloc<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"k\">sizeof<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"p\">));<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">capacity<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">capacity<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">deallocator<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">deallocator<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">queue<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">\/\/ ----------------------------------------------------------------------------<\/span>\n<span class=\"kt\">int<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue__is_full<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">queue<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">count<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">capacity<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">\/\/ ----------------------------------------------------------------------------<\/span>\n<span class=\"kt\">int<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue__is_empty<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">queue<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">rear<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"nb\">NULL<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">\/\/ ----------------------------------------------------------------------------<\/span>\n<span class=\"n\">value_t<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue__dequeue<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">queue<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">queue__is_empty<\/span><span class=\"p\">(<\/span><span class=\"n\">queue<\/span><span class=\"p\">))<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"nb\">NULL<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">front<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">rear<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">front<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"nb\">NULL<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">temp<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">rear<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">rear<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">rear<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">prev<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">rear<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">rear<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">next<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"nb\">NULL<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">value<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">temp<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">value<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">free<\/span><span class=\"p\">(<\/span><span class=\"n\">temp<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">queue<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">count<\/span><span class=\"o\">--<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">value<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">\/\/ ----------------------------------------------------------------------------<\/span>\n<span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue__enqueue<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">self<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">value_t<\/span><span class=\"w\"> <\/span><span class=\"n\">value<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">key_dt<\/span><span class=\"w\"> <\/span><span class=\"n\">key<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">queue__is_full<\/span><span class=\"p\">(<\/span><span class=\"n\">self<\/span><span class=\"p\">))<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"nb\">NULL<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">temp<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">queue_item_new<\/span><span class=\"p\">(<\/span><span class=\"n\">value<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">key<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">temp<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">next<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">self<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">front<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">queue__is_empty<\/span><span class=\"p\">(<\/span><span class=\"n\">self<\/span><span class=\"p\">))<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"n\">self<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">rear<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">self<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">front<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">temp<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">else<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"n\">self<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">front<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">prev<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">temp<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"n\">self<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">front<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">temp<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">self<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">count<\/span><span class=\"o\">++<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">temp<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">\/\/ ----------------------------------------------------------------------------<\/span>\n<span class=\"kt\">void<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">queue__destroy<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">self<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">!<\/span><span class=\"n\">isvalid<\/span><span class=\"p\">(<\/span><span class=\"n\">self<\/span><span class=\"p\">))<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"k\">return<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">next<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"nb\">NULL<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">queue_item_t<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">item<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">self<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">front<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">isvalid<\/span><span class=\"p\">(<\/span><span class=\"n\">item<\/span><span class=\"p\">);<\/span><span class=\"w\"> <\/span><span class=\"n\">item<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">next<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"n\">next<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">item<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">next<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"n\">queue_item__destroy<\/span><span class=\"p\">(<\/span><span class=\"n\">item<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">self<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">deallocator<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">free<\/span><span class=\"p\">(<\/span><span class=\"n\">self<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>It's quite a fair bit of code; however, we are not interested in how the data\nstructures are implemented, but rather to what it actually implements. This\nalready gives us plenty to play with.<\/p>\n<p>The important detail here is that our C application has a component implemented\nin <code>cache.c<\/code> and we want to unit-test it. Before we can run any actual tests, we\nneed to build a binary object that we can invoke from Python. So let's put this\ncode in <code>tests\/cunit\/__init__.py<\/code><\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">from<\/span> <span class=\"nn\">pathlib<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">Path<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">subprocess<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">PIPE<\/span><span class=\"p\">,<\/span> <span class=\"n\">STDOUT<\/span><span class=\"p\">,<\/span> <span class=\"n\">run<\/span>\n\n<span class=\"n\">HERE<\/span> <span class=\"o\">=<\/span> <span class=\"n\">Path<\/span><span class=\"p\">(<\/span><span class=\"vm\">__file__<\/span><span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">resolve<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"n\">parent<\/span>\n<span class=\"n\">TEST<\/span> <span class=\"o\">=<\/span> <span class=\"n\">HERE<\/span><span class=\"o\">.<\/span><span class=\"n\">parent<\/span>\n<span class=\"n\">ROOT<\/span> <span class=\"o\">=<\/span> <span class=\"n\">TEST<\/span><span class=\"o\">.<\/span><span class=\"n\">parent<\/span>\n<span class=\"n\">SRC<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ROOT<\/span> <span class=\"o\">\/<\/span> <span class=\"s2\">&quot;src&quot;<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CompilationError<\/span><span class=\"p\">(<\/span><span class=\"ne\">Exception<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">pass<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">compile<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">:<\/span> <span class=\"n\">Path<\/span><span class=\"p\">,<\/span> <span class=\"n\">cflags<\/span><span class=\"o\">=<\/span><span class=\"p\">[],<\/span> <span class=\"n\">ldadd<\/span><span class=\"o\">=<\/span><span class=\"p\">[]):<\/span>\n    <span class=\"n\">binary<\/span> <span class=\"o\">=<\/span> <span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.so&quot;<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"n\">result<\/span> <span class=\"o\">=<\/span> <span class=\"n\">run<\/span><span class=\"p\">(<\/span>\n        <span class=\"p\">[<\/span><span class=\"s2\">&quot;gcc&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-shared&quot;<\/span><span class=\"p\">,<\/span> <span class=\"o\">*<\/span><span class=\"n\">cflags<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-o&quot;<\/span><span class=\"p\">,<\/span> <span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"n\">binary<\/span><span class=\"p\">),<\/span> <span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">),<\/span> <span class=\"o\">*<\/span><span class=\"n\">ldadd<\/span><span class=\"p\">],<\/span>\n        <span class=\"n\">stdout<\/span><span class=\"o\">=<\/span><span class=\"n\">PIPE<\/span><span class=\"p\">,<\/span>\n        <span class=\"n\">stderr<\/span><span class=\"o\">=<\/span><span class=\"n\">STDOUT<\/span><span class=\"p\">,<\/span>\n        <span class=\"n\">cwd<\/span><span class=\"o\">=<\/span><span class=\"n\">SRC<\/span><span class=\"p\">,<\/span>\n    <span class=\"p\">)<\/span>\n\n    <span class=\"k\">if<\/span> <span class=\"n\">result<\/span><span class=\"o\">.<\/span><span class=\"n\">returncode<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">0<\/span><span class=\"p\">:<\/span>\n        <span class=\"k\">return<\/span>\n\n    <span class=\"k\">raise<\/span> <span class=\"n\">CompilationError<\/span><span class=\"p\">(<\/span><span class=\"n\">result<\/span><span class=\"o\">.<\/span><span class=\"n\">stdout<\/span><span class=\"o\">.<\/span><span class=\"n\">decode<\/span><span class=\"p\">())<\/span>\n<\/pre><\/div>\n\n\n<p>This simply defines the <code>compile<\/code> utility that allows us to invoke <code>gcc<\/code> to\ncompile a source and generate the <code>.so<\/code> shared object. We can then use it in our\ntest source this way<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">from<\/span> <span class=\"nn\">ctypes<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">CDLL<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">pytest<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">tests.cunit<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">SRC<\/span><span class=\"p\">,<\/span> <span class=\"nb\">compile<\/span>\n\n<span class=\"n\">C<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;libc.so.6&quot;<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"nd\">@pytest<\/span><span class=\"o\">.<\/span><span class=\"n\">fixture<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">cache<\/span><span class=\"p\">():<\/span>\n    <span class=\"n\">source<\/span> <span class=\"o\">=<\/span> <span class=\"n\">SRC<\/span> <span class=\"o\">\/<\/span> <span class=\"s2\">&quot;cache.c&quot;<\/span>\n    <span class=\"nb\">compile<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">yield<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.so&quot;<\/span><span class=\"p\">)))<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">test_cache<\/span><span class=\"p\">(<\/span><span class=\"n\">cache<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">lru_cache<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">lru_cache_new<\/span><span class=\"p\">(<\/span><span class=\"mi\">10<\/span><span class=\"p\">,<\/span> <span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">free<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">lru_cache<\/span>\n    <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">lru_cache__destroy<\/span><span class=\"p\">(<\/span><span class=\"n\">lru_cache<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>At this point, your project folder should have the following structure<\/p>\n<div class=\"highlight\"><pre><span><\/span>my-c-project\/\n...\n\u251c\u2500\u2500 src\/\n|   \u251c\u2500\u2500 cache.c\n|   \u2514\u2500\u2500 cache.h\n\u251c\u2500\u2500 tests\/\n|   \u251c\u2500\u2500 cunit\/\n|   |   \u251c\u2500\u2500 __init__.py\n|   |   \u2514\u2500\u2500 test_cache.py\n|   \u2514\u2500\u2500 __init__.py\n...\n<\/pre><\/div>\n\n\n<p>and when you run <code>pytest<\/code> again, this time the C source would be compiled at\nruntime using <code>gcc<\/code>. The tests then run as before, which should produce the same\noutput we saw earlier.<\/p>\n<h1 id=\"dead-end\">Dead end?<\/h1>\n<p>If you're still with me, then things are probably looking interesting to you\ntoo. So let's test a bit more of the functions exported by the caching\ncomponent. Let's make a test case for the <code>queue_item_t<\/code> and <code>queue_t<\/code> objects,\nlike so<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">from<\/span> <span class=\"nn\">ctypes<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">CDLL<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">pytest<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">tests.cunit<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">SRC<\/span><span class=\"p\">,<\/span> <span class=\"nb\">compile<\/span>\n\n<span class=\"n\">C<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;libc.so.6&quot;<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"nd\">@pytest<\/span><span class=\"o\">.<\/span><span class=\"n\">fixture<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">cache<\/span><span class=\"p\">():<\/span>\n    <span class=\"n\">source<\/span> <span class=\"o\">=<\/span> <span class=\"n\">SRC<\/span> <span class=\"o\">\/<\/span> <span class=\"s2\">&quot;cache.c&quot;<\/span>\n    <span class=\"nb\">compile<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">yield<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.so&quot;<\/span><span class=\"p\">)))<\/span>\n\n\n<span class=\"n\">NULL<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">0<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">test_queue_item<\/span><span class=\"p\">(<\/span><span class=\"n\">cache<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">value<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">1<\/span>\n    <span class=\"n\">queue_item<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue_item_new<\/span><span class=\"p\">(<\/span><span class=\"n\">value<\/span><span class=\"p\">,<\/span> <span class=\"mi\">42<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">queue_item<\/span>\n\n    <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue_item__destroy<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_item<\/span><span class=\"p\">,<\/span> <span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">free<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"nd\">@pytest<\/span><span class=\"o\">.<\/span><span class=\"n\">mark<\/span><span class=\"o\">.<\/span><span class=\"n\">parametrize<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;qsize&quot;<\/span><span class=\"p\">,<\/span> <span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"mi\">10<\/span><span class=\"p\">,<\/span> <span class=\"mi\">100<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1000<\/span><span class=\"p\">])<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">test_queue<\/span><span class=\"p\">(<\/span><span class=\"n\">cache<\/span><span class=\"p\">,<\/span> <span class=\"n\">qsize<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">q<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue_new<\/span><span class=\"p\">(<\/span><span class=\"n\">qsize<\/span><span class=\"p\">,<\/span> <span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">free<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">assert<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue__is_empty<\/span><span class=\"p\">(<\/span><span class=\"n\">q<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">qsize<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">0<\/span> <span class=\"ow\">or<\/span> <span class=\"ow\">not<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue__is_full<\/span><span class=\"p\">(<\/span><span class=\"n\">q<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">assert<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue__dequeue<\/span><span class=\"p\">(<\/span><span class=\"n\">q<\/span><span class=\"p\">)<\/span> <span class=\"ow\">is<\/span> <span class=\"n\">NULL<\/span>\n\n    <span class=\"n\">values<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">malloc<\/span><span class=\"p\">(<\/span><span class=\"mi\">16<\/span><span class=\"p\">)<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">range<\/span><span class=\"p\">(<\/span><span class=\"n\">qsize<\/span><span class=\"p\">)]<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"nb\">all<\/span><span class=\"p\">(<\/span><span class=\"n\">values<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">for<\/span> <span class=\"n\">k<\/span><span class=\"p\">,<\/span> <span class=\"n\">v<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">enumerate<\/span><span class=\"p\">(<\/span><span class=\"n\">values<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">assert<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue__enqueue<\/span><span class=\"p\">(<\/span><span class=\"n\">q<\/span><span class=\"p\">,<\/span> <span class=\"n\">v<\/span><span class=\"p\">,<\/span> <span class=\"n\">k<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">assert<\/span> <span class=\"n\">qsize<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">0<\/span> <span class=\"ow\">or<\/span> <span class=\"ow\">not<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue__is_empty<\/span><span class=\"p\">(<\/span><span class=\"n\">q<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue__is_full<\/span><span class=\"p\">(<\/span><span class=\"n\">q<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue__enqueue<\/span><span class=\"p\">(<\/span><span class=\"n\">q<\/span><span class=\"p\">,<\/span> <span class=\"mi\">42<\/span><span class=\"p\">,<\/span> <span class=\"mi\">42<\/span><span class=\"p\">)<\/span> <span class=\"ow\">is<\/span> <span class=\"n\">NULL<\/span>\n\n    <span class=\"k\">assert<\/span> <span class=\"n\">values<\/span> <span class=\"o\">==<\/span> <span class=\"p\">[<\/span><span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue__dequeue<\/span><span class=\"p\">(<\/span><span class=\"n\">q<\/span><span class=\"p\">)<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">range<\/span><span class=\"p\">(<\/span><span class=\"n\">qsize<\/span><span class=\"p\">)]<\/span>\n<\/pre><\/div>\n\n\n<p>Let's run the new tests with <code>pytest -vv<\/code> and<\/p>\n<div class=\"highlight\"><pre><span><\/span>=============================== test session starts ===============================\nplatform linux -- Python 3.10.2, pytest-7.0.0, pluggy-1.0.0 -- \/home\/gabriele\/Projects\/cunit\/.venv\/bin\/python3.10\ncachedir: .pytest_cache\nrootdir: \/home\/gabriele\/Projects\/cunit\ncollected 5 items\n\ntests\/cunit\/test_cache.py::test_queue_item Fatal Python error: Segmentation fault\n\nCurrent thread 0x00007f4016e4f740 (most recent call first):\n  File &quot;\/home\/gabriele\/Projects\/cunit\/tests\/cunit\/test_cache.py&quot;, line 24 in test_queue_item\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/python.py&quot;, line 192 in pytest_pyfunc_call\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_callers.py&quot;, line 39 in _multicall\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_manager.py&quot;, line 80 in _hookexec\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_hooks.py&quot;, line 265 in __call__\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/python.py&quot;, line 1718 in runtest\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/runner.py&quot;, line 168 in pytest_runtest_call\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_callers.py&quot;, line 39 in _multicall\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_manager.py&quot;, line 80 in _hookexec\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_hooks.py&quot;, line 265 in __call__\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/runner.py&quot;, line 261 in &lt;lambda&gt;\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/runner.py&quot;, line 340 in from_call\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/runner.py&quot;, line 260 in call_runtest_hook\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/runner.py&quot;, line 221 in call_and_report\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/runner.py&quot;, line 132 in runtestprotocol\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/runner.py&quot;, line 113 in pytest_runtest_protocol\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_callers.py&quot;, line 39 in _multicall\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_manager.py&quot;, line 80 in _hookexec\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_hooks.py&quot;, line 265 in __call__\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/main.py&quot;, line 347 in pytest_runtestloop\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_callers.py&quot;, line 39 in _multicall\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_manager.py&quot;, line 80 in _hookexec\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_hooks.py&quot;, line 265 in __call__\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/main.py&quot;, line 322 in _main\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/main.py&quot;, line 268 in wrap_session\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/main.py&quot;, line 315 in pytest_cmdline_main\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_callers.py&quot;, line 39 in _multicall\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_manager.py&quot;, line 80 in _hookexec\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/pluggy\/_hooks.py&quot;, line 265 in __call__\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/config\/__init__.py&quot;, line 165 in main\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/lib\/python3.10\/site-packages\/_pytest\/config\/__init__.py&quot;, line 188 in console_main\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/bin\/pytest&quot;, line 8 in &lt;module&gt;\n[1]    337951 segmentation fault  .venv\/bin\/pytest -vv\n<\/pre><\/div>\n\n\n<p>Wait, what?! Where are our tests? A segmentation fault?!? Where did that come\nfrom? Well, there goes all this <code>pytest<\/code> hype! \ud83d\ude21<\/p>\n<p>Now, do you think I would have written this post if this was really the end of\nthe story? <\/p>\n<p>If you want to figure out for yourself where the problem is, pause here. When\nyou are ready to carry on, change line 20 to<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">value<\/span> <span class=\"o\">=<\/span> <span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">malloc<\/span><span class=\"p\">(<\/span><span class=\"mi\">16<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>and now the tests will all be happy! However, we really want to avoid crashing\nthe <code>pytest<\/code> process when we run into a segmentation fault, which is not so rare\nwhen running arbitrary C code. Not only that, but we would like to get some\nuseful information, like a traceback, that could give us insight as to where the\nproblem might be! One of the many strengths of <code>pytest<\/code> is its <a href=\"https:\/\/docs.pytest.org\/en\/latest\/reference\/reference.html\">extensive\nconfiguration API<\/a>. How do we use it to not crash the test runner?\nThe idea is to spawn another <code>pytest<\/code> process that runs just <em>a<\/em> test. Now, if\nthat test causes a segmentation fault, the parent process will keep running the\nother tests. Let's put this into <code>tests\/cunit\/conftest.py<\/code><\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># file: tests\/cunit\/conftest.py<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">os<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">sys<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">subprocess<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">PIPE<\/span><span class=\"p\">,<\/span> <span class=\"n\">run<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">types<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">FunctionType<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">SegmentationFault<\/span><span class=\"p\">(<\/span><span class=\"ne\">Exception<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">pass<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CUnitTestFailure<\/span><span class=\"p\">(<\/span><span class=\"ne\">Exception<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">pass<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">pytest_pycollect_makeitem<\/span><span class=\"p\">(<\/span><span class=\"n\">collector<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"n\">obj<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">if<\/span> <span class=\"p\">(<\/span>\n        <span class=\"ow\">not<\/span> <span class=\"n\">os<\/span><span class=\"o\">.<\/span><span class=\"n\">getenv<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;PYTEST_CUNIT&quot;<\/span><span class=\"p\">)<\/span>\n        <span class=\"ow\">and<\/span> <span class=\"nb\">isinstance<\/span><span class=\"p\">(<\/span><span class=\"n\">obj<\/span><span class=\"p\">,<\/span> <span class=\"n\">FunctionType<\/span><span class=\"p\">)<\/span>\n        <span class=\"ow\">and<\/span> <span class=\"n\">name<\/span><span class=\"o\">.<\/span><span class=\"n\">startswith<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;test_&quot;<\/span><span class=\"p\">)<\/span>\n    <span class=\"p\">):<\/span>\n        <span class=\"n\">obj<\/span><span class=\"o\">.<\/span><span class=\"n\">__cunit__<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span><span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"n\">collector<\/span><span class=\"o\">.<\/span><span class=\"n\">fspath<\/span><span class=\"p\">),<\/span> <span class=\"n\">name<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">cunit<\/span><span class=\"p\">(<\/span><span class=\"n\">module<\/span><span class=\"p\">:<\/span> <span class=\"nb\">str<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">:<\/span> <span class=\"nb\">str<\/span><span class=\"p\">,<\/span> <span class=\"n\">full_name<\/span><span class=\"p\">:<\/span> <span class=\"nb\">str<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"nf\">_<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">_<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">__<\/span><span class=\"p\">):<\/span>\n        <span class=\"n\">test<\/span> <span class=\"o\">=<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">module<\/span><span class=\"si\">}<\/span><span class=\"s2\">::<\/span><span class=\"si\">{<\/span><span class=\"n\">name<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span>\n        <span class=\"n\">env<\/span> <span class=\"o\">=<\/span> <span class=\"n\">os<\/span><span class=\"o\">.<\/span><span class=\"n\">environ<\/span><span class=\"o\">.<\/span><span class=\"n\">copy<\/span><span class=\"p\">()<\/span>\n        <span class=\"n\">env<\/span><span class=\"p\">[<\/span><span class=\"s2\">&quot;PYTEST_CUNIT&quot;<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">full_name<\/span>\n\n        <span class=\"n\">result<\/span> <span class=\"o\">=<\/span> <span class=\"n\">run<\/span><span class=\"p\">([<\/span><span class=\"n\">sys<\/span><span class=\"o\">.<\/span><span class=\"n\">argv<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">],<\/span> <span class=\"s2\">&quot;-svv&quot;<\/span><span class=\"p\">,<\/span> <span class=\"n\">test<\/span><span class=\"p\">],<\/span> <span class=\"n\">stdout<\/span><span class=\"o\">=<\/span><span class=\"n\">PIPE<\/span><span class=\"p\">,<\/span> <span class=\"n\">stderr<\/span><span class=\"o\">=<\/span><span class=\"n\">PIPE<\/span><span class=\"p\">,<\/span> <span class=\"n\">env<\/span><span class=\"o\">=<\/span><span class=\"n\">env<\/span><span class=\"p\">)<\/span>\n\n        <span class=\"k\">if<\/span> <span class=\"n\">result<\/span><span class=\"o\">.<\/span><span class=\"n\">returncode<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">0<\/span><span class=\"p\">:<\/span>\n            <span class=\"k\">return<\/span>\n\n        <span class=\"k\">raise<\/span> <span class=\"n\">CUnitTestFailure<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;<\/span><span class=\"se\">\\n<\/span><span class=\"s2\">&quot;<\/span> <span class=\"o\">+<\/span> <span class=\"n\">result<\/span><span class=\"o\">.<\/span><span class=\"n\">stdout<\/span><span class=\"o\">.<\/span><span class=\"n\">decode<\/span><span class=\"p\">())<\/span>\n\n    <span class=\"k\">return<\/span> <span class=\"n\">_<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">pytest_collection_modifyitems<\/span><span class=\"p\">(<\/span><span class=\"n\">session<\/span><span class=\"p\">,<\/span> <span class=\"n\">config<\/span><span class=\"p\">,<\/span> <span class=\"n\">items<\/span><span class=\"p\">)<\/span> <span class=\"o\">-&gt;<\/span> <span class=\"kc\">None<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">if<\/span> <span class=\"n\">test_name<\/span> <span class=\"o\">:=<\/span> <span class=\"n\">os<\/span><span class=\"o\">.<\/span><span class=\"n\">getenv<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;PYTEST_CUNIT&quot;<\/span><span class=\"p\">):<\/span>\n        <span class=\"c1\"># We are inside the sandbox process. We select the only test we care<\/span>\n        <span class=\"n\">items<\/span><span class=\"p\">[:]<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"n\">_<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">items<\/span> <span class=\"k\">if<\/span> <span class=\"n\">_<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span> <span class=\"o\">==<\/span> <span class=\"n\">test_name<\/span><span class=\"p\">]<\/span>\n        <span class=\"k\">return<\/span>\n\n    <span class=\"k\">for<\/span> <span class=\"n\">item<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">items<\/span><span class=\"p\">:<\/span>\n        <span class=\"k\">if<\/span> <span class=\"nb\">hasattr<\/span><span class=\"p\">(<\/span><span class=\"n\">item<\/span><span class=\"o\">.<\/span><span class=\"n\">_obj<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;__cunit__&quot;<\/span><span class=\"p\">):<\/span>\n            <span class=\"n\">item<\/span><span class=\"o\">.<\/span><span class=\"n\">_obj<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cunit<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">item<\/span><span class=\"o\">.<\/span><span class=\"n\">_obj<\/span><span class=\"o\">.<\/span><span class=\"n\">__cunit__<\/span><span class=\"p\">,<\/span> <span class=\"n\">full_name<\/span><span class=\"o\">=<\/span><span class=\"n\">item<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>Let's re-run our broken test suite and see what happens this time:<\/p>\n<div class=\"highlight\"><pre><span><\/span>================================ test session starts =================================\nplatform linux -- Python 3.10.2, pytest-7.0.0, pluggy-1.0.0 -- \/home\/gabriele\/Projects\/cunit\/.venv\/bin\/python3.10\ncachedir: .pytest_cache\nrootdir: \/home\/gabriele\/Projects\/cunit\ncollected 5 items\n\ntests\/cunit\/test_cache.py::test_queue_item &lt;- tests\/cunit\/conftest.py FAILED   [ 20%]\ntests\/cunit\/test_cache.py::test_queue[0] &lt;- tests\/cunit\/conftest.py PASSED     [ 40%]\ntests\/cunit\/test_cache.py::test_queue[10] &lt;- tests\/cunit\/conftest.py PASSED    [ 60%]\ntests\/cunit\/test_cache.py::test_queue[100] &lt;- tests\/cunit\/conftest.py PASSED   [ 80%]\ntests\/cunit\/test_cache.py::test_queue[1000] &lt;- tests\/cunit\/conftest.py PASSED  [100%]\n\n====================================== FAILURES ======================================\n__________________________________ test_queue_item ___________________________________\n\n_ = ()\n__ = {&#39;cache&#39;: &lt;CDLL &#39;\/home\/gabriele\/Projects\/cunit\/src\/cache.so&#39;, handle 25c8400 at 0x7efd5b5d83d0&gt;}\ntest = &#39;\/home\/gabriele\/Projects\/cunit\/tests\/cunit\/test_cache.py::test_queue_item&#39;\nenv = {&#39;ANDROID_HOME&#39;: &#39;\/home\/gabriele\/.android\/sdk&#39;, &#39;COLORTERM&#39;: &#39;truecolor&#39;, &#39;DBUS_SESSION_BUS_ADDRESS&#39;: &#39;unix:path=\/run\/user\/1000\/bus&#39;, &#39;DEFAULTS_PATH&#39;: &#39;\/usr\/share\/gconf\/ubuntu.default.path&#39;, ...}\nresult = CompletedProcess(args=[&#39;.venv\/bin\/pytest&#39;, &#39;-svv&#39;, &#39;\/home\/gabriele\/Projects\/cunit\/tests\/cunit\/test_cache.py::test_queu...__init__.py&quot;, line 188 in console_main\\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/bin\/pytest&quot;, line 8 in &lt;module&gt;\\n&#39;)\n\n    def _(*_, **__):\n        test = f&quot;{module}::{name}&quot;\n        env = os.environ.copy()\n        env[&quot;PYTEST_CUNIT&quot;] = full_name\n\n        result = run([sys.argv[0], &quot;-svv&quot;, test], stdout=PIPE, stderr=PIPE, env=env)\n\n        if result.returncode == 0:\n            return\n\n&gt;       raise CUnitTestFailure(&quot;\\n&quot; + result.stdout.decode())\nE       tests.cunit.conftest.CUnitTestFailure: \nE       ============================= test session starts ==============================\nE       platform linux -- Python 3.10.2, pytest-7.0.0, pluggy-1.0.0 -- \/home\/gabriele\/Projects\/cunit\/.venv\/bin\/python3.10\nE       cachedir: .pytest_cache\nE       rootdir: \/home\/gabriele\/Projects\/cunit\nE       collecting ... collected 1 item\nE       \nE       tests\/cunit\/test_cache.py::test_queue_item\n\ntests\/cunit\/conftest.py:49: CUnitTestFailure\n============================== short test summary info ===============================\nFAILED tests\/cunit\/test_cache.py::test_queue_item - tests.cunit.conftest.CUnitTestF...\n============================ 1 failed, 4 passed in 1.21s =============================\n<\/pre><\/div>\n\n\n<p>How do we like this better? Now the first test fails with the segmentation\nfault, but the rest of the test suite still runs and we can see the reason of\nthe failure for the first test in the report, i.e. the segmentation fault.<\/p>\n<p>But what is the <code>conftest.py<\/code> code actually doing? Let's have a look. The\n<code>pytest_pycollect_makeitem<\/code> hook gets invoked when the tests inside\n<code>tests\\cunit<\/code> are being collected by <code>pytest<\/code>. At this stage we \"mark\" them as C\nunit tests by giving each collected <code>item<\/code> the <code>__cunit__<\/code> attribute. The value\nis a tuple containing the information of where the item came from\n(<code>collection.fspath<\/code> is the path of the module that defined the test, e.g.\n<code>tests\/cunit\/test_cache.py<\/code>) and the test name (e.g. <code>test_queue_item<\/code>). In our\ncase we only care about items that are of <code>FunctionType<\/code> type and that start\nwith <code>test_<\/code>. The environment variable <code>PYTEST_CUNIT<\/code> is used to detect whether\nwe are running in the parent <code>pytest<\/code> process or the \"sandbox\" child. In the\nlatter case we don't care of marking tests because we know exactly what we want\nto run.<\/p>\n<p>Once all the tests have been collected, we use the\n<code>pytest_collection_modifyitems<\/code> hook to actually modify the tests that we\npreviously marked as C unit tests. Again, the behaviour depends on whether we\nare in the parent <code>pytest<\/code> process, or in the sandbox. If <code>PYTEST_CUNIT<\/code> is set,\nthat's the signal that we are in the child <code>pytest<\/code> process. The value, as we\nshall see shortly, contains the information needed to pick the test that we want\nto run. So we use it to modify the list <code>items<\/code> to just the test that matches\nthe information stored in <code>PYTEST_CUNIT<\/code>. In the parent <code>pytest<\/code> process we\nactually modify what the items that we marked as C unit tests do. Obviously, we\ndon't want them to run the actual test, but rather a new instance of <code>pytest<\/code>\nthat will then run the test on behalf of the parent process. The magic happens\ninside the \"decorator\" <code>cunit<\/code>, which we use to build a closure around the test\nthat we want to run. As you can see, it returns a function (with a bit of a\nfunny and unusual signature) which, when called, will set the <code>PYTEST_CUNIT<\/code>\nvariable with the full name of the test (this is to support parametrised tests)\nand then run <code>sys.argv[0]<\/code> (which should be <code>pytest<\/code> if we run the test suite\nwith <code>pytest<\/code>), followed by the switches <code>-svv<\/code> and the path of the module that\nprovides the test we are wrapping around. This way, when the process terminates,\neither because the test passed or something really bad happened, we can inspect\nthe return code and the streams, and act accordingly.<\/p>\n<p>So now we have a test runner that can handle segmentation faults gracefully but\nstill doesn't tell us where they actually happened. Can we get some more\ndetailed information in the output? When a Python test fails we get a nice\ntraceback that tells us where things went wrong. Can we do the same with C unit\ntests? The answer is <strong>yes<\/strong>, provided we collect core dumps while tests run. If\nyou are running on Ubuntu, you can do <code>ulimit -c unlimited<\/code> and a <code>core<\/code> dump\nwill be generated in the working directory every time a segmentation fault\noccurs. We can then run <code>gdb<\/code> in batch mode to print a nice traceback that will\nhopefully help us investigate the problem. So let's add these helpers to the\n<code>conftest.py<\/code> file<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">from<\/span> <span class=\"nn\">pathlib<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">Path<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">subprocess<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">STDOUT<\/span><span class=\"p\">,<\/span> <span class=\"n\">check_output<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">gdb<\/span><span class=\"p\">(<\/span><span class=\"n\">cmds<\/span><span class=\"p\">:<\/span> <span class=\"nb\">list<\/span><span class=\"p\">[<\/span><span class=\"nb\">str<\/span><span class=\"p\">],<\/span> <span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">:<\/span> <span class=\"nb\">str<\/span><span class=\"p\">)<\/span> <span class=\"o\">-&gt;<\/span> <span class=\"nb\">str<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">check_output<\/span><span class=\"p\">(<\/span>\n        <span class=\"p\">[<\/span><span class=\"s2\">&quot;gdb&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-q&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-batch&quot;<\/span><span class=\"p\">]<\/span>\n        <span class=\"o\">+<\/span> <span class=\"p\">[<\/span><span class=\"n\">_<\/span> <span class=\"k\">for<\/span> <span class=\"n\">cs<\/span> <span class=\"ow\">in<\/span> <span class=\"p\">((<\/span><span class=\"s2\">&quot;-ex&quot;<\/span><span class=\"p\">,<\/span> <span class=\"n\">_<\/span><span class=\"p\">)<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">cmds<\/span><span class=\"p\">)<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">cs<\/span><span class=\"p\">]<\/span>\n        <span class=\"o\">+<\/span> <span class=\"nb\">list<\/span><span class=\"p\">(<\/span><span class=\"n\">args<\/span><span class=\"p\">),<\/span>\n        <span class=\"n\">stderr<\/span><span class=\"o\">=<\/span><span class=\"n\">STDOUT<\/span><span class=\"p\">,<\/span>\n    <span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">decode<\/span><span class=\"p\">()<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">bt<\/span><span class=\"p\">(<\/span><span class=\"n\">binary<\/span><span class=\"p\">:<\/span> <span class=\"n\">Path<\/span><span class=\"p\">)<\/span> <span class=\"o\">-&gt;<\/span> <span class=\"nb\">str<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">if<\/span> <span class=\"n\">Path<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;core&quot;<\/span><span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">is_file<\/span><span class=\"p\">():<\/span>\n        <span class=\"k\">return<\/span> <span class=\"n\">gdb<\/span><span class=\"p\">([<\/span><span class=\"s2\">&quot;bt full&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;q&quot;<\/span><span class=\"p\">],<\/span> <span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"n\">binary<\/span><span class=\"p\">),<\/span> <span class=\"s2\">&quot;core&quot;<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">return<\/span> <span class=\"s2\">&quot;No core dump available.&quot;<\/span>\n<\/pre><\/div>\n\n\n<p>and improve the <code>cunit<\/code> decorator like so<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">from<\/span> <span class=\"nn\">tests.cunit<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">SRC<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">cunit<\/span><span class=\"p\">(<\/span><span class=\"n\">module<\/span><span class=\"p\">:<\/span> <span class=\"nb\">str<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">:<\/span> <span class=\"nb\">str<\/span><span class=\"p\">,<\/span> <span class=\"n\">full_name<\/span><span class=\"p\">:<\/span> <span class=\"nb\">str<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"nf\">_<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">_<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">__<\/span><span class=\"p\">):<\/span>\n        <span class=\"n\">test<\/span> <span class=\"o\">=<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">module<\/span><span class=\"si\">}<\/span><span class=\"s2\">::<\/span><span class=\"si\">{<\/span><span class=\"n\">name<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span>\n        <span class=\"n\">env<\/span> <span class=\"o\">=<\/span> <span class=\"n\">os<\/span><span class=\"o\">.<\/span><span class=\"n\">environ<\/span><span class=\"o\">.<\/span><span class=\"n\">copy<\/span><span class=\"p\">()<\/span>\n        <span class=\"n\">env<\/span><span class=\"p\">[<\/span><span class=\"s2\">&quot;PYTEST_CUNIT&quot;<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">full_name<\/span>\n\n        <span class=\"n\">result<\/span> <span class=\"o\">=<\/span> <span class=\"n\">run<\/span><span class=\"p\">([<\/span><span class=\"n\">sys<\/span><span class=\"o\">.<\/span><span class=\"n\">argv<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">],<\/span> <span class=\"s2\">&quot;-svv&quot;<\/span><span class=\"p\">,<\/span> <span class=\"n\">test<\/span><span class=\"p\">],<\/span> <span class=\"n\">stdout<\/span><span class=\"o\">=<\/span><span class=\"n\">PIPE<\/span><span class=\"p\">,<\/span> <span class=\"n\">stderr<\/span><span class=\"o\">=<\/span><span class=\"n\">PIPE<\/span><span class=\"p\">,<\/span> <span class=\"n\">env<\/span><span class=\"o\">=<\/span><span class=\"n\">env<\/span><span class=\"p\">)<\/span>\n\n        <span class=\"k\">match<\/span> <span class=\"n\">result<\/span><span class=\"o\">.<\/span><span class=\"n\">returncode<\/span><span class=\"p\">:<\/span>\n            <span class=\"k\">case<\/span> <span class=\"mi\">0<\/span><span class=\"p\">:<\/span>\n                <span class=\"k\">return<\/span>\n\n            <span class=\"k\">case<\/span> <span class=\"o\">-<\/span><span class=\"mi\">11<\/span><span class=\"p\">:<\/span>\n                <span class=\"n\">binary_name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">Path<\/span><span class=\"p\">(<\/span><span class=\"n\">module<\/span><span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">stem<\/span><span class=\"o\">.<\/span><span class=\"n\">replace<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;test_&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;&quot;<\/span><span class=\"p\">)<\/span>\n                <span class=\"k\">raise<\/span> <span class=\"n\">SegmentationFault<\/span><span class=\"p\">(<\/span><span class=\"n\">bt<\/span><span class=\"p\">((<\/span><span class=\"n\">SRC<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">binary_name<\/span><span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.so&quot;<\/span><span class=\"p\">)))<\/span>\n\n        <span class=\"k\">raise<\/span> <span class=\"n\">CUnitTestFailure<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;<\/span><span class=\"se\">\\n<\/span><span class=\"s2\">&quot;<\/span> <span class=\"o\">+<\/span> <span class=\"n\">result<\/span><span class=\"o\">.<\/span><span class=\"n\">stdout<\/span><span class=\"o\">.<\/span><span class=\"n\">decode<\/span><span class=\"p\">())<\/span>\n\n    <span class=\"k\">return<\/span> <span class=\"n\">_<\/span>\n<\/pre><\/div>\n\n\n<p>Now, when we run our broken test suite we should get this more verbose output<\/p>\n<div class=\"highlight\"><pre><span><\/span>================================ test session starts =================================\nplatform linux -- Python 3.10.2, pytest-7.0.0, pluggy-1.0.0 -- \/home\/gabriele\/Projects\/cunit\/.venv\/bin\/python3.10\ncachedir: .pytest_cache\nrootdir: \/home\/gabriele\/Projects\/cunit\ncollected 5 items\n\ntests\/cunit\/test_cache.py::test_queue_item &lt;- tests\/cunit\/conftest.py FAILED   [ 20%]\ntests\/cunit\/test_cache.py::test_queue[0] &lt;- tests\/cunit\/conftest.py PASSED     [ 40%]\ntests\/cunit\/test_cache.py::test_queue[10] &lt;- tests\/cunit\/conftest.py PASSED    [ 60%]\ntests\/cunit\/test_cache.py::test_queue[100] &lt;- tests\/cunit\/conftest.py PASSED   [ 80%]\ntests\/cunit\/test_cache.py::test_queue[1000] &lt;- tests\/cunit\/conftest.py PASSED  [100%]\n\n====================================== FAILURES ======================================\n__________________________________ test_queue_item ___________________________________\n\n_ = ()\n__ = {&#39;cache&#39;: &lt;CDLL &#39;\/home\/gabriele\/Projects\/cunit\/src\/cache.so&#39;, handle fc4ba0 at 0x7f5ec8d50460&gt;}\ntest = &#39;\/home\/gabriele\/Projects\/cunit\/tests\/cunit\/test_cache.py::test_queue_item&#39;\nenv = {&#39;ANDROID_HOME&#39;: &#39;\/home\/gabriele\/.android\/sdk&#39;, &#39;COLORTERM&#39;: &#39;truecolor&#39;, &#39;DBUS_SESSION_BUS_ADDRESS&#39;: &#39;unix:path=\/run\/user\/1000\/bus&#39;, &#39;DEFAULTS_PATH&#39;: &#39;\/usr\/share\/gconf\/ubuntu.default.path&#39;, ...}\nresult = CompletedProcess(args=[&#39;.venv\/bin\/pytest&#39;, &#39;-svv&#39;, &#39;\/home\/gabriele\/Projects\/cunit\/tests\/cunit\/test_cache.py::test_queu...__init__.py&quot;, line 188 in console_main\\n  File &quot;\/home\/gabriele\/Projects\/cunit\/.venv\/bin\/pytest&quot;, line 8 in &lt;module&gt;\\n&#39;)\nbinary_name = &#39;cache&#39;\n\n    def _(*_, **__):\n        test = f&quot;{module}::{name}&quot;\n        env = os.environ.copy()\n        env[&quot;PYTEST_CUNIT&quot;] = full_name\n\n        result = run([sys.argv[0], &quot;-svv&quot;, test], stdout=PIPE, stderr=PIPE, env=env)\n\n        match result.returncode:\n            case 0:\n                return\n\n            case -11:\n                binary_name = Path(module).stem.replace(&quot;test_&quot;, &quot;&quot;)\n&gt;               raise SegmentationFault(bt((SRC \/ binary_name).with_suffix(&quot;.so&quot;)))\nE               tests.cunit.conftest.SegmentationFault: \nE               warning: core file may not match specified executable file.\nE               [New LWP 548824]\nE               [Thread debugging using libthread_db enabled]\nE               Using host libthread_db library &quot;\/lib\/x86_64-linux-gnu\/libthread_db.so.1&quot;.\nE               Core was generated by `\/home\/gabriele\/Projects\/cunit\/.venv\/bin\/python3.10 .venv\/bin\/pytest -svv \/home\/&#39;.\nE               Program terminated with signal SIGSEGV, Segmentation fault.\nE               #0  raise (sig=&lt;optimised out&gt;) at ..\/sysdeps\/unix\/sysv\/linux\/raise.c:50\nE               50  ..\/sysdeps\/unix\/sysv\/linux\/raise.c: No such file or directory.\nE               #0  raise (sig=&lt;optimised out&gt;) at ..\/sysdeps\/unix\/sysv\/linux\/raise.c:50\nE                       set = {\nE                         __val = {[0] = 0, [1] = 1, [2] = 3, [3] = 140049677169600, [4] = 8, [5] = 4714713, [6] = 140049688588951, [7] = 140049678146624, [8] = 3, [9] = 5586517, [10] = 17917168, [11] = 3, [12] = 3, [13] = 140049678074368, [14] = 140049678074368, [15] = 4663610}\nE                       }\nE                       pid = &lt;optimised out&gt;\nE                       tid = &lt;optimised out&gt;\nE                       ret = &lt;optimised out&gt;\nE               #1  &lt;signal handler called&gt;\nE               No locals.\nE               #2  __GI___libc_free (mem=0x1) at malloc.c:3102\nE                       ar_ptr = &lt;optimised out&gt;\nE                       p = &lt;optimised out&gt;\nE                       hook = 0x0\nE               #3  0x00007f5fdbf1b40a in queue_item.destroy () from \/home\/gabriele\/Projects\/cunit\/src\/cache.so\nE               No symbol table info available.\nE               #4  0x00007f5fdb188ff5 in ?? () from \/usr\/lib\/x86_64-linux-gnu\/libffi.so.7\nE               No symbol table info available.\nE               #5  0x00007f5fdb18840a in ?? () from \/usr\/lib\/x86_64-linux-gnu\/libffi.so.7\nE               No symbol table info available.\nE               #6  0x00007f5fdb1a2286 in ?? () from \/usr\/lib\/python3.10\/lib-dynload\/_ctypes.cpython-310-x86_64-linux-gnu.so\nE               No symbol table info available.\nE               #7  0x00007f5fdb196eba in ?? () from \/usr\/lib\/python3.10\/lib-dynload\/_ctypes.cpython-310-x86_64-linux-gnu.so\nE               No symbol table info available.\nE               #8  0x0000000000512f83 in ?? ()\nE               No symbol table info available.\nE               #9  0x00007f5fda9fd160 in ?? ()\nE               No symbol table info available.\nE               #10 0x00007f5fdb1a12d0 in ?? () from \/usr\/lib\/python3.10\/lib-dynload\/_ctypes.cpython-310-x86_64-linux-gnu.so\nE               No symbol table info available.\nE               #11 0x00007f5fdaa1b658 in ?? ()\nE               No symbol table info available.\nE               #12 0x00007f5fda9f9690 in ?? ()\nE               No symbol table info available.\nE               #13 0x00007f5fdabd1f20 in ?? ()\nE               No symbol table info available.\nE               #14 0x00007f5fdaa1b680 in ?? ()\nE               No symbol table info available.\nE               #15 0x00000000011325a0 in ?? ()\nE               No symbol table info available.\nE               #16 0x00007f5fda9f214a in ?? ()\nE               No symbol table info available.\nE               #17 0x00007f5fdaa1b820 in ?? ()\nE               No symbol table info available.\nE               #18 0x00007f5fda9f20f0 in ?? ()\nE               No symbol table info available.\nE               #19 0x00007f5fda9fd160 in ?? ()\nE               No symbol table info available.\nE               #20 0x000000000057b1ec in ?? ()\nE               No symbol table info available.\nE               #21 0x59586f2afaac8ab6 in ?? ()\nE               No symbol table info available.\nE               #22 0x00007f5fdaa1b7e0 in ?? ()\nE               No symbol table info available.\nE               #23 0x0000000001116534 in ?? ()\nE               No symbol table info available.\nE               #24 0x00007f5fda9d50c0 in ?? ()\nE               No symbol table info available.\nE               #25 0x00007f5fdaa1b808 in ?? ()\nE               No symbol table info available.\nE               #26 0x8000000000000002 in ?? ()\nE               No symbol table info available.\nE               #27 0x00007f5fda949480 in ?? ()\nE               No symbol table info available.\nE               #28 0x00007f5fdaa1b810 in ?? ()\nE               No symbol table info available.\nE               #29 0x00007f5fda96aec0 in ?? ()\nE               No symbol table info available.\nE               #30 0x00007f5fdaa1b800 in ?? ()\nE               No symbol table info available.\nE               #31 0x00000000011164f0 in ?? ()\nE               No symbol table info available.\nE               #32 0x0000000000575f1d in ?? ()\nE               No symbol table info available.\nE               #33 0x42439dfdd749cffb in ?? ()\nE               No symbol table info available.\nE               #34 0x00007f5fdaa1b620 in ?? ()\nE               No symbol table info available.\nE               #35 0x0000000001116534 in ?? ()\nE               No symbol table info available.\nE               #36 0x00007f5fdb4ec070 in ?? ()\nE               No symbol table info available.\nE               #37 0x00007f5fda9fbb80 in ?? ()\nE               No symbol table info available.\nE               #38 0x0000000000000000 in ?? ()\nE               No symbol table info available.\n\ntests\/cunit\/conftest.py:56: SegmentationFault\n============================== short test summary info ===============================\nFAILED tests\/cunit\/test_cache.py::test_queue_item - tests.cunit.conftest.Segmentati...\n============================ 1 failed, 4 passed in 1.73s =============================\n<\/pre><\/div>\n\n\n<p>Hmm, Still not that useful. What if we compile with debug symbols? Let's change\nthe <code>cache<\/code> fixture in <code>test_cache.py<\/code> so that it compiles the caching sources\nwith the <code>-g<\/code> option<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nd\">@pytest<\/span><span class=\"o\">.<\/span><span class=\"n\">fixture<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">cache<\/span><span class=\"p\">():<\/span>\n    <span class=\"n\">source<\/span> <span class=\"o\">=<\/span> <span class=\"n\">SRC<\/span> <span class=\"o\">\/<\/span> <span class=\"s2\">&quot;cache.c&quot;<\/span>\n    <span class=\"nb\">compile<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">,<\/span> <span class=\"n\">cflags<\/span><span class=\"o\">=<\/span><span class=\"p\">[<\/span><span class=\"s2\">&quot;-g&quot;<\/span><span class=\"p\">])<\/span>\n    <span class=\"k\">yield<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.so&quot;<\/span><span class=\"p\">)))<\/span>\n<\/pre><\/div>\n\n\n<p>Now that's much, <em>much<\/em> better!<\/p>\n<div class=\"highlight\"><pre><span><\/span>E               #3  0x00007ff51cf8b40a in queue_item__destroy (self=0x104ac60, deallocator=0x7ff51cc68850 &lt;__GI___libc_free&gt;) at \/home\/gabriele\/Projects\/cunit\/src\/cache.c:37\n<\/pre><\/div>\n\n\n<p>This tells us <em>exactly<\/em> where the problem occurred. Now that we have this\ninformation we can fix the test and make the test suite happy again! \ud83c\udf89<\/p>\n<h1 id=\"pythonic-c-unit-testing\">Pythonic C unit testing<\/h1>\n<p>OK, running C unit tests is nice and fun, but it doesn't save us much in terms\nof typing. In fact, the tests we wrote so far not only feel quite verbose,\nconsidering we are writing Python code, but don't feel Pythonic at all. Whilst\nthere is, in principle, no reason why non-Python tests written in Python should\nlook and feel Pythonic, can we somehow do something perhaps more elegant? The\nidea of using a fixture to wrap around a binary object is perhaps interesting,\nbut can we maybe pretend that <code>cache<\/code> is a Python module instead so that we can\ndo things like <code>from cache import queue_item_new<\/code> etc... and sweep all this\n<code>ctypes<\/code> business under the carpet? Well, let's give this a try, shall we?<\/p>\n<p>Back in <code>tests\/cunit\/__init__.py<\/code>, let's add the following subtype of\n<code>ModuleType<\/code>:<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">from<\/span> <span class=\"nn\">ctypes<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">CDLL<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">types<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">ModuleType<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CModule<\/span><span class=\"p\">(<\/span><span class=\"n\">ModuleType<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">source<\/span><span class=\"p\">):<\/span>\n        <span class=\"nb\">super<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;Generated from <\/span><span class=\"si\">{<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s1\">&#39;.c&#39;<\/span><span class=\"p\">)<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span><span class=\"p\">)<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__binary__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.so&quot;<\/span><span class=\"p\">))<\/span>\n        <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__binary__<\/span><span class=\"o\">.<\/span><span class=\"vm\">__dict__<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__getattr__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">return<\/span> <span class=\"nb\">getattr<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__binary__<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"nd\">@classmethod<\/span>\n    <span class=\"k\">def<\/span> <span class=\"nf\">compile<\/span><span class=\"p\">(<\/span><span class=\"bp\">cls<\/span><span class=\"p\">,<\/span> <span class=\"n\">source<\/span><span class=\"p\">,<\/span> <span class=\"n\">cflags<\/span><span class=\"o\">=<\/span><span class=\"p\">[],<\/span> <span class=\"n\">ldadd<\/span><span class=\"o\">=<\/span><span class=\"p\">[]):<\/span>\n        <span class=\"nb\">compile<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.c&quot;<\/span><span class=\"p\">),<\/span> <span class=\"n\">cflags<\/span><span class=\"p\">,<\/span> <span class=\"n\">ldadd<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">return<\/span> <span class=\"bp\">cls<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>Now create <code>tests\/cunit\/cache.py<\/code> with the following content<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">sys<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">pathlib<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">Path<\/span>\n\n<span class=\"kn\">from<\/span> <span class=\"nn\">tests.cunit<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">SRC<\/span><span class=\"p\">,<\/span> <span class=\"n\">CModule<\/span>\n\n<span class=\"n\">CFLAGS<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"s2\">&quot;-g&quot;<\/span><span class=\"p\">]<\/span>\n\n<span class=\"n\">sys<\/span><span class=\"o\">.<\/span><span class=\"n\">modules<\/span><span class=\"p\">[<\/span><span class=\"vm\">__name__<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CModule<\/span><span class=\"o\">.<\/span><span class=\"n\">compile<\/span><span class=\"p\">(<\/span><span class=\"n\">SRC<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">Path<\/span><span class=\"p\">(<\/span><span class=\"vm\">__file__<\/span><span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">stem<\/span><span class=\"p\">,<\/span> <span class=\"n\">cflags<\/span><span class=\"o\">=<\/span><span class=\"n\">CFLAGS<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>and get rid of the <code>cache<\/code> fixture in <code>test_cache.py<\/code> (make sure to remove it\nalso from the test arguments!). Instead, add<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">tests.cunit.cache<\/span> <span class=\"k\">as<\/span> <span class=\"nn\">cache<\/span>\n<\/pre><\/div>\n\n\n<p><em>et voil\u00e0<\/em>! Now <code>cache<\/code> feels like a Python module that exports ordinary\nfunctions that we can call like any other Python function.<\/p>\n<p>Now, what about this code<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">def<\/span> <span class=\"nf\">test_queue_item<\/span><span class=\"p\">():<\/span>\n    <span class=\"n\">value<\/span> <span class=\"o\">=<\/span> <span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">malloc<\/span><span class=\"p\">(<\/span><span class=\"mi\">16<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">queue_item<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue_item_new<\/span><span class=\"p\">(<\/span><span class=\"n\">value<\/span><span class=\"p\">,<\/span> <span class=\"mi\">42<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">queue_item<\/span>\n\n    <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">queue_item__destroy<\/span><span class=\"p\">(<\/span><span class=\"n\">queue_item<\/span><span class=\"p\">,<\/span> <span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">free<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>Clearly <code>cache.queue_item_new<\/code> is creating a new object. The Pythonic way of\nwriting something like this would be<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">def<\/span> <span class=\"nf\">test_queue_item<\/span><span class=\"p\">():<\/span>\n    <span class=\"n\">value<\/span> <span class=\"o\">=<\/span> <span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">malloc<\/span><span class=\"p\">(<\/span><span class=\"mi\">16<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">queue_item<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cache<\/span><span class=\"o\">.<\/span><span class=\"n\">QueueItem<\/span><span class=\"p\">(<\/span><span class=\"n\">value<\/span><span class=\"p\">,<\/span> <span class=\"mi\">42<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">queue_item<\/span>\n<\/pre><\/div>\n\n\n<p>and we don't even care about destroying the object as we'd love the garbage\ncollector to take care of that for us. Now that's a more Pythonic way of going\nabout our C unit tests! Can we achieve something like this? The answer is once\nagain <em>yes<\/em>, but ... we can make this work provided we make some further\nassumptions, like some naming conventions. You might have noticed that the data\nstructures defined in <code>cache.h<\/code> have the naming <code>&lt;adt&gt;_t<\/code>, in an OOP flavour.\nMethods follow the naming convention <code>&lt;adt&gt;_&lt;staticmethod&gt;<\/code> and\n<code>&lt;adt&gt;__&lt;emethod&gt;<\/code>. Some of the methods have <em>special<\/em> names, like <code>&lt;adt&gt;_new<\/code>,\n<code>&lt;adt&gt;__destroy<\/code> etc.... So, provided we adhere to <em>some<\/em> naming conventions,\nlike this one, we could dynamically create Python types at runtime. How? With\nthe secret art of <a href=\"https:\/\/realpython.com\/python-metaclasses\/\">metaprogramming<\/a>. But before we can start\ncreating new types at runtime, we need to be able to parse C header files to\ninfer their definitions. That's why our next step is to add <code>pycparser<\/code> to our\ntest dependencies and implement a C AST visitor that can collect all the\nrelevant type and method declarations that we can use to build the spec for\nPython types. Here is what it might look like<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">from<\/span> <span class=\"nn\">ctypes<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">c_char_p<\/span>\n\n<span class=\"kn\">from<\/span> <span class=\"nn\">pycparser<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">c_ast<\/span><span class=\"p\">,<\/span> <span class=\"n\">c_parser<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">pycparser.plyparser<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">ParseError<\/span>\n\n<span class=\"k\">class<\/span> <span class=\"nc\">DeclCollector<\/span><span class=\"p\">(<\/span><span class=\"n\">c_ast<\/span><span class=\"o\">.<\/span><span class=\"n\">NodeVisitor<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span> <span class=\"o\">=<\/span> <span class=\"p\">{}<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">functions<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[]<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">_get_type<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">node<\/span><span class=\"p\">):<\/span>\n        <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"n\">node<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">return<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span><span class=\"p\">[<\/span><span class=\"s2\">&quot; &quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">(<\/span><span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">names<\/span><span class=\"p\">)]<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">visit_Typedef<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">node<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">if<\/span> <span class=\"nb\">isinstance<\/span><span class=\"p\">(<\/span><span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"p\">,<\/span> <span class=\"n\">c_ast<\/span><span class=\"o\">.<\/span><span class=\"n\">Struct<\/span><span class=\"p\">)<\/span> <span class=\"ow\">and<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">declname<\/span><span class=\"o\">.<\/span><span class=\"n\">endswith<\/span><span class=\"p\">(<\/span>\n            <span class=\"s2\">&quot;_t&quot;<\/span>\n        <span class=\"p\">):<\/span>\n            <span class=\"n\">struct<\/span> <span class=\"o\">=<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span><span class=\"p\">[<\/span><span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">declname<\/span><span class=\"p\">[:<\/span><span class=\"o\">-<\/span><span class=\"mi\">2<\/span><span class=\"p\">]]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CTypeDef<\/span><span class=\"p\">(<\/span>\n                <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">declname<\/span><span class=\"p\">,<\/span>\n                <span class=\"p\">[<\/span><span class=\"n\">decl<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span> <span class=\"k\">for<\/span> <span class=\"n\">decl<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">struct<\/span><span class=\"o\">.<\/span><span class=\"n\">decls<\/span><span class=\"p\">],<\/span>\n            <span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">visit_Decl<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">node<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">if<\/span> <span class=\"s2\">&quot;extern&quot;<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">storage<\/span><span class=\"p\">:<\/span>\n            <span class=\"k\">return<\/span>\n\n        <span class=\"k\">if<\/span> <span class=\"nb\">isinstance<\/span><span class=\"p\">(<\/span><span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"p\">,<\/span> <span class=\"n\">c_ast<\/span><span class=\"o\">.<\/span><span class=\"n\">FuncDecl<\/span><span class=\"p\">):<\/span>\n            <span class=\"n\">func_name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span>\n            <span class=\"n\">ret_type<\/span> <span class=\"o\">=<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span>\n            <span class=\"n\">rtype<\/span> <span class=\"o\">=<\/span> <span class=\"kc\">None<\/span>\n            <span class=\"k\">if<\/span> <span class=\"nb\">isinstance<\/span><span class=\"p\">(<\/span><span class=\"n\">ret_type<\/span><span class=\"p\">,<\/span> <span class=\"n\">c_ast<\/span><span class=\"o\">.<\/span><span class=\"n\">PtrDecl<\/span><span class=\"p\">):<\/span>\n                <span class=\"k\">if<\/span> <span class=\"s2\">&quot;&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">(<\/span><span class=\"n\">ret_type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">names<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"s2\">&quot;char&quot;<\/span><span class=\"p\">:<\/span>\n                    <span class=\"n\">rtype<\/span> <span class=\"o\">=<\/span> <span class=\"n\">c_char_p<\/span>\n            <span class=\"n\">args<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span>\n                <span class=\"p\">[<\/span><span class=\"n\">_<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span> <span class=\"k\">if<\/span> <span class=\"nb\">hasattr<\/span><span class=\"p\">(<\/span><span class=\"n\">_<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;name&quot;<\/span><span class=\"p\">)<\/span> <span class=\"k\">else<\/span> <span class=\"kc\">None<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">args<\/span><span class=\"o\">.<\/span><span class=\"n\">params<\/span><span class=\"p\">]<\/span>\n                <span class=\"k\">if<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">args<\/span> <span class=\"ow\">is<\/span> <span class=\"ow\">not<\/span> <span class=\"kc\">None<\/span>\n                <span class=\"k\">else<\/span> <span class=\"p\">[]<\/span>\n            <span class=\"p\">)<\/span>\n            <span class=\"k\">if<\/span> <span class=\"n\">func_name<\/span><span class=\"o\">.<\/span><span class=\"n\">endswith<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;_new&quot;<\/span><span class=\"p\">):<\/span>\n                <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span><span class=\"p\">[<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">func_name<\/span><span class=\"p\">[:<\/span><span class=\"o\">-<\/span><span class=\"mi\">4<\/span><span class=\"p\">]<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span><span class=\"p\">]<\/span><span class=\"o\">.<\/span><span class=\"n\">constructor<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CFunctionDef<\/span><span class=\"p\">(<\/span>\n                    <span class=\"s2\">&quot;new&quot;<\/span><span class=\"p\">,<\/span> <span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">rtype<\/span>\n                <span class=\"p\">)<\/span>\n            <span class=\"k\">elif<\/span> <span class=\"s2\">&quot;__&quot;<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">func_name<\/span><span class=\"p\">:<\/span>\n                <span class=\"n\">type_name<\/span><span class=\"p\">,<\/span> <span class=\"n\">_<\/span><span class=\"p\">,<\/span> <span class=\"n\">method_name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">func_name<\/span><span class=\"o\">.<\/span><span class=\"n\">partition<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;__&quot;<\/span><span class=\"p\">)<\/span>\n                <span class=\"k\">if<\/span> <span class=\"ow\">not<\/span> <span class=\"n\">type_name<\/span><span class=\"p\">:<\/span>\n                    <span class=\"k\">return<\/span>\n                <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span><span class=\"p\">[<\/span><span class=\"n\">type_name<\/span><span class=\"p\">]<\/span><span class=\"o\">.<\/span><span class=\"n\">methods<\/span><span class=\"o\">.<\/span><span class=\"n\">append<\/span><span class=\"p\">(<\/span>\n                    <span class=\"n\">CFunctionDef<\/span><span class=\"p\">(<\/span><span class=\"n\">method_name<\/span><span class=\"p\">,<\/span> <span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">rtype<\/span><span class=\"p\">)<\/span>\n                <span class=\"p\">)<\/span>\n            <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">functions<\/span><span class=\"o\">.<\/span><span class=\"n\">append<\/span><span class=\"p\">(<\/span><span class=\"n\">CFunctionDef<\/span><span class=\"p\">(<\/span><span class=\"n\">func_name<\/span><span class=\"p\">,<\/span> <span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">rtype<\/span><span class=\"p\">))<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">collect<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">decl<\/span><span class=\"p\">):<\/span>\n        <span class=\"n\">parser<\/span> <span class=\"o\">=<\/span> <span class=\"n\">c_parser<\/span><span class=\"o\">.<\/span><span class=\"n\">CParser<\/span><span class=\"p\">()<\/span>\n        <span class=\"k\">try<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">ast<\/span> <span class=\"o\">=<\/span> <span class=\"n\">parser<\/span><span class=\"o\">.<\/span><span class=\"n\">parse<\/span><span class=\"p\">(<\/span><span class=\"n\">decl<\/span><span class=\"p\">,<\/span> <span class=\"n\">filename<\/span><span class=\"o\">=<\/span><span class=\"s2\">&quot;&lt;preprocessed&gt;&quot;<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">except<\/span> <span class=\"n\">ParseError<\/span> <span class=\"k\">as<\/span> <span class=\"n\">e<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">lines<\/span> <span class=\"o\">=<\/span> <span class=\"n\">decl<\/span><span class=\"o\">.<\/span><span class=\"n\">splitlines<\/span><span class=\"p\">()<\/span>\n            <span class=\"n\">line<\/span><span class=\"p\">,<\/span> <span class=\"n\">col<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span>\n                <span class=\"nb\">int<\/span><span class=\"p\">(<\/span><span class=\"n\">_<\/span><span class=\"p\">)<\/span> <span class=\"o\">-<\/span> <span class=\"mi\">1<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">e<\/span><span class=\"o\">.<\/span><span class=\"n\">args<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span><span class=\"o\">.<\/span><span class=\"n\">partition<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot; &quot;<\/span><span class=\"p\">)[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span><span class=\"o\">.<\/span><span class=\"n\">split<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;:&quot;<\/span><span class=\"p\">)[<\/span><span class=\"mi\">1<\/span><span class=\"p\">:<\/span><span class=\"mi\">3<\/span><span class=\"p\">]<\/span>\n            <span class=\"p\">)<\/span>\n            <span class=\"k\">for<\/span> <span class=\"n\">i<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">range<\/span><span class=\"p\">(<\/span><span class=\"nb\">max<\/span><span class=\"p\">(<\/span><span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">line<\/span> <span class=\"o\">-<\/span> <span class=\"mi\">4<\/span><span class=\"p\">),<\/span> <span class=\"nb\">min<\/span><span class=\"p\">(<\/span><span class=\"n\">line<\/span> <span class=\"o\">+<\/span> <span class=\"mi\">5<\/span><span class=\"p\">,<\/span> <span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"n\">lines<\/span><span class=\"p\">))):<\/span>\n                <span class=\"k\">if<\/span> <span class=\"n\">i<\/span> <span class=\"o\">!=<\/span> <span class=\"n\">line<\/span><span class=\"p\">:<\/span>\n                    <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">i<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"si\">:<\/span><span class=\"s2\">5d<\/span><span class=\"si\">}<\/span><span class=\"s2\">  <\/span><span class=\"si\">{<\/span><span class=\"n\">lines<\/span><span class=\"p\">[<\/span><span class=\"n\">i<\/span><span class=\"p\">]<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span><span class=\"p\">)<\/span>\n                <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                    <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">i<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"si\">:<\/span><span class=\"s2\">5d<\/span><span class=\"si\">}<\/span><span class=\"s2\">  <\/span><span class=\"se\">\\033<\/span><span class=\"s2\">[33;1m<\/span><span class=\"si\">{<\/span><span class=\"n\">lines<\/span><span class=\"p\">[<\/span><span class=\"n\">line<\/span><span class=\"p\">]<\/span><span class=\"si\">}<\/span><span class=\"se\">\\033<\/span><span class=\"s2\">[0m&quot;<\/span><span class=\"p\">)<\/span>\n                    <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot; &quot;<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">col<\/span> <span class=\"o\">+<\/span> <span class=\"mi\">5<\/span><span class=\"p\">)<\/span> <span class=\"o\">+<\/span> <span class=\"s2\">&quot;<\/span><span class=\"se\">\\033<\/span><span class=\"s2\">[31;1m&lt;&lt;^<\/span><span class=\"se\">\\033<\/span><span class=\"s2\">[0m&quot;<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">raise<\/span>\n\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">visit<\/span><span class=\"p\">(<\/span><span class=\"n\">ast<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">return<\/span> <span class=\"p\">{<\/span>\n            <span class=\"n\">k<\/span><span class=\"p\">:<\/span> <span class=\"n\">v<\/span>\n            <span class=\"k\">for<\/span> <span class=\"n\">k<\/span><span class=\"p\">,<\/span> <span class=\"n\">v<\/span> <span class=\"ow\">in<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span><span class=\"o\">.<\/span><span class=\"n\">items<\/span><span class=\"p\">()<\/span>\n            <span class=\"k\">if<\/span> <span class=\"nb\">isinstance<\/span><span class=\"p\">(<\/span><span class=\"n\">v<\/span><span class=\"p\">,<\/span> <span class=\"n\">CTypeDef<\/span><span class=\"p\">)<\/span> <span class=\"ow\">and<\/span> <span class=\"n\">v<\/span><span class=\"o\">.<\/span><span class=\"n\">constructor<\/span>\n        <span class=\"p\">}<\/span>\n<\/pre><\/div>\n\n\n<p>This is also a handful, but the behaviour is very simple. We define an AST\nvisitor that listens to <code>typedef<\/code>s and declarations. For the former, we single\nout structure declarations and store the relevant information inside an\ninstance of the custom <code>CTypeDef<\/code> dataclass; as for the latter, we only trap\nfunction declaration (plus a special handling for those functions that return\n<code>char *<\/code>). The two intermediate dataclasses <code>CTypeDef<\/code> and <code>CFunctionDef<\/code> are\nmerely defined as (up to you to use the more elegant <code>@dataclass<\/code> decorator)<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">class<\/span> <span class=\"nc\">CFunctionDef<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">rtype<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">name<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">args<\/span> <span class=\"o\">=<\/span> <span class=\"n\">args<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">rtype<\/span> <span class=\"o\">=<\/span> <span class=\"n\">rtype<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CTypeDef<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"n\">fields<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">name<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">fields<\/span> <span class=\"o\">=<\/span> <span class=\"n\">fields<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">methods<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[]<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">constructor<\/span> <span class=\"o\">=<\/span> <span class=\"kc\">False<\/span>\n<\/pre><\/div>\n\n\n<p>Armed with these definitions we can enhance our <code>CModule<\/code> class like so<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">class<\/span> <span class=\"nc\">CModule<\/span><span class=\"p\">(<\/span><span class=\"n\">ModuleType<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">source<\/span><span class=\"p\">):<\/span>\n        <span class=\"nb\">super<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;Generated from <\/span><span class=\"si\">{<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s1\">&#39;.c&#39;<\/span><span class=\"p\">)<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span><span class=\"p\">)<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__binary__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.so&quot;<\/span><span class=\"p\">))<\/span>\n\n        <span class=\"n\">collector<\/span> <span class=\"o\">=<\/span> <span class=\"n\">DeclCollector<\/span><span class=\"p\">()<\/span>\n\n        <span class=\"k\">for<\/span> <span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctypedef<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">collector<\/span><span class=\"o\">.<\/span><span class=\"n\">collect<\/span><span class=\"p\">(<\/span>\n            <span class=\"n\">preprocess<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.h&quot;<\/span><span class=\"p\">))<\/span>\n        <span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">items<\/span><span class=\"p\">():<\/span>\n            <span class=\"n\">parts<\/span> <span class=\"o\">=<\/span> <span class=\"n\">name<\/span><span class=\"o\">.<\/span><span class=\"n\">split<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;_&quot;<\/span><span class=\"p\">)<\/span>\n            <span class=\"n\">py_name<\/span> <span class=\"o\">=<\/span> <span class=\"s2\">&quot;&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">((<\/span><span class=\"n\">_<\/span><span class=\"o\">.<\/span><span class=\"n\">capitalize<\/span><span class=\"p\">()<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">parts<\/span><span class=\"p\">))<\/span>\n            <span class=\"nb\">setattr<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">py_name<\/span><span class=\"p\">,<\/span> <span class=\"n\">CMetaType<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctypedef<\/span><span class=\"p\">,<\/span> <span class=\"kc\">None<\/span><span class=\"p\">))<\/span>\n\n        <span class=\"k\">for<\/span> <span class=\"n\">cfuncdef<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">collector<\/span><span class=\"o\">.<\/span><span class=\"n\">functions<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span>\n            <span class=\"k\">try<\/span><span class=\"p\">:<\/span>\n                <span class=\"n\">cfunc<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CFunction<\/span><span class=\"p\">(<\/span><span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"nb\">getattr<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__binary__<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">))<\/span>\n                <span class=\"nb\">setattr<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">except<\/span> <span class=\"ne\">AttributeError<\/span><span class=\"p\">:<\/span>\n                <span class=\"c1\"># Not part of the binary<\/span>\n                <span class=\"k\">pass<\/span>\n\n    <span class=\"nd\">@classmethod<\/span>\n    <span class=\"k\">def<\/span> <span class=\"nf\">compile<\/span><span class=\"p\">(<\/span><span class=\"bp\">cls<\/span><span class=\"p\">,<\/span> <span class=\"n\">source<\/span><span class=\"p\">,<\/span> <span class=\"n\">cflags<\/span><span class=\"o\">=<\/span><span class=\"p\">[],<\/span> <span class=\"n\">ldadd<\/span><span class=\"o\">=<\/span><span class=\"p\">[]):<\/span>\n        <span class=\"nb\">compile<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.c&quot;<\/span><span class=\"p\">),<\/span> <span class=\"n\">cflags<\/span><span class=\"p\">,<\/span> <span class=\"n\">ldadd<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">return<\/span> <span class=\"bp\">cls<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>That is, we add the collected types and functions as attributes to the C module.\nThis is the C type metaclass that we use as a C type factory<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">class<\/span> <span class=\"nc\">CMetaType<\/span><span class=\"p\">(<\/span><span class=\"nb\">type<\/span><span class=\"p\">(<\/span><span class=\"n\">Structure<\/span><span class=\"p\">)):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__new__<\/span><span class=\"p\">(<\/span><span class=\"bp\">cls<\/span><span class=\"p\">,<\/span> <span class=\"n\">cmodule<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctypedef<\/span><span class=\"p\">,<\/span> <span class=\"n\">_<\/span><span class=\"o\">=<\/span><span class=\"kc\">None<\/span><span class=\"p\">):<\/span>\n        <span class=\"n\">ctype<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">super<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"fm\">__new__<\/span><span class=\"p\">(<\/span>\n            <span class=\"bp\">cls<\/span><span class=\"p\">,<\/span>\n            <span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"p\">,<\/span>\n            <span class=\"p\">(<\/span><span class=\"n\">CType<\/span><span class=\"p\">,),<\/span>\n            <span class=\"p\">{<\/span><span class=\"s2\">&quot;__cmodule__&quot;<\/span><span class=\"p\">:<\/span> <span class=\"n\">cmodule<\/span><span class=\"p\">},<\/span>\n        <span class=\"p\">)<\/span>\n\n        <span class=\"n\">constructor<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">getattr<\/span><span class=\"p\">(<\/span><span class=\"n\">cmodule<\/span><span class=\"o\">.<\/span><span class=\"n\">__binary__<\/span><span class=\"p\">,<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"p\">[:<\/span><span class=\"o\">-<\/span><span class=\"mi\">2<\/span><span class=\"p\">]<\/span><span class=\"si\">}<\/span><span class=\"s2\">_new&quot;<\/span><span class=\"p\">)<\/span>\n        <span class=\"n\">ctype<\/span><span class=\"o\">.<\/span><span class=\"n\">new<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CStaticMethod<\/span><span class=\"p\">(<\/span><span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">constructor<\/span><span class=\"p\">,<\/span> <span class=\"n\">constructor<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctype<\/span><span class=\"p\">)<\/span>\n\n        <span class=\"k\">for<\/span> <span class=\"n\">method_def<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">methods<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">method_name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">method_def<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span>\n            <span class=\"n\">method<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">getattr<\/span><span class=\"p\">(<\/span><span class=\"n\">cmodule<\/span><span class=\"o\">.<\/span><span class=\"n\">__binary__<\/span><span class=\"p\">,<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"p\">[:<\/span><span class=\"o\">-<\/span><span class=\"mi\">2<\/span><span class=\"p\">]<\/span><span class=\"si\">}<\/span><span class=\"s2\">__<\/span><span class=\"si\">{<\/span><span class=\"n\">method_name<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span><span class=\"p\">)<\/span>\n            <span class=\"nb\">setattr<\/span><span class=\"p\">(<\/span><span class=\"n\">ctype<\/span><span class=\"p\">,<\/span> <span class=\"n\">method_name<\/span><span class=\"p\">,<\/span> <span class=\"n\">CMethod<\/span><span class=\"p\">(<\/span><span class=\"n\">method_def<\/span><span class=\"p\">,<\/span> <span class=\"n\">method<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctype<\/span><span class=\"p\">))<\/span>\n\n        <span class=\"n\">ctype<\/span><span class=\"o\">.<\/span><span class=\"n\">__cname__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span>\n\n        <span class=\"k\">return<\/span> <span class=\"n\">ctype<\/span>\n<\/pre><\/div>\n\n\n<p>This is responsible for creating a new C type as a Python type, with all the\nmethods added as appropriate instances of wrappers around the C functions (note\nthe special handling of the <code>_new<\/code> method!):<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">class<\/span> <span class=\"nc\">CFunction<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__args__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"o\">.<\/span><span class=\"n\">args<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cfunc__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cfunc<\/span>\n        <span class=\"k\">if<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"o\">.<\/span><span class=\"n\">rtype<\/span> <span class=\"ow\">is<\/span> <span class=\"ow\">not<\/span> <span class=\"kc\">None<\/span><span class=\"p\">:<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cfunc__<\/span><span class=\"o\">.<\/span><span class=\"n\">restype<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"o\">.<\/span><span class=\"n\">rtype<\/span>\n\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">_posonly<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">all<\/span><span class=\"p\">(<\/span><span class=\"n\">_<\/span> <span class=\"ow\">is<\/span> <span class=\"kc\">None<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__args__<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">check_args<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">kwargs<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">if<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">_posonly<\/span> <span class=\"ow\">and<\/span> <span class=\"n\">kwargs<\/span><span class=\"p\">:<\/span>\n            <span class=\"k\">raise<\/span> <span class=\"ne\">ValueError<\/span><span class=\"p\">(<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"si\">}<\/span><span class=\"s2\"> takes only positional arguments&quot;<\/span><span class=\"p\">)<\/span>\n\n        <span class=\"n\">nargs<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"n\">args<\/span><span class=\"p\">)<\/span> <span class=\"o\">+<\/span> <span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">if<\/span> <span class=\"n\">nargs<\/span> <span class=\"o\">!=<\/span> <span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__args__<\/span><span class=\"p\">):<\/span>\n            <span class=\"k\">raise<\/span> <span class=\"ne\">TypeError<\/span><span class=\"p\">(<\/span>\n                <span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"si\">}<\/span><span class=\"s2\"> takes exactly <\/span><span class=\"si\">{<\/span><span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__args__<\/span><span class=\"p\">)<\/span><span class=\"si\">}<\/span><span class=\"s2\"> arguments (<\/span><span class=\"si\">{<\/span><span class=\"n\">nargs<\/span><span class=\"si\">}<\/span><span class=\"s2\"> given)&quot;<\/span>\n            <span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__call__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">check_args<\/span><span class=\"p\">(<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">return<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cfunc__<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__repr__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">return<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;CFunction &#39;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&#39;&gt;&quot;<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CMethod<\/span><span class=\"p\">(<\/span><span class=\"n\">CFunction<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctype<\/span><span class=\"p\">):<\/span>\n        <span class=\"nb\">super<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">)<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__ctype__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ctype<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__get__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">obj<\/span><span class=\"p\">,<\/span> <span class=\"n\">objtype<\/span><span class=\"o\">=<\/span><span class=\"kc\">None<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">def<\/span> <span class=\"nf\">_<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">):<\/span>\n            <span class=\"n\">cargs<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"n\">obj<\/span><span class=\"o\">.<\/span><span class=\"n\">__cself__<\/span><span class=\"p\">,<\/span> <span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">]<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">check_args<\/span><span class=\"p\">(<\/span><span class=\"n\">cargs<\/span><span class=\"p\">,<\/span> <span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n\n            <span class=\"k\">return<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cfunc__<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">cargs<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n\n        <span class=\"n\">_<\/span><span class=\"o\">.<\/span><span class=\"n\">__cmethod__<\/span> <span class=\"o\">=<\/span> <span class=\"bp\">self<\/span>\n\n        <span class=\"k\">return<\/span> <span class=\"n\">_<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__repr__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">return<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;CMethod &#39;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&#39; of CType &#39;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__ctype__<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&#39;&gt;&quot;<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CStaticMethod<\/span><span class=\"p\">(<\/span><span class=\"n\">CFunction<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctype<\/span><span class=\"p\">):<\/span>\n        <span class=\"nb\">super<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">)<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__ctype__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ctype<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__repr__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">return<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;CStaticMethod &#39;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&#39; of CType &#39;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__ctype__<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&#39;&gt;&quot;<\/span>\n<\/pre><\/div>\n\n\n<p>The <code>CType<\/code> class implementation tries to mimic the behaviour of Python classes,\nwith <code>__cself__<\/code> playing the role of the C analogue of <code>self<\/code>:<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">class<\/span> <span class=\"nc\">CType<\/span><span class=\"p\">(<\/span><span class=\"n\">Structure<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cself__<\/span> <span class=\"o\">=<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">new<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__del__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">if<\/span> <span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">destroy<\/span><span class=\"o\">.<\/span><span class=\"n\">__cmethod__<\/span><span class=\"o\">.<\/span><span class=\"n\">__args__<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">1<\/span><span class=\"p\">:<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">destroy<\/span><span class=\"p\">()<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__repr__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">return<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"si\">}<\/span><span class=\"s2\"> CObject at <\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cself__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&gt;&quot;<\/span>\n<\/pre><\/div>\n\n\n<p>We also handle the destructor by overriding the <code>__del__<\/code> special method so that\nthe garbage collector can take care of freeing memory for us.<\/p>\n<p>In general, we would need to preprocess sources, especially headers, before we\ncan parse them concretely with <code>pycparser<\/code>. That's why we also need to define\nsomething like<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">restrict_re<\/span> <span class=\"o\">=<\/span> <span class=\"n\">re<\/span><span class=\"o\">.<\/span><span class=\"n\">compile<\/span><span class=\"p\">(<\/span><span class=\"sa\">r<\/span><span class=\"s2\">&quot;__restrict \\w+&quot;<\/span><span class=\"p\">)<\/span>\n\n<span class=\"n\">_header_head<\/span> <span class=\"o\">=<\/span> <span class=\"sa\">r<\/span><span class=\"s2\">&quot;&quot;&quot;<\/span>\n<span class=\"s2\">#define __attribute__(x)<\/span>\n<span class=\"s2\">#define __extension__<\/span>\n<span class=\"s2\">#define __inline inline<\/span>\n<span class=\"s2\">#define __asm__(x)<\/span>\n<span class=\"s2\">#define __const=const<\/span>\n<span class=\"s2\">#define __inline__ inline<\/span>\n<span class=\"s2\">#define __inline inline<\/span>\n<span class=\"s2\">#define __restrict<\/span>\n<span class=\"s2\">#define __signed__ signed<\/span>\n<span class=\"s2\">#define __GNUC_VA_LIST<\/span>\n<span class=\"s2\">#define __gnuc_va_list char<\/span>\n<span class=\"s2\">#define __thread<\/span>\n<span class=\"s2\">&quot;&quot;&quot;<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">preprocess<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">:<\/span> <span class=\"n\">Path<\/span><span class=\"p\">)<\/span> <span class=\"o\">-&gt;<\/span> <span class=\"nb\">str<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">with<\/span> <span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">open<\/span><span class=\"p\">()<\/span> <span class=\"k\">as<\/span> <span class=\"n\">fin<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">code<\/span> <span class=\"o\">=<\/span> <span class=\"n\">_header_head<\/span> <span class=\"o\">+<\/span> <span class=\"n\">fin<\/span><span class=\"o\">.<\/span><span class=\"n\">read<\/span><span class=\"p\">()<\/span>\n        <span class=\"k\">return<\/span> <span class=\"n\">restrict_re<\/span><span class=\"o\">.<\/span><span class=\"n\">sub<\/span><span class=\"p\">(<\/span>\n            <span class=\"s2\">&quot;&quot;<\/span><span class=\"p\">,<\/span>\n            <span class=\"n\">run<\/span><span class=\"p\">(<\/span>\n                <span class=\"p\">[<\/span><span class=\"s2\">&quot;gcc&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-E&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-P&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-&quot;<\/span><span class=\"p\">],<\/span>\n                <span class=\"n\">stdout<\/span><span class=\"o\">=<\/span><span class=\"n\">PIPE<\/span><span class=\"p\">,<\/span>\n                <span class=\"nb\">input<\/span><span class=\"o\">=<\/span><span class=\"n\">code<\/span><span class=\"o\">.<\/span><span class=\"n\">encode<\/span><span class=\"p\">(),<\/span>\n                <span class=\"n\">cwd<\/span><span class=\"o\">=<\/span><span class=\"n\">SRC<\/span><span class=\"p\">,<\/span>\n            <span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">stdout<\/span><span class=\"o\">.<\/span><span class=\"n\">decode<\/span><span class=\"p\">(),<\/span>\n        <span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>The <code>_header_head<\/code> and the <code>restrict_re<\/code> are needed to take care of GCC\nextensions, which are not supported by <code>pycparser<\/code>. But apart from them, all we\ndo is invoke the <code>gcc<\/code> preprocessor on the C sources. Putting everything\ntogether inside <code>tests\/cunit\/__init__.py<\/code> we would then have something that\nlooks like<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">ctypes<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">re<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">ctypes<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">,<\/span> <span class=\"n\">POINTER<\/span><span class=\"p\">,<\/span> <span class=\"n\">Structure<\/span><span class=\"p\">,<\/span> <span class=\"n\">c_char_p<\/span><span class=\"p\">,<\/span> <span class=\"n\">cast<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">pathlib<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">Path<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">subprocess<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">PIPE<\/span><span class=\"p\">,<\/span> <span class=\"n\">STDOUT<\/span><span class=\"p\">,<\/span> <span class=\"n\">run<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">types<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">ModuleType<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">typing<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">Any<\/span><span class=\"p\">,<\/span> <span class=\"n\">Optional<\/span>\n\n<span class=\"kn\">from<\/span> <span class=\"nn\">pycparser<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">c_ast<\/span><span class=\"p\">,<\/span> <span class=\"n\">c_parser<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">pycparser.plyparser<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">ParseError<\/span>\n\n<span class=\"n\">HERE<\/span> <span class=\"o\">=<\/span> <span class=\"n\">Path<\/span><span class=\"p\">(<\/span><span class=\"vm\">__file__<\/span><span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">resolve<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"n\">parent<\/span>\n<span class=\"n\">TEST<\/span> <span class=\"o\">=<\/span> <span class=\"n\">HERE<\/span><span class=\"o\">.<\/span><span class=\"n\">parent<\/span>\n<span class=\"n\">ROOT<\/span> <span class=\"o\">=<\/span> <span class=\"n\">TEST<\/span><span class=\"o\">.<\/span><span class=\"n\">parent<\/span>\n<span class=\"n\">SRC<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ROOT<\/span> <span class=\"o\">\/<\/span> <span class=\"s2\">&quot;src&quot;<\/span>\n\n\n<span class=\"n\">restrict_re<\/span> <span class=\"o\">=<\/span> <span class=\"n\">re<\/span><span class=\"o\">.<\/span><span class=\"n\">compile<\/span><span class=\"p\">(<\/span><span class=\"sa\">r<\/span><span class=\"s2\">&quot;__restrict \\w+&quot;<\/span><span class=\"p\">)<\/span>\n\n<span class=\"n\">_header_head<\/span> <span class=\"o\">=<\/span> <span class=\"sa\">r<\/span><span class=\"s2\">&quot;&quot;&quot;<\/span>\n<span class=\"s2\">#define __attribute__(x)<\/span>\n<span class=\"s2\">#define __extension__<\/span>\n<span class=\"s2\">#define __inline inline<\/span>\n<span class=\"s2\">#define __asm__(x)<\/span>\n<span class=\"s2\">#define __const=const<\/span>\n<span class=\"s2\">#define __inline__ inline<\/span>\n<span class=\"s2\">#define __inline inline<\/span>\n<span class=\"s2\">#define __restrict<\/span>\n<span class=\"s2\">#define __signed__ signed<\/span>\n<span class=\"s2\">#define __GNUC_VA_LIST<\/span>\n<span class=\"s2\">#define __gnuc_va_list char<\/span>\n<span class=\"s2\">#define __thread<\/span>\n<span class=\"s2\">&quot;&quot;&quot;<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">preprocess<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">:<\/span> <span class=\"n\">Path<\/span><span class=\"p\">)<\/span> <span class=\"o\">-&gt;<\/span> <span class=\"nb\">str<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">with<\/span> <span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">open<\/span><span class=\"p\">()<\/span> <span class=\"k\">as<\/span> <span class=\"n\">fin<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">code<\/span> <span class=\"o\">=<\/span> <span class=\"n\">_header_head<\/span> <span class=\"o\">+<\/span> <span class=\"n\">fin<\/span><span class=\"o\">.<\/span><span class=\"n\">read<\/span><span class=\"p\">()<\/span>\n        <span class=\"k\">return<\/span> <span class=\"n\">restrict_re<\/span><span class=\"o\">.<\/span><span class=\"n\">sub<\/span><span class=\"p\">(<\/span>\n            <span class=\"s2\">&quot;&quot;<\/span><span class=\"p\">,<\/span>\n            <span class=\"n\">run<\/span><span class=\"p\">(<\/span>\n                <span class=\"p\">[<\/span><span class=\"s2\">&quot;gcc&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-E&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-P&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-&quot;<\/span><span class=\"p\">],<\/span>\n                <span class=\"n\">stdout<\/span><span class=\"o\">=<\/span><span class=\"n\">PIPE<\/span><span class=\"p\">,<\/span>\n                <span class=\"nb\">input<\/span><span class=\"o\">=<\/span><span class=\"n\">code<\/span><span class=\"o\">.<\/span><span class=\"n\">encode<\/span><span class=\"p\">(),<\/span>\n                <span class=\"n\">cwd<\/span><span class=\"o\">=<\/span><span class=\"n\">SRC<\/span><span class=\"p\">,<\/span>\n            <span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">stdout<\/span><span class=\"o\">.<\/span><span class=\"n\">decode<\/span><span class=\"p\">(),<\/span>\n        <span class=\"p\">)<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">compile<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">:<\/span> <span class=\"n\">Path<\/span><span class=\"p\">,<\/span> <span class=\"n\">cflags<\/span><span class=\"o\">=<\/span><span class=\"p\">[],<\/span> <span class=\"n\">ldadd<\/span><span class=\"o\">=<\/span><span class=\"p\">[]):<\/span>\n    <span class=\"n\">binary<\/span> <span class=\"o\">=<\/span> <span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.so&quot;<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"n\">result<\/span> <span class=\"o\">=<\/span> <span class=\"n\">run<\/span><span class=\"p\">(<\/span>\n        <span class=\"p\">[<\/span><span class=\"s2\">&quot;gcc&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-shared&quot;<\/span><span class=\"p\">,<\/span> <span class=\"o\">*<\/span><span class=\"n\">cflags<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-o&quot;<\/span><span class=\"p\">,<\/span> <span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"n\">binary<\/span><span class=\"p\">),<\/span> <span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">),<\/span> <span class=\"o\">*<\/span><span class=\"n\">ldadd<\/span><span class=\"p\">],<\/span>\n        <span class=\"n\">stdout<\/span><span class=\"o\">=<\/span><span class=\"n\">PIPE<\/span><span class=\"p\">,<\/span>\n        <span class=\"n\">stderr<\/span><span class=\"o\">=<\/span><span class=\"n\">STDOUT<\/span><span class=\"p\">,<\/span>\n        <span class=\"n\">cwd<\/span><span class=\"o\">=<\/span><span class=\"n\">SRC<\/span><span class=\"p\">,<\/span>\n    <span class=\"p\">)<\/span>\n\n    <span class=\"k\">if<\/span> <span class=\"n\">result<\/span><span class=\"o\">.<\/span><span class=\"n\">returncode<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">0<\/span><span class=\"p\">:<\/span>\n        <span class=\"k\">return<\/span>\n\n    <span class=\"k\">raise<\/span> <span class=\"ne\">RuntimeError<\/span><span class=\"p\">(<\/span><span class=\"n\">result<\/span><span class=\"o\">.<\/span><span class=\"n\">stdout<\/span><span class=\"o\">.<\/span><span class=\"n\">decode<\/span><span class=\"p\">())<\/span>\n\n\n<span class=\"n\">C<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;libc.so.6&quot;<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CFunctionDef<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">rtype<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">name<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">args<\/span> <span class=\"o\">=<\/span> <span class=\"n\">args<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">rtype<\/span> <span class=\"o\">=<\/span> <span class=\"n\">rtype<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CTypeDef<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"n\">fields<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">name<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">fields<\/span> <span class=\"o\">=<\/span> <span class=\"n\">fields<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">methods<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[]<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">constructor<\/span> <span class=\"o\">=<\/span> <span class=\"kc\">False<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CType<\/span><span class=\"p\">(<\/span><span class=\"n\">Structure<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cself__<\/span> <span class=\"o\">=<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">new<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__del__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">if<\/span> <span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">destroy<\/span><span class=\"o\">.<\/span><span class=\"n\">__cmethod__<\/span><span class=\"o\">.<\/span><span class=\"n\">__args__<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">1<\/span><span class=\"p\">:<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">destroy<\/span><span class=\"p\">()<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__repr__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">return<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"si\">}<\/span><span class=\"s2\"> CObject at <\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cself__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&gt;&quot;<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CFunction<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__args__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"o\">.<\/span><span class=\"n\">args<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cfunc__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cfunc<\/span>\n        <span class=\"k\">if<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"o\">.<\/span><span class=\"n\">rtype<\/span> <span class=\"ow\">is<\/span> <span class=\"ow\">not<\/span> <span class=\"kc\">None<\/span><span class=\"p\">:<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cfunc__<\/span><span class=\"o\">.<\/span><span class=\"n\">restype<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"o\">.<\/span><span class=\"n\">rtype<\/span>\n\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">_posonly<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">all<\/span><span class=\"p\">(<\/span><span class=\"n\">_<\/span> <span class=\"ow\">is<\/span> <span class=\"kc\">None<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__args__<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">check_args<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">kwargs<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">if<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">_posonly<\/span> <span class=\"ow\">and<\/span> <span class=\"n\">kwargs<\/span><span class=\"p\">:<\/span>\n            <span class=\"k\">raise<\/span> <span class=\"ne\">ValueError<\/span><span class=\"p\">(<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"si\">}<\/span><span class=\"s2\"> takes only positional arguments&quot;<\/span><span class=\"p\">)<\/span>\n\n        <span class=\"n\">nargs<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"n\">args<\/span><span class=\"p\">)<\/span> <span class=\"o\">+<\/span> <span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">if<\/span> <span class=\"n\">nargs<\/span> <span class=\"o\">!=<\/span> <span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__args__<\/span><span class=\"p\">):<\/span>\n            <span class=\"k\">raise<\/span> <span class=\"ne\">TypeError<\/span><span class=\"p\">(<\/span>\n                <span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"si\">}<\/span><span class=\"s2\"> takes exactly <\/span><span class=\"si\">{<\/span><span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__args__<\/span><span class=\"p\">)<\/span><span class=\"si\">}<\/span><span class=\"s2\"> arguments (<\/span><span class=\"si\">{<\/span><span class=\"n\">nargs<\/span><span class=\"si\">}<\/span><span class=\"s2\"> given)&quot;<\/span>\n            <span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__call__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">check_args<\/span><span class=\"p\">(<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">return<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cfunc__<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__repr__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">return<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;CFunction &#39;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&#39;&gt;&quot;<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CMethod<\/span><span class=\"p\">(<\/span><span class=\"n\">CFunction<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctype<\/span><span class=\"p\">):<\/span>\n        <span class=\"nb\">super<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">)<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__ctype__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ctype<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__get__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">obj<\/span><span class=\"p\">,<\/span> <span class=\"n\">objtype<\/span><span class=\"o\">=<\/span><span class=\"kc\">None<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">def<\/span> <span class=\"nf\">_<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">):<\/span>\n            <span class=\"n\">cargs<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"n\">obj<\/span><span class=\"o\">.<\/span><span class=\"n\">__cself__<\/span><span class=\"p\">,<\/span> <span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">]<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">check_args<\/span><span class=\"p\">(<\/span><span class=\"n\">cargs<\/span><span class=\"p\">,<\/span> <span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n\n            <span class=\"k\">return<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__cfunc__<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">cargs<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n\n        <span class=\"n\">_<\/span><span class=\"o\">.<\/span><span class=\"n\">__cmethod__<\/span> <span class=\"o\">=<\/span> <span class=\"bp\">self<\/span>\n\n        <span class=\"k\">return<\/span> <span class=\"n\">_<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__repr__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">return<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;CMethod &#39;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&#39; of CType &#39;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__ctype__<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&#39;&gt;&quot;<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CStaticMethod<\/span><span class=\"p\">(<\/span><span class=\"n\">CFunction<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctype<\/span><span class=\"p\">):<\/span>\n        <span class=\"nb\">super<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">)<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__ctype__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ctype<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"fm\">__repr__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">return<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;CStaticMethod &#39;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&#39; of CType &#39;<\/span><span class=\"si\">{<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__ctype__<\/span><span class=\"o\">.<\/span><span class=\"vm\">__name__<\/span><span class=\"si\">}<\/span><span class=\"s2\">&#39;&gt;&quot;<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CMetaType<\/span><span class=\"p\">(<\/span><span class=\"nb\">type<\/span><span class=\"p\">(<\/span><span class=\"n\">Structure<\/span><span class=\"p\">)):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__new__<\/span><span class=\"p\">(<\/span><span class=\"bp\">cls<\/span><span class=\"p\">,<\/span> <span class=\"n\">cmodule<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctypedef<\/span><span class=\"p\">,<\/span> <span class=\"n\">_<\/span><span class=\"o\">=<\/span><span class=\"kc\">None<\/span><span class=\"p\">):<\/span>\n        <span class=\"n\">ctype<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">super<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"fm\">__new__<\/span><span class=\"p\">(<\/span>\n            <span class=\"bp\">cls<\/span><span class=\"p\">,<\/span>\n            <span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"p\">,<\/span>\n            <span class=\"p\">(<\/span><span class=\"n\">CType<\/span><span class=\"p\">,),<\/span>\n            <span class=\"p\">{<\/span><span class=\"s2\">&quot;__cmodule__&quot;<\/span><span class=\"p\">:<\/span> <span class=\"n\">cmodule<\/span><span class=\"p\">},<\/span>\n        <span class=\"p\">)<\/span>\n\n        <span class=\"n\">constructor<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">getattr<\/span><span class=\"p\">(<\/span><span class=\"n\">cmodule<\/span><span class=\"o\">.<\/span><span class=\"n\">__binary__<\/span><span class=\"p\">,<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"p\">[:<\/span><span class=\"o\">-<\/span><span class=\"mi\">2<\/span><span class=\"p\">]<\/span><span class=\"si\">}<\/span><span class=\"s2\">_new&quot;<\/span><span class=\"p\">)<\/span>\n        <span class=\"n\">ctype<\/span><span class=\"o\">.<\/span><span class=\"n\">new<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CStaticMethod<\/span><span class=\"p\">(<\/span><span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">constructor<\/span><span class=\"p\">,<\/span> <span class=\"n\">constructor<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctype<\/span><span class=\"p\">)<\/span>\n\n        <span class=\"k\">for<\/span> <span class=\"n\">method_def<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">methods<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">method_name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">method_def<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span>\n            <span class=\"n\">method<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">getattr<\/span><span class=\"p\">(<\/span><span class=\"n\">cmodule<\/span><span class=\"o\">.<\/span><span class=\"n\">__binary__<\/span><span class=\"p\">,<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"p\">[:<\/span><span class=\"o\">-<\/span><span class=\"mi\">2<\/span><span class=\"p\">]<\/span><span class=\"si\">}<\/span><span class=\"s2\">__<\/span><span class=\"si\">{<\/span><span class=\"n\">method_name<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span><span class=\"p\">)<\/span>\n            <span class=\"nb\">setattr<\/span><span class=\"p\">(<\/span><span class=\"n\">ctype<\/span><span class=\"p\">,<\/span> <span class=\"n\">method_name<\/span><span class=\"p\">,<\/span> <span class=\"n\">CMethod<\/span><span class=\"p\">(<\/span><span class=\"n\">method_def<\/span><span class=\"p\">,<\/span> <span class=\"n\">method<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctype<\/span><span class=\"p\">))<\/span>\n\n        <span class=\"n\">ctype<\/span><span class=\"o\">.<\/span><span class=\"n\">__cname__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ctypedef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span>\n\n        <span class=\"k\">return<\/span> <span class=\"n\">ctype<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">DeclCollector<\/span><span class=\"p\">(<\/span><span class=\"n\">c_ast<\/span><span class=\"o\">.<\/span><span class=\"n\">NodeVisitor<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span> <span class=\"o\">=<\/span> <span class=\"p\">{}<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">functions<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[]<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">_get_type<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">node<\/span><span class=\"p\">):<\/span>\n        <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"n\">node<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">return<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span><span class=\"p\">[<\/span><span class=\"s2\">&quot; &quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">(<\/span><span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">names<\/span><span class=\"p\">)]<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">visit_Typedef<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">node<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">if<\/span> <span class=\"nb\">isinstance<\/span><span class=\"p\">(<\/span><span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"p\">,<\/span> <span class=\"n\">c_ast<\/span><span class=\"o\">.<\/span><span class=\"n\">Struct<\/span><span class=\"p\">)<\/span> <span class=\"ow\">and<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">declname<\/span><span class=\"o\">.<\/span><span class=\"n\">endswith<\/span><span class=\"p\">(<\/span>\n            <span class=\"s2\">&quot;_t&quot;<\/span>\n        <span class=\"p\">):<\/span>\n            <span class=\"n\">struct<\/span> <span class=\"o\">=<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span><span class=\"p\">[<\/span><span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">declname<\/span><span class=\"p\">[:<\/span><span class=\"o\">-<\/span><span class=\"mi\">2<\/span><span class=\"p\">]]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CTypeDef<\/span><span class=\"p\">(<\/span>\n                <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">declname<\/span><span class=\"p\">,<\/span>\n                <span class=\"p\">[<\/span><span class=\"n\">decl<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span> <span class=\"k\">for<\/span> <span class=\"n\">decl<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">struct<\/span><span class=\"o\">.<\/span><span class=\"n\">decls<\/span><span class=\"p\">],<\/span>\n            <span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">visit_Decl<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">node<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">if<\/span> <span class=\"s2\">&quot;extern&quot;<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">storage<\/span><span class=\"p\">:<\/span>\n            <span class=\"k\">return<\/span>\n\n        <span class=\"k\">if<\/span> <span class=\"nb\">isinstance<\/span><span class=\"p\">(<\/span><span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"p\">,<\/span> <span class=\"n\">c_ast<\/span><span class=\"o\">.<\/span><span class=\"n\">FuncDecl<\/span><span class=\"p\">):<\/span>\n            <span class=\"n\">func_name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span>\n            <span class=\"n\">ret_type<\/span> <span class=\"o\">=<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span>\n            <span class=\"n\">rtype<\/span> <span class=\"o\">=<\/span> <span class=\"kc\">None<\/span>\n            <span class=\"k\">if<\/span> <span class=\"nb\">isinstance<\/span><span class=\"p\">(<\/span><span class=\"n\">ret_type<\/span><span class=\"p\">,<\/span> <span class=\"n\">c_ast<\/span><span class=\"o\">.<\/span><span class=\"n\">PtrDecl<\/span><span class=\"p\">):<\/span>\n                <span class=\"k\">if<\/span> <span class=\"s2\">&quot;&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">(<\/span><span class=\"n\">ret_type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">names<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"s2\">&quot;char&quot;<\/span><span class=\"p\">:<\/span>\n                    <span class=\"n\">rtype<\/span> <span class=\"o\">=<\/span> <span class=\"n\">c_char_p<\/span>\n            <span class=\"n\">args<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span>\n                <span class=\"p\">[<\/span><span class=\"n\">_<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span> <span class=\"k\">if<\/span> <span class=\"nb\">hasattr<\/span><span class=\"p\">(<\/span><span class=\"n\">_<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;name&quot;<\/span><span class=\"p\">)<\/span> <span class=\"k\">else<\/span> <span class=\"kc\">None<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">args<\/span><span class=\"o\">.<\/span><span class=\"n\">params<\/span><span class=\"p\">]<\/span>\n                <span class=\"k\">if<\/span> <span class=\"n\">node<\/span><span class=\"o\">.<\/span><span class=\"n\">type<\/span><span class=\"o\">.<\/span><span class=\"n\">args<\/span> <span class=\"ow\">is<\/span> <span class=\"ow\">not<\/span> <span class=\"kc\">None<\/span>\n                <span class=\"k\">else<\/span> <span class=\"p\">[]<\/span>\n            <span class=\"p\">)<\/span>\n            <span class=\"k\">if<\/span> <span class=\"n\">func_name<\/span><span class=\"o\">.<\/span><span class=\"n\">endswith<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;_new&quot;<\/span><span class=\"p\">):<\/span>\n                <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span><span class=\"p\">[<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">func_name<\/span><span class=\"p\">[:<\/span><span class=\"o\">-<\/span><span class=\"mi\">4<\/span><span class=\"p\">]<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span><span class=\"p\">]<\/span><span class=\"o\">.<\/span><span class=\"n\">constructor<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CFunctionDef<\/span><span class=\"p\">(<\/span>\n                    <span class=\"s2\">&quot;new&quot;<\/span><span class=\"p\">,<\/span> <span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">rtype<\/span>\n                <span class=\"p\">)<\/span>\n            <span class=\"k\">elif<\/span> <span class=\"s2\">&quot;__&quot;<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">func_name<\/span><span class=\"p\">:<\/span>\n                <span class=\"n\">type_name<\/span><span class=\"p\">,<\/span> <span class=\"n\">_<\/span><span class=\"p\">,<\/span> <span class=\"n\">method_name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">func_name<\/span><span class=\"o\">.<\/span><span class=\"n\">partition<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;__&quot;<\/span><span class=\"p\">)<\/span>\n                <span class=\"k\">if<\/span> <span class=\"ow\">not<\/span> <span class=\"n\">type_name<\/span><span class=\"p\">:<\/span>\n                    <span class=\"k\">return<\/span>\n                <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span><span class=\"p\">[<\/span><span class=\"n\">type_name<\/span><span class=\"p\">]<\/span><span class=\"o\">.<\/span><span class=\"n\">methods<\/span><span class=\"o\">.<\/span><span class=\"n\">append<\/span><span class=\"p\">(<\/span>\n                    <span class=\"n\">CFunctionDef<\/span><span class=\"p\">(<\/span><span class=\"n\">method_name<\/span><span class=\"p\">,<\/span> <span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">rtype<\/span><span class=\"p\">)<\/span>\n                <span class=\"p\">)<\/span>\n            <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">functions<\/span><span class=\"o\">.<\/span><span class=\"n\">append<\/span><span class=\"p\">(<\/span><span class=\"n\">CFunctionDef<\/span><span class=\"p\">(<\/span><span class=\"n\">func_name<\/span><span class=\"p\">,<\/span> <span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"n\">rtype<\/span><span class=\"p\">))<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">collect<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">decl<\/span><span class=\"p\">):<\/span>\n        <span class=\"n\">parser<\/span> <span class=\"o\">=<\/span> <span class=\"n\">c_parser<\/span><span class=\"o\">.<\/span><span class=\"n\">CParser<\/span><span class=\"p\">()<\/span>\n        <span class=\"k\">try<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">ast<\/span> <span class=\"o\">=<\/span> <span class=\"n\">parser<\/span><span class=\"o\">.<\/span><span class=\"n\">parse<\/span><span class=\"p\">(<\/span><span class=\"n\">decl<\/span><span class=\"p\">,<\/span> <span class=\"n\">filename<\/span><span class=\"o\">=<\/span><span class=\"s2\">&quot;&lt;preprocessed&gt;&quot;<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">except<\/span> <span class=\"n\">ParseError<\/span> <span class=\"k\">as<\/span> <span class=\"n\">e<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">lines<\/span> <span class=\"o\">=<\/span> <span class=\"n\">decl<\/span><span class=\"o\">.<\/span><span class=\"n\">splitlines<\/span><span class=\"p\">()<\/span>\n            <span class=\"n\">line<\/span><span class=\"p\">,<\/span> <span class=\"n\">col<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span>\n                <span class=\"nb\">int<\/span><span class=\"p\">(<\/span><span class=\"n\">_<\/span><span class=\"p\">)<\/span> <span class=\"o\">-<\/span> <span class=\"mi\">1<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">e<\/span><span class=\"o\">.<\/span><span class=\"n\">args<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span><span class=\"o\">.<\/span><span class=\"n\">partition<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot; &quot;<\/span><span class=\"p\">)[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span><span class=\"o\">.<\/span><span class=\"n\">split<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;:&quot;<\/span><span class=\"p\">)[<\/span><span class=\"mi\">1<\/span><span class=\"p\">:<\/span><span class=\"mi\">3<\/span><span class=\"p\">]<\/span>\n            <span class=\"p\">)<\/span>\n            <span class=\"k\">for<\/span> <span class=\"n\">i<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">range<\/span><span class=\"p\">(<\/span><span class=\"nb\">max<\/span><span class=\"p\">(<\/span><span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">line<\/span> <span class=\"o\">-<\/span> <span class=\"mi\">4<\/span><span class=\"p\">),<\/span> <span class=\"nb\">min<\/span><span class=\"p\">(<\/span><span class=\"n\">line<\/span> <span class=\"o\">+<\/span> <span class=\"mi\">5<\/span><span class=\"p\">,<\/span> <span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"n\">lines<\/span><span class=\"p\">))):<\/span>\n                <span class=\"k\">if<\/span> <span class=\"n\">i<\/span> <span class=\"o\">!=<\/span> <span class=\"n\">line<\/span><span class=\"p\">:<\/span>\n                    <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">i<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"si\">:<\/span><span class=\"s2\">5d<\/span><span class=\"si\">}<\/span><span class=\"s2\">  <\/span><span class=\"si\">{<\/span><span class=\"n\">lines<\/span><span class=\"p\">[<\/span><span class=\"n\">i<\/span><span class=\"p\">]<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span><span class=\"p\">)<\/span>\n                <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                    <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">i<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"si\">:<\/span><span class=\"s2\">5d<\/span><span class=\"si\">}<\/span><span class=\"s2\">  <\/span><span class=\"se\">\\033<\/span><span class=\"s2\">[33;1m<\/span><span class=\"si\">{<\/span><span class=\"n\">lines<\/span><span class=\"p\">[<\/span><span class=\"n\">line<\/span><span class=\"p\">]<\/span><span class=\"si\">}<\/span><span class=\"se\">\\033<\/span><span class=\"s2\">[0m&quot;<\/span><span class=\"p\">)<\/span>\n                    <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot; &quot;<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">col<\/span> <span class=\"o\">+<\/span> <span class=\"mi\">5<\/span><span class=\"p\">)<\/span> <span class=\"o\">+<\/span> <span class=\"s2\">&quot;<\/span><span class=\"se\">\\033<\/span><span class=\"s2\">[31;1m&lt;&lt;^<\/span><span class=\"se\">\\033<\/span><span class=\"s2\">[0m&quot;<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">raise<\/span>\n\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">visit<\/span><span class=\"p\">(<\/span><span class=\"n\">ast<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">return<\/span> <span class=\"p\">{<\/span>\n            <span class=\"n\">k<\/span><span class=\"p\">:<\/span> <span class=\"n\">v<\/span>\n            <span class=\"k\">for<\/span> <span class=\"n\">k<\/span><span class=\"p\">,<\/span> <span class=\"n\">v<\/span> <span class=\"ow\">in<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">types<\/span><span class=\"o\">.<\/span><span class=\"n\">items<\/span><span class=\"p\">()<\/span>\n            <span class=\"k\">if<\/span> <span class=\"nb\">isinstance<\/span><span class=\"p\">(<\/span><span class=\"n\">v<\/span><span class=\"p\">,<\/span> <span class=\"n\">CTypeDef<\/span><span class=\"p\">)<\/span> <span class=\"ow\">and<\/span> <span class=\"n\">v<\/span><span class=\"o\">.<\/span><span class=\"n\">constructor<\/span>\n        <span class=\"p\">}<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">CModule<\/span><span class=\"p\">(<\/span><span class=\"n\">ModuleType<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">source<\/span><span class=\"p\">):<\/span>\n        <span class=\"nb\">super<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;Generated from <\/span><span class=\"si\">{<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s1\">&#39;.c&#39;<\/span><span class=\"p\">)<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span><span class=\"p\">)<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__binary__<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CDLL<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.so&quot;<\/span><span class=\"p\">))<\/span>\n\n        <span class=\"n\">collector<\/span> <span class=\"o\">=<\/span> <span class=\"n\">DeclCollector<\/span><span class=\"p\">()<\/span>\n\n        <span class=\"k\">for<\/span> <span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctypedef<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">collector<\/span><span class=\"o\">.<\/span><span class=\"n\">collect<\/span><span class=\"p\">(<\/span>\n            <span class=\"n\">preprocess<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.h&quot;<\/span><span class=\"p\">))<\/span>\n        <span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">items<\/span><span class=\"p\">():<\/span>\n            <span class=\"n\">parts<\/span> <span class=\"o\">=<\/span> <span class=\"n\">name<\/span><span class=\"o\">.<\/span><span class=\"n\">split<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;_&quot;<\/span><span class=\"p\">)<\/span>\n            <span class=\"n\">py_name<\/span> <span class=\"o\">=<\/span> <span class=\"s2\">&quot;&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">((<\/span><span class=\"n\">_<\/span><span class=\"o\">.<\/span><span class=\"n\">capitalize<\/span><span class=\"p\">()<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">parts<\/span><span class=\"p\">))<\/span>\n            <span class=\"nb\">setattr<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">py_name<\/span><span class=\"p\">,<\/span> <span class=\"n\">CMetaType<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">ctypedef<\/span><span class=\"p\">,<\/span> <span class=\"kc\">None<\/span><span class=\"p\">))<\/span>\n\n        <span class=\"k\">for<\/span> <span class=\"n\">cfuncdef<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">collector<\/span><span class=\"o\">.<\/span><span class=\"n\">functions<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">name<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cfuncdef<\/span><span class=\"o\">.<\/span><span class=\"n\">name<\/span>\n            <span class=\"k\">try<\/span><span class=\"p\">:<\/span>\n                <span class=\"n\">cfunc<\/span> <span class=\"o\">=<\/span> <span class=\"n\">CFunction<\/span><span class=\"p\">(<\/span><span class=\"n\">cfuncdef<\/span><span class=\"p\">,<\/span> <span class=\"nb\">getattr<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__binary__<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">))<\/span>\n                <span class=\"nb\">setattr<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"n\">cfunc<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">except<\/span> <span class=\"ne\">AttributeError<\/span><span class=\"p\">:<\/span>\n                <span class=\"c1\"># Not part of the binary<\/span>\n                <span class=\"k\">pass<\/span>\n\n    <span class=\"nd\">@classmethod<\/span>\n    <span class=\"k\">def<\/span> <span class=\"nf\">compile<\/span><span class=\"p\">(<\/span><span class=\"bp\">cls<\/span><span class=\"p\">,<\/span> <span class=\"n\">source<\/span><span class=\"p\">,<\/span> <span class=\"n\">cflags<\/span><span class=\"o\">=<\/span><span class=\"p\">[],<\/span> <span class=\"n\">ldadd<\/span><span class=\"o\">=<\/span><span class=\"p\">[]):<\/span>\n        <span class=\"nb\">compile<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"o\">.<\/span><span class=\"n\">with_suffix<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.c&quot;<\/span><span class=\"p\">),<\/span> <span class=\"n\">cflags<\/span><span class=\"p\">,<\/span> <span class=\"n\">ldadd<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">return<\/span> <span class=\"bp\">cls<\/span><span class=\"p\">(<\/span><span class=\"n\">source<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>So let's use this new technology to make our C unit tests more <em>Pythonesque<\/em>.\nThis is what our new <code>test_cache.py<\/code> can look like<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">pytest<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">tests.cunit<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">C<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">tests.cunit.cache<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">Queue<\/span><span class=\"p\">,<\/span> <span class=\"n\">QueueItem<\/span>\n\n<span class=\"n\">NULL<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">0<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">test_queue_item<\/span><span class=\"p\">():<\/span>\n    <span class=\"n\">value<\/span> <span class=\"o\">=<\/span> <span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">malloc<\/span><span class=\"p\">(<\/span><span class=\"mi\">16<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">queue_item<\/span> <span class=\"o\">=<\/span> <span class=\"n\">QueueItem<\/span><span class=\"p\">(<\/span><span class=\"n\">value<\/span><span class=\"p\">,<\/span> <span class=\"mi\">42<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">queue_item<\/span><span class=\"o\">.<\/span><span class=\"n\">__cself__<\/span>\n\n    <span class=\"n\">queue_item<\/span><span class=\"o\">.<\/span><span class=\"n\">destroy<\/span><span class=\"p\">(<\/span><span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">free<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"nd\">@pytest<\/span><span class=\"o\">.<\/span><span class=\"n\">mark<\/span><span class=\"o\">.<\/span><span class=\"n\">parametrize<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;qsize&quot;<\/span><span class=\"p\">,<\/span> <span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"mi\">10<\/span><span class=\"p\">,<\/span> <span class=\"mi\">100<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1000<\/span><span class=\"p\">])<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">test_queue<\/span><span class=\"p\">(<\/span><span class=\"n\">qsize<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">q<\/span> <span class=\"o\">=<\/span> <span class=\"n\">Queue<\/span><span class=\"p\">(<\/span><span class=\"n\">qsize<\/span><span class=\"p\">,<\/span> <span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">free<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">assert<\/span> <span class=\"n\">q<\/span><span class=\"o\">.<\/span><span class=\"n\">is_empty<\/span><span class=\"p\">()<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">qsize<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">0<\/span> <span class=\"ow\">or<\/span> <span class=\"ow\">not<\/span> <span class=\"n\">q<\/span><span class=\"o\">.<\/span><span class=\"n\">is_full<\/span><span class=\"p\">()<\/span>\n\n    <span class=\"k\">assert<\/span> <span class=\"n\">q<\/span><span class=\"o\">.<\/span><span class=\"n\">dequeue<\/span><span class=\"p\">()<\/span> <span class=\"ow\">is<\/span> <span class=\"n\">NULL<\/span>\n\n    <span class=\"n\">values<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"n\">C<\/span><span class=\"o\">.<\/span><span class=\"n\">malloc<\/span><span class=\"p\">(<\/span><span class=\"mi\">16<\/span><span class=\"p\">)<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">range<\/span><span class=\"p\">(<\/span><span class=\"n\">qsize<\/span><span class=\"p\">)]<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"nb\">all<\/span><span class=\"p\">(<\/span><span class=\"n\">values<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">for<\/span> <span class=\"n\">k<\/span><span class=\"p\">,<\/span> <span class=\"n\">v<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">enumerate<\/span><span class=\"p\">(<\/span><span class=\"n\">values<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">assert<\/span> <span class=\"n\">q<\/span><span class=\"o\">.<\/span><span class=\"n\">enqueue<\/span><span class=\"p\">(<\/span><span class=\"n\">v<\/span><span class=\"p\">,<\/span> <span class=\"n\">k<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">assert<\/span> <span class=\"n\">qsize<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">0<\/span> <span class=\"ow\">or<\/span> <span class=\"ow\">not<\/span> <span class=\"n\">q<\/span><span class=\"o\">.<\/span><span class=\"n\">is_empty<\/span><span class=\"p\">()<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">q<\/span><span class=\"o\">.<\/span><span class=\"n\">is_full<\/span><span class=\"p\">()<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">q<\/span><span class=\"o\">.<\/span><span class=\"n\">enqueue<\/span><span class=\"p\">(<\/span><span class=\"mi\">42<\/span><span class=\"p\">,<\/span> <span class=\"mi\">42<\/span><span class=\"p\">)<\/span> <span class=\"ow\">is<\/span> <span class=\"n\">NULL<\/span>\n\n    <span class=\"k\">assert<\/span> <span class=\"n\">values<\/span> <span class=\"o\">==<\/span> <span class=\"p\">[<\/span><span class=\"n\">q<\/span><span class=\"o\">.<\/span><span class=\"n\">dequeue<\/span><span class=\"p\">()<\/span> <span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">range<\/span><span class=\"p\">(<\/span><span class=\"n\">qsize<\/span><span class=\"p\">)]<\/span>\n<\/pre><\/div>\n\n\n<p>We can practically treat our C binary object as if it were an actual Python\nmodule and import the <code>Queue<\/code> and <code>QueueItem<\/code> types from it. We can then use\nthem as if they were actual Python classes and call methods on them. How cool is\nthat? \ud83d\ude04<\/p>\n<h1 id=\"coverage-please\">Coverage, please!<\/h1>\n<p>What about test coverage? If you've used <code>pytest<\/code> before, you are probably\nfamiliar with the <code>pytest-cov<\/code> plugin, which is a handy way of collecting and\nreporting test coverage. GCC supports the <code>-fprofile-arcs<\/code> and <code>-ftest-coverage<\/code>\nto emit coverage data. So if we also wanted to get test coverage data while\nrunning C unit tests with <code>pytest<\/code>, all we would have to do is add these flags\nto the <code>CFLAGS<\/code> list in the <code>cache.py<\/code> module. The <a href=\"https:\/\/gcovr.com\/en\/stable\/\"><code>gcovr<\/code><\/a> tool is\nactually inspired by the Python counterpart <code>coverage.py<\/code> and can be used to\ngenerate some nice reports. In particular, it could be used to generate\nCobertura XML reports that could be uploaded to services like\n<a href=\"https:\/\/codecov.io\/\">codecov.io<\/a>.<\/p>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"python"}},{"@attributes":{"term":"testing"}},{"@attributes":{"term":"c"}}]},{"title":"My first close encounter with the Linux kernel source","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/my-first-close-encounter-with-the-linux-kernel-source.html","rel":"alternate"}},"published":"2022-01-15T14:39:00+00:00","updated":"2022-01-15T14:39:00+00:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2022-01-15:\/my-first-close-encounter-with-the-linux-kernel-source.html","summary":"<p>I finally had the chance to set some time aside to touch the Linux kernel source, for fun and profit. This is the story of how I implemented a new BPF helper.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#background\">Background<\/a><\/li>\n<li><a href=\"#getting-things-started\">Getting things started<\/a><\/li>\n<li><a href=\"#adding-new-bells-and-whistles\">Adding new bells and whistles<\/a><\/li>\n<li><a href=\"#use-cases-beyond-austin\">Use cases beyond Austin<\/a><\/li>\n<\/ul>\n<\/div>\n<h1 id=\"background\">Background<\/h1>\n<p>As you probably know if you've been on my blog before, I develop and maintain\n<a href=\"https:\/\/github.com\/p403n1x87\/austin\">Austin<\/a>, a frame stack sampler for CPython which, in essence, is the\nmain material for building a zero-instrumentation, minimal impact sampling\nprofiler for Python. Recently I worked on a variant, <a href=\"https:\/\/github.com\/P403n1x87\/austin#native-frame-stack\">austinp<\/a>, which\nsamples not only the Python stacks, but also native ones (optionally including\nLinux kernel stacks as well). The reason why this is just a variant and not part\nof Austin itself is because it violates two of its fundamental principles:\ndependencies and impact. The <code>austinp<\/code> variant relies on <code>libunwind<\/code> to perform\nnative stack unwinding, which means that a. we have a dependency on a\nthird-party library and b. since <code>libunwind<\/code> relies on <code>ptrace<\/code>, <code>austinp<\/code> no\nlonger has the guarantee of minimal impact.<\/p>\n<p>With the rapid and highly active development of <a href=\"https:\/\/ebpf.io\/\">eBPF<\/a> in the Linux kernel,\n<code>ptrace<\/code>-based stack unwinding might as well be considered a thing of the past.\nOne can exploit the BPF support for <code>perf_event<\/code> to implement a simple native\nprofiler <a href=\"https:\/\/github.com\/P403n1x87\/bpf\">in just a few lines of C code<\/a> with the support of\n<a href=\"https:\/\/github.com\/libbpf\/libbpf\">libbpf<\/a>. But for Austin to one day have a BPF variant, native stacks\nalone are not enough. We would need a way to unwind the Python frame stack from\na BPF program, and at the moment of writing this is not quite yet possible. Why?\nBecause the BPF part of the Linux kernel lacks support for accessing the VM\nspace of a remote process. Indeed, the <a href=\"https:\/\/man7.org\/linux\/man-pages\/man2\/process_vm_readv.2.html\"><code>process_vm_readv<\/code><\/a>\nsystem call is the core functionality that makes Austin possible on Linux, and\nunfortunately, as I discovered, there is no counterpart exposed to BPF at the\ntime of writing.<\/p>\n<p>How do I know there is no such functionality in BPF-land yet? Back in September\n2021 I sent <a href=\"https:\/\/lore.kernel.org\/bpf\/CAGnuNNt7va4u78rvPmusYnhXAuy5e9aRhEeO6HDqYUsH979QLQ@mail.gmail.com\/T\/\">an email<\/a> to the BPF mailing list, asking if it was\npossible to do what <code>process_vm_readv<\/code> does inside a BPF program. The answer I\nreceived from the maintainer was negative, but it turned out I asked the\nquestion at the right time, because the feature that was required to make this\npossible, i.e. sleepable BPF programs, had just been introduced in the kernel.\nUnfortunately, I coulnd't pick up the invite from the maintainers to prepare a\npatch back then. I still had not experience with the Linux kernel, nor the time\nto fully commit to the task, so I had to postpone this. The chance to actually\nlook into this came with the January R&amp;D week. This post is a recollection of\nthe events of that week.<\/p>\n<h1 id=\"getting-things-started\">Getting things started<\/h1>\n<p>So January came and along with it the R&amp;D Week. I decided to use this time to\nfinally have a play with the Linux kernel, something that I wanted to do in a\nlong time. Now I had the chance to do that for fun, but also for profit as I\nactually had a pretty concrete use case. The first step was of course to get the\ndevelopment environment up. Since I didn't want to risk bricking my Linux\npartition, I decided the best approach was to spin up a virtual machine.<\/p>\n<p>I had a Ubuntu VM lying around from previous experiments with BPF and I thought\nof using that. Soon I discovered that, whilst many tutorials use or recommend a\nUbuntu system for Linux kernel development, things are much smoother on Debian.\nSo my personal recommendation for anyone starting on the Linux kernel for the\nfirst time, like me, is to use Debian.<\/p>\n<p>But let's go in order. The first thing we want to do is get our hands on the\nlatest version of the Linux kernel source code. So install <code>git<\/code> along with\nthese other dependencies<\/p>\n<div class=\"highlight\"><pre><span><\/span>sudo apt-get install vim libncurses5-dev gcc make git libssl-dev bison flex libelf-dev bc dwarves\n<\/pre><\/div>\n\n\n<p>and clone the repository with<\/p>\n<div class=\"highlight\"><pre><span><\/span>mkdir kernel\n<span class=\"nb\">cd<\/span> kernel\ngit clone -b staging-testing git:\/\/git.kernel.org\/pub\/scm\/linux\/kernel\/git\/gregkh\/staging.git\n<span class=\"nb\">cd<\/span> staging\n<\/pre><\/div>\n\n\n<p>These steps are adapted from <a href=\"https:\/\/kernelnewbies.org\/OutreachyfirstpatchSetup?action=show&amp;redirect=OPWfirstpatchSetup\">this KernelNewbies page<\/a>, which\nis the guide that I have used to get started.<\/p>\n<p>The Linux kernel is highly (like, <em>very<\/em> highly) configurable, depending your\nsystem and your needs, but of course the source code comes with none. The next\nstep then is to make one. The guide recommends starting from the current\nsystem's configuration as a base, which we can get with<\/p>\n<div class=\"highlight\"><pre><span><\/span>cp \/boot\/config-<span class=\"sb\">`<\/span>uname -r<span class=\"sb\">`<\/span>* .config\n<\/pre><\/div>\n\n\n<p>If the VM is running a not-so-recent build of the Linux kernel, it's perhaps\nworth running <code>make olddefconfig<\/code> to get any new configuration options in. And\nto speed the compilation process up a bit, we can use the current module\nconfiguration with <code>make localmodconfig<\/code> to get rid of some of the defaults that\nare not applicable to the current system.<\/p>\n<blockquote>\n<p>This is where things get a bit annoying if you are using Ubuntu instead of\nDebian. It turns out there are some options that involve security certificates\nthat are best set to the empty string to allow the build steps to regenerate\nwhatever is needed. So if you are using Ubuntu to experiment on the Linux\nkernel, you might want to set <code>CONFIG_SYSTEM_TRUSTED_KEYS<\/code>,\n<code>CONFIG_MODULE_SIG_KEY<\/code> and <code>CONFIG_SYSTEM_REVOCATION_KEYS<\/code> to <code>\"\"<\/code>.<\/p>\n<\/blockquote>\n<p>Before moving on, it might be a good\nidea to change the value of <code>CONFIG_LOCALVERSION<\/code> to something else other than\nthe empty string, in case we end up overwriting the current Linux kernel. For\nmy experiments I have used <code>CONFIG_LOCALVERSION=p403n1x87<\/code>. This will show up\nas a prefix in the kernel image name in the bootloader menu<\/p>\n<p align=\"center\">\n  <a href=\"https:\/\/p403n1x87.github.io\/images\/linux-first-encounter\/grub.png\" target=\"_blank\"><img\n    src=\"https:\/\/p403n1x87.github.io\/images\/linux-first-encounter\/grub.png\"\n    alt=\"The custom Linux kernel build image in GRUB\"\n  \/><\/a>\n<\/p>\n\n<p>To make sure that everything is in order, and that you get GRUB menu entries\nsimilar to the ones shown above we can try to compile the sources with<\/p>\n<div class=\"highlight\"><pre><span><\/span>make -j2 &gt; \/dev\/null <span class=\"o\">&amp;&amp;<\/span> <span class=\"nb\">echo<\/span> <span class=\"s2\">&quot;OK&quot;<\/span>\n<\/pre><\/div>\n\n\n<p>Feel free to replace the <code>2<\/code> with any other number that might be appropriate to\nyour system. This represents the number of parallel compilation processes that\nare spawned, so it correlates with the number of cores that are available to\nyour VM. The <code>\/dev\/null<\/code> is there to suppress any output that goes to <code>stdout<\/code>,\nwhich is not very instructive. This leaves us with more useful warnings and\nerrors that are spit out to <code>stderr<\/code>. If the compilation succeeeded we would\nthen see <code>OK<\/code> at the end of the output. If this is the case, we can proceed with\nthe installation with<\/p>\n<div class=\"highlight\"><pre><span><\/span>sudo make modules_install install\n<\/pre><\/div>\n\n\n<p>To boot into the newly compiled Linux kernel, simply reboot the VM with<\/p>\n<div class=\"highlight\"><pre><span><\/span>sudo reboot\n<\/pre><\/div>\n\n\n<p>We're now all set to start hacking the Linux kernel \ud83c\udf89.<\/p>\n<h1 id=\"adding-new-bells-and-whistles\">Adding new bells and whistles<\/h1>\n<p>I spent the Monday of R&amp;D Week getting my environment up and running and having\na first look at the structure of my local copy of the Linux kernel source\nrepository. The next day I started studying the parts that were relevant to my\ngoal, that is find out where the <code>process_vm_readv<\/code> system call is defined and\nhow it works internally. The Linux kernel source comes with <code>ctags<\/code> support for\neasy navigation. However, if you do not want to install additional tools you can\nmake use of the <a href=\"https:\/\/elixir.bootlin.com\/linux\/latest\/source\">Elixir<\/a> project to find where symbols are defined and\nreferenced.<\/p>\n<p>The <code>process_vm_readv<\/code> system call is defined in <code>mm\/process_vm_access.c<\/code>.\nStudying the content of this source carefully, we discover that memory is read\npage-by-page using <code>copy_page_to_iter<\/code>. The system call is also generic in the\nsense that it performs multiple operations, according to the number of <code>iovec<\/code>\nelements that are passed to it. For my use case, I'm seeking to implement a\nfunction with the signature<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kt\">ssize_t<\/span><span class=\"w\"> <\/span><span class=\"nf\">process_vm_read<\/span><span class=\"p\">(<\/span><span class=\"kt\">pid_t<\/span><span class=\"w\"> <\/span><span class=\"n\">pid<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">dst<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">ssize_t<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">user_ptr<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>so that a possible implementation might have looked something like<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kt\">ssize_t<\/span><span class=\"w\"> <\/span><span class=\"nf\">process_vm_read<\/span><span class=\"p\">(<\/span><span class=\"kt\">pid_t<\/span><span class=\"w\"> <\/span><span class=\"n\">pid<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">dst<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">ssize_t<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">user_ptr<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">iovec<\/span><span class=\"w\"> <\/span><span class=\"n\">lvec<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">rvec<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">unlikely<\/span><span class=\"p\">(<\/span><span class=\"n\">size<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">))<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"n\">lvec<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">iovec<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">.<\/span><span class=\"n\">iov_base<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">dst<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">.<\/span><span class=\"n\">iov_len<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">rvec<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">iovec<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">.<\/span><span class=\"n\">iov_base<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">user_ptr<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">.<\/span><span class=\"n\">iov_len<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">process_vm_rw<\/span><span class=\"p\">(<\/span><span class=\"n\">pid<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"o\">&amp;<\/span><span class=\"n\">lvec<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"o\">&amp;<\/span><span class=\"n\">rvec<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>This is indeed what I have started experimenting with. I added this function to\n<code>mm\/process_vm_access.c<\/code> and exposed it to BPF with the helper<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">BPF_CALL_4<\/span><span class=\"p\">(<\/span><span class=\"n\">bpf_copy_from_user_remote<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">dst<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">u32<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">const<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"n\">__user<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">user_ptr<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">pid_t<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">pid<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">unlikely<\/span><span class=\"p\">(<\/span><span class=\"n\">size<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">))<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">process_vm_read<\/span><span class=\"p\">(<\/span><span class=\"n\">pid<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">dst<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">user_ptr<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"k\">const<\/span><span class=\"w\"> <\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">bpf_func_proto<\/span><span class=\"w\"> <\/span><span class=\"n\">bpf_copy_from_user_remote_proto<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">.<\/span><span class=\"n\">func<\/span><span class=\"w\">     <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">bpf_copy_from_user_remote<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">.<\/span><span class=\"n\">gpl_only<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"nb\">false<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">.<\/span><span class=\"n\">ret_type<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">RET_INTEGER<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">.<\/span><span class=\"n\">arg1_type<\/span><span class=\"w\">    <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">ARG_PTR_TO_UNINIT_MEM<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">.<\/span><span class=\"n\">arg2_type<\/span><span class=\"w\">    <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">ARG_CONST_SIZE_OR_ZERO<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">.<\/span><span class=\"n\">arg3_type<\/span><span class=\"w\">    <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">ARG_ANYTHING<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">.<\/span><span class=\"n\">arg4_type<\/span><span class=\"w\">    <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">ARG_ANYTHING<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"p\">};<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>To test this I wrote a simple C application that takes an integer from the\ncommand line, stores it safely somewhere in its VM space and is kind enough to\ntell us its PID and the VM address where we can find the \"secret\"<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;stdio.h&gt;<\/span><span class=\"cp\"><\/span>\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;stdlib.h&gt;<\/span><span class=\"cp\"><\/span>\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;unistd.h&gt;<\/span><span class=\"cp\"><\/span>\n\n<span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"nf\">main<\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">argc<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">char<\/span><span class=\"w\"> <\/span><span class=\"o\">**<\/span><span class=\"n\">argv<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">argc<\/span><span class=\"w\"> <\/span><span class=\"o\">!=<\/span><span class=\"w\"> <\/span><span class=\"mi\">2<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"n\">fprintf<\/span><span class=\"p\">(<\/span><span class=\"n\">stderr<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot;usage: secret &lt;SECRET&gt;<\/span><span class=\"se\">\\n<\/span><span class=\"s\">&quot;<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"mi\">-1<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">secret<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">atoi<\/span><span class=\"p\">(<\/span><span class=\"n\">argv<\/span><span class=\"p\">[<\/span><span class=\"mi\">1<\/span><span class=\"p\">]);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">printf<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;I&#39;m process %d and my secret is at %p<\/span><span class=\"se\">\\n<\/span><span class=\"s\">&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">getpid<\/span><span class=\"p\">(),<\/span><span class=\"w\"> <\/span><span class=\"o\">&amp;<\/span><span class=\"n\">secret<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">printf<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;%d %p<\/span><span class=\"se\">\\n<\/span><span class=\"s\">&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">getpid<\/span><span class=\"p\">(),<\/span><span class=\"w\"> <\/span><span class=\"o\">&amp;<\/span><span class=\"n\">secret<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(;;)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"n\">sleep<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>We can then write a simple BPF program that takes the PID and the address to\npass them to the new helper in the attempt to retrieve the secret value from the\nremote target process<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">secret<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"p\">...<\/span><span class=\"w\"><\/span>\n\n<span class=\"n\">SEC<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;fentry.s\/__x64_sys_write&quot;<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">BPF_PROG<\/span><span class=\"p\">(<\/span><span class=\"n\">test_sys_write<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">fd<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"k\">const<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">buf<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">size_t<\/span><span class=\"w\"> <\/span><span class=\"n\">count<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">pid<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">bpf_get_current_pid_tgid<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"o\">&gt;&gt;<\/span><span class=\"w\"> <\/span><span class=\"mi\">32<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">pid<\/span><span class=\"w\"> <\/span><span class=\"o\">!=<\/span><span class=\"w\"> <\/span><span class=\"n\">my_pid<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"n\">bytes<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">bytes<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">bpf_copy_from_user_remote<\/span><span class=\"p\">(<\/span><span class=\"o\">&amp;<\/span><span class=\"n\">secret<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"k\">sizeof<\/span><span class=\"p\">(<\/span><span class=\"n\">secret<\/span><span class=\"p\">),<\/span><span class=\"w\"> <\/span><span class=\"n\">user_ptr<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">target_pid<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">bpf_printk<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;bpf_copy_from_user_remote: copied %d bytes&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">bytes<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>We are using <code>fentry<\/code> because it has a sleepable variant, <code>fentry.s<\/code>, and using\n<code>sys_write<\/code> as our target system call. Every time this system call is called our\nBPF program is executed. In this case we just run only if it was the associated\nuser-space process that made the call.<\/p>\n<p>Unfortunately, when checking the result of the <code>bpf_printk<\/code> call from\n<code>\/sys\/kernel\/debug\/tracing\/trace_pipe<\/code>, I would always see <code>-14<\/code> (<code>-EFAULT<\/code>)\ninstead of the expected <code>4<\/code>. This is an indication that we are passing the wrong\nmemory addresses around. Further investigation led me to the conclusion that the\nproblem was with how I was using <code>process_vm_rw<\/code>. Indeed, the manpage of the\n<code>process_vm_readv<\/code> system call clearly states that<\/p>\n<blockquote>\n<p>These system calls transfer data between the address space of the calling\nprocess (\"the local process\") and the process identified by pid (\"the\nremote process\"). The data moves directly between the address spaces of\nthe two processes, without passing through kernel space.<\/p>\n<\/blockquote>\n<p>The problem here is that the destination is a variable with a kernel-space\naddress, which would then fail the checks in <code>import_iovec<\/code> in the attempt to\ncopy the <code>struct iovec<\/code> data over from user-space into kernel-space. Looking at\nhow <code>process_vm_rw<\/code> is actually implemented, one can see that the local vectors\nare turned into a <code>struct iov_iter<\/code>, so I though that all that was needed was to\ncreate one from a <code>struct kvec<\/code> instead. My next attempt has then been<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kt\">ssize_t<\/span><span class=\"w\"> <\/span><span class=\"nf\">process_vm_read<\/span><span class=\"p\">(<\/span><span class=\"kt\">pid_t<\/span><span class=\"w\"> <\/span><span class=\"n\">pid<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">dst<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">ssize_t<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">user_ptr<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">kvec<\/span><span class=\"w\"> <\/span><span class=\"n\">lvec<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">iovec<\/span><span class=\"w\"> <\/span><span class=\"n\">rvec<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">iov_iter<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">unlikely<\/span><span class=\"p\">(<\/span><span class=\"n\">size<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">))<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"n\">lvec<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">kvec<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">.<\/span><span class=\"n\">iov_base<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">dst<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">.<\/span><span class=\"n\">iov_len<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">};<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">rvec<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">iovec<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">.<\/span><span class=\"n\">iov_base<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">user_ptr<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">.<\/span><span class=\"n\">iov_len<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">};<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"n\">iov_iter_kvec<\/span><span class=\"p\">(<\/span><span class=\"o\">&amp;<\/span><span class=\"n\">iter<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">READ<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"o\">&amp;<\/span><span class=\"n\">lvec<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">process_vm_rw_core<\/span><span class=\"p\">(<\/span><span class=\"n\">pid<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"o\">&amp;<\/span><span class=\"n\">iter<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"o\">&amp;<\/span><span class=\"n\">rvec<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>Progress! The return value of the BPF helper is now <code>4<\/code> as expected, but for\nsome reason the value read into <code>int secret<\/code> was always 0. \ud83d\udca5<\/p>\n<p>While in the middle of my experiments, I sent a <a href=\"https:\/\/lore.kernel.org\/bpf\/CAGnuNNtdvbk+wp8uYDPK3weGm5PVmM7hqEaD=Mg2nBT-dKtNHw@mail.gmail.com\/\">new message<\/a> to the\nBPF mailing list on Wednesday, asking for feedback on my approach. I just wanted\nto see if, generally, I was on the right track and what I was doing made any\nsense. Clearly, my initial use of the details of <code>process_vm_readv<\/code> did not. On\nThurday evening a get the <a href=\"https:\/\/lore.kernel.org\/bpf\/20220113233708.1682225-1-kennyyu@fb.com\/\">reply of Kenny<\/a> from Meta, who showed me their\npatch with a similar change. They used <code>access_process_vm<\/code> from <code>mm\/memory.c<\/code>\ninstead, with the added bonus of requring a reference to a process descriptor\n<code>struct task_struct<\/code> instead of a <code>pid<\/code>. This has the benefit, as Kenny\nexplained, to remove the ambiguity around <code>pid<\/code> which comes from the use of\nnamespaces. However, for my particular use case, I really need a <code>pid<\/code> (the\nnamespace information might also be given at some point). Looking back at the\nnotes that I have taken throughout the previous three days, I actually realised\nthat I had come across <code>access_process_vm<\/code> while poking around to find what else\nwas there. What threw me off was a misinterpretation of the comment that\npreceded its definition<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cm\">\/*<\/span>\n<span class=\"cm\">* Access another process&#39; address space.<\/span>\n<span class=\"cm\">* Source\/target buffer must be kernel space,<\/span>\n<span class=\"cm\">* Do not walk the page table directly, use get_user_pages<\/span>\n<span class=\"cm\">*\/<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>Being a total Linux kernel newbie, I interpreted this as meaning that both\ntarget and source should have been kernel-space addresses, which of course was\nnot my case. However, that comment refers to the <code>buf<\/code> argument, which makes a\nlot of sense when you think of it, for the kernel-to-kernel interpretation does\nnot really fit with the name of the function. So I ditched my previous attempts\ninside <code>mm\/process_vm_access.c<\/code> and refactored my new BPF helper into<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">BPF_CALL_4<\/span><span class=\"p\">(<\/span><span class=\"n\">bpf_copy_from_user_remote<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">dst<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">u32<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">,<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">const<\/span><span class=\"w\"> <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"n\">__user<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">user_ptr<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">pid_t<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">pid<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">task_struct<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">task<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">unlikely<\/span><span class=\"p\">(<\/span><span class=\"n\">size<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">))<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"n\">task<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">find_get_task_by_vpid<\/span><span class=\"p\">(<\/span><span class=\"n\">pid<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">!<\/span><span class=\"n\">task<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"n\">ESRCH<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">access_process_vm<\/span><span class=\"p\">(<\/span><span class=\"n\">task<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">unsigned<\/span><span class=\"w\"> <\/span><span class=\"kt\">long<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"n\">user_ptr<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">dst<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>and success! The simple BPF program that we saw earlier is now able to correctly\nread the secret from the target process! \ud83c\udf89<\/p>\n<blockquote>\n<p>For the full BPF program sources, head over to my <a href=\"https:\/\/github.com\/P403n1x87\/bpf\/tree\/copy-from-user-remote\">bpf<\/a>\nrepository.<\/p>\n<\/blockquote>\n<h1 id=\"use-cases-beyond-austin\">Use cases beyond Austin<\/h1>\n<p>Is this helper useful only for Austin? Not at all. Whilst there clearly is a use\ncase for Austin in the way that I described at the beginning of this post, the\nnew BPF helper opens up for many interesting observability applications. At the\nend of the day, what this does is to allow BPF programs to read the VM of any\nprocess on the system. Debugging of production systems should then be an idea\npopping in your mind right now, and indeed this is the use case of Kenny and the\nreason why they are proposing a similar BPF helper.<\/p>\n<p>Think of a Python application, for example, running with CPython and native\nextensions. Getting any sort of meaningful observability into them is generally\nhard. This new helper might be able to, say, get the arguments of a (native)\nfunction, interpret them as Python object, and return a meaningful\nrepresentation. The possibilities are virtually endless.<\/p>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"linux"}},{"@attributes":{"term":"bpf"}},{"@attributes":{"term":"r&d"}}]},{"title":"Increasing Austin accuracy with a dobule-heap trick","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/increasing-austin-accuracy-with-a-dobule-heap-trick.html","rel":"alternate"}},"published":"2021-12-17T13:35:00+00:00","updated":"2021-12-17T13:35:00+00:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2021-12-17:\/increasing-austin-accuracy-with-a-dobule-heap-trick.html","summary":"<p>The latest version of Austin comes with a heap size option that can be used in increase the accuracy with which invalid samples are detected. In this post I give a brief description of how this works.<\/p>","content":"<p>It has been argued that, in order to collect reliable data with a sampling\nprofiler that peeks at the private VM address space of a process, like\n<a href=\"https:\/\/github.com\/p403n1x87\/austin\">Austin<\/a>, it is <em>necessary<\/em> to pause the tracee. However, one of the\nfundamental principles that Austin is based on is: <em>keep perturbations to a\nminimum<\/em>. Therefore, pausing the tracee to sample its private VM is not an\noption on the table. Does this mean that Austin is bound to report unreliable\ndata? My aim with this post it to convince you that halting the tracee is a\n<em>sufficient<\/em> condition, but not a <em>necessary<\/em> one. That is, we can still get\npretty accurate results without pausing the tracee every time we want to sample\nit, provided we are willing to trade in some physical memory.<\/p>\n<p>So how do we leverage some extra memory consumption for increased accuracy? To\nunderstand this we need to review how Austin works, which in turns forces us to\nreview how Python works when it comes to the frame stack management. When Python\nevaluates some bytecode, it creates a frame object that references the code that\nis being executed and that holds some useful information. These frames are\nreferenced by a linked list, which means that, in general, there are no\nguarantees as to where in the VM space the frame objects might get allocated.\nHowever, the coding of <a href=\"https:\/\/github.com\/python\/cpython\/blob\/87539cc716fab47cd4f501f2441c4ab8e80bce6f\/Objects\/frameobject.c#L778\"><code>frame_alloc<\/code><\/a> seems to suggest that\nentries in the list might get reused when the become <em>free<\/em>. So if we keep track\nof the addresses as we discover them we could in principle get all the frames\nwith just a single system call, instead of making a call for each one of them.\nBut in order to make a single call to the system call to read the private VM of\nthe tracee we need an address range, which means we need a buffer large enough\nto receive the content. The VM address space is huge on a 64-bit architecture so\nwe cannot really afford to dump an arbitrary range, and therefore we need a\ncompromise: start getting frame objects and increase the range <em>up to<\/em> a given\nthreshold. The hope here is that, with a bit of luck, we start tracking a range\nwhere frame objects are <em>likely<\/em> to be allocated.<\/p>\n<p>In practical terms, the above solution might still not be good enough. Depending\non the allocator that is being used, memory could be allocated <em>anywhere<\/em> in the\nmassive VM address space. However, some patterns generally emerge: sometimes the\naddress is close to the image in memory of the process, sometimes it is closer\nto the upper boundary of the allowed user-space VM range. Since the gap in\nbetween these two regions could be very big, we could double up the above idea\nand have <em>two<\/em> local buffers, one close to the process image in memory, the\nother closer to the upper boundary of the VM address space. This effectively\ndoubles our chances of dumping frames with just two system calls. When we need\nto resolve a frame that is not within that range we bite the bullet and make\nanother system call. But, if we are lucky, the overall rate of calls we make\nshould drop considerably with the double-heap trick above, and result accuracy\nshould benefit from it.<\/p>\n<p>So let's see if our theoretical speculations work in reality. The latest release\n3.2.0 of Austin comes with a new option, <code>-h\/--heap<\/code>, which allows us to cap the\ntotal amount of double-heap that Austin can allocate to dump frame objects in\none go. Let's try profiling<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">def<\/span> <span class=\"nf\">sum_up_to<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">if<\/span> <span class=\"n\">n<\/span> <span class=\"o\">&lt;=<\/span> <span class=\"mi\">1<\/span><span class=\"p\">:<\/span>\n        <span class=\"k\">return<\/span> <span class=\"mi\">1<\/span>\n\n    <span class=\"n\">result<\/span> <span class=\"o\">=<\/span> <span class=\"n\">n<\/span> <span class=\"o\">+<\/span> <span class=\"n\">sum_up_to<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span> <span class=\"o\">-<\/span> <span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">return<\/span> <span class=\"n\">result<\/span>\n\n\n<span class=\"k\">for<\/span> <span class=\"n\">_<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">range<\/span><span class=\"p\">(<\/span><span class=\"mi\">1_000_000<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">N<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">16<\/span>\n    <span class=\"k\">assert<\/span> <span class=\"n\">sum_up_to<\/span><span class=\"p\">(<\/span><span class=\"n\">N<\/span><span class=\"p\">)<\/span> <span class=\"o\">==<\/span> <span class=\"p\">(<\/span><span class=\"n\">N<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">N<\/span> <span class=\"o\">+<\/span> <span class=\"mi\">1<\/span><span class=\"p\">))<\/span> <span class=\"o\">&gt;&gt;<\/span> <span class=\"mi\">1<\/span>\n<\/pre><\/div>\n\n\n<p>with <code>-h 0<\/code>, i.e. with no heap. This is the picture that we get if we run the\nabove with the <a href=\"https:\/\/marketplace.visualstudio.com\/items?itemName=p403n1x87.austin-vscode\">Austin VS Code<\/a> extension:<\/p>\n<p align=\"center\">\n  <a href=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/recursive-no-heap.png\" target=\"_blank\"><img\n    src=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/recursive-no-heap.png\"\n    alt=\"Results with no heap\"\n  \/><\/a>\n<\/p>\n\n<p>Whilst it is true that most of the CPU time is spent in <code>sum_up_to<\/code>, we would\nhave expected a call stack at most 16 frame high in this case. This flame graph\nis then misleading, although it still conveys the correct information that\n<code>sum_up_to<\/code> is the largest consumer of CPU time. Let's try again, but with the\ndefault heap size this time (which is 256 MB):<\/p>\n<p align=\"center\">\n  <a href=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/recursive-heap.png\" target=\"_blank\"><img\n    src=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/recursive-heap.png\"\n    alt=\"Results with the double heap\"\n  \/><\/a>\n<\/p>\n\n<p>This time the recursive call stack have the expected height, and Austin is able\nto tell which samples didn't look quite right. But instead of throwing them\naway,  they are displayed starting from a parent <code>INVALID<\/code> frame. Why do we want\nto do that? Two reasons: we still account for the total CPU time and we still\nhave useful information. Even if invalid, those where the frames that got\nsampled, so some of them must have effectively been running when we sampled.\nCollecting that data is therefore useful for other aggregations, like a source\ncode heat map, like the one produced by the Austin VS Code extension:<\/p>\n<p align=\"center\">\n  <a href=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/recursive-heap-heatmap.png\" target=\"_blank\"><img\n    src=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/recursive-heap-heatmap.png\"\n    alt=\"Results with the double heap: source heat map\"\n  \/><\/a>\n<\/p>\n\n<p>But before we get too hopeful about the merits of this solution, let's have a\nlook at some other examples. The following flame graph comes from profiling <a href=\"https:\/\/github.com\/P403n1x87\/aoc\/blob\/c309c50503bded668745269e3dbc6273acc76d04\/2021\/12\/code.py\">my\nsolution<\/a> for <a href=\"https:\/\/adventofcode.com\/2021\/day\/12\">Day 12<\/a> of the Advent of Code 2021 challenge\nwith the default heap size:<\/p>\n<p align=\"center\">\n  <a href=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/day12-heap.png\" target=\"_blank\"><img\n    src=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/day12-heap.png\"\n    alt=\"Results for Day 12 with the double heap\"\n  \/><\/a>\n<\/p>\n\n<p>There are clearly invalid frames (those in green that start directly from <code>dfs<\/code>,\nin the middle of the graph) that are not reported as such. For comparison, this\nis the resulting graph with no heap:<\/p>\n<p align=\"center\">\n  <a href=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/day12-no-heap.png\" target=\"_blank\"><img\n    src=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/day12-no-heap.png\"\n    alt=\"Results for Day 12 without heap\"\n  \/><\/a>\n<\/p>\n\n<p>The conclusion that we draw from these experiments is that the double-heap trick\nimplemented in Austin 3.2 increases the accuracy of the results, in the sense\nthat Austin is better able to detect invalid samples, but there still is the\nchance of getting invalid samples as valid in the output. However, one thing to\nnotice is that these are quite some artificial examples that involve code that\nruns pretty quickly. In many practical situations you would be using a profiler\nto detect unexpected slow paths, where Austin has good chances of producing\nfairly accurate results. This is the profiling data generated by this simple\ntest script:<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">threading<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">keep_cpu_busy<\/span><span class=\"p\">():<\/span>\n    <span class=\"n\">a<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[]<\/span>\n    <span class=\"k\">for<\/span> <span class=\"n\">i<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">range<\/span><span class=\"p\">(<\/span><span class=\"mi\">2000000<\/span><span class=\"p\">):<\/span>\n        <span class=\"n\">a<\/span><span class=\"o\">.<\/span><span class=\"n\">append<\/span><span class=\"p\">(<\/span><span class=\"n\">i<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"k\">if<\/span> <span class=\"vm\">__name__<\/span> <span class=\"o\">==<\/span> <span class=\"s2\">&quot;__main__&quot;<\/span><span class=\"p\">:<\/span>\n    <span class=\"n\">threading<\/span><span class=\"o\">.<\/span><span class=\"n\">Thread<\/span><span class=\"p\">(<\/span><span class=\"n\">target<\/span><span class=\"o\">=<\/span><span class=\"n\">keep_cpu_busy<\/span><span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">start<\/span><span class=\"p\">()<\/span>\n    <span class=\"n\">keep_cpu_busy<\/span><span class=\"p\">()<\/span>\n<\/pre><\/div>\n\n\n<p>with the heap:<\/p>\n<p align=\"center\">\n  <a href=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/target34-heap.png\" target=\"_blank\"><img\n    src=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/target34-heap.png\"\n    alt=\"Results for Day 12 without heap\"\n  \/><\/a>\n<\/p>\n\n<p>and without:<\/p>\n<p align=\"center\">\n  <a href=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/target34-no-heap.png\" target=\"_blank\"><img\n    src=\"https:\/\/p403n1x87.github.io\/images\/austin-accuracy\/target34-no-heap.png\"\n    alt=\"Results for Day 12 without heap\"\n  \/><\/a>\n<\/p>\n\n<p>i.e. they're almost identical.<\/p>\n<p>The default value of 256 MB for the maximum combined heap size seems a\nreasonable compromise for getting even more accurate results, but on system with\nlimited resources it is perhaps advisable to run Austin with a lower value, if\nnot with the more drastic option <code>--heap 0<\/code>, which gives the pre-3.2 behaviour.<\/p>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"python"}},{"@attributes":{"term":"profiling"}},{"@attributes":{"term":"r&d"}}]},{"title":"How I completed the Hacktoberfest 2021 challenge with a profiler","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/how-i-completed-the-hacktoberfest-2021-challenge-with-a-profiler.html","rel":"alternate"}},"published":"2021-12-16T15:18:00+00:00","updated":"2021-12-16T15:18:00+00:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2021-12-16:\/how-i-completed-the-hacktoberfest-2021-challenge-with-a-profiler.html","summary":"<p>I shall reveal to you how I managed to complete the Hacktoberfest 2021 challenge with just a profiler. So read on if you are interested!<\/p>","content":"<p>Remember my post about <a href=\"https:\/\/p403n1x87.github.io\/how-to-bust-python-performance-issues.html\">how to bust performace issues<\/a>? My claim there was\nthat if you picked a project at random from e.g. GitHub, you'd find something\nthat would catch your eye if you ran the code through a profiler. Iterating this\nprocess then seemed like a good strategy to generate PRs, which is what you need\nto do if you want to <a href=\"https:\/\/dev.to\/p403n1x87\">complete the Hacktoberfest challenge<\/a> when that\ntime of the year comes around.<\/p>\n<p>But let's not get the wrong idea. You shouldn't walk away from here thinking\nthat performance analysis is as trivial as turning the profiler on during test\nruns. What my previous post was trying to show is that, in many cases, code is\nnot profiled and therefore it is easy to find some (rather) low-hanging fruits\nthat can be fixed easily just as simply as looking at profiling data from the\ntest suite. Once these are out of the way, that's when the performance analysis\nbecomes a challenge itself, and some more serious and structured methodologies\nare required to make further progress.<\/p>\n<p>So how did I actually use a profiler to complete the Hacktoberfest? I started by\nlooking at all the Python projects with the <code>hacktoberfest<\/code> topic on GitHub and\npicked some that looked interesting to me. The profiler of choice was (surprise,\nsurprise) <a href=\"https:\/\/github.com\/p403n1x87\/austin\">Austin<\/a>, since it requires no instrumentation and has\npractically no impact on the tracee, meaning that I could just sneak a <code>austin<\/code>\nin the command line used to start the tests to get the data that I needed.<\/p>\n<p>As a concrete example, let's look at how I was able to detect and fix a\nperformance regression in <a href=\"https:\/\/github.com\/bee-san\/pyWhat\">pyWhat<\/a>. I forked the repository, made a\nlocal clone and looked at how the test suite is run. Peeking at the GitHub\nActions I could see the test suite was triggered with <code>nox<\/code><\/p>\n<div class=\"highlight\"><pre><span><\/span>python -m nox\n<\/pre><\/div>\n\n\n<p>Inside the <code>noxfile.py<\/code> we can find the <code>tests<\/code> session, which is the one we are\ninterested in<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nd\">@nox<\/span><span class=\"o\">.<\/span><span class=\"n\">session<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">tests<\/span><span class=\"p\">(<\/span><span class=\"n\">session<\/span><span class=\"p\">:<\/span> <span class=\"n\">Session<\/span><span class=\"p\">)<\/span> <span class=\"o\">-&gt;<\/span> <span class=\"kc\">None<\/span><span class=\"p\">:<\/span>\n    <span class=\"sd\">&quot;&quot;&quot;Run the test suite.&quot;&quot;&quot;<\/span>\n    <span class=\"n\">session<\/span><span class=\"o\">.<\/span><span class=\"n\">run<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;poetry&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;install&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;--no-dev&quot;<\/span><span class=\"p\">,<\/span> <span class=\"n\">external<\/span><span class=\"o\">=<\/span><span class=\"kc\">True<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">install_with_constraints<\/span><span class=\"p\">(<\/span>\n        <span class=\"n\">session<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest-black&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest-cov&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest-isort&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest-flake8&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest-mypy&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;types-requests&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;types-orjson&quot;<\/span><span class=\"p\">,<\/span>\n    <span class=\"p\">)<\/span>\n    <span class=\"n\">session<\/span><span class=\"o\">.<\/span><span class=\"n\">run<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;pytest&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-vv&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;--cov=.\/&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;--cov-report=xml&quot;<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>So let's create a <code>profile<\/code> session where we run the test suite through Austin.\nAll we have to do is add <code>austin<\/code> at the right place in the arguments to\n<code>session.run<\/code>, plus some additional options, e.g.:<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nd\">@nox<\/span><span class=\"o\">.<\/span><span class=\"n\">session<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">profile<\/span><span class=\"p\">(<\/span><span class=\"n\">session<\/span><span class=\"p\">:<\/span> <span class=\"n\">Session<\/span><span class=\"p\">)<\/span> <span class=\"o\">-&gt;<\/span> <span class=\"kc\">None<\/span><span class=\"p\">:<\/span>\n    <span class=\"sd\">&quot;&quot;&quot;Profile the test suite.&quot;&quot;&quot;<\/span>\n    <span class=\"n\">session<\/span><span class=\"o\">.<\/span><span class=\"n\">run<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;poetry&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;install&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;--no-dev&quot;<\/span><span class=\"p\">,<\/span> <span class=\"n\">external<\/span><span class=\"o\">=<\/span><span class=\"kc\">True<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">profile_file<\/span> <span class=\"o\">=<\/span> <span class=\"n\">os<\/span><span class=\"o\">.<\/span><span class=\"n\">environ<\/span><span class=\"o\">.<\/span><span class=\"n\">get<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;AUSTIN_FILE&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;tests.austin&quot;<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">install_with_constraints<\/span><span class=\"p\">(<\/span>\n        <span class=\"n\">session<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest-black&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest-cov&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest-isort&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest-flake8&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;pytest-mypy&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;types-requests&quot;<\/span><span class=\"p\">,<\/span>\n        <span class=\"s2\">&quot;types-orjson&quot;<\/span><span class=\"p\">,<\/span>\n    <span class=\"p\">)<\/span>\n    <span class=\"n\">session<\/span><span class=\"o\">.<\/span><span class=\"n\">run<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;austin&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-so&quot;<\/span><span class=\"p\">,<\/span> <span class=\"n\">profile_file<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;-i&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;1ms&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;pytest&quot;<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>Here I've actually removed options to <code>pytest<\/code> which I don't care about, like\ncode coverage, as it's not what I want to profile this time. The <code>-s<\/code> option\ntells Austin to give us non-idle samples only, effectively giving us a profile\nof CPU time. I'm also allowing the Austin output file to be specified from the\nenvironment via the <code>AUSTIN_FILE<\/code> variable. This means that, if I want to\nprofile the tests and save the results to <code>tests.austin<\/code>, all I have to do is\ninvoke<\/p>\n<div class=\"highlight\"><pre><span><\/span>pipx install nox  <span class=\"c1\"># if not installed already<\/span>\n<span class=\"nv\">AUSTIN_FILE<\/span><span class=\"o\">=<\/span>tests.austin nox -rs profile\n<\/pre><\/div>\n\n\n<p>Once this completes, the profiling data will be sitting in <code>tests.austin<\/code>, ready\nto be analysed. With VS Code open on my local copy of <code>pyWhat<\/code>, I've used the\n<a href=\"https:\/\/marketplace.visualstudio.com\/items?itemName=p403n1x87.austin-vscode\">Austin VS Code<\/a> extension to visualise the data in the form of a flame\ngraph and, by poking around, this is what caught my eye<\/p>\n<p align=\"center\">\n  <a href=\"https:\/\/user-images.githubusercontent.com\/20231758\/138076258-67c0e621-9055-477f-97f8-5754147267aa.png\" target=\"_blank\">\n    <img\n      src=\"https:\/\/user-images.githubusercontent.com\/20231758\/138076258-67c0e621-9055-477f-97f8-5754147267aa.png\"\n      alt=\"pyWhat tests before the fix\"\n    \/>\n  <\/a>\n<\/p>\n\n<p>The suspect here is the chunky <code>deepcopy<\/code> frame stack which is quite noticeable.\nThe question, of course, is whether the deepcopy is really needed. Clicking on\nthe <code>check<\/code> frame takes us straight into the part of the code where the\n<code>deepcopy<\/code> is triggered. By inspecting the lines around I couldn't really see\nthe need of making <code>deepcopy<\/code> of objects. So I turned that back (it was\noriginally a shallow copy, that was later turned into a deep copy) into a\nshallow copy with <a href=\"https:\/\/github.com\/bee-san\/pyWhat\/pull\/218\/files\">this PR<\/a>,\nran the test and checked for the expected output. All was looking find. In fact,\nthings now looked much, much better! Rerunning the profile session with the\nchange produced the following picture:<\/p>\n<p align=\"center\">\n  <a href=\"https:\/\/user-images.githubusercontent.com\/20231758\/138076271-6241b43b-d1f3-439d-9afc-3022ce2e231b.png\" target=\"_blank\">\n    <img\n      src=\"https:\/\/user-images.githubusercontent.com\/20231758\/138076271-6241b43b-d1f3-439d-9afc-3022ce2e231b.png\"\n      alt=\"pyWhat tests after the fix\"\n    \/>\n  <\/a>\n<\/p>\n\n<p>The <code>deepcopy<\/code> stacks have disappeared and the <code>check<\/code> frame is overall much\nslimmer! And so, just like that, a performance regression has been found and\nfixed in just a few minutes :).<\/p>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"python"}},{"@attributes":{"term":"profiling"}}]},{"title":"Spy on Python down to the Linux kernel level","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/spy-on-python-down-to-the-linux-kernel-level.html","rel":"alternate"}},"published":"2021-09-27T11:56:00+01:00","updated":"2021-09-27T11:56:00+01:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2021-09-27:\/spy-on-python-down-to-the-linux-kernel-level.html","summary":"<p>Observability into native call stacks requires some compromise. In this post I explain what this actually means for a Python tool like Austin.<\/p>","content":"<p>When I conceived the design of <a href=\"https:\/\/github.com\/p403n1x87\/austin\">Austin<\/a> for the first time, I've sworn\nto always adhere to two guiding principles:<\/p>\n<ul>\n<li>no dependencies other than the standard C library (and whatever system calls\n  the OS provides);<\/li>\n<li>minimal impact on the tracee, even under high sampling frequency.<\/li>\n<\/ul>\n<p>Let me elaborate on why I decided to stick to these two <em>rules<\/em>. The first one\nis more of a choice of simplicity. The power horse of Austin is the capability\nof reading the private memory of any process, be it a child process or not. Many\nplatforms provide the API or system calls to do that, some with more security\ngotchas than others. Once Austin has access to that information, the rest is\nplain C code that makes sense of that data and provides a meaningful\nrepresentation to the user by merely calling <code>libc<\/code>'s <code>fprintf<\/code> on a loop.<\/p>\n<p>The second guiding principle is what everybody desires from observability tools.\nWe want to be able to extract as much information as possible from a running\nprogram, perturbing it as little as possible as to avoid skewed data. Austin can\nmake this guarantee because reading VM memory does not require the tracee to be\nhalted. Furthermore, the fact that Python has a <a href=\"https:\/\/realpython.com\/python-gil\/\">GIL<\/a> implies that a simple\nPython application will run on at most one physical core. To be more precise, a\nnormal, pure-Python application would not spend more CPU time than wall-clock\ntime. Therefore, on machines with multiple cores, even if Austin ends up acting\nlike a busy loop at high sampling frequencies and hogging a physical core, there\nwould still be plenty of other cores to run the Python application unperturbed\nand unaware that is being spied on. Even for <a href=\"https:\/\/docs.python.org\/3\/library\/multiprocessing.html\">multiprocess<\/a> applications,\nthe expected impact is minimal, for if you are running, say, a uWSGI server on a\n64-core machine, you wouldn't lose much if Austin hogs one of them. Besides, you\nprobably don't need to sample at very high frequences (like once every 50\nmicroseconds), but you could be happy with, e.g. 1000 Hz, which is still pretty\nhigh, but would not cause Austin to require an entire core for itself.<\/p>\n<p>When you put these two principles together you get a tool that compiles down to\na single tiny binary and that has minimal impact on the tracee at runtime. The\nadded bonus is that it doesn't even require any instrumentation! These are\nsurely ideal features for an observability tool that make Austin very well\nsuited for running in a production environment.<\/p>\n<p>But Austin strengths are also its limitations unfortunately. What if our\napplication has parts written as Python extensions, e.g. native <a href=\"https:\/\/docs.python.org\/3\/extending\/extending.html\">C\/C++\nextensions<\/a>, <a href=\"https:\/\/cython.org\/\">Cython<\/a>, <a href=\"https:\/\/github.com\/PyO3\/pyo3\">Rust<\/a>, or even <a href=\"https:\/\/p403n1x87.github.io\/extending-python-with-assembly.html\">assembly<\/a>? By\nreading a process private VM, Austin can only reconstruct the pure-Python call\nstacks. To unwind the native call stacks, Austin would need to use some heavier\nmachinery. Forget about using a third-party library for doing that, which would\nviolate the first principle, the more serious issue here is that there are\ncurrently no ways of avoiding the use of system calls like <a href=\"https:\/\/man7.org\/linux\/man-pages\/man2\/ptrace.2.html\"><code>ptrace(2)<\/code><\/a>\nfrom user-space. This would be a serious violation of the second principle. Why?\nBecause stack unwinding using <code>ptrace<\/code> requires threads to be halted, thus\ncausing a non-negligible impact on the tracee. Besides, stack unwinding is not\nexactly straight-forward on every platform to implement.<\/p>\n<p>The compromise is <a href=\"https:\/\/github.com\/P403n1x87\/austin\/tree\/devel#native-frame-stack\">austinp<\/a>, a <em>variant<\/em> of Austin that can do native\nstack unwinding, <em>just<\/em> on Linux, using <a href=\"https:\/\/www.nongnu.org\/libunwind\/\"><code>libunwind<\/code><\/a> and <code>ptrace<\/code>.\nThis tool is to be used when you really need to have observability into native\ncall stacks, as the use of <code>ptrace<\/code> implies that the tracee will be impacted to\nsome extent. This is why, be default, <code>austinp<\/code> samples at a much lower rate.\nThis doesn't mean that you cannot use this tool in a production environment, but\nthat you should be aware of the potential penalties that come with it. Many\nobservability tools from the past relied on <code>ptrace<\/code> or similar to achieve their\ngoal, and <code>austinp<\/code> is just a (relatively) new entry into that list. More modern\nsolutions rely on technologies like <a href=\"https:\/\/ebpf.io\/\">eBPF<\/a> to provide efficient\nobservability into the Linux kernel, as well as into user-space.<\/p>\n<p>Speaking of the Linux kernel, eBPF is not the only way to retrieve kernel\nstacks. In the future we might have a variant of Austin that relies on eBPF for\nsome heavy lifting, but for now <code>austinp<\/code> leverages the information exposed by\n<a href=\"https:\/\/man7.org\/linux\/man-pages\/man5\/proc.5.html\"><code>procfs<\/code><\/a> to push stack unwinding down to the Linux kernel level. The\n<code>austinp<\/code> variant has the same CLI of Austin, but with the extra option <code>-k<\/code>,\nwhich can be used to sample kernel stacks alongside native ones. I am still to\nfind a valid use-case for wanting to obtain kernel observability from a Python\nprogram, but I think this could be an interesting way to see how the interpreter\ninteracts with the kernel; and perhaps someone might find ways of inspecting the\nLinux kernel performance by coding a simple Python script rather than a more\nverbose C equivalent.<\/p>\n<p>You can find some examples of <code>austinp<\/code> in action on my <a href=\"https:\/\/twitter.com\/p403n1x87\">Twitter\naccount<\/a>. This, for example, is what you'd get for a simple\n<a href=\"https:\/\/scikit-learn.org\/stable\/\">scikit-learn<\/a> classification model, when you open the collected\nsamples via the <a href=\"[https:\/\/marketplace.visualstudio.com\/items?itemName=p403n1x87.austin-vscode]\">Austin VS Code<\/a> extension:<\/p>\n<blockquote class=\"twitter-tweet\" data-theme=\"dark\"><p lang=\"en\" dir=\"ltr\">The latest development builds of <a href=\"https:\/\/twitter.com\/AustinSampler?ref_src=twsrc%5Etfw\">@AustinSampler<\/a>, including the austinp variant for native stack sampling on Linux are now available from <a href=\"https:\/\/twitter.com\/github?ref_src=twsrc%5Etfw\">@github<\/a> releases <a href=\"https:\/\/t.co\/nBfzm3mDng\">https:\/\/t.co\/nBfzm3mDng<\/a>. <a href=\"https:\/\/t.co\/IjVfAm1hRk\">pic.twitter.com\/IjVfAm1hRk<\/a><\/p>&mdash; Gabriele Tornetta \ud83c\uddea\ud83c\uddfa \ud83c\uddee\ud83c\uddf9 \ud83c\uddec\ud83c\udde7 (@p403n1x87) <a href=\"https:\/\/twitter.com\/p403n1x87\/status\/1435569784620470283?ref_src=twsrc%5Etfw\">September 8, 2021<\/a><\/blockquote>\n\n<p><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<p>If you want to give <code>austinp<\/code> a try you can follow the instructions on the\n<a href=\"https:\/\/github.com\/P403n1x87\/austin\/tree\/devel#native-frame-stack\">README<\/a> for compiling from sources, or download the pre-built binary\nfrom the the <a href=\"https:\/\/github.com\/P403n1x87\/austin\/releases\/tag\/dev\">Development build<\/a>. In the future, <code>austinp<\/code> will be\navailable from ordinary releases too!<\/p>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"python"}},{"@attributes":{"term":"profiling"}}]},{"title":"How to bust Python performance issues","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/how-to-bust-python-performance-issues.html","rel":"alternate"}},"published":"2021-07-02T17:43:00+01:00","updated":"2021-07-02T17:43:00+01:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2021-07-02:\/how-to-bust-python-performance-issues.html","summary":"<p>In this short post I will try to convince you of how easy it is to find performance issues in your Python code and how you should develop the habit of profiling your code before you ship it.<\/p>","content":"<p>In my experience as a software engineer, I think it's still way to common to see\nproduction-ready code being shipped without having been profiled at least once.\nWith the current computing power and the ever increasing number of available\ncores per machine, it feels like a lot of preference is generally given to\nreadable and maintainable code at the cost of those extra microseconds. Whilst\nthis might make sense for extremely complex code-bases in low-level languages,\nthis is perhaps more of an issue with technologies like Python, where in general\nyou can still make some substantial optimisations while still retaining\nreadability and maintainability.<\/p>\n<p>To further prove to myself that profiling is still an important step in the\ndevelopment process that gets overlooked, I did the following experiment. I\ngrabbed a Python project at random, the first one that popped up on my GitHub\nfeed, looked at its test suite and profiled the test runs. The day I did this,\n<a href=\"https:\/\/github.com\/willmcgugan\/rich\">Rich<\/a> was sitting at the top of my GitHub\nfeed, so what follows is a trace-back of the steps that led me to contribute\n<a href=\"https:\/\/github.com\/willmcgugan\/rich\/pull\/1253\">this performance PR<\/a> to the\nproject. Besides Python, the other tools that I have used are <a href=\"https:\/\/github.com\/p403n1x87\/austin\">Austin\n3<\/a> and VS Code with the <a href=\"https:\/\/marketplace.visualstudio.com\/items?itemName=p403n1x87.austin-vscode\">Austin\nextension<\/a>\ninstalled.<\/p>\n<p>So first of all, let's make sure that our test environment is fully set up. If\nyou want to follow along, make sure that you have Austin and VS Code installed.\nI was using Windows the day I made this experiment, so I had Austin installed\nwith <a href=\"https:\/\/community.chocolatey.org\/packages\/austin\/\">choco<\/a>, and the VS Code\nextension installed from the Visual Studio Marketplace. Let's get our hands on\nsome code now by cloning Rich and checking out that commit that was master for\nme at that time. Open up a terminal and type<\/p>\n<div class=\"highlight\"><pre><span><\/span>git clone https:\/\/github.com\/willmcgugan\/rich.git\ngit checkout ce4f18c\n<\/pre><\/div>\n\n\n<p>The project uses <a href=\"https:\/\/github.com\/python-poetry\/poetry\">poetry<\/a> so running\nthe test suite is as easy as invoking<\/p>\n<div class=\"highlight\"><pre><span><\/span>poetry install\npoetry run python -m pytest\n<\/pre><\/div>\n\n\n<p>Once we are sure that all the tests pass we are ready to start getting some\nprofiling data to see what's actually running. Version 3 of Austin comes with a\nreworked <code>sleepless<\/code> mode that can be used to get an estimate of CPU time\ninstead of wall time. One big advantage of using a tool like Austin is that we\ndo not have to make any changes to the code in order to get profiling data out\nof it. Besides, Austin runs out-of-process, which means that it won't have any\neffects on the code. Getting profiling data is as easy as invoking Austin just\nbefore the test run<\/p>\n<div class=\"highlight\"><pre><span><\/span>poetry run austin -so profile_master.austin python -m pytest -vv tests\n<\/pre><\/div>\n\n\n<blockquote>\n<p><strong>WARNING<\/strong> Here we can let Austin sample the whole <code>pytest<\/code> run because we\nhave checked beforehand that it only takes a few seconds to complete. <strong>DO\nNOT<\/strong> try the same exact thing with long-running test suites or you would end\nup with a massive sample file that would be hard to process. In such cases you\ncan either select a few tests, or run Austin with the <code>-x,--exposure<\/code> option\nto limit sampling to just a few seconds, and adjust the sampling interval with\nthe <code>-i\\--interval<\/code> option as best suited.<\/p>\n<\/blockquote>\n<p>The <code>-s<\/code> option turns the <code>sleepless<\/code> mode on, which gives us only the on-CPU\nsamples, whereas the <code>-o<\/code> option specifies the output file. Once the test run\nterminates, our profiling data will be in <code>profile_master.austin<\/code>, ready to be\nanalysed with the Austin VS Code extension. We get the best experience if we\nstart VS Code from within the project's root directory as this allows us to\nbrowse the code while we look at the flame graph. So fire up VS Code from the\nterminal with<\/p>\n<div class=\"highlight\"><pre><span><\/span>code .\n<\/pre><\/div>\n\n\n<p>and activate the Austin extension by clicking on the <code>FLAME GRAPH<\/code> tab in the\nbottom panel. Sometimes you would have to right click inside the panel and click\nthe menu entry to fully activate the extension.<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/bust-perf-issues\/austin-panel.png\"\n    alt=\"The Austin Flame Graph panel within VS Code\"\n  \/>\n<\/p>\n\n<p>At this point we are ready to load the profiling data that we have collected.\nClick on the <code>OPEN<\/code> button or press <kbd>CTRL<\/kbd> + <kbd>SHIFT<\/kbd> +\n<kbd>A<\/kbd> to bring up the open dialog and select <code>profile_master.austin<\/code>. The\nAustin VS Code extension will analyse all the collected sample and generate the\nflame graph.<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/bust-perf-issues\/open-profile.gif\"\n    alt=\"Open a profile file\"\n  \/>\n<\/p>\n\n<p>The picture we get is certainly overwhelming at first, especially for those that\nare not familiar with how <code>pytest<\/code> works internally. This is perhaps a nice way\nto actually find out how <code>pytest<\/code> collects and run tests. By poking around we\ndiscover that some of the test runs are under the <code>pytest_pyfunc_call<\/code> frame. If\nyou are struggling to find it, press <kbd>F<\/kbd> inside the flame graph to\nreveal the search input box, type <code>pytest_pyfunc_call<\/code> and hit <kbd>ENTER<\/kbd>.\nThe frames that match the search string will be highlighted in purple. Let's\nscroll until we find the largest one. When we click on it, it will expand to the\nfull width of the panel to give us a better idea of what lies underneath it, and\nthe corresponding portion of the source code will also appear in VS Code!<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/bust-perf-issues\/pytest_pyfunc_call.gif\"\n    alt=\"pytest_pyfunc_call\"\n  \/>\n<\/p>\n\n<p>We now have a better view of the tests that are being executed under this path.\nAt this point we can start looking for the largest leaf frames and see if we can\nmake any sense of them. When I first looked at this graph, one thing that\nquickly caught my eye was this particular stack.<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/bust-perf-issues\/test_log.png\"\n    alt=\"test_log\"\n  \/>\n<\/p>\n\n<p>Clicking on the <code>test_log<\/code> frame reveals the test code in VS Code. Surprisingly,\nthe test has just a single call to <code>Console.log<\/code>, and the percent timing\nannotation generated by the Austin extension tell us that, of the whole on-CPU\ntime for the test suite, about 4.9% is spent on that single call!<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/bust-perf-issues\/test_log_code.png\"\n    alt=\"The code for the test_log test case\"\n  \/>\n<\/p>\n\n<p>Looking back at the flame graph, we realise that all the time in the <code>log<\/code> frame\nis spent calling <code>stack<\/code>. Clicking on the <code>log<\/code> frame reveals the source code\nfor the <code>Console.log<\/code> method and we can inspect how the information from the\nstack is used to generate the log entry. The line we are interested in is 1685,\nwhere we have<\/p>\n<div class=\"highlight\"><pre><span><\/span>            <span class=\"n\">caller<\/span> <span class=\"o\">=<\/span> <span class=\"n\">inspect<\/span><span class=\"o\">.<\/span><span class=\"n\">stack<\/span><span class=\"p\">()[<\/span><span class=\"n\">_stack_offset<\/span><span class=\"p\">]<\/span>\n<\/pre><\/div>\n\n\n<p>So <code>inspect.stack()<\/code> is called, which according to the flame graph does a lot of\nresolutions for each frame (see those calls to <code>findsource<\/code>, <code>getmodule<\/code>\netc...), none of which seems to be used in <code>Console.log<\/code>, and besides we just\npick one of the frames close to the top of the stack and chuck the rest away.\nThat's pretty expensive for a simple log call. Since I had some familiarity with\nthe <a href=\"https:\/\/docs.python.org\/3\/library\/inspect.html\"><code>inspect<\/code><\/a> module with my\nwork on Austin and other stuff, I knew there is (at least for CPython) the lower\nlevel method\n<a href=\"https:\/\/docs.python.org\/3\/library\/inspect.html#inspect.currentframe\"><code>currentframe<\/code><\/a>\nthat would give you essential information about the currently executing frame.\nFrom there you can navigate down the stack and stop at the frame of interest. In\nthis case we just need to take the parent frame of the current one, and we\nalready find all the information needed by the <code>Console.log<\/code> method. I made the\nchanges as part of the already mentioned PR\n<a href=\"https:\/\/github.com\/willmcgugan\/rich\/pull\/1253\">#1253<\/a>, so if you check that\ncode out and re-run the tests with<\/p>\n<div class=\"highlight\"><pre><span><\/span>poetry run austin -so profile_pr.austin python -m pytest -vv tests\n<\/pre><\/div>\n\n\n<p>and open the new profiling data in <code>profile_pr.austin<\/code> you will see that the\ntest case <code>test_log<\/code> has pretty much disappeared as it basically takes almost\nzero CPU time now.<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/bust-perf-issues\/test_log_pr.png\"\n    alt=\"test_log with the performance change\"\n  \/>\n<\/p>\n\n<p>Instead we see <code>test_log_caller_frame_info<\/code>, which is the test case for the\ncompatibility utility for those Python implementations that do not implement\n<code>currentframe<\/code>. But with CPython, calling <code>Console.log<\/code> is now inexpensive\ncompared to the original implementation.<\/p>\n<p>See how easy it's been to find a performance issue. With the right tool we\ndidn't have to add any instrumentation to the code, especially in one we\nprobably had no familiarity with. In many cases you only understand your code\ntruly if you see it in action. So no more excuses for not profiling your code\nbefore you ship it! ;P<\/p>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"python"}},{"@attributes":{"term":"profiling"}},{"@attributes":{"term":"optimisation"}}]},{"title":"Looking Back at My Time at Avaloq","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/looking-back-at-my-time-at-avaloq.html","rel":"alternate"}},"published":"2021-02-04T16:37:00+00:00","updated":"2021-02-04T16:37:00+00:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2021-02-04:\/looking-back-at-my-time-at-avaloq.html","summary":"<p>My adventure with Avaloq started in August 2016 and ended in January 2021. It's time to look back at what has happened during these years.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#departing-from-academia\">Departing from Academia<\/a><\/li>\n<li><a href=\"#the-move\">The Move<\/a><\/li>\n<li><a href=\"#settling-in\">Settling In<\/a><\/li>\n<li><a href=\"#my-first-project\">My First Project<\/a><\/li>\n<li><a href=\"#wind-of-change\">Wind of Change<\/a><\/li>\n<li><a href=\"#shipping-containers\">Shipping Containers<\/a><\/li>\n<li><a href=\"#the-right-direction\">The Right Direction?<\/a><\/li>\n<li><a href=\"#social-life\">Social Life<\/a><\/li>\n<li><a href=\"#gabrexit\">Gabrexit<\/a><\/li>\n<li><a href=\"#references\">References<\/a><\/li>\n<\/ul>\n<\/div>\n<h1 id=\"departing-from-academia\">Departing from Academia<\/h1>\n<p>When I started my job as a Software Engineer at Avaloq in 2016, I had just\ndecided to leave academia for good and move to what many would call a \"proper\"\njob. It's not that I wasn't caressing the idea of pursuing a career in academia,\nbut while I was applying to those few positions available that were compatible\nwith my PhD in <a href=\"http:\/\/theses.gla.ac.uk\/7203\/1\/2016tornettaphd.pdf\">Classification of\nC*-algebras<\/a>, only to\nreceive rejections back, the actual truth was revealed to me: I didn't really\nfancy going down that path. And there was, I found out, a multitude of reasons\nwhy I then decided to veer towards a completely different direction. The first\nsignal was that, rather than being upset, in many cases I was actually happy to\nbe rejected because, in the end, I didn't quite like either the kind of research\nI was signing up for, or the place I would have had to move to. Not only this,\nbut being a researcher in that field meant years and years of relocating from\nplace to place until, if you got that bit of luck to assist you, you got a\npermanent position somewhere. This is something I have witnessed through some of\nthe people that I have met during my PhD. This, I also realised, wasn't what I\nwas longing to go through, so yet another reason to move to a different career\npath that could allow me to settle somewhere and rely on a stable income. In the\nend, you can still do science in your own spare time and actually focus on what\nyou really like to do, rather than what the grant is for. Of course, this is not\nquite like doing research in its proper environment. In fact, it's quite far\nfrom it, since the time availability is limited because of your daily job, you\ndon't have direct access to the experts, either in person or through journal\nsubscriptions etc..., but it's certainly not impossible to do and, in fact, I\nhave been able to do it, as I shall tell you later on.<\/p>\n<h1 id=\"the-move\">The Move<\/h1>\n<p>I have always considered software development a hobby, something that I had\nstarted doing as a kid, out of curiosity and driven by my innate desire to know\nhow things work, and never really considered a potential career path. However,\nwhen my interests in academia started to subside, it was something that surely\ndeserved to be reconsidered. I've always been enthusiastic about Science, and\nTechnology in general in particular, and a role in a tech company would have\ncertainly given me the chance to be in contact with the latest developments.\nWithin a week of job-hunting, I landed my first role as Software Engineer at\nAvaloq, back in 2016. This is where I started to adjust to a brand new world.<\/p>\n<p>The first thing I had to come to terms with was the stark difference between the\nrigour that I have got accustomed to during my PhD in Pure Mathematics, to the\nsomewhat looser attitude on the workplace. There wasn't much attention on the\ntechnical words used in meetings, which made many things look very confusing at\nfirst, as the same concepts were referred to by different names. But after the\ninitial \"shock\", things moved quite smoothly and I started enjoying what I was\ndoing. I could make use of my honed problem solving skills and prove myself a\ngood asset for my new team. In exchange of the excitement coming from the new\nchallenges, I got to hone my software design and engineering skills, which have\nbeen growing <em>in the wild<\/em> with my personal experiments up to that point.<\/p>\n<h1 id=\"settling-in\">Settling In<\/h1>\n<p>A few weeks into my new job I came to know that somebody was organising an\ninternal hackathon to work on some internal tooling ideas. I thought it was a\ngood idea to join in, not only to show commitment and score points, but also\nbecause I genuinely thought it was a good chance to get to work with people from\ndifferent teams learn a good deal about other aspects of the company, the\nproduct, the workflows and the tools and technologies, that I wouldn't have\nknown otherwise. The project I decided to work on was a tool inspired by git\nblame for the sources hosted on the in-house source version control solution.\nThe hackathon took place a couple of weeks later and the team I joined to work\non said tool ended up winning. Not bad of a start! But the real win for me was\nactually achieving the goal I was hoping to achieve from the very beginning,\nthat is, learning a good deal about the job quickly. What came out of this\ncompetition was far more than just winning a fancy nerdy t-shirt<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/avaloq\/hacky_tools_prize.jpg\"\n    alt=\"Hacky Tools Prize\"\n  \/>\n<\/p>\n\n<p>With time, the Python glue code that I contributed for the hackathon evolved\ninto an entire Python tooling framework that allowed me to write tools in no\ntime to help me with my daily tasks. But perhaps more importantly, some of those\ntools turned out to be useful to my colleagues as well. Branching sources across\ndifferent releases now took seconds rather than hours of manual, tedious and\nerror-prone work.<\/p>\n<p>The other thing that came out of this experience was that I got actively\ninvolved in promoting and organising more of these events for the office. But I\nwill get back to this later on.<\/p>\n<h1 id=\"my-first-project\">My First Project<\/h1>\n<p>When Doug, my only local teammate, realised that I was picking things up rather\nquickly, he challenged me with my very first issue only a few days into the job.\nBy the end of the first week I had fixed my first issue. As a new joiner, I was\nexpected to spend the first three months on \"Education\" and to learn about the\nproduct, the tools etc..., and certainly nobody was expecting me to be\nproductive during that time. There is mandatory training that most of the new\njoiners need to go through, at the end of which there's an exam that is\nfundamental for passing probation. That marks the point when you're supposed to\nactually become useful to your teammates.<\/p>\n<p>Perhaps showing all that eagerness to learn didn't play too well in my favour,\nbecause the next challenge I received was the involvement in a rather big\nproject a few months into the role. Just kidding :). This is when I got to fly\nto Zurich to visit the Avaloq HQ and have planning sessions with the other\nteammate and the solution architect that were both based there. The task was\nrather ambitious: rethink the customisation API to allow the customer to write\nless code when supporting multiple versions or variants of certain message\ntypes, in a way that is backward-compatible with the existing implementation.\nThe first hurdle for me was to try and make sense of all those requirements\nwhile I was still familiarising with the code-base and the tiny corner of the\nproduct I was supposed to look after. But this was a test I was not willing to\nfail, so I jumped right into the task and played my part for the team to deliver\nthe final solution to the customers. A few weeks before the end of the planned\nworking days we rolled out the enhanced API to our business teams who helped\ncatch and fix a few minor bugs before they could get to our customers. Happy\ntimes \ud83c\udf89.<\/p>\n<h1 id=\"wind-of-change\">Wind of Change<\/h1>\n<p>After about two years since I joined Avaloq, many things started to change. A\nnew Managing Director was hired to replace the previous one who decided to move to Zurich, and brought some new ideas and a rather strong wind of\nchange. Many things were shaken and turned upside-down, albeit some of the changes were not always well appreciated by the office. Long story short, he got fired a\nyear or so later for not being totally in line with the people above him, and\npresumably also because some of that change he brought with didn't quite\nresonate with everybody in the office.<\/p>\n<p>Personally, I think we should give the MD some credits for some of the things he\ncampaigned for. One thing that I appreciated was the idea of introducing\n<em>clans<\/em>. Pick a topic, gather some people that are passionate about it and turn\nthat passion into action to make things better for you and the people around\nyou. This is how I got involved with the <em>Technology and Innovation clan<\/em>, to\nwhich I contributed the idea of regular internal hackathon as a way to make\npeople from different teams have more chances to interact, break from their\ndaily routine, learn a new skill and, more importantly, have some fun together!\nOther members suggested introducing a <em>10% time initiative<\/em>, similar to what\nother companies do, with the aim of improving attractiveness and retention.\nThese indeed have been the times when the office population has been the\nstabler; a sign that, perhaps, the clan had achieved its main goal. We've been\nable to run three more internal hackathons afterwards, before being forced to\nstop due to the pandemic. Overall, the initiative has been very successful, so\nmuch that Zurich <em>stole<\/em> the idea and organised a similar event for the whole\nEMEA area, before opening to external hackathons. Not being able to travel\neverybody to the HQ, every site has been asked to contribute a <em>dream-team<\/em>, and\nI was happy to be picked as one of the representatives for the Edinburgh office.\nThat was the time when my functional manager, which was based in Zurich, took me\nto <a href=\"https:\/\/en.wikipedia.org\/wiki\/Fronalpstock_(Schwyz)\">Fronalpstock<\/a> over the\nweekend to give me a taste of the Swiss Alps.<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/avaloq\/gab_on_fronalpstock.jpg\"\n    alt=\"Visiting Fronalpstock\"\n  \/>\n<\/p>\n\n<p>From a personal perspective, the 10% initiative gave me the chance to work on\nthe Python tooling framework that I have already mentioned earlier on, to which\nI also contributed many hours of my personal time. Ultimately, this turned into\n<a href=\"https:\/\/github.com\/P403n1x87\/sibilla\">Sibilla<\/a>, a sort of Python DAO\/ORM for\nthe Oracle databases that allows you to write queries in a Pythonic way. My\nAvaloq-specific tooling framework was built on top of this general abstraction\nlayer and provided the basis for the many tools that I have built over the years\nto make my life easier at work. My Machine Learning tool for the classification\nof incoming issues was also based on this framework, but more on this other\ntopic later.<\/p>\n<h1 id=\"shipping-containers\">Shipping Containers<\/h1>\n<p>The MD wasn't the only major change that we were experiencing around 2018. That\nwas also the year when the Board of Architects decided that it was about time to\nstart <a href=\"https:\/\/www.avaloq.com\/en\/-\/avaloq-upgrades-open-software-architecture-designed-to-make-banks-more-agile\">moving towards the world of\nmicro-services<\/a>\nand put bits of the product into their own containers and pods. This is one of\nthe times when I have had some disagreement with the way we have been asked to\nroll out the changes for the parts that I was responsible for. Needless to say\nthat what I'm about to tell reflects my very own opinion, to which we are all\nentitled :).<\/p>\n<p>These days, everybody who is maintaining a\n<a href=\"https:\/\/m.signalvnoise.com\/the-majestic-monolith\/\">monolith<\/a> is made to feel\nashamed of it. You either have a zoo of micro-services, or your architects\nshould all be shown the door for manifest incompetency. This new attitude seems\nto have generated a rush in many tech companies to split their products in bits\nand put them into containers, sometimes not even knowing why or how exactly. The\npart I was not in line with was the decision to turn the so-called <em>adapters<\/em>\n(sic.), the tools that we provided to the customer to allow them to connect our\nproduct to third-party systems, into micro-services. In a sense, these tools\nwere already containerised; being Java applications, they had their own\ndependencies which were shipped in their own packages. Other big sub-products\nlooked like better candidates for \"demolition\" and containerisation, but after\nsome considerations they were left out of the discussion. Because of this\ndecision, integrating new code is still a bit of a struggle at times, since one\nof the many parts can still block integration testing for everybody else.<\/p>\n<p>Of all the projects I have been involved with during my time at Avaloq, the\ncontainerisation initiative is perhaps the one I am the least satisfied with.\nPerhaps it's just a matter of taste, but the result we've got in the end didn't\nquite appeal to me. This does not mean that we delivered something broken to the\ncustomer. On the contrary, the adapters have never worked better, thanks also to\nmy personal efforts to include automation testing within the scope of the\nproject. But something with the new state of the code-base didn't quite resonate\nwith me. Have I had the freedom to choose for a different architecture, I would\nhave probably gone a different route. But that's that.<\/p>\n<h1 id=\"the-right-direction\">The Right Direction?<\/h1>\n<p>Something that surprised me from the very beginning was the lack of a Data\nScience team within the company. In 2016, Machine Learning was already quite\nubiquitous, with some of the biggest achievement both in software and hardware\ngiving very satisfactory results in many fields. As someone coming from a\ncompletely different world, I knew absolutely nothing about FinTechs, but in my\nhead they were the prime candidate for adopting ML technologies. So while the\nmain focus was declaring war at the monoliths, I tried starting the Data Science\nfire, hoping that somebody would respond to that. What I did was to use the\nReading Circles to introduce myself and my colleagues (in particular those who,\nlike me, came from a different background a didn't know much about the field) to\nthe subject of Machine Learning and Data Science. The ultimate goal was to\nattract enough people to form a small group and put something together that\ncould get somebody on a more managerial level to blow on that flame. I thought I\nshould have led by example and so, after going through <a href=\"https:\/\/www.manning.com\/books\/deep-learning-with-python-second-edition\">Deep Learning with\nPython<\/a>\nduring a full session of one of the reading circles, I had an idea: why not use\nall the data about customer issues to assist in opening new ones against the\nright team?<\/p>\n<p>One problem that the Support team complained about from time to time was that\nsome issues would be opened against the wrong team, and therefore the laid\naround for too long before the right people could look at them. If only they had\na smart tool to help them re-assign these issues to the right team. With my\nnewly acquired knowledge about text classification, I thought that perhaps,\nhidden in all the issues that have been fixed in the past, and that have then\nbeen assigned to the right team, there was a statistical signal strong enough to\nmake a classification model work with a decent accuracy. My first experiments\nwere based on a bi-directional LSTM models that I had to train on my work laptop\nas I had no access to better hardware. As a consequence, I had to pick just a\nsmall chunk of the data so that I could actually see the result of training\nafter a reasonable amount of time. Once it became clear that there was a signal,\nI decided to go for a Naive Bayes approach on the full data, which was the only\nthing I could train with the resources at my disposal. I have enjoyed working on\nthis project very much and for various reasons. The first is the fact that the\nmodel I was able to train turned out to produce useful classifications, and the\nprototype application that I ended up building on top of it proved very handy to\nsupport. Besides their emails with which they thanked me for making such a tool,\nI was also proud of being awarded an <em>Extraordinary Achievement Reward<\/em>, a\nreward that Avaloq gives to those who distinguished themselves within the\ncompany for achieving something out of the ordinary, as part of their bonus\nscheme. Another reason why I enjoyed this project is because it gave me the\nchance to investigate some ML topics more in-depth, close to research level,\nwhich takes us to the point I was making at the beginning that, even though I\nhave left academia, I can still do research, and at my very own terms. I started\ninvestigating how the hierarchical structure of the teams within the company\ncould have been used to increase the classification accuracy of my models, and\nthat's how I came to know of <a href=\"https:\/\/www.svkir.com\/publications.html\">Kiritchenko's\nwork<\/a> on the topic. The result of this\ninvestigation led me to implement a <a href=\"https:\/\/github.com\/P403n1x87\/marvin\/wiki\/Hierarchical-Classification\"><em>local classifier per level<\/em> hierarchical\nclassification\nmodel<\/a>,\nbut also to an interesting explanation of a phenomenon of uncertainty increase\nin the complement approach to Naive Bayes [1].<\/p>\n<p>Secretly, I was hoping that the sort of visibility that I had gained with this\nproject could have acted as a catalyst for establishing a proper ML team here in\nEdinburgh, the place that has contributed a lot to AI and still does. I created\na <a href=\"https:\/\/en.wikipedia.org\/wiki\/Kaizen\">Kaizen<\/a> initiative (that's the name\nthat was chosen for a Jira board that acts as a sort of incubator for ideas)\nback in March 2020, with the aim of creating a small team to focus on smart\ninternal tooling, a way to unlock the potential of all the data actually owned\nby Avaloq. We had the kick-off meeting a couple of months before my departure\nand I do really hope that things can take off. I would regard that as some sort\nof legacy of my time spent there. The conditions, however, didn't quite seem to\nbe there, and this is also one of the reasons that, in the end, led me to take\nthe decision to move on with my life.<\/p>\n<h1 id=\"social-life\">Social Life<\/h1>\n<p>One great thing that I am surely going to miss of Avaloq is the absolutely\nfantastic atmosphere of the social events. The office in general, on a normal\nday, has that really good vibe that I think it's rare to find. I have had the\nchance to work with really nice people from the moment I stepped in until the\nvery last day. The local team I was part of has been like a family to me. I like\ncue sports and the fact that we had a pool table in the office allowed me to\ncombine my passion with social activities to promote a nice and relaxed\natmosphere. I will surely miss those few games of pool after lunch with some of\nmy former colleagues.<\/p>\n<iframe width=\"560\" height=\"315\" src=\"https:\/\/www.youtube.com\/embed\/v5omd2llgNc\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen><\/iframe>\n\n<p>I have organised pool events of all sorts to bring people from different teams\ntogether, have a laugh at the table and enjoy a wee break from work. We have had\nfairly regular social seasonal and team events throughout the years, each one\nwith a theme, Xmas events etc.... This is what happened at the Halloween event\nthat I have helped co-organise together with the rest of my local team<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/avaloq\/erica_orville.jpg\"\n    alt=\"CPython data structures\"\n  \/>\n<\/p>\n\n<p>No doubt I am going to miss all of this, especially now that my next role is\nremote. Surely I can find a co-working space where I can re-create some sort of\noffice experience, but it won't be like being in the office with the rest of the\nteam. However, on the plus side, I discovered that working from home has its\npros too. Even though Avaloq offers great flexibility in terms of remote\nworking, I have always preferred going to the office for that atmosphere that I\njust mentioned, and I thought I could never work remotely. Of course the\npandemic forced all of us inside, so that I was coerced into experiencing\nworking from home on the long term. I guess, like many other things in life, you\ncan get used to it, and indeed so I did. So never say never in life. Perhaps in\nthe future I'll be able to sneak in to a board games night at the old office :).<\/p>\n<h1 id=\"gabrexit\">Gabrexit<\/h1>\n<p>It's only when I was getting towards the end of my notice period that I have started\nrealising the extent of my contributions at Avaloq. I am generally a modest\nperson and I don't like to brag about my achievements, but I think it's\nimportant that I try to look back at all the work that I have done and give\nmyself a pat on my shoulder for each good thing that I have done. I have already\nmentioned the Python tooling framework. My manager asked me to find a new\ncaretaker of those projects before I left. A colleague of mine helped me migrate\nthe original code-base from Python 2 to Python 3 and so he was the perfect\ncandidate to take over. This is another thing that never flew as high as I'd\nhoped, mostly because Python is not one of the \"official\" technologies adopted at\nAvaloq. However, many people started relying on some of the tools that I wrote,\nso it only makes sense to ensure that somebody can keep looking after them once\nI'm gone.<\/p>\n<p>The other hand-over meeting that I was asked to attend was about the ML project\nfor the classification of issues. As I said, some people liked the idea, and the\nprototype, very much, especially those that were in charge of designing the new\ncustomer support portal. They asked me to go briefly over the general\narchitecture so that they can continue working on it and take the project into\nproduction, so that the customers also can start benefiting from recommended\nteams to assign the issue to as they type the problem description in the portal.<\/p>\n<p>Lastly, I have had a chat with the head of the technical writers. I had\nmentioned to the previous head that I had a prototype for a post-processing tool\nthat could have simplified they way they keep the documentation in sync with\nsome of the latest development on the main product. So I have also had a meeting\nto hand that prototype over to the tech writers team.<\/p>\n<p>Thus, as I was getting closer to my last day at Avaloq, I started realising that\nmy contributions, in the end, went beyond the confines of my team, but spanned\nareas that were quite orthogonal to my duties of Software Engineer for the\nmessaging part of the product, going from customer support to documentation. And\nif the Kaizen initiative does take off, that would be yet another seed that I\nwould have planted and that would hopefully bear my name for me to be remembered\nas somebody who didn't just play pool all the time, but that also did at least a\nfew good things for the company.<\/p>\n<h1 id=\"references\">References<\/h1>\n<p>[1] Gabriele N. Tornetta, <em>Entropy methods for the confidence assessment of probabilistic classification models<\/em>, submitted to Statistica, 2020.<\/p>","category":[{"@attributes":{"term":"Work"}},{"@attributes":{"term":"work"}}]},{"title":"The Austin TUI Way to Resourceful Text-based User Interfaces","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/the-austin-tui-way-to-resourceful-text-based-user-interfaces.html","rel":"alternate"}},"published":"2020-10-26T10:08:00+00:00","updated":"2020-10-26T10:08:00+00:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2020-10-26:\/the-austin-tui-way-to-resourceful-text-based-user-interfaces.html","summary":"<p>The latest version of the Austin TUI project makes use of a custom XML resource file to describe and build the actual text-based UI, using <code>curses<\/code> as back-end. In this post we shall see how one can use the same technology to quickly build reactive TUIs using Python.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#introduction\">Introduction<\/a><\/li>\n<li><a href=\"#the-view-object\">The View Object<\/a><\/li>\n<li><a href=\"#the-widgets\">The Widgets<\/a><\/li>\n<li><a href=\"#updating-the-ui\">Updating the UI<\/a><\/li>\n<li><a href=\"#spicing-things-up-with-colours\">Spicing Things up with Colours<\/a><\/li>\n<\/ul>\n<\/div>\n<h1 id=\"introduction\">Introduction<\/h1>\n<p>Due to the increasing popularity of the sample tools that were included with\nearlier versions of <a href=\"https:\/\/github.com\/p403n1x87\/austin\">Austin<\/a>, I have\ndecided to move them into their own dedicated repositories. The TUI, for\nexample, can now be found at <a href=\"https:\/\/github.com\/p403n1x87\/austin-tui\">Austin\nTUI<\/a>.<\/p>\n<p>Being, as I said, sample tools, the original coding wasn't very pleasant as the\nmain focus was on how to embed Austin into your application, rather than the\napplication itself. So the first step was to come up with a good design so that\nthe code would be tidy and easy to maintain. Austin TUI has also been my very\nfirst attempt at a serious TUI application. The standard approach in Python is\nwith the <a href=\"https:\/\/docs.python.org\/3\/howto\/curses.html\"><code>curses<\/code><\/a> module, but one\nthing that you learn quite quickly is that such a low-level API tends to make\nfor untidy-looking code pretty easily if you're not careful. There are some way\nout of this, in the form of higher level frameworks which offer you many cool\nfeatures and abstractions like <em>widgets<\/em>. None of them, as nice as they are,\nwere to my taste though.<\/p>\n<p>My previous experiences with UI have almost always revolved around the notion of\n<em>resource files<\/em>. That is, the various bits of the user interface, like the main\nwindow, configuration and about dialogs, were all described by some DSL living\nin the project folder as resource files. You can take\n<a href=\"https:\/\/glade.gnome.org\/\">Glade<\/a> as an example, which is also the tool that\ninspired the solution that I ended up developing and adopting for Austin TUI.\nWith Glade, a GTK UI is described by an XML document, which is then parsed at\nruntime to produce the actual UI. What I wanted for Austin TUI was something\nsimilar so that I could de-clutter the Python code from all the UI definition\nlogic, and focus only on the other aspects of a UI project, like event handling.\nI also wanted something that played nicely with the\n<a href=\"https:\/\/en.wikipedia.org\/wiki\/Model%E2%80%93view%E2%80%93controller\">MVC<\/a>\ndesign pattern and, as we shall see briefly, the main concept behind Austin TUI\nis indeed that of a <em>view<\/em>.<\/p>\n<h1 id=\"the-view-object\">The View Object<\/h1>\n<p>The central object of the Austin TUI is the <code>View<\/code> object. This is responsible\nfor refreshing the interface as well as exposing the event handlers for events\nlike key presses. The view itself is not the UI though, but it contains a\nreference to it via the <code>root_widget<\/code>. The novelty in Austin TUI is that we want\nto partially build a view, or at least the UI part, using a resource file.<\/p>\n<p>So take a look at the following minimalist example.<\/p>\n<table class=\"highlighttable\"><tr><td class=\"linenos\"><div class=\"linenodiv\"><pre><span class=\"normal\"> 1<\/span>\n<span class=\"normal\"> 2<\/span>\n<span class=\"normal\"> 3<\/span>\n<span class=\"normal\"> 4<\/span>\n<span class=\"normal\"> 5<\/span>\n<span class=\"normal\"> 6<\/span>\n<span class=\"normal\"> 7<\/span>\n<span class=\"normal\"> 8<\/span>\n<span class=\"normal\"> 9<\/span>\n<span class=\"normal\">10<\/span>\n<span class=\"normal\">11<\/span>\n<span class=\"normal\">12<\/span>\n<span class=\"normal\">13<\/span>\n<span class=\"normal\">14<\/span><\/pre><\/div><\/td><td class=\"code\"><div class=\"highlight\"><pre><span><\/span><span class=\"cp\">&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?&gt;<\/span>\n\n<span class=\"nt\">&lt;aui:MinimalView<\/span> <span class=\"na\">xmlns:aui=<\/span><span class=\"s\">&quot;http:\/\/austin.p403n1x87.com\/ui&quot;<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;minimal_view&quot;<\/span><span class=\"nt\">&gt;<\/span>\n  <span class=\"nt\">&lt;aui:Window<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;main&quot;<\/span><span class=\"nt\">&gt;<\/span>\n    <span class=\"nt\">&lt;aui:Label<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;label&quot;<\/span>\n      <span class=\"na\">text=<\/span><span class=\"s\">&quot;Hello World&quot;<\/span>\n      <span class=\"na\">align=<\/span><span class=\"s\">&quot;center&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n  <span class=\"nt\">&lt;\/aui:Window&gt;<\/span>\n\n  <span class=\"cm\">&lt;!-- Signal mappings --&gt;<\/span>\n\n  <span class=\"nt\">&lt;aui:signal<\/span> <span class=\"na\">key=<\/span><span class=\"s\">&quot;q&quot;<\/span> <span class=\"na\">handler=<\/span><span class=\"s\">&quot;on_quit&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n\n<span class=\"nt\">&lt;\/aui:MinimalView&gt;<\/span>\n<\/pre><\/div>\n<\/td><\/tr><\/table>\n\n<p><em>minimal-view.xml<\/em><\/p>\n<p>The above XML document describes a view with two main components: a <code>Window<\/code>\nelement, which provides the root node for the UI, and a <code>signal<\/code> to bind the\nmethod <code>on_quit<\/code> to the key <code>q<\/code>. The UI itself contains a single label that will\ndisplay the text <code>Hello World<\/code>. When we run an application that uses this UI, we\nexpect to see the text <code>Hello World<\/code> centred on the screen and we also expect to\nexit as soon as we press <code>Q<\/code>.<\/p>\n<p>So what do we need to make the above UI work? With the framework included in\nAustin TUI, all that we need to do is declare a subclass of <code>View<\/code> with the same\nname as the root node of the XML document, that is <code>MinimalView<\/code> in this case,\nand then build the actual view object using the <code>ViewBuilder<\/code> class, like so<\/p>\n<table class=\"highlighttable\"><tr><td class=\"linenos\"><div class=\"linenodiv\"><pre><span class=\"normal\"> 1<\/span>\n<span class=\"normal\"> 2<\/span>\n<span class=\"normal\"> 3<\/span>\n<span class=\"normal\"> 4<\/span>\n<span class=\"normal\"> 5<\/span>\n<span class=\"normal\"> 6<\/span>\n<span class=\"normal\"> 7<\/span>\n<span class=\"normal\"> 8<\/span>\n<span class=\"normal\"> 9<\/span>\n<span class=\"normal\">10<\/span>\n<span class=\"normal\">11<\/span>\n<span class=\"normal\">12<\/span>\n<span class=\"normal\">13<\/span>\n<span class=\"normal\">14<\/span>\n<span class=\"normal\">15<\/span>\n<span class=\"normal\">16<\/span>\n<span class=\"normal\">17<\/span>\n<span class=\"normal\">18<\/span>\n<span class=\"normal\">19<\/span>\n<span class=\"normal\">20<\/span>\n<span class=\"normal\">21<\/span>\n<span class=\"normal\">22<\/span>\n<span class=\"normal\">23<\/span>\n<span class=\"normal\">24<\/span><\/pre><\/div><\/td><td class=\"code\"><div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">asyncio<\/span>\n\n<span class=\"kn\">from<\/span> <span class=\"nn\">austin_tui.view<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">View<\/span><span class=\"p\">,<\/span> <span class=\"n\">ViewBuilder<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">MinimalView<\/span><span class=\"p\">(<\/span><span class=\"n\">View<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"nf\">on_quit<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">data<\/span><span class=\"o\">=<\/span><span class=\"kc\">None<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">raise<\/span> <span class=\"ne\">KeyboardInterrupt<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;quit signal&quot;<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">main<\/span><span class=\"p\">():<\/span>\n    <span class=\"k\">with<\/span> <span class=\"nb\">open<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;minimal-view.xml&quot;<\/span><span class=\"p\">)<\/span> <span class=\"k\">as<\/span> <span class=\"n\">view_stream<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">view<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ViewBuilder<\/span><span class=\"o\">.<\/span><span class=\"n\">from_stream<\/span><span class=\"p\">(<\/span><span class=\"n\">view_stream<\/span><span class=\"p\">)<\/span>\n        <span class=\"n\">view<\/span><span class=\"o\">.<\/span><span class=\"n\">open<\/span><span class=\"p\">()<\/span>\n\n    <span class=\"k\">try<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">asyncio<\/span><span class=\"o\">.<\/span><span class=\"n\">get_event_loop<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"n\">run_forever<\/span><span class=\"p\">()<\/span>\n    <span class=\"k\">except<\/span> <span class=\"ne\">KeyboardInterrupt<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">view<\/span><span class=\"o\">.<\/span><span class=\"n\">close<\/span><span class=\"p\">()<\/span>\n        <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;Bye!&quot;<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"k\">if<\/span> <span class=\"vm\">__name__<\/span> <span class=\"o\">==<\/span> <span class=\"s2\">&quot;__main__&quot;<\/span><span class=\"p\">:<\/span>\n    <span class=\"n\">main<\/span><span class=\"p\">()<\/span>\n<\/pre><\/div>\n<\/td><\/tr><\/table>\n\n<p>This is all the Python code required to build a minimalist TUI that displays a\nlabel on the screen and that quits whenever the user presses <code>Q<\/code>. Things we\nnotice from this example are<\/p>\n<ul>\n<li>we are using <code>asyncio<\/code> to handle user events on the UI; this means that we\n   can schedule our own asynchronous task without making the UI unresponsive;<\/li>\n<li>we use the static method <code>from_stream<\/code> of the <code>ViewBuilder<\/code> class to build a\n   UI from file; if the file resided inside a Python module we could have used\n   the <code>from_resource<\/code> static method instead for convenience;<\/li>\n<li>we call the <code>open<\/code> method on the view object to display the UI;<\/li>\n<li>we call the <code>close<\/code> method on the view object to close the UI and restore the\n   terminal to its original status.<\/li>\n<\/ul>\n<p>In this particular example, we make the <code>on_quit<\/code> event handler simulate a\nkeyboard interrupt and we handle <code>KeyboardInterrupt<\/code> to quit nicely.<\/p>\n<h1 id=\"the-widgets\">The Widgets<\/h1>\n<p>Austin TUI uses the widget abstraction too. Elements like <code>Window<\/code> and <code>Label<\/code>\nthat we have seen above are all exposed by the Austin TUI library via the\n<code>austin_tui.widgets.catalog<\/code> sub-module. A window is a simple logical container\nthat can hold a single child, spanning the full content of the window. In the\nexample above, the only child of the window is the <code>Label<\/code> widget identified by\nthe name <code>label<\/code>. If you want to add multiple children to the window, you would\nwant to include an intermediate <code>Box<\/code> container widget which acts as an HTML5\nflex container. Let's see how we can build a simple UI for a minimalist <code>top<\/code>\nutility.<\/p>\n<table class=\"highlighttable\"><tr><td class=\"linenos\"><div class=\"linenodiv\"><pre><span class=\"normal\"> 1<\/span>\n<span class=\"normal\"> 2<\/span>\n<span class=\"normal\"> 3<\/span>\n<span class=\"normal\"> 4<\/span>\n<span class=\"normal\"> 5<\/span>\n<span class=\"normal\"> 6<\/span>\n<span class=\"normal\"> 7<\/span>\n<span class=\"normal\"> 8<\/span>\n<span class=\"normal\"> 9<\/span>\n<span class=\"normal\">10<\/span>\n<span class=\"normal\">11<\/span>\n<span class=\"normal\">12<\/span>\n<span class=\"normal\">13<\/span>\n<span class=\"normal\">14<\/span>\n<span class=\"normal\">15<\/span>\n<span class=\"normal\">16<\/span>\n<span class=\"normal\">17<\/span>\n<span class=\"normal\">18<\/span>\n<span class=\"normal\">19<\/span>\n<span class=\"normal\">20<\/span>\n<span class=\"normal\">21<\/span>\n<span class=\"normal\">22<\/span>\n<span class=\"normal\">23<\/span>\n<span class=\"normal\">24<\/span>\n<span class=\"normal\">25<\/span>\n<span class=\"normal\">26<\/span>\n<span class=\"normal\">27<\/span>\n<span class=\"normal\">28<\/span>\n<span class=\"normal\">29<\/span>\n<span class=\"normal\">30<\/span>\n<span class=\"normal\">31<\/span>\n<span class=\"normal\">32<\/span>\n<span class=\"normal\">33<\/span>\n<span class=\"normal\">34<\/span>\n<span class=\"normal\">35<\/span>\n<span class=\"normal\">36<\/span>\n<span class=\"normal\">37<\/span>\n<span class=\"normal\">38<\/span>\n<span class=\"normal\">39<\/span>\n<span class=\"normal\">40<\/span>\n<span class=\"normal\">41<\/span>\n<span class=\"normal\">42<\/span>\n<span class=\"normal\">43<\/span>\n<span class=\"normal\">44<\/span>\n<span class=\"normal\">45<\/span>\n<span class=\"normal\">46<\/span>\n<span class=\"normal\">47<\/span>\n<span class=\"normal\">48<\/span>\n<span class=\"normal\">49<\/span>\n<span class=\"normal\">50<\/span>\n<span class=\"normal\">51<\/span>\n<span class=\"normal\">52<\/span>\n<span class=\"normal\">53<\/span>\n<span class=\"normal\">54<\/span>\n<span class=\"normal\">55<\/span>\n<span class=\"normal\">56<\/span>\n<span class=\"normal\">57<\/span>\n<span class=\"normal\">58<\/span>\n<span class=\"normal\">59<\/span>\n<span class=\"normal\">60<\/span>\n<span class=\"normal\">61<\/span>\n<span class=\"normal\">62<\/span>\n<span class=\"normal\">63<\/span>\n<span class=\"normal\">64<\/span>\n<span class=\"normal\">65<\/span>\n<span class=\"normal\">66<\/span>\n<span class=\"normal\">67<\/span>\n<span class=\"normal\">68<\/span>\n<span class=\"normal\">69<\/span>\n<span class=\"normal\">70<\/span>\n<span class=\"normal\">71<\/span>\n<span class=\"normal\">72<\/span>\n<span class=\"normal\">73<\/span><\/pre><\/div><\/td><td class=\"code\"><div class=\"highlight\"><pre><span><\/span><span class=\"cp\">&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?&gt;<\/span>\n\n<span class=\"nt\">&lt;aui:MiniTop<\/span> <span class=\"na\">xmlns:aui=<\/span><span class=\"s\">&quot;http:\/\/austin.p403n1x87.com\/ui&quot;<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;tui&quot;<\/span><span class=\"nt\">&gt;<\/span>\n  <span class=\"nt\">&lt;aui:Window<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;main&quot;<\/span><span class=\"nt\">&gt;<\/span>\n    <span class=\"nt\">&lt;aui:Box<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;main_box&quot;<\/span> <span class=\"na\">flow=<\/span><span class=\"s\">&quot;v&quot;<\/span><span class=\"nt\">&gt;<\/span>\n      <span class=\"nt\">&lt;aui:Box<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;overview_box&quot;<\/span> <span class=\"na\">flow=<\/span><span class=\"s\">&quot;h&quot;<\/span><span class=\"nt\">&gt;<\/span>\n        <span class=\"nt\">&lt;aui:Label<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;nprocs_label&quot;<\/span>\n          <span class=\"na\">text=<\/span><span class=\"s\">&quot;No. of procs.&quot;<\/span>\n          <span class=\"na\">width=<\/span><span class=\"s\">&quot;16&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n\n        <span class=\"nt\">&lt;aui:Label<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;nprocs&quot;<\/span>\n          <span class=\"na\">align=<\/span><span class=\"s\">&quot;center&quot;<\/span>\n          <span class=\"na\">width=<\/span><span class=\"s\">&quot;8&quot;<\/span>\n          <span class=\"na\">bold=<\/span><span class=\"s\">&quot;true&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n\n        <span class=\"nt\">&lt;aui:Label<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;cpu_label&quot;<\/span>\n          <span class=\"na\">text=<\/span><span class=\"s\">&quot;Total %CPU&quot;<\/span>\n          <span class=\"na\">width=<\/span><span class=\"s\">&quot;16&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n\n        <span class=\"nt\">&lt;aui:Label<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;cpu&quot;<\/span>\n          <span class=\"na\">align=<\/span><span class=\"s\">&quot;right&quot;<\/span>\n          <span class=\"na\">width=<\/span><span class=\"s\">&quot;6&quot;<\/span>\n          <span class=\"na\">bold=<\/span><span class=\"s\">&quot;true&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n      <span class=\"nt\">&lt;\/aui:Box&gt;<\/span>\n\n      <span class=\"nt\">&lt;aui:Box<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;table_header&quot;<\/span> <span class=\"na\">flow=<\/span><span class=\"s\">&quot;h&quot;<\/span><span class=\"nt\">&gt;<\/span>\n        <span class=\"nt\">&lt;aui:Label<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;own&quot;<\/span>\n          <span class=\"na\">text=<\/span><span class=\"s\">&quot;PID&quot;<\/span>\n          <span class=\"na\">align=<\/span><span class=\"s\">&quot;right&quot;<\/span>\n          <span class=\"na\">width=<\/span><span class=\"s\">&quot;8&quot;<\/span>\n          <span class=\"na\">bold=<\/span><span class=\"s\">&quot;true&quot;<\/span>\n          <span class=\"na\">reverse=<\/span><span class=\"s\">&quot;true&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n        <span class=\"nt\">&lt;aui:Label<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;proc_cpu&quot;<\/span>\n          <span class=\"na\">text=<\/span><span class=\"s\">&quot;%CPU&quot;<\/span>\n          <span class=\"na\">align=<\/span><span class=\"s\">&quot;center&quot;<\/span>\n          <span class=\"na\">width=<\/span><span class=\"s\">&quot;10&quot;<\/span>\n          <span class=\"na\">bold=<\/span><span class=\"s\">&quot;true&quot;<\/span>\n          <span class=\"na\">reverse=<\/span><span class=\"s\">&quot;true&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n        <span class=\"nt\">&lt;aui:Label<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;cmdline&quot;<\/span>\n          <span class=\"na\">text=<\/span><span class=\"s\">&quot;COMMAND LINE&quot;<\/span>\n          <span class=\"na\">bold=<\/span><span class=\"s\">&quot;true&quot;<\/span>\n          <span class=\"na\">reverse=<\/span><span class=\"s\">&quot;true&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n      <span class=\"nt\">&lt;\/aui:Box&gt;<\/span>\n\n      <span class=\"nt\">&lt;aui:ScrollView<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;proc_view&quot;<\/span><span class=\"nt\">&gt;<\/span>\n        <span class=\"nt\">&lt;aui:Table<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;table&quot;<\/span> <span class=\"na\">columns=<\/span><span class=\"s\">&quot;3&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n      <span class=\"nt\">&lt;\/aui:ScrollView&gt;<\/span>\n\n      <span class=\"nt\">&lt;aui:Label<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;footer&quot;<\/span>\n        <span class=\"na\">text=<\/span><span class=\"s\">&quot;Press Q to exit.&quot;<\/span>\n        <span class=\"na\">align=<\/span><span class=\"s\">&quot;center&quot;<\/span>\n        <span class=\"na\">bold=<\/span><span class=\"s\">&quot;true&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n    <span class=\"nt\">&lt;\/aui:Box&gt;<\/span>\n  <span class=\"nt\">&lt;\/aui:Window&gt;<\/span>\n\n  <span class=\"cm\">&lt;!-- Signal mappings --&gt;<\/span>\n\n  <span class=\"nt\">&lt;aui:signal<\/span> <span class=\"na\">key=<\/span><span class=\"s\">&quot;q&quot;<\/span>         <span class=\"na\">handler=<\/span><span class=\"s\">&quot;on_quit&quot;<\/span>   <span class=\"nt\">\/&gt;<\/span>\n  <span class=\"nt\">&lt;aui:signal<\/span> <span class=\"na\">key=<\/span><span class=\"s\">&quot;KEY_UP&quot;<\/span>    <span class=\"na\">handler=<\/span><span class=\"s\">&quot;on_up&quot;<\/span>     <span class=\"nt\">\/&gt;<\/span>\n  <span class=\"nt\">&lt;aui:signal<\/span> <span class=\"na\">key=<\/span><span class=\"s\">&quot;KEY_DOWN&quot;<\/span>  <span class=\"na\">handler=<\/span><span class=\"s\">&quot;on_down&quot;<\/span>   <span class=\"nt\">\/&gt;<\/span>\n  <span class=\"nt\">&lt;aui:signal<\/span> <span class=\"na\">key=<\/span><span class=\"s\">&quot;KEY_PPAGE&quot;<\/span> <span class=\"na\">handler=<\/span><span class=\"s\">&quot;on_pgup&quot;<\/span>   <span class=\"nt\">\/&gt;<\/span>\n  <span class=\"nt\">&lt;aui:signal<\/span> <span class=\"na\">key=<\/span><span class=\"s\">&quot;KEY_NPAGE&quot;<\/span> <span class=\"na\">handler=<\/span><span class=\"s\">&quot;on_pgdown&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n\n  <span class=\"cm\">&lt;!-- Palette --&gt;<\/span>\n\n  <span class=\"nt\">&lt;aui:palette&gt;<\/span>\n    <span class=\"nt\">&lt;aui:color<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;pid&quot;<\/span>  <span class=\"na\">fg=<\/span><span class=\"s\">&quot;3&quot;<\/span>   <span class=\"nt\">\/&gt;<\/span>\n    <span class=\"nt\">&lt;aui:color<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;opt&quot;<\/span>  <span class=\"na\">fg=<\/span><span class=\"s\">&quot;4&quot;<\/span>   <span class=\"nt\">\/&gt;<\/span>\n    <span class=\"nt\">&lt;aui:color<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;cmd&quot;<\/span>  <span class=\"na\">fg=<\/span><span class=\"s\">&quot;10&quot;<\/span>  <span class=\"nt\">\/&gt;<\/span>\n    <span class=\"nt\">&lt;aui:color<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;args&quot;<\/span> <span class=\"na\">fg=<\/span><span class=\"s\">&quot;246&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n  <span class=\"nt\">&lt;\/aui:palette&gt;<\/span>\n\n<span class=\"nt\">&lt;\/aui:MiniTop&gt;<\/span>\n<\/pre><\/div>\n<\/td><\/tr><\/table>\n\n<p><em>mini-top.xml<\/em><\/p>\n<p>This is a slightly bigger XML document where we have a few more nested widgets,\nas well as a new feature: the <code>palette<\/code> element. Let's start by looking at the\nUI part. We see that the child of the <code>Window<\/code> element is now a <code>Box<\/code> with\nvertical flow. This means that all the children that we add to this box will\nspan the whole width and pile up vertically. We use this element to divide the\nwindow into three parts: the top one will hold some summary stats; the middle\nwill hold the process table; the bottom part is just a label telling us how to\nquit the application.<\/p>\n<p>We see that the top part is just another <code>Box<\/code>, this time with horizontal flow.\nInside we have four labels, two of which have fixed content and act as actual\nlabels, describing the values that we will update. In this case we will keep\ntrack of the number of active process and the total CPU load.<\/p>\n<p>The middle section of the UI is a <code>ScrollView<\/code>, which allows us create widgets\nwithin it that overflow the actual terminal size. This is an abstraction that\nmakes for easy scrolling of overflowing content. In this example, inside the\n<code>ScrollView<\/code> we have a <code>Table<\/code> object with three columns; these will be the\nprocess ID, the CPU load for the process and its command line.<\/p>\n<p>To make the UI appealing to the eye, we shall make use of colours, and this is\nwhere the new <code>palette<\/code> element in the XML document steps in. This is used to\ngive a name to <code>curses<\/code> colour pairs. In this particular example we are changing\nthe foreground colour only, but in principle we could change the background as\nwell by setting the <code>bg<\/code> attribute. We'll see with the Python code below how to\neasily reference the colours in the palette.<\/p>\n<p>Before moving on though, I appreciate that there isn't much of an official\ndocumentation of the UI framework used by the Austin TUI project, and this post\nis a way to make up for that as much as possible for now. Looking at all the\nsample XML above you might be wondering where all those attributes come from.\nFor example, when we look at a <code>Label<\/code> element, we see<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nt\">&lt;aui:Label<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;label&quot;<\/span>\n  <span class=\"na\">text=<\/span><span class=\"s\">&quot;Hello World&quot;<\/span>\n  <span class=\"na\">align=<\/span><span class=\"s\">&quot;center&quot;<\/span> <span class=\"nt\">\/&gt;<\/span>\n<\/pre><\/div>\n\n\n<p>The attributes <code>name<\/code>, <code>text<\/code> and <code>align<\/code> are precisely the arguments of the\n<code>__init__<\/code> method of the <code>Label<\/code> class. Hence, if you want to find out which\nattributes are available for a certain widget you will have to find it in the\nwidget collection and look at its constructor.<\/p>\n<blockquote>\n<p>Every widget has at least the <code>name<\/code> attribute. Every other widget requires\nits own set of attributes.<\/p>\n<\/blockquote>\n<p>Widgets are discovered by the view builder in a dynamic way, which means that\nyou could sub-class <code>Widget<\/code> and make your own widgets. If you do so and want\nto reference your custom widget in the XML document, all you have to do is use\nthe class name as element name. For example, if you have something like<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">from<\/span> <span class=\"nn\">austin_tui.widgets<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">Widget<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">MyWidget<\/span><span class=\"p\">(<\/span><span class=\"n\">Widget<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">name<\/span><span class=\"p\">,<\/span> <span class=\"n\">some_attribute<\/span><span class=\"p\">):<\/span>\n        <span class=\"nb\">super<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"n\">name<\/span><span class=\"p\">)<\/span>\n        <span class=\"o\">...<\/span>\n<\/pre><\/div>\n\n\n<p>then in the XML document you would have something like<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nt\">&lt;aui:MyWidget<\/span> <span class=\"na\">name=<\/span><span class=\"s\">&quot;mywidget_instance&quot;<\/span> <span class=\"na\">some-attribute=<\/span><span class=\"s\">&quot;42&quot;<\/span> <span class=\"nt\">&gt;<\/span>\n  <span class=\"cm\">&lt;!-- any potential children here --&gt;<\/span>\n<span class=\"nt\">&lt;\/aui:MyWidget&gt;<\/span>\n<\/pre><\/div>\n\n\n<h1 id=\"updating-the-ui\">Updating the UI<\/h1>\n<p>For the model part of the MVC pattern we don't have much to say here as that\nwill depend upon your application. In this post I will show you how to make a\nminimalist top application, so we can have a look what the model code could look\nlike in this case a move over to the more interesting bit, which is the <code>C<\/code> in\nMVC.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">psutil<\/span>\n\n\n<span class=\"n\">data<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">sorted<\/span><span class=\"p\">(<\/span>\n    <span class=\"p\">[<\/span>\n        <span class=\"p\">(<\/span>\n            <span class=\"n\">p<\/span><span class=\"o\">.<\/span><span class=\"n\">info<\/span><span class=\"p\">[<\/span><span class=\"s2\">&quot;pid&quot;<\/span><span class=\"p\">],<\/span>\n            <span class=\"n\">p<\/span><span class=\"o\">.<\/span><span class=\"n\">info<\/span><span class=\"p\">[<\/span><span class=\"s2\">&quot;cpu_percent&quot;<\/span><span class=\"p\">],<\/span>\n            <span class=\"n\">p<\/span><span class=\"o\">.<\/span><span class=\"n\">info<\/span><span class=\"p\">[<\/span><span class=\"s2\">&quot;cmdline&quot;<\/span><span class=\"p\">],<\/span>\n        <span class=\"p\">)<\/span>\n        <span class=\"k\">for<\/span> <span class=\"n\">p<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">psutil<\/span><span class=\"o\">.<\/span><span class=\"n\">process_iter<\/span><span class=\"p\">([<\/span><span class=\"s2\">&quot;pid&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;cpu_percent&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;cmdline&quot;<\/span><span class=\"p\">])<\/span>\n    <span class=\"p\">],<\/span>\n    <span class=\"n\">key<\/span><span class=\"o\">=<\/span><span class=\"k\">lambda<\/span> <span class=\"n\">x<\/span><span class=\"p\">:<\/span> <span class=\"n\">x<\/span><span class=\"p\">[<\/span><span class=\"mi\">1<\/span><span class=\"p\">],<\/span>\n    <span class=\"n\">reverse<\/span><span class=\"o\">=<\/span><span class=\"kc\">True<\/span><span class=\"p\">,<\/span>\n<span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>If you are familiar with the <code>psutil<\/code> module, you will see that we are iterating\nover all active processes to extract some information from them. In this case we\nare interested in the PIDs, the CPU usage and the command line. These three\nvalues will be used to fill in the three columns of the <code>Table<\/code> widgets that we\nintroduced in <code>mini-top.xml<\/code>. As we will see, there is some more code that we\ncould put into the model part of our design, but for simplicity we will embed\nthat into the controller. The  code below shows you how to update the UI every 2\nseconds with fresh system data.<\/p>\n<table class=\"highlighttable\"><tr><td class=\"linenos\"><div class=\"linenodiv\"><pre><span class=\"normal\"> 1<\/span>\n<span class=\"normal\"> 2<\/span>\n<span class=\"normal\"> 3<\/span>\n<span class=\"normal\"> 4<\/span>\n<span class=\"normal\"> 5<\/span>\n<span class=\"normal\"> 6<\/span>\n<span class=\"normal\"> 7<\/span>\n<span class=\"normal\"> 8<\/span>\n<span class=\"normal\"> 9<\/span>\n<span class=\"normal\">10<\/span>\n<span class=\"normal\">11<\/span>\n<span class=\"normal\">12<\/span>\n<span class=\"normal\">13<\/span>\n<span class=\"normal\">14<\/span>\n<span class=\"normal\">15<\/span>\n<span class=\"normal\">16<\/span>\n<span class=\"normal\">17<\/span>\n<span class=\"normal\">18<\/span>\n<span class=\"normal\">19<\/span>\n<span class=\"normal\">20<\/span>\n<span class=\"normal\">21<\/span>\n<span class=\"normal\">22<\/span>\n<span class=\"normal\">23<\/span>\n<span class=\"normal\">24<\/span>\n<span class=\"normal\">25<\/span>\n<span class=\"normal\">26<\/span>\n<span class=\"normal\">27<\/span>\n<span class=\"normal\">28<\/span>\n<span class=\"normal\">29<\/span>\n<span class=\"normal\">30<\/span>\n<span class=\"normal\">31<\/span>\n<span class=\"normal\">32<\/span>\n<span class=\"normal\">33<\/span>\n<span class=\"normal\">34<\/span>\n<span class=\"normal\">35<\/span>\n<span class=\"normal\">36<\/span>\n<span class=\"normal\">37<\/span>\n<span class=\"normal\">38<\/span>\n<span class=\"normal\">39<\/span>\n<span class=\"normal\">40<\/span>\n<span class=\"normal\">41<\/span>\n<span class=\"normal\">42<\/span>\n<span class=\"normal\">43<\/span>\n<span class=\"normal\">44<\/span>\n<span class=\"normal\">45<\/span>\n<span class=\"normal\">46<\/span>\n<span class=\"normal\">47<\/span>\n<span class=\"normal\">48<\/span>\n<span class=\"normal\">49<\/span>\n<span class=\"normal\">50<\/span>\n<span class=\"normal\">51<\/span>\n<span class=\"normal\">52<\/span>\n<span class=\"normal\">53<\/span>\n<span class=\"normal\">54<\/span>\n<span class=\"normal\">55<\/span>\n<span class=\"normal\">56<\/span>\n<span class=\"normal\">57<\/span>\n<span class=\"normal\">58<\/span>\n<span class=\"normal\">59<\/span>\n<span class=\"normal\">60<\/span>\n<span class=\"normal\">61<\/span>\n<span class=\"normal\">62<\/span>\n<span class=\"normal\">63<\/span>\n<span class=\"normal\">64<\/span>\n<span class=\"normal\">65<\/span>\n<span class=\"normal\">66<\/span>\n<span class=\"normal\">67<\/span>\n<span class=\"normal\">68<\/span>\n<span class=\"normal\">69<\/span>\n<span class=\"normal\">70<\/span>\n<span class=\"normal\">71<\/span>\n<span class=\"normal\">72<\/span>\n<span class=\"normal\">73<\/span>\n<span class=\"normal\">74<\/span>\n<span class=\"normal\">75<\/span>\n<span class=\"normal\">76<\/span>\n<span class=\"normal\">77<\/span>\n<span class=\"normal\">78<\/span>\n<span class=\"normal\">79<\/span>\n<span class=\"normal\">80<\/span>\n<span class=\"normal\">81<\/span>\n<span class=\"normal\">82<\/span>\n<span class=\"normal\">83<\/span>\n<span class=\"normal\">84<\/span>\n<span class=\"normal\">85<\/span>\n<span class=\"normal\">86<\/span>\n<span class=\"normal\">87<\/span>\n<span class=\"normal\">88<\/span>\n<span class=\"normal\">89<\/span>\n<span class=\"normal\">90<\/span>\n<span class=\"normal\">91<\/span><\/pre><\/div><\/td><td class=\"code\"><div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">asyncio<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">psutil<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">austin_tui.view<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">View<\/span><span class=\"p\">,<\/span> <span class=\"n\">ViewBuilder<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">format_cmdline<\/span><span class=\"p\">(<\/span><span class=\"n\">cmdline<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">if<\/span> <span class=\"ow\">not<\/span> <span class=\"n\">cmdline<\/span><span class=\"p\">:<\/span>\n        <span class=\"k\">return<\/span> <span class=\"s2\">&quot;&quot;<\/span>\n    <span class=\"n\">cmd<\/span><span class=\"p\">,<\/span> <span class=\"o\">*<\/span><span class=\"n\">args<\/span> <span class=\"o\">=<\/span> <span class=\"n\">cmdline<\/span>\n    <span class=\"n\">args<\/span> <span class=\"o\">=<\/span> <span class=\"s2\">&quot; &quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">(<\/span>\n        <span class=\"p\">[<\/span><span class=\"n\">arg<\/span> <span class=\"k\">if<\/span> <span class=\"n\">arg<\/span><span class=\"o\">.<\/span><span class=\"n\">startswith<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;-&quot;<\/span><span class=\"p\">)<\/span> <span class=\"k\">else<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;b&gt;&lt;opt&gt;<\/span><span class=\"si\">{<\/span><span class=\"n\">arg<\/span><span class=\"si\">}<\/span><span class=\"s2\">&lt;\/opt&gt;&lt;\/b&gt;&quot;<\/span> <span class=\"k\">for<\/span> <span class=\"n\">arg<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">args<\/span><span class=\"p\">]<\/span>\n    <span class=\"p\">)<\/span>\n    <span class=\"k\">return<\/span> <span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;cmd&gt;<\/span><span class=\"si\">{<\/span><span class=\"n\">cmd<\/span><span class=\"si\">}<\/span><span class=\"s2\">&lt;\/cmd&gt; &lt;args&gt;<\/span><span class=\"si\">{<\/span><span class=\"n\">args<\/span><span class=\"si\">}<\/span><span class=\"s2\">&lt;\/args&gt;&quot;<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">MiniTop<\/span><span class=\"p\">(<\/span><span class=\"n\">View<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"nf\">on_quit<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">data<\/span><span class=\"o\">=<\/span><span class=\"kc\">None<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">raise<\/span> <span class=\"ne\">KeyboardInterrupt<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;quit signal&quot;<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">on_pgdown<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">data<\/span><span class=\"o\">=<\/span><span class=\"kc\">None<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">proc_view<\/span><span class=\"o\">.<\/span><span class=\"n\">scroll_down<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">table<\/span><span class=\"o\">.<\/span><span class=\"n\">height<\/span> <span class=\"o\">-<\/span> <span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">proc_view<\/span><span class=\"o\">.<\/span><span class=\"n\">refresh<\/span><span class=\"p\">()<\/span>\n        <span class=\"k\">return<\/span> <span class=\"kc\">False<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">on_pgup<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">data<\/span><span class=\"o\">=<\/span><span class=\"kc\">None<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">proc_view<\/span><span class=\"o\">.<\/span><span class=\"n\">scroll_up<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">table<\/span><span class=\"o\">.<\/span><span class=\"n\">height<\/span> <span class=\"o\">-<\/span> <span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">proc_view<\/span><span class=\"o\">.<\/span><span class=\"n\">refresh<\/span><span class=\"p\">()<\/span>\n        <span class=\"k\">return<\/span> <span class=\"kc\">False<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">on_up<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">data<\/span><span class=\"o\">=<\/span><span class=\"kc\">None<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">proc_view<\/span><span class=\"o\">.<\/span><span class=\"n\">scroll_up<\/span><span class=\"p\">()<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">proc_view<\/span><span class=\"o\">.<\/span><span class=\"n\">refresh<\/span><span class=\"p\">()<\/span>\n        <span class=\"k\">return<\/span> <span class=\"kc\">False<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">on_down<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">data<\/span><span class=\"o\">=<\/span><span class=\"kc\">None<\/span><span class=\"p\">):<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">proc_view<\/span><span class=\"o\">.<\/span><span class=\"n\">scroll_down<\/span><span class=\"p\">()<\/span>\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">proc_view<\/span><span class=\"o\">.<\/span><span class=\"n\">refresh<\/span><span class=\"p\">()<\/span>\n        <span class=\"k\">return<\/span> <span class=\"kc\">False<\/span>\n\n    <span class=\"k\">async<\/span> <span class=\"k\">def<\/span> <span class=\"nf\">update<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">while<\/span> <span class=\"kc\">True<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">data<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">sorted<\/span><span class=\"p\">(<\/span>\n                <span class=\"p\">[<\/span>\n                    <span class=\"p\">(<\/span>\n                        <span class=\"n\">p<\/span><span class=\"o\">.<\/span><span class=\"n\">info<\/span><span class=\"p\">[<\/span><span class=\"s2\">&quot;pid&quot;<\/span><span class=\"p\">],<\/span>\n                        <span class=\"n\">p<\/span><span class=\"o\">.<\/span><span class=\"n\">info<\/span><span class=\"p\">[<\/span><span class=\"s2\">&quot;cpu_percent&quot;<\/span><span class=\"p\">],<\/span>\n                        <span class=\"n\">format_cmdline<\/span><span class=\"p\">(<\/span><span class=\"n\">p<\/span><span class=\"o\">.<\/span><span class=\"n\">info<\/span><span class=\"p\">[<\/span><span class=\"s2\">&quot;cmdline&quot;<\/span><span class=\"p\">]),<\/span>\n                    <span class=\"p\">)<\/span>\n                    <span class=\"k\">for<\/span> <span class=\"n\">p<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">psutil<\/span><span class=\"o\">.<\/span><span class=\"n\">process_iter<\/span><span class=\"p\">([<\/span><span class=\"s2\">&quot;pid&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;cpu_percent&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;cmdline&quot;<\/span><span class=\"p\">])<\/span>\n                <span class=\"p\">],<\/span>\n                <span class=\"n\">key<\/span><span class=\"o\">=<\/span><span class=\"k\">lambda<\/span> <span class=\"n\">x<\/span><span class=\"p\">:<\/span> <span class=\"n\">x<\/span><span class=\"p\">[<\/span><span class=\"mi\">1<\/span><span class=\"p\">],<\/span>\n                <span class=\"n\">reverse<\/span><span class=\"o\">=<\/span><span class=\"kc\">True<\/span><span class=\"p\">,<\/span>\n            <span class=\"p\">)<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">table<\/span><span class=\"o\">.<\/span><span class=\"n\">set_data<\/span><span class=\"p\">(<\/span>\n                <span class=\"p\">[<\/span>\n                    <span class=\"p\">(<\/span>\n                        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">markup<\/span><span class=\"p\">(<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;pid&gt;<\/span><span class=\"si\">{<\/span><span class=\"n\">pid<\/span><span class=\"si\">:<\/span><span class=\"s2\">8d<\/span><span class=\"si\">}<\/span><span class=\"s2\">&lt;\/pid&gt;&quot;<\/span><span class=\"p\">),<\/span>\n                        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">markup<\/span><span class=\"p\">(<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;b&gt;<\/span><span class=\"si\">{<\/span><span class=\"n\">cpu<\/span><span class=\"si\">:<\/span><span class=\"s2\">^10.2f<\/span><span class=\"si\">}<\/span><span class=\"s2\">&lt;\/b&gt;&quot;<\/span><span class=\"p\">),<\/span>\n                        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">markup<\/span><span class=\"p\">(<\/span><span class=\"n\">format_cmdline<\/span><span class=\"p\">(<\/span><span class=\"n\">cmdline<\/span><span class=\"p\">)),<\/span>\n                    <span class=\"p\">)<\/span>\n                    <span class=\"k\">for<\/span> <span class=\"n\">pid<\/span><span class=\"p\">,<\/span> <span class=\"n\">cpu<\/span><span class=\"p\">,<\/span> <span class=\"n\">cmdline<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">data<\/span>\n                <span class=\"p\">]<\/span>\n            <span class=\"p\">)<\/span>\n\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">nprocs<\/span><span class=\"o\">.<\/span><span class=\"n\">set_text<\/span><span class=\"p\">(<\/span><span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"n\">data<\/span><span class=\"p\">)))<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">cpu<\/span><span class=\"o\">.<\/span><span class=\"n\">set_text<\/span><span class=\"p\">(<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">psutil<\/span><span class=\"o\">.<\/span><span class=\"n\">cpu_percent<\/span><span class=\"p\">()<\/span><span class=\"si\">:<\/span><span class=\"s2\">3.2f<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span><span class=\"p\">)<\/span>\n\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">table<\/span><span class=\"o\">.<\/span><span class=\"n\">draw<\/span><span class=\"p\">()<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">root_widget<\/span><span class=\"o\">.<\/span><span class=\"n\">refresh<\/span><span class=\"p\">()<\/span>\n\n            <span class=\"k\">await<\/span> <span class=\"n\">asyncio<\/span><span class=\"o\">.<\/span><span class=\"n\">sleep<\/span><span class=\"p\">(<\/span><span class=\"mi\">2<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">main<\/span><span class=\"p\">():<\/span>\n    <span class=\"k\">with<\/span> <span class=\"nb\">open<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;mini-top.austinui&quot;<\/span><span class=\"p\">)<\/span> <span class=\"k\">as<\/span> <span class=\"n\">austinui<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">view<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ViewBuilder<\/span><span class=\"o\">.<\/span><span class=\"n\">from_stream<\/span><span class=\"p\">(<\/span><span class=\"n\">austinui<\/span><span class=\"p\">)<\/span>\n        <span class=\"n\">view<\/span><span class=\"o\">.<\/span><span class=\"n\">open<\/span><span class=\"p\">()<\/span>\n\n    <span class=\"n\">loop<\/span> <span class=\"o\">=<\/span> <span class=\"n\">asyncio<\/span><span class=\"o\">.<\/span><span class=\"n\">get_event_loop<\/span><span class=\"p\">()<\/span>\n\n    <span class=\"k\">try<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">loop<\/span><span class=\"o\">.<\/span><span class=\"n\">create_task<\/span><span class=\"p\">(<\/span><span class=\"n\">view<\/span><span class=\"o\">.<\/span><span class=\"n\">update<\/span><span class=\"p\">())<\/span>\n        <span class=\"n\">loop<\/span><span class=\"o\">.<\/span><span class=\"n\">run_forever<\/span><span class=\"p\">()<\/span>\n    <span class=\"k\">except<\/span> <span class=\"ne\">KeyboardInterrupt<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">view<\/span><span class=\"o\">.<\/span><span class=\"n\">close<\/span><span class=\"p\">()<\/span>\n        <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;Bye!&quot;<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"k\">if<\/span> <span class=\"vm\">__name__<\/span> <span class=\"o\">==<\/span> <span class=\"s2\">&quot;__main__&quot;<\/span><span class=\"p\">:<\/span>\n    <span class=\"n\">main<\/span><span class=\"p\">()<\/span>\n<\/pre><\/div>\n<\/td><\/tr><\/table>\n\n<p>There are quite a few new things that we need to explan here. First of all we\nsee that we implement <code>MiniTop<\/code> as a sub-class of <code>View<\/code> and we declare all the\nrequired event handlers. We now have an <code>update<\/code> asynchronous method which we\nuse to update the UI every 2 seconds. For this to work, we need to create a task\nusing this method before starting the <code>asyncio<\/code> event loop (line 83). On line 43\nwe have the model logic that we just saw above and the interesting part starts\nafter that. On line 55 we set the collected tabular data to the <code>table<\/code> widget.\nYou would have guessed at this point that the way to reference a widget declared\nby the XML document is via attribute access on the view object. In this case,\nthe <code>Table<\/code> element has the <code>name<\/code> attribute set to <code>table<\/code>. Therefore we can\nreference it inside the <code>MiniTop<\/code> instance via <code>self<\/code>, i.e. <code>self.table<\/code>. On\nlines 66-67 we do a similar thing, i.e. we update the value of the <code>nprocs<\/code> and\n<code>cpu<\/code> labels with the number of process and the total CPU usage respectively. At\nthis point, no update is displayed on screen and this is for efficiency reasons.\nOnce we have modified all the widgets that needed to be updated we can force a\nredraw by calling the <code>draw<\/code> method. This merely updates some buffers in memory,\nso in order to flush the changes to screen we have to make a call to the\n<code>refresh<\/code> method on a window-like object. In this case the simplest thing is to\njust refresh the whole root widget (line 70).<\/p>\n<p>You might have also noticed that most of the event handlers are explicitly\nreturning <code>False<\/code>. That's because we are manually forcing a refresh of the\n<code>ScrollView<\/code> widget (lines 23, 28, 33, 38) and therefore we return <code>False<\/code> to\navoid propagating the refresh request further up the widget hierarchy for\nperformance. Here we only needed refreshing the <code>ScrollView<\/code>; we could have\nomitted the manual refresh and returned <code>True<\/code> with the same result, but the\nwindow that would get refreshed would be the root one.<\/p>\n<h1 id=\"spicing-things-up-with-colours\">Spicing Things up with Colours<\/h1>\n<p>The last thing we need to have a look at are those mysterious calls to the\n<code>markup<\/code> method and the <code>format_cmdline<\/code> helper function. This is we we find\nreferences to the <code>palette<\/code> element in the XML document. Let's have a closer\nlook at line 58 for example.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">markup<\/span><span class=\"p\">(<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;&lt;pid&gt;<\/span><span class=\"si\">{<\/span><span class=\"n\">pid<\/span><span class=\"si\">:<\/span><span class=\"s2\">8d<\/span><span class=\"si\">}<\/span><span class=\"s2\">&lt;\/pid&gt;&quot;<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>The <code>markup<\/code> method of a <code>View<\/code> object is a convenience method for creating\nstrings with multiple attributes, like foreground\/background colour, boldface,\nreversed, using an XML-like markup syntax. In this particular case, we want to\nwrite the PID on screen using the color pair with name <code>pid<\/code> declared inside the\n<code>palette<\/code> element of the XML document. On line 59 we use the <code>&lt;b&gt;<\/code> tag to make\nthe CPU usage bold. For the command line we do something more complex with the\n<code>format_cmdline<\/code> function. If the command line is non-empty, we use the <code>cmd<\/code>\ncolour for the actual command, and the colour <code>args<\/code> for the rest of the\narguments, with the exception of option values, which are highlighted with the\ncolour <code>opt<\/code> and with boldface. This spares us from having to manually split a\nstring into chunks with different formatting.<\/p>\n<p>When you run the code above you should see something similar to the screenshot\nbelow appearing in your terminal.<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/mini_top.png\"\n    alt=\"CPython data structures\"\n  \/>\n<\/p>\n\n<p>To quit the application, simply press <code>Q<\/code> as suggested at the bottom.<\/p>\n<p>You can find a working example of this minimalistic top utility on\n<a href=\"https:\/\/github.com\/p403n1x87\/mini-top\">GitHub<\/a> if you want to play around with\nit and familiarise a bit more with the Austin TUI way to resourceful text-based\nuser interfaces.<\/p>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"python"}},{"@attributes":{"term":"tui"}}]},{"title":"An Overview of Monads in Haskell","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/an-overview-of-monads-in-haskell.html","rel":"alternate"}},"published":"2020-09-17T16:18:00+01:00","updated":"2020-09-17T16:18:00+01:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2020-09-17:\/an-overview-of-monads-in-haskell.html","summary":"<p>Monads are arguably one of the most important concepts in functional programming. In this post we pave the street to understanding how this purely category theoretical object finds its place in a language like Haskell.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#introduction\">Introduction<\/a><\/li>\n<li><a href=\"#functors\">Functors<\/a><\/li>\n<li><a href=\"#applicative-functors\">Applicative Functors<\/a><\/li>\n<li><a href=\"#monads\">Monads<\/a><\/li>\n<\/ul>\n<\/div>\n<h1 id=\"introduction\">Introduction<\/h1>\n<p>If you are learning a functional programming language like Haskell, sooner or\nlater you will find yourself dealing with the concept of <strong>monad<\/strong>. You probably\nknew already that there is a fair bit of category theory embedded in the\nlanguage, and this comes as no surprise at all. Indeed, functional programming\nlanguages were developed with the goal of making lambda calculus practical and,\nas it turns out, type theory has many relations to category theory.<\/p>\n<p>In this post we make our way to the concept of monads in Haskell starting from\nthe notion of functor. The ultimate goal is to motivate the use of the term\nmonad by showing that we indeed have a monadic structure, but also give a\nsomewhat rigurous justification of the fact that \"all told, monads are just\nmonoids in the category of endofunctors\".<\/p>\n<h1 id=\"functors\">Functors<\/h1>\n<p>A typical pattern of functional programming is that of applying a function to a\nlist element-wise. There are many ways of doing this, but the most general one\nis that of using the <code>map<\/code> function<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">map<\/span><span class=\"w\"> <\/span><span class=\"ow\">::<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">a<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">b<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"p\">[<\/span><span class=\"n\">a<\/span><span class=\"p\">]<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"p\">[<\/span><span class=\"n\">b<\/span><span class=\"p\">]<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>That is, <code>map<\/code> takes a function <code>a -&gt; b<\/code> and a list of type <code>[a]<\/code> to produce a\nlist of type <code>[b]<\/code> by simply applying the function to each entry of the first\nlist. Two important properties of the function <code>map<\/code> follow from this\ndefinition. The first involves its interaction with the identity function <code>id ::\na -&gt; a<\/code> which simply evaluates to its sole argument. Indeed,<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">map<\/span><span class=\"w\"> <\/span><span class=\"n\">id<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">id<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>in the sense that both sides represent the same function. Now, if we have two\nfunctions, <code>f :: a -&gt; b<\/code> and <code>g :: b -&gt; c<\/code>, it doesn't matter whether we first\ncompute the composition <code>g . f<\/code> and apply it to <code>map<\/code>, or compose <code>map g<\/code> with\n<code>map f<\/code>. The net result is that we obtain a function that takes a list of type\n<code>[a]<\/code> that is sent to a list of type <code>[c]<\/code> whose elements are precisely given by\nthe image of each element of the initial list through the composition <code>g . f<\/code>.\nWe can express this fact with the identity<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">map<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"o\">.<\/span><span class=\"w\"> <\/span><span class=\"n\">g<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">map<\/span><span class=\"w\"> <\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"o\">.<\/span><span class=\"w\"> <\/span><span class=\"n\">map<\/span><span class=\"w\"> <\/span><span class=\"n\">g<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>Let's look back at what we have here. In more abstract terms, we can think of\nthe operation of putting square brackets around a type as some sort of map on\nthe set of Haskell types into itself, namely <span class=\"math\">\\([\\ \\cdot\\ ] : a \\mapsto [a]\\)<\/span>, for\nany Haskell type <span class=\"math\">\\(a\\)<\/span>. Observe further that the type of the function <code>map<\/code> is\nindeed <code>map :: (a -&gt; b) -&gt; ([a] -&gt; [b])<\/code>. That is to say that <code>map<\/code> sends a\nfunction <code>a -&gt; b<\/code> to a function <code>[a] -&gt; [b]<\/code>, together with the two properties\nabove.<\/p>\n<p>We may be tempted, at this point, to introduce the Haskell category\n<span class=\"math\">\\(\\mathsf{Hask}\\)<\/span>, whose objects are the Haskell types, and whose arrows are the\nHaskell functions, i.e. those types that can be put into the form <code>a -&gt; b<\/code> for\nsome Haskell types <code>a<\/code> and <code>b<\/code>. If we do so, we then recognise that the pair\n(<code>[]<\/code>, <code>map<\/code>) defines a functor from <span class=\"math\">\\(\\mathsf{Hask}\\)<\/span> into itself. That's because\n<code>[]<\/code> maps objects to objects (lists) and <code>map<\/code> maps functions to functions\n(between lists). That these mappings are functorial follows immediately from the\ntwo properties that we have observed earlier.<\/p>\n<p>The next abstraction step is in realising that there is nothing special in the\n(<code>[]<\/code>, <code>map<\/code>) pair. For if all we want are endofunctors on <span class=\"math\">\\(\\mathsf{Hask}\\)<\/span>, all\nwe really need is a <em>parametrised<\/em> type <code>T<\/code> and a map <code>fmap :: (a -&gt; b) -&gt; T a\n-&gt; T b<\/code> that <em>plays nicely<\/em> with <code>T<\/code>, in the sense that it satisfies the\nfunctoriality properties<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">fmap<\/span><span class=\"w\"> <\/span><span class=\"n\">id<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">id<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>and<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">fmap<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"o\">.<\/span><span class=\"w\"> <\/span><span class=\"n\">g<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">fmap<\/span><span class=\"w\"> <\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"o\">.<\/span><span class=\"w\"> <\/span><span class=\"n\">fmap<\/span><span class=\"w\"> <\/span><span class=\"n\">g<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>for any functions <code>f<\/code> and <code>g<\/code>. We then see that the list example we have\nanalyised above is just the special case where <code>T a = [a]<\/code> and <code>fmap = map<\/code>.<\/p>\n<p>To get a feel of what functors mean in Haskell (or more generally in functional\nprogramming), we should regard the parametrised type <code>T<\/code> as a sort of\n<em>container<\/em> type. A list, of course, is an example of a container, as it\ncontains multiple instances of a certan type in an ordered fashon. But <code>T<\/code> can\nalso be, e.g., a rooted tree with values of type <code>a<\/code> on each of its nodes. So,\ngenerally, <code>T<\/code> is some sort of container structure that accomodates for multiple\nvalues of the parameter type <code>a<\/code>.<\/p>\n<p>Notice now how the definition of <code>fmap<\/code> depends on the parametrised type <code>T<\/code>.\nThis means that the first step in defining an endofunctor over <span class=\"math\">\\(\\mathsf{Hask}\\)<\/span>\nis to produce one such parametrised type <code>T<\/code>. But once we have one, can we find\nan <code>fmap<\/code> such that (<code>T<\/code>, <code>fmap<\/code>) is a functor? The answer to this question very\nmuch depends on the nature of <code>T<\/code>, but what we can be certain of is that once we\nhave found one such <code>fmap<\/code>, then it is <strong>unique<\/strong>. Why? This is a consequence of\nthe so-called <code>parametricity<\/code> result, which derives from parametric\npolymorphism. The function <code>fmap :: (a -&gt; b) -&gt; T a -&gt; T b<\/code> implies universal\nquantifiers for the types <code>a<\/code> and <code>b<\/code>. That is to say, for any types <code>a<\/code> and\n<code>b<\/code>, <code>fmap<\/code> sends a function <code>a -&gt; b<\/code> to a function <code>T a -&gt; T b<\/code>. The key\nobservation is that, because of this parametric dependency on the types <code>a<\/code> and\n<code>b<\/code>, the function <code>fmap<\/code> cannot act in a way that depends on a particular choice\nof the types, but can depend at most on the container structure described by the\nparametrised type <code>T<\/code>.<\/p>\n<p>In order to understand this concept, let's take a step back and consider again\nthe special case of lists. Suppose that we have another candidate <code>fmap'<\/code> for an\n<code>fmap<\/code> other than <code>map<\/code>. Since the types are arbitrary, such new candidate can\nonly interact with the list structure, i.e. apply the first argument, that is,\nthe function <code>f :: a -&gt; b<\/code> to each element of the list of type <code>[a]<\/code> to produce\n<code>[b]<\/code>, and perhaps do a reshuffling. Now, if we apply this new candidate to the\nidentity <code>id :: a -&gt; a<\/code>, we get precisely the reshuffling bit <code>s = fmap' id<\/code>. On\nthe other hand, <code>fmap'<\/code> must satisfy the functor properties, and in particular\n<code>fmap' id = id<\/code>, whence <code>s = id<\/code>. That is to say that <code>fmap'<\/code> cannot actually do\nany reshuffling, so it must coincide with <code>map<\/code>.<\/p>\n<p>The general argument works in the same way, we only need to replace the list\nstructure with a general one <code>T<\/code> and argue that by the functor properties <code>fmap<\/code>\ncan only map a function to each element of type <code>a<\/code> inside <code>T a<\/code> while\npreserving the container structure described by <code>T<\/code>.<\/p>\n<blockquote>\n<p>More elegantly, this result could have been obtained using the <em>free theorem<\/em>\nassociated to the type <code>(a -&gt; b) -&gt; (T a -&gt; T b)<\/code>; see <a href=\"http:\/\/citeseerx.ist.psu.edu\/viewdoc\/download?doi=10.1.1.38.9875&amp;rep=rep1&amp;type=pdf\">(Wadler,\n1989)<\/a>\nfor the details.<\/p>\n<\/blockquote>\n<h1 id=\"applicative-functors\">Applicative Functors<\/h1>\n<p>We have seen that functors are a generalisation of the common pattern of mapping\na function over the elements of a list. Can we generalise to mapping of\nfunctions with more than one argument? That is to say, given that (<code>T<\/code>, <code>fmap<\/code>)\ndescribes a functor, can we find, e.g., <code>fmap2 :: (a -&gt; b -&gt; c) -&gt; (T a -&gt; T b\n-&gt; T c)<\/code>, that satisfies some reasonable coherence properties? A first\nobservation is that the type <code>a -&gt; b -&gt; c<\/code> is identical to <code>a -&gt; (b -&gt; c)<\/code>, just\nby definition. If we apply <code>fmap<\/code> to any instances of this type we would get\nsomething of type <code>T a -&gt; T (b -&gt; c)<\/code>, which isn't quite what we want. <em>But<\/em>, if\nwe had a function <code>&lt;*&gt; :: T (a -&gt; b) -&gt; (T a -&gt; T b)<\/code> we would get closer, but\nnot quite there yet. Indeed, we could now define <code>fmap2<\/code> like so<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">fmap2<\/span><span class=\"w\"> <\/span><span class=\"ow\">::<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">a<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">b<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">c<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">T<\/span><span class=\"w\"> <\/span><span class=\"n\">a<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"kt\">T<\/span><span class=\"w\"> <\/span><span class=\"n\">b<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"kt\">T<\/span><span class=\"w\"> <\/span><span class=\"n\">c<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">fmap2<\/span><span class=\"w\"> <\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"w\"> <\/span><span class=\"n\">y<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">fmap<\/span><span class=\"w\"> <\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"n\">y<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>that is, we curry the result of <code>fmap f<\/code> with the first argument <code>x<\/code> and then\napply <code>&lt;*&gt;<\/code> to the result to obtain a function that we can compute on <code>y<\/code> to get\na result of type <code>T c<\/code>. Of course, we can now repeat these steps to define\n<code>fmap3 :: (a -&gt; b -&gt; c -&gt; d) -&gt; (T a -&gt; T b -&gt; T c -&gt; Td)<\/code> by using <code>fmap2<\/code> as a\nstarting point, that is<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">fmap3<\/span><span class=\"w\"> <\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"w\"> <\/span><span class=\"n\">y<\/span><span class=\"w\"> <\/span><span class=\"n\">z<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">fmap2<\/span><span class=\"w\"> <\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"w\"> <\/span><span class=\"n\">y<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"n\">z<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>and carry on, to get a whole family of maps <code>fmapn<\/code>. Using infix notation for\n<code>&lt;*&gt;<\/code> and assuming associativity to the left, we could then write the general\ncase as<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">fmapn<\/span><span class=\"w\"> <\/span><span class=\"ow\">::<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">a1<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"o\">...<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">an<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">b<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">T<\/span><span class=\"w\"> <\/span><span class=\"n\">a1<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"o\">...<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"kt\">T<\/span><span class=\"w\"> <\/span><span class=\"n\">an<\/span><span class=\"w\"> <\/span><span class=\"ow\">-&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">b<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">fmapn<\/span><span class=\"w\"> <\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"n\">x1<\/span><span class=\"w\"> <\/span><span class=\"o\">...<\/span><span class=\"w\"> <\/span><span class=\"n\">xn<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">fmap<\/span><span class=\"w\"> <\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"n\">x1<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"o\">...<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">xn<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>This is now starting to look just like function application, except that we are\nnot making that very explicit with the first argument <code>x1<\/code> of type <code>T a1<\/code>. If we\nwanted to fix that, we would have to find, e.g., a function <code>p<\/code> such that <code>fmap\nf x1 = (p f) &lt;*&gt; x<\/code>. What's the type of <code>p<\/code>? We see that <code>p :: a -&gt; T a<\/code> would\nwork well here, so if we could find such a function we could push the base of\nthe recursion back and define <code>fmap<\/code> itself in terms of <code>p<\/code> and <code>&lt;*&gt;<\/code>. Then we\ncould easily generalise functors to functions of an arbitrary number of\narguments by simply mapping it through <code>p<\/code> and applying arguments with <code>&lt;*&gt;<\/code>.<\/p>\n<p>In Haskell, it is customary to call the function <code>p<\/code> with the name <code>pure<\/code>; any\nfunctor for which one can define both <code>pure<\/code> and <code>&lt;*&gt;<\/code> is then called an\n<em>applicative<\/em> functor. The importance of these special objects does not come\nsolely from the fact that they generalise the notion of mapping to functions of\nmore than one argument. Indeed, they represent an important step toward\n<em>effectful<\/em> programming inside a purely functional programming language. Perhaps\nwe can appreciate this better with a concrete example based on the functor\n<code>Maybe<\/code>. Its values can be used to represent the success or failure of an\noperation. For exmaple, when an operation failed, we can simply return\n<code>Nothing<\/code>, else we return <code>Just x<\/code>, where <code>x<\/code> is the valid result of the\noperation. Haskell makes <code>Maybe<\/code> into an applicative functor by default by\ndefining<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">pure<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"kt\">Just<\/span><span class=\"w\"><\/span>\n\n<span class=\"kt\">Nothing<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"kr\">_<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"kt\">Nothing<\/span><span class=\"w\"><\/span>\n<span class=\"p\">(<\/span><span class=\"kt\">Just<\/span><span class=\"w\"> <\/span><span class=\"n\">f<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">mx<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">fmap<\/span><span class=\"w\"> <\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"n\">mx<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>We see from the above definitions and the recursive nature of applicative\nfunctors that the propagation of the occurrence of an invalid value like\n<code>Nothing<\/code> during the computation is automatically propagated to the end result,\nwith no need to put checks in place for each argument of the function <code>f<\/code>.<\/p>\n<p>Before moving on, we should spend some more time looking back at the identity\n<code>(pure f) &lt;*&gt; x = fmap f x<\/code> and recalling that <code>fmap<\/code> must satisfy some\ncoherence properties that make it part of the description of a functor. As we\nhave said earlier, we expect <code>pure<\/code> and <code>&lt;*&gt;<\/code> to satisfy some coherence\nproperties themselves if they are to work as they are supposed to, and we see\nwhy that is. One of such properties comes for free, viz.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"p\">(<\/span><span class=\"n\">pure<\/span><span class=\"w\"> <\/span><span class=\"n\">id<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>since <code>fmap id x = x<\/code>. The other properties that are required of <code>pure<\/code> and\n<code>&lt;*&gt;<\/code> are<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">pure<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">pure<\/span><span class=\"w\"> <\/span><span class=\"n\">f<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">pure<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">x<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">pure<\/span><span class=\"w\"> <\/span><span class=\"n\">y<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">pure<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">$<\/span><span class=\"w\"> <\/span><span class=\"n\">y<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">pure<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">.<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">y<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">z<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">y<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;*&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">z<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>and all ensure that <code>pure<\/code> pretty much delivers what it promises. For example,\nthe first of the above three properties is just a way of ensuring that <code>pure<\/code>\nembeds ordinary function application into the effectful programming realm in an\nunsurprising way.<\/p>\n<blockquote>\n<p>Deeper ties with Category Theory are presented in details in the original\npaper <a href=\"http:\/\/www.staff.city.ac.uk\/~ross\/papers\/Applicative.html\">(McBride &amp; Paterson,\n2008)<\/a>. There it is\nshown how one can give a symmetric definition of <code>&lt;*&gt;<\/code> and show that an\napplicative functor is just a <em>lax monoidal functor<\/em>.<\/p>\n<\/blockquote>\n<h1 id=\"monads\">Monads<\/h1>\n<p>Monads have been introduced to crystallise yet another common pattern in\neffectful programming that is not quite captured by both the Functor and the\nApplicative \"patterns\".<\/p>\n<p>Consider the case of a program that consists of a series of steps that need to\nbe executed in order, and such that the output of one is used as input for the\nnext one. A classical example is a function <code>f<\/code> that requires the result of a\ndivision <code>g x y = x \/ y<\/code>. It is clear that if <code>g<\/code> receives a <code>0<\/code> as second\nargument during the execution of the program, we are in an exceptional situation\nthat somehow we need to handle. But we can hardly do so if <code>g<\/code> is of type, say,\n<code>g :: Int -&gt; Int -&gt; Int<\/code>. Lacking support for catching and reacting to\nexceptions, we need a way to <em>signal<\/em> that something went wrong and propagate\nthat to the end.<\/p>\n<p>The first obvious thing to do is to rewrite <code>g<\/code> in such a way that its type is\n<code>g :: Int -&gt; Int -&gt; Maybe Int<\/code>, so that <code>g _ 0 = Nothing<\/code> and <code>g x y = Just (x \/\ny)<\/code>. Then, for every function <code>f<\/code> that receives the output of <code>g<\/code> as input, we\nwould have to go through the tedious process of checking whether the value it\nreceived is valid or not. If those functions have been coded already, then we\nhave a problem that is not much fun to solve.<\/p>\n<p>How about we check the arguments we are passing to a function that is coming\nfrom another function, detecting any failures to propagate, else continue with\nthe normal execution? With this approach, all we need is a <em>binding<\/em> function\n<code>&gt;&gt;= :: Maybe a -&gt; (a -&gt; Maybe b) -&gt; Maybe b<\/code> that checks the first argument,\nand only apply it to the second argument if it makes sense to do so, propagating\nany \"bad\" value otherwise. It is clear, just by looking at the involved types,\nthat applicative functors are of no much use in this case, since a function like\n<code>g<\/code> is not <em>pure<\/em>, given that it can produce a side effect like <code>Nothing<\/code>.\nHence, what we have here is a different pattern: the <strong>monad<\/strong> pattern.<\/p>\n<p>The notion of <em>monad<\/em> comes from category theory and arises from adjunction. But\nbefore we can make contact with the Haskell notion of monads that was given\nabove we need to replace the bind operator <code>&gt;&gt;= :: M a -&gt; (a -&gt; M b) -&gt; M b<\/code>\n(here <code>M<\/code> is a parametrised type that we want to turn into a monad; hence <code>M<\/code> is\na functor) with the function <code>join :: M (\u039c a) -&gt; M a<\/code>. Together with <code>pure<\/code> (or\n<code>return<\/code> we should say), it provides the structure that makes the endofunctor\n<code>M<\/code> a monad. In a nutshell, what we want to do is prove that <code>(M, pure, join)<\/code>\nhas the monad structure. And we shall also see in what sense a monad can be\ndefined as a \"monoid in the category of endofunctors\", as Mac Lane put it.<\/p>\n<p>First things first, let's see how the <code>join<\/code> function relates to the bind\noperator <code>&gt;&gt;=<\/code>. You can convince yourself that the definition<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">join<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">x<\/span><span class=\"w\"> <\/span><span class=\"o\">&gt;&gt;=<\/span><span class=\"w\"> <\/span><span class=\"n\">id<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>has the correct type. As to what this is good for from a practical point of\nview, recall the example of the function <code>g :: Int -&gt; Int -&gt; Maybe Int<\/code> that we\nsaw before. We argued that we couldn't apply the applicative pattern because <code>g<\/code>\nis not a pure function and that the issue boiled down to <code>g<\/code> having the \"wrong\"\ntype. For if we used <code>fmap2 :: (a -&gt; b -&gt; c) -&gt; (Maybe a -&gt; Maybe b -&gt; Maybe\nc)<\/code>, we would end up with <code>fmap2 g :: Maybe Int -&gt; Maybe Int -&gt; Maybe (Maybe\nInt)<\/code>. But now, if we curry and apply <code>join<\/code> we can <em>cure<\/em> that double boxing\n<code>Maybe (Maybe Int)<\/code> to get back a <code>Maybe Int<\/code> instead.<\/p>\n<p>Let's now look at the coherence properties that must be satisfied by <code>join<\/code>. The\nfirst one we note is its interaction with itself<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">join<\/span><span class=\"w\"> <\/span><span class=\"o\">.<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">fmap<\/span><span class=\"w\"> <\/span><span class=\"n\">join<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">join<\/span><span class=\"w\"> <\/span><span class=\"o\">.<\/span><span class=\"w\"> <\/span><span class=\"n\">join<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>which is quite easily understood if we look at the case of lists. Here the\n<code>join<\/code> function does what we would expect, considering its name: it flattens the\nlist of lists <code>[[a]]<\/code> into a single list <code>[a]<\/code> obtained by concatenation. The\nsecond coherence property that we look at involves the interaction between\n<code>join<\/code> and <code>pure<\/code> (or <code>return<\/code>), viz.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">join<\/span><span class=\"w\"> <\/span><span class=\"o\">.<\/span><span class=\"w\"> <\/span><span class=\"n\">pure<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">join<\/span><span class=\"w\"> <\/span><span class=\"o\">.<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">fmap<\/span><span class=\"w\"> <\/span><span class=\"n\">pure<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"ow\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">id<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>which can be understood as saying that when we embed a value of type <code>M a<\/code> into\nthe \"pure\" part of <code>M (M a)<\/code>, then <code>join<\/code> \"unboxes\" the structure and gives us\nthe initial value back.<\/p>\n<p>Now, thanks to the two above properties, we can easily recognise a monad\nstructure for the triplet <code>(M, pure, join)<\/code>, with <code>T<\/code> the endofunctor and <code>pure<\/code>\nand <code>join<\/code> representing the required natural transformations.<\/p>\n<p>In what sense can we regard a monad as a monoid? The general qualitative answer\nis that the map <code>pure<\/code> is akin to a unit and <code>join<\/code> is akin to the binary\noperation on a monoid. But if we look at the type of <code>join<\/code>, which is <code>M (M a)\n-&gt; M a<\/code>, we don't see any sign of a Cartesian product. Instead, we have some\nsort of composition of functors, whence the second part of the qualitative\nanswer, i.e. that the Cartesian product should be replaced with \"composition\".\nBut what kind of composition? Let's clarify these points a bit.<\/p>\n<dl>\n<dt>Let'sabandon Haskell functors for now, since it is pretty clear that the last<\/dt>\n<dt>\"pattern\" that we have described has all the rights to be called a <em>monad<\/em>, and<\/dt>\n<dt>look at the matter from the perspective of category theory. Given functors $F, G<\/dt>\n<dd>C \\to D<span class=\"math\">\\(, $J, K : D \\to E\\)<\/span> and natural transformations <span class=\"math\">\\(\\alpha : F \\Rightarrow\nG\\)<\/span> and <span class=\"math\">\\(\\beta : J \\Rightarrow K\\)<\/span>, we can construct a new natural transformation\n<span class=\"math\">\\(\\beta \\circ \\alpha\\)<\/span> between the functors <span class=\"math\">\\(J \\circ F\\)<\/span> and <span class=\"math\">\\(K \\circ G\\)<\/span> according\nto the following commutative diagram<\/dd>\n<\/dl>\n<div class=\"math\">$$\\require{AMScd}\n\\begin{CD}\n  F(X) @&gt;\\alpha_X&gt;&gt; G(X)\\\\\n  @VJVV @VJVV\\\\ (J\\circ F)(X) @&gt;J(\\alpha_X)&gt;&gt;\n(J\\circ G)(X) @&gt;\\beta_{G(X)}&gt;&gt; (K\\circ G)(X)\n\\end{CD}$$<\/div>\n<p>That is, we define the natural transformation <span class=\"math\">\\(\\beta\\circ\\alpha : J \\circ F\n\\Rightarrow K \\circ G\\)<\/span> as having components<\/p>\n<div class=\"math\">$$(\\beta \\circ \\alpha)_X = \\beta_{G(X)} \\circ J(\\alpha_X)$$<\/div>\n<p>Using <span class=\"math\">\\(\\eta\\)<\/span> and <span class=\"math\">\\(\\mu\\)<\/span> as short-hand notation for the natural transformations\n<code>pure<\/code> and <code>join<\/code>, we see that the two coherence properties above can be stated\nin the form of commutative diagram<\/p>\n<div class=\"math\">$$\\require{AMScd}\n\\begin{CD}\n  M \\circ M \\circ M @&gt;\\iota_M\\circ \\mu&gt;&gt; M \\circ M\\\\\n  @V\\mu \\circ \\iota_MVV @V\\mu VV\\\\\n  M \\circ M @&gt;\\mu&gt;&gt; M\n\\end{CD}$$<\/div>\n<p>and<\/p>\n<div class=\"math\">$$\\require{AMScd}\n\\begin{CD}\n  M \\circ I @&gt; \\iota_M \\circ \\eta &gt;&gt; M \\circ M @&lt; \\eta\\circ\\iota_M &lt;&lt; I \\circ\n    M\\\\\n  @| @V \\mu VV @|\\\\\n  M @= M @= M\n\\end{CD}$$<\/div>\n<p>where <span class=\"math\">\\(\\iota_M\\)<\/span> denotes of course the identity natural transformation. Note\nthat, in terms of components, the two diagrams are equivalent to the relations<\/p>\n<div class=\"math\">$$\\forall X,\\ \\mu_X(M(\\mu_X)) = \\mu_X(\\mu_{M(X)})$$<\/div>\n<p>and<\/p>\n<div class=\"math\">$$\\forall X,\\ \\mu(M(\\eta_X)) = \\mu(\\eta_{M(X)}),$$<\/div>\n<p>which we can easily translate back into the original properties for the Haskell\n<code>pure<\/code> and <code>join<\/code>.<\/p>\n<p>The above diagrams should now make more precise the meaning of Mac Lane's sentence<\/p>\n<blockquote>\n<p>All told, a monad in <span class=\"math\">\\(X\\)<\/span> is just a monoid in the category of endofunctors of\n<span class=\"math\">\\(X\\)<\/span>, with product <span class=\"math\">\\(\\times\\)<\/span> replaced by composition of endofunctors and unit\nset by the identity endofunctor.<\/p>\n<\/blockquote>\n<p>Indeed we see that <span class=\"math\">\\(\\mu\\)<\/span> and <span class=\"math\">\\(\\eta\\)<\/span> have diagrams similar to the analogous\nconcepts in monoids, with the difference that functor composition <span class=\"math\">\\(\\circ\\)<\/span> is now\neverywhere we would expect a Cartesian product <span class=\"math\">\\(\\times\\)<\/span>.<\/p>\n<script type=\"text\/javascript\">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {\n    var align = \"center\",\n        indent = \"0em\",\n        linebreak = \"false\";\n\n    if (false) {\n        align = (screen.width < 768) ? \"left\" : align;\n        indent = (screen.width < 768) ? \"0em\" : indent;\n        linebreak = (screen.width < 768) ? 'true' : linebreak;\n    }\n\n    var mathjaxscript = document.createElement('script');\n    mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';\n    mathjaxscript.type = 'text\/javascript';\n    mathjaxscript.src = 'https:\/\/cdnjs.cloudflare.com\/ajax\/libs\/mathjax\/2.7.3\/latest.js?config=TeX-AMS-MML_HTMLorMML';\n\n    var configscript = document.createElement('script');\n    configscript.type = 'text\/x-mathjax-config';\n    configscript[(window.opera ? \"innerHTML\" : \"text\")] =\n        \"MathJax.Hub.Config({\" +\n        \"    config: ['MMLorHTML.js'],\" +\n        \"    TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'none' } },\" +\n        \"    jax: ['input\/TeX','input\/MathML','output\/HTML-CSS'],\" +\n        \"    extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js'],\" +\n        \"    displayAlign: '\"+ align +\"',\" +\n        \"    displayIndent: '\"+ indent +\"',\" +\n        \"    showMathMenu: true,\" +\n        \"    messageStyle: 'normal',\" +\n        \"    tex2jax: { \" +\n        \"        inlineMath: [ ['\\\\\\\\(','\\\\\\\\)'] ], \" +\n        \"        displayMath: [ ['$$','$$'] ],\" +\n        \"        processEscapes: true,\" +\n        \"        preview: 'TeX',\" +\n        \"    }, \" +\n        \"    'HTML-CSS': { \" +\n        \"        availableFonts: ['STIX', 'TeX'],\" +\n        \"        preferredFont: 'STIX',\" +\n        \"        styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} },\" +\n        \"        linebreaks: { automatic: \"+ linebreak +\", width: '90% container' },\" +\n        \"    }, \" +\n        \"}); \" +\n        \"if ('default' !== 'default') {\" +\n            \"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {\" +\n                \"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;\" +\n                \"VARIANT['normal'].fonts.unshift('MathJax_default');\" +\n                \"VARIANT['bold'].fonts.unshift('MathJax_default-bold');\" +\n                \"VARIANT['italic'].fonts.unshift('MathJax_default-italic');\" +\n                \"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');\" +\n            \"});\" +\n            \"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {\" +\n                \"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;\" +\n                \"VARIANT['normal'].fonts.unshift('MathJax_default');\" +\n                \"VARIANT['bold'].fonts.unshift('MathJax_default-bold');\" +\n                \"VARIANT['italic'].fonts.unshift('MathJax_default-italic');\" +\n                \"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');\" +\n            \"});\" +\n        \"}\";\n\n    (document.body || document.getElementsByTagName('head')[0]).appendChild(configscript);\n    (document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);\n}\n<\/script>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"haskell"}},{"@attributes":{"term":"category theory"}},{"@attributes":{"term":"functional programming"}}]},{"title":"Deterministic and Statistical Python Profiling","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/deterministic-and-statistical-python-profiling.html","rel":"alternate"}},"published":"2019-05-05T17:35:00+01:00","updated":"2019-05-05T17:35:00+01:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2019-05-05:\/deterministic-and-statistical-python-profiling.html","summary":"<p>If you want to be sure that your applications are working optimally, then sooner or later you will end up turning to profiling techniques to identify and correct potential issues with your code. In this post, I discuss some of the current profiling tools and techniques for Python. The official documentation has a <a href=\"https:\/\/docs.python.org\/3\/library\/profile.html\">whole section<\/a> on the subject, but we shall go beyond that and have a look at some alternative solutions, especially in the area of sampling profilers.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#brief-introduction-to-profiling\">Brief Introduction to Profiling<\/a><\/li>\n<li><a href=\"#python-profiling\">Python Profiling<\/a><ul>\n<li><a href=\"#standard-python-profiling\">Standard Python Profiling<\/a><\/li>\n<li><a href=\"#a-look-under-the-bonnet\">A Look Under the Bonnet<\/a><\/li>\n<li><a href=\"#statistical-profiling\">Statistical Profiling<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#enter-austin\">Enter Austin<\/a><ul>\n<li><a href=\"#on-your-marks\">On Your Marks<\/a><\/li>\n<li><a href=\"#flame-graphs-with-austin\">Flame Graphs with Austin<\/a><\/li>\n<li><a href=\"#the-tui\">The TUI<\/a><\/li>\n<li><a href=\"#web-austin\">Web Austin<\/a><\/li>\n<li><a href=\"#write-your-own\">Write Your Own<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#conclusions\">Conclusions<\/a><\/li>\n<\/ul>\n<\/div>\n<h1 id=\"brief-introduction-to-profiling\">Brief Introduction to Profiling<\/h1>\n<p>Let's start with a quick introduction to what <em>profiling<\/em> is. <em>Profiling<\/em> is a\nrun-time program analysis technique. Generally, a certain level of\n<em>instrumentation<\/em> is required to retrieve some kind of <em>tracing<\/em> information\nwhile the program is running. This is usually in the form of tracing\ninstructions interleaved with the line of your source code, like debug\nstatements, for instance, usually enriched with timestamp information or other\nrelevant details, like memory usage, etc... .<\/p>\n<p>One normally distinguishes between two main categories of profilers:<\/p>\n<ul>\n<li><em>event-based<\/em> (or <em>deterministic<\/em>)<\/li>\n<li><em>statistical<\/em> (or <em>sampling<\/em>).<\/li>\n<\/ul>\n<p>Profilers in the first category make use of <em>hooks<\/em> that allow registering event\ncallbacks. At the lowest level, these hooks are provided directly by the\noperating system and allow you to trace events like function calls and returns.\nVirtual machines and interpreters, like JVM and CPython, provide <em>software<\/em>\nhooks instead, for generally the same events, but also for language-specific\nfeatures, like class loading for instance. The reason why profilers in this\ncategory are called <em>deterministic<\/em> is that, by listening to the various events,\nyou can get a deterministic view of what is happening inside your application.<\/p>\n<p>In contrast, <em>statistical<\/em> profilers tend to provide approximate figures only,\nobtained by, e.g., sampling the call stack at regular interval of times. These\nsamples can then be analysed statistically to provide meaningful metrics for the\nprofiled target.<\/p>\n<p>One might get the impression that deterministic profilers are a better choice\nthan statistical profilers. However, both categories come with pros and cons.\nFor example, statistical profilers usually require less instrumentation, if none\nat all, and introduce less overhead in the profiled target program. Therefore,\nif a statistical profiler can guarantee a certain accuracy on the metrics that\ncan be derived from them, then it is usually a better choice over a more\naccurate deterministic profiler that can introduce higher overhead.<\/p>\n<h1 id=\"python-profiling\">Python Profiling<\/h1>\n<p>There are quite a plethora of profiling tools available for Python, either\ndeterministic or statistical. The official documentation describes the use of\nthe Python profiling interface through two different implementations:<\/p>\n<ul>\n<li><a href=\"https:\/\/docs.python.org\/3\/library\/profile.html#module-profile\"><code>profile<\/code><\/a>,<\/li>\n<li><a href=\"https:\/\/docs.python.org\/3\/library\/profile.html#module-cProfile\"><code>cProfile<\/code><\/a>.<\/li>\n<\/ul>\n<p>The former is a pure Python module and, as such, introduces more overhead than\nthe latter, which is a C extension that implements the same interface as\n<code>profile<\/code>. They both fit into the category of <em>deterministic<\/em> profilers and make\nuse of the Python C API\n<a href=\"https:\/\/docs.python.org\/3\/c-api\/init.html#profiling-and-tracing\"><code>PyEval_SetProfile<\/code><\/a>\nto register event hooks.<\/p>\n<h2 id=\"standard-python-profiling\">Standard Python Profiling<\/h2>\n<p>Let's have a look at how to use <code>cProfile<\/code>, as this will be the standard choice\nfor a deterministic profiler. Here is an example that will profile the\ncall-stack of <code>psutil.process_iter<\/code>.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># File: process_iter.py<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">cProfile<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">psutil<\/span>\n\n<span class=\"n\">cProfile<\/span><span class=\"o\">.<\/span><span class=\"n\">run<\/span><span class=\"p\">(<\/span>\n  <span class=\"s1\">&#39;[list(psutil.process_iter()) for i in range(1_000)]&#39;<\/span><span class=\"p\">,<\/span>\n  <span class=\"s1\">&#39;process_iter&#39;<\/span>\n<span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>The above code runs <code>psutil.process_iter<\/code> for 1000 times through cProfile and\nsends the output to the <code>process_iter<\/code> file in the current working directory. A\ngood reason to save the result to a file is that one can then use a tool like\n<a href=\"https:\/\/github.com\/jrfonseca\/gprof2dot\">gprof2dot<\/a> to provide a graphical\nrepresentation of the collected data. This tool turns the output of cProfile\ninto a dot graph which can then be visualised to make better sense of it. E.g.,\nthese are the commands required to collect the data and visualise it in the form\nof a DOT graph inside a PDF document:<\/p>\n<div class=\"highlight\"><pre><span><\/span>python3 process_iter.py\ngprof2dot -f pstats process_iter <span class=\"p\">|<\/span> dot -Tpdf -o process_iter.pdf\n<\/pre><\/div>\n\n\n<p>This is what the result will look like. The colours help us identify the\nbranches of execution where most of the time is spent.<\/p>\n<p><img alt=\"process_iter\" src=\"https:\/\/p403n1x87.github.io\/images\/python-profiling\/process_iter.svg\"><\/p>\n<h2 id=\"a-look-under-the-bonnet\">A Look Under the Bonnet<\/h2>\n<p>The output of a tool like gprof2dot can be quite intuitive to understand,\nespecially if you have had some prior experience with profilers. However, in\norder to better appreciate what is still to come it is best if we have a quick\nlook at some of the basics of the Python execution model.<\/p>\n<p>Python is an interpreted language and the reference implementation of its\ninterpreter is <a href=\"https:\/\/en.wikipedia.org\/wiki\/CPython\">CPython<\/a>. As the name\nsuggests, it is written in C, and it offers a C API that can be used to write C\nextensions.<\/p>\n<p>One of the fundamental objects of CPython is the interpreter itself, which has a\ndata structure associated with it, namely <code>PyInterpreterState<\/code>. In principle,\nthere can be many instances of <code>PyInterpreterState<\/code> within the same process, but\nfor the sake of simplicity, we shall ignore this possibility here. One of the\nfields of this C data structure is <code>tstate_head<\/code>, which points to the first\nelement of a doubly-linked list of instances of the <code>PyThreadState<\/code> structure.\nAs you can imagine, this other data structure represents the state of a thread\nof execution associated with the referring interpreter instance. We can navigate\nthis list by following the references of its field <code>next<\/code> (and navigate back\nwith <code>prev<\/code>).<\/p>\n<p>Each instance of <code>PyThreadState<\/code> points to the current execution frame, which is\nthe object that bears the information about the execution of a code block via\nthe field <code>frame<\/code>. This is described by the <code>PyFrameObject<\/code> structure, which is\nalso a list. In fact, this is the stack that we are after. Each frame will have,\nin general, a parent frame that can be retrieved by means of the <code>f_back<\/code>\npointer on the <code>PyFrameObject<\/code> structure. The picture produced by gprof2dot of\nthe previous section is the graphical representation of this stack of frames.\nThe information contained in the first row of each box comes from the\n<code>PyCodeObject<\/code> structure, which can be obtained from every instance of\n<code>PyFrameObject<\/code> via the <code>f_code<\/code> field. In particular, <code>PyCodeObject<\/code> allows you\nto retrieve the name of the file that contains the Python code being executed in\nthat frame as well as its line number and the name of the context (e.g. the\ncurrent function).<\/p>\n<p>Sometimes the C API changes between releases, but the following image is a\nfairly stable representation of the relations between the above-mentioned\nstructures that are common among many of the major CPython releases.<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/python-profiling\/python_structs.svg\"\n    alt=\"CPython data structures\"\n  \/>\n<\/p>\n\n<p>The loop around <code>PyFrameObject<\/code>, which represents its field <code>f_back<\/code>, creates\nthe structure of a singly-linked list of frame objects. This is precisely the\nframe stack.<\/p>\n<p>The Python profiling API can be demonstrated with some simple Python code. The\nfollowing example declares a decorator, <code>@profile<\/code>, that can be used to extract\nthe frame stack generated by the execution of a function. In this case, we\ndefine the factorial function<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">sys<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">profile<\/span><span class=\"p\">(<\/span><span class=\"n\">f<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">def<\/span> <span class=\"nf\">profiler<\/span><span class=\"p\">(<\/span><span class=\"n\">frame<\/span><span class=\"p\">,<\/span> <span class=\"n\">event<\/span><span class=\"p\">,<\/span> <span class=\"n\">arg<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">if<\/span> <span class=\"s2\">&quot;c_&quot;<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">event<\/span><span class=\"p\">:<\/span>\n            <span class=\"k\">return<\/span>\n\n        <span class=\"n\">stack<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[]<\/span>\n        <span class=\"k\">while<\/span> <span class=\"n\">frame<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">code<\/span> <span class=\"o\">=<\/span> <span class=\"n\">frame<\/span><span class=\"o\">.<\/span><span class=\"n\">f_code<\/span>\n            <span class=\"n\">stack<\/span><span class=\"o\">.<\/span><span class=\"n\">append<\/span><span class=\"p\">(<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">code<\/span><span class=\"o\">.<\/span><span class=\"n\">co_name<\/span><span class=\"si\">}<\/span><span class=\"s2\">@<\/span><span class=\"si\">{<\/span><span class=\"n\">frame<\/span><span class=\"o\">.<\/span><span class=\"n\">f_lineno<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span><span class=\"p\">)<\/span>\n            <span class=\"n\">frame<\/span> <span class=\"o\">=<\/span> <span class=\"n\">frame<\/span><span class=\"o\">.<\/span><span class=\"n\">f_back<\/span>\n        <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{:12}<\/span><span class=\"s2\"> <\/span><span class=\"si\">{}<\/span><span class=\"s2\">&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">format<\/span><span class=\"p\">(<\/span><span class=\"n\">event<\/span><span class=\"o\">.<\/span><span class=\"n\">upper<\/span><span class=\"p\">(),<\/span> <span class=\"s2\">&quot; -&gt; &quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">(<\/span><span class=\"n\">stack<\/span><span class=\"p\">[::<\/span><span class=\"o\">-<\/span><span class=\"mi\">1<\/span><span class=\"p\">])))<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">wrapper<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">):<\/span>\n        <span class=\"n\">old_profiler<\/span> <span class=\"o\">=<\/span> <span class=\"n\">sys<\/span><span class=\"o\">.<\/span><span class=\"n\">getprofile<\/span><span class=\"p\">()<\/span>\n\n        <span class=\"n\">sys<\/span><span class=\"o\">.<\/span><span class=\"n\">setprofile<\/span><span class=\"p\">(<\/span><span class=\"n\">profiler<\/span><span class=\"p\">)<\/span>\n        <span class=\"n\">r<\/span> <span class=\"o\">=<\/span> <span class=\"n\">f<\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">args<\/span><span class=\"p\">,<\/span> <span class=\"o\">**<\/span><span class=\"n\">kwargs<\/span><span class=\"p\">)<\/span>\n        <span class=\"n\">sys<\/span><span class=\"o\">.<\/span><span class=\"n\">setprofile<\/span><span class=\"p\">(<\/span><span class=\"n\">old_profiler<\/span><span class=\"p\">)<\/span>\n\n        <span class=\"k\">return<\/span> <span class=\"n\">r<\/span>\n\n    <span class=\"k\">return<\/span> <span class=\"n\">wrapper<\/span>\n\n\n<span class=\"nd\">@profile<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">factorial<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">if<\/span> <span class=\"n\">n<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">0<\/span><span class=\"p\">:<\/span>\n        <span class=\"k\">return<\/span> <span class=\"mi\">1<\/span>\n\n    <span class=\"k\">return<\/span> <span class=\"n\">n<\/span> <span class=\"o\">*<\/span> <span class=\"n\">factorial<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"o\">-<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n\n\n<span class=\"k\">if<\/span> <span class=\"vm\">__name__<\/span> <span class=\"o\">==<\/span> <span class=\"s2\">&quot;__main__&quot;<\/span><span class=\"p\">:<\/span>\n    <span class=\"n\">factorial<\/span><span class=\"p\">(<\/span><span class=\"mi\">3<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>Note that the coding of the <code>profiler<\/code> function can be simplified considerably\nby using the <code>inspect<\/code> module:<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">inspect<\/span>\n\n<span class=\"o\">...<\/span>\n\n  <span class=\"k\">def<\/span> <span class=\"nf\">profiler<\/span><span class=\"p\">(<\/span><span class=\"n\">frame<\/span><span class=\"p\">,<\/span> <span class=\"n\">event<\/span><span class=\"p\">,<\/span> <span class=\"n\">arg<\/span><span class=\"p\">):<\/span>\n      <span class=\"k\">if<\/span> <span class=\"s2\">&quot;c_&quot;<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">event<\/span><span class=\"p\">:<\/span>\n          <span class=\"k\">return<\/span>\n\n      <span class=\"n\">stack<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"sa\">f<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{<\/span><span class=\"n\">f<\/span><span class=\"o\">.<\/span><span class=\"n\">function<\/span><span class=\"si\">}<\/span><span class=\"s2\">@<\/span><span class=\"si\">{<\/span><span class=\"n\">f<\/span><span class=\"o\">.<\/span><span class=\"n\">lineno<\/span><span class=\"si\">}<\/span><span class=\"s2\">&quot;<\/span> <span class=\"k\">for<\/span> <span class=\"n\">f<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">inspect<\/span><span class=\"o\">.<\/span><span class=\"n\">stack<\/span><span class=\"p\">()[<\/span><span class=\"mi\">1<\/span><span class=\"p\">:]]<\/span>\n      <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;<\/span><span class=\"si\">{:8}<\/span><span class=\"s2\"> <\/span><span class=\"si\">{}<\/span><span class=\"s2\">&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">format<\/span><span class=\"p\">(<\/span><span class=\"n\">event<\/span><span class=\"o\">.<\/span><span class=\"n\">upper<\/span><span class=\"p\">(),<\/span> <span class=\"s2\">&quot; -&gt; &quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">(<\/span><span class=\"n\">stack<\/span><span class=\"p\">[::<\/span><span class=\"o\">-<\/span><span class=\"mi\">1<\/span><span class=\"p\">])))<\/span>\n\n<span class=\"o\">...<\/span>\n<\/pre><\/div>\n\n\n<h2 id=\"statistical-profiling\">Statistical Profiling<\/h2>\n<p>For a profiler from the statistical category, we have to look for external\ntools. In this case, The \"standard\" approach is to make use of a system call\nlike <code>setitimer<\/code>, which is used to register a signal handler that gets called at\nregular intervals of time. The general idea is to register a callback that gets\na snapshot of the current frame stack when triggered. An example of a profiler\nthat works like this is <a href=\"https:\/\/github.com\/vmprof\/vmprof-python\">vmprof<\/a>.<\/p>\n<p>Some drawbacks of this approach are: 1. the signal handler runs in the same\nprocess as the Python interpreter, and generally the main thread; 2. signals can\ninterrupt system calls, which can cause stalls in the running program.<\/p>\n<p>There are other approaches that can be taken in order to implement a statistical\nprofiler, though. An example is <a href=\"https:\/\/github.com\/uber\/pyflame\">pyflame<\/a>,\nwhich is more in the spirit of a debugging tool and uses <code>ptrace<\/code>-like system\ncalls. The situation is a bit more involved here since the profiler is now an\nexternal process. The general idea is to use <code>ptrace<\/code> to pause the running\nPython program, read its virtual memory and reconstruct the frame stack from it.\nHere, the main challenges are 1. to find the location of the relevant CPython\ndata structures in memory and 2. parse them to extract the frame stack\ninformation. The differences between Python 2 and Python 3 and the occasional\nchanges of the CPython ABI within the same minor release add up to the\ncomplexity of the task.<\/p>\n<p>Once all has been taken care of, though, a statistical profiler of this kind has\nthe potential of lowering the overhead caused by source instrumentation even\nfurther so that the payoff is generally worth the effort.<\/p>\n<h1 id=\"enter-austin\">Enter Austin<\/h1>\n<p>We just saw that with a tool like pyflame we can get away with no\ninstrumentation. An objection that can be raised against it, though, is that it\nstill halts the profiled program in order to read the interpreter state. System\ncalls like <code>ptrace<\/code> were designed for debugging tools, for which it is desirable\nto stop the execution at some point, inspect memory, step over one instruction\nor a whole line of source code at a time etc.... Ideally, we would like our\nprofiler to interfere as little as possible with the profiled program.<\/p>\n<p>This is where a tool like <a href=\"https:\/\/github.com\/P403n1x87\/austin\">Austin<\/a> comes\ninto play. Austin is, strictly speaking, not a full-fledged profiler on its own.\nIn fact, Austin is merely a frame stack sampler for CPython. Concretely, this\nmeans that all Austin does is to sample the frame stack of a running Python\nprogram at (almost) regular intervals of time.<\/p>\n<p>A similar approach is followed by <a href=\"https:\/\/github.com\/benfred\/py-spy\">py-spy<\/a>,\nanother Python profiler written in Rust and inspired by\n<a href=\"https:\/\/github.com\/rbspy\/rbspy\">rbspy<\/a>. However, Austin tends to provide higher\nperformance in general for two main reasons. One is that it is written in pure\nC, with no external dependencies other than the standard C library. The other is\nthat Austin is just a frame stack sampler. It focuses on dumping the relevant\nparts of the Python interpreter state as quickly as possible and delegates any\ndata aggregations and analysis to external tools. In theory, Austin offers you\nhigher sampling rates at virtually no cost at the expenses of the profiled\nprocess. This makes Austin the ideal choice for profiling production code at\nrun-time, with not even a single line of instrumentation required!<\/p>\n<p>So, how does Austin read the virtual memory of another process without halting\nit? Many platforms offer system calls to do just that. On Linux, for example,\nthe system call is\n<a href=\"http:\/\/man7.org\/linux\/man-pages\/man2\/process_vm_readv.2.html\"><code>process_vm_readv<\/code><\/a>.\nOnce one has located the instance of <code>PyInterpreterState<\/code>, everything else\nfollows automatically, as we saw with the discussion on some of the details of\nthe CPython execution model.<\/p>\n<h2 id=\"on-your-marks\">On Your Marks<\/h2>\n<p>At this point you might have started, quite understandably, to be a bit\nconcerned with concurrency issues. What can we actually make of a memory dump\nfrom a running process we have no control over? What guarantees do we have that\nthe moment we decide to peek at the Python interpreter state, we find all the\nrelevant data structures in a consistent state? The answer to this question lies\nin the difference in execution speed between C and Python code, the latter\nbeing, on average, order of 10 times faster than the former. So what we have\nhere is a race between Austin (which is written in C) and the Python target.\nWhen Austin samples the Python interpreter memory, it does so quite quickly\ncompared to the scale of execution of a Python code block. On the other hand,\nCPython is also written in C, can refresh its state pretty quickly too. As a\ncinematic analogy, think that we are trying to create an animation by taking\nsnapshots of a moving subject in quick succession. If the motion we are trying\nto capture is not too abrupt (compared to the time it takes to take a snapshot,\nthat is), then we won't spot any motion blur and our images will be perfectly\nclear. This video of the Cassini flyby over Jupiter, Europa and Io, for\ninstance, been made from still images, visualises this idea clearly.<\/p>\n<p align=\"center\">\n  <iframe\n      class=\"center-image\"\n    width=\"100%\"\n    height=\"315\"\n    src=\"https:\/\/www.youtube.com\/embed\/-0JxkZjwpRg\"\n    frameborder=\"0\"\n    allow=\"accelerometer; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen>\n  <\/iframe>\n<\/p>\n\n<p>With Austin, each frame stack sample is the analogue of a snapshot and the\nmoving subject is the Python code being executed by the interpreter. Of course,\nAustin could be unlucky and decide to sample precisely during the moment CPython\nis in the middle of updating the frame stack. However, based on our previous\nconsiderations, we can expect this to be a rather rare event. Sometimes a\npicture is worth a thousand words, so here is an idealistic \"CPython vs Austin\"\nexecution timeline comparison.<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/python-profiling\/timeline.svg\"\n    alt=\"CPython and Austin timeline comparison\"\n  \/>\n<\/p>\n\n<p>Now one could argue that, in order to decrease the error rate, an approach\nsimilar to pyflame, where we halt the execution before taking a snapshot, would\nbe a better solution. In fact, it makes practically no difference. Indeed it\ncould happen that the profiler decides to call <code>ptrace<\/code> while CPython is in the\nmiddle of refreshing the frame stack. In this case, it doesn't really matter\nwhether CPython has been halted or not, the frame stack will be in an\ninconsistent state anyway.<\/p>\n<p>As a final wrap-up comment to this digression, statistical profilers for Python\nlike Austin can produce reliable output, as the error rate tends to be very low.\nThis is possible because Austin is written in pure C and therefore offers\noptimal sampling performance.<\/p>\n<h2 id=\"flame-graphs-with-austin\">Flame Graphs with Austin<\/h2>\n<p>The simplest way to turn Austin into a basic profiler is to pipe it to a tool\nlike Brendan Gregg's <a href=\"https:\/\/github.com\/brendangregg\/FlameGraph\">FlameGraph<\/a>.\nFor example, assuming that <code>austin<\/code> is in your <code>PATH<\/code> variable (e.g. because you\nhave installed it from the Snap Store with <code>sudo snap install austin --beta\n--classic<\/code>) and that <code>flamegraph.pl<\/code> is installed in <code>\/opt\/flamegraph<\/code>, we can\ndo<\/p>\n<div class=\"highlight\"><pre><span><\/span>austin python3 process_iter.py <span class=\"p\">|<\/span> \/opt\/flamegraph\/flamegraph.pl --countname<span class=\"o\">=<\/span>usec &gt; process_iter.svg\n<\/pre><\/div>\n\n\n<p>We are using <code>--countname=usec<\/code> because Austin samples frame stacks in\nmicroseconds and this information will then be part of the output of the flame\ngraph tool. The following image is the result that I have got from running the\nabove command.<\/p>\n<p><object data=\"https:\/\/p403n1x87.github.io\/images\/python-profiling\/process_iter_fg.svg\"\n          type=\"image\/svg+xml\"\n                width=\"100%\"\n                class=\"center-image\" >\n  <img src=\"https:\/\/p403n1x87.github.io\/images\/python-profiling\/process_iter_fg.svg\" style=\"width:100%;\"\/>\n<\/object><\/p>\n<p>Austin is now included in the official Debian repositories. This means that you\ncan install it with<\/p>\n<div class=\"highlight\"><pre><span><\/span>apt install austin\n<\/pre><\/div>\n\n\n<p>on Linux distributions that are derived from Debian. On Windows, Austin can be\ninstalled from <a href=\"https:\/\/chocolatey.org\/packages\/austin\">Chocolatey<\/a> with the\ncommand<\/p>\n<div class=\"highlight\"><pre><span><\/span>choco install austin --pre\n<\/pre><\/div>\n\n\n<p>Alternatively, you can just head to the\n<a href=\"https:\/\/github.com\/P403n1x87\/austin\/releases\">release<\/a> page on GitHub and\ndownload the appropriate binary release for your platform.<\/p>\n<h2 id=\"the-tui\">The TUI<\/h2>\n<p>The GitHub repository of Austin comes with a TUI application written in Python\nand based on <code>curses<\/code>. It provides an example of an application that uses the\noutput from Austin to display <em>live<\/em> top-like profiling statistics of a running\nPython program.<\/p>\n<p>If you want to try it, you can install it with<\/p>\n<div class=\"highlight\"><pre><span><\/span>pip install git+https:\/\/github.com\/P403n1x87\/austin.git\n<\/pre><\/div>\n\n\n<p>and run it with<\/p>\n<div class=\"highlight\"><pre><span><\/span>austin-tui python3 \/path\/to\/process_iter.py\n<\/pre><\/div>\n\n\n<p>By default, the TUI shows only the current frame being executed in the selected\nthread. You can navigate through the different threads with <kbd>\u21de Page Up<\/kbd>\nand <kbd>\u21df Page Down<\/kbd>. You can also view all the collected samples with the\nFull Mode, which can be toggled with <kbd>F<\/kbd>. The currently executing frame\nwill be highlighted and a tree representation of the current frame stack will be\navailable on the right-hand side of the terminal.<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/python-profiling\/austin-tui_threads_nav.gif\"\n    alt=\"Austin TUI\"\n  \/>\n<\/p>\n\n<p>If you are a statistician or a data scientist working with Python, you can use\nthe TUI to peek at your model while it is training to see what is going on and\nto identify areas of your code that could potentially be optimised to run\nfaster. For example, let's assume that you are training a model on Linux in a\nsingle process using the command<\/p>\n<div class=\"highlight\"><pre><span><\/span>python3 my_model.py\n<\/pre><\/div>\n\n\n<p>You can attach the TUI to your model with the command (as superuser)<\/p>\n<div class=\"highlight\"><pre><span><\/span>austin-tui -p <span class=\"sb\">`<\/span>pgrep -f my_model.py <span class=\"p\">|<\/span> head -n <span class=\"m\">1<\/span><span class=\"sb\">`<\/span> -i <span class=\"m\">10000<\/span>\n<\/pre><\/div>\n\n\n<p>The <code>pgrep<\/code> part is there to select the PID of the Python process that is\nrunning your model, while <code>-i 10000<\/code> sets the sampling interval to 10 ms.<\/p>\n<h2 id=\"web-austin\">Web Austin<\/h2>\n<p>Web Austin is another example of how to use Austin to build a profiling tool. In\nthis case, we make use of the\n<a href=\"https:\/\/github.com\/spiermar\/d3-flame-graph\">d3-flame-graph<\/a> plugin for\n<a href=\"https:\/\/d3js.org\/\">D3<\/a> to produce a <strong>live<\/strong> flame graph visualisation of the\ncollected samples inside a web browser. This opens up to <em>remote profiling<\/em>, as\nthe web application can be served on an arbitrary IPv4 address.<\/p>\n<p>Like the TUI, Web Austin can be installed from GitHub with<\/p>\n<div class=\"highlight\"><pre><span><\/span>pip install git+https:\/\/github.com\/P403n1x87\/austin.git\n<\/pre><\/div>\n\n\n<p>Assuming you are still interested to see what is happening inside your\nstatistical model while it is training, you can use the command<\/p>\n<div class=\"highlight\"><pre><span><\/span>austin-web -p <span class=\"sb\">`<\/span>pgrep -f my-model.py <span class=\"p\">|<\/span> head -n <span class=\"m\">1<\/span><span class=\"sb\">`<\/span> -i <span class=\"m\">10000<\/span>\n<\/pre><\/div>\n\n\n<p>As for the TUI, the command line arguments are the same as Austin's. When Web\nAustin starts up, it creates a simple HTTP server that serves on <code>localhost<\/code> at\nan ephemeral port.<\/p>\n<div class=\"highlight\"><pre><span><\/span># austin-web -p `pgrep -f my-model.py | head -n 1` -i 10000\n_____      ___       __    ______       _______              __________             _____\n____\/|_    __ |     \/ \/_______  \/_      ___    |___  __________  \/___(_)______      ____\/|_\n_|    \/    __ | \/| \/ \/_  _ \\_  __ \\     __  \/| |  \/ \/ \/_  ___\/  __\/_  \/__  __ \\     _|    \/\n\/_ __|     __ |\/ |\/ \/ \/  __\/  \/_\/ \/     _  ___ \/ \/_\/ \/_(__  )\/ \/_ _  \/ _  \/ \/ \/     \/_ __|\n |\/        ____\/|__\/  \\___\/\/_.___\/      \/_\/  |_\\__,_\/ \/____\/ \\__\/ \/_\/  \/_\/ \/_\/       |\/\n\n\n* Sampling process with PID 3711 (python3 my_model.py)\n* Web Austin is running on http:\/\/localhost:34257. Press Ctrl+C to stop.\n<\/pre><\/div>\n\n\n<p>If you then open <code>http:\/\/localhost:34257<\/code> in your browser you will then see a\nweb application that looks like the following<\/p>\n<p align=\"center\">\n  <img\n    src=\"https:\/\/p403n1x87.github.io\/images\/python-profiling\/web-austin.gif\"\n    alt=\"Web Austin\"\n  \/>\n<\/p>\n\n<blockquote>\n<p>Note that an active internet connection is required for the application to\nwork, as the d3-flame-graph plugin, as well as some fonts, are retrieved from\nremote sources.<\/p>\n<\/blockquote>\n<p>If you want to change the host and the port of the HTTP server created by Web\nAustin you can set the environment variables <code>WEBAUSTIN_HOST<\/code> and\n<code>WEBAUSTIN_PORT<\/code>. If you want to run the Web Austin web application on, e.g.,\n<code>0.0.0.0:8080<\/code>, so that it can be accessed from everywhere, use the command<\/p>\n<div class=\"highlight\"><pre><span><\/span># WEBAUSTIN_HOST=&quot;0.0.0.0&quot; WEBAUSTIN_PORT=8080 austin-web -p `pgrep -f my-model.py | head -n 1` -i 10000\n_____      ___       __    ______       _______              __________             _____\n____\/|_    __ |     \/ \/_______  \/_      ___    |___  __________  \/___(_)______      ____\/|_\n_|    \/    __ | \/| \/ \/_  _ \\_  __ \\     __  \/| |  \/ \/ \/_  ___\/  __\/_  \/__  __ \\     _|    \/\n\/_ __|     __ |\/ |\/ \/ \/  __\/  \/_\/ \/     _  ___ \/ \/_\/ \/_(__  )\/ \/_ _  \/ _  \/ \/ \/     \/_ __|\n |\/        ____\/|__\/  \\___\/\/_.___\/      \/_\/  |_\\__,_\/ \/____\/ \\__\/ \/_\/  \/_\/ \/_\/       |\/\n\n\n* Sampling process with PID 3711 (python3 my_model.py)\n* Web Austin is running on http:\/\/0.0.0.0:8080. Press Ctrl+C to stop.\n<\/pre><\/div>\n\n\n<h2 id=\"write-your-own\">Write Your Own<\/h2>\n<p>Austin's Powers (!) reside in its very simplicity. The \"hard\" problem of\nsampling the Python frame stack has been solved for you so that you can focus on\nprocessing the samples to produce the required metrics.<\/p>\n<p>If you decide to write a tool in Python, the Austin project on GitHub comes with\na Python wrapper. Depending on your preferences, you can choose between a\nthread-based approach or an <code>asyncio<\/code> one. Just as an example, let's see how to\nuse the <code>AsyncAustin<\/code> class to make a custom profiler based on the samples\ncollected by Austin.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">sys<\/span>\n\n<span class=\"kn\">from<\/span> <span class=\"nn\">austin<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">AsyncAustin<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">austin.stats<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">parse_line<\/span>\n\n\n<span class=\"k\">class<\/span> <span class=\"nc\">MyAustin<\/span><span class=\"p\">(<\/span><span class=\"n\">AsyncAustin<\/span><span class=\"p\">):<\/span>\n\n    <span class=\"c1\"># Subclass AsyncAustin and implement this callback. This will be called<\/span>\n    <span class=\"c1\"># every time Austin generates a sample. The convenience method parse_line<\/span>\n    <span class=\"c1\"># will parse the sample and produce the thread name, the stack of contexts<\/span>\n    <span class=\"c1\"># with the corresponding line numbers and the measured duration for the<\/span>\n    <span class=\"c1\"># sample.<\/span>\n\n    <span class=\"k\">def<\/span> <span class=\"nf\">on_sample_received<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">sample<\/span><span class=\"p\">):<\/span>\n        <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"n\">parse_line<\/span><span class=\"p\">(<\/span><span class=\"n\">sample<\/span><span class=\"o\">.<\/span><span class=\"n\">encode<\/span><span class=\"p\">()))<\/span>\n\n\n<span class=\"k\">if<\/span> <span class=\"vm\">__name__<\/span> <span class=\"o\">==<\/span> <span class=\"s2\">&quot;__main__&quot;<\/span><span class=\"p\">:<\/span>\n    <span class=\"n\">my_austin<\/span> <span class=\"o\">=<\/span> <span class=\"n\">MyAustin<\/span><span class=\"p\">()<\/span>\n\n    <span class=\"n\">my_austin<\/span><span class=\"o\">.<\/span><span class=\"n\">start<\/span><span class=\"p\">(<\/span><span class=\"n\">sys<\/span><span class=\"o\">.<\/span><span class=\"n\">argv<\/span><span class=\"p\">[<\/span><span class=\"mi\">1<\/span><span class=\"p\">:])<\/span>\n    <span class=\"k\">if<\/span> <span class=\"ow\">not<\/span> <span class=\"n\">my_austin<\/span><span class=\"o\">.<\/span><span class=\"n\">wait<\/span><span class=\"p\">():<\/span>\n       <span class=\"k\">raise<\/span> <span class=\"ne\">RuntimeError<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;Austin failed to start&quot;<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">try<\/span><span class=\"p\">:<\/span>\n        <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;MyAustin is starting...&quot;<\/span><span class=\"p\">)<\/span>\n        <span class=\"n\">my_austin<\/span><span class=\"o\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">()<\/span>\n        <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;The profiled target has terminated.&quot;<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">except<\/span> <span class=\"ne\">KeyboardInterrupt<\/span><span class=\"p\">:<\/span>\n        <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;MyAustin has been terminated from keyboard.&quot;<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>As the example above shows, it is enough to inherit from <code>AsyncAustin<\/code> and\ndefine the <code>on_sample_received<\/code> callback. This will get called every time Austin\nproduces a sample. You can then do whatever you like with it. Here we simply\npass the <code>sample<\/code>, which is just a binary string in the format <code>Thread\n[tid];[func] ([mod]);#[line no];[func] ...;L[line no] [usec]<\/code> to the\n<code>parse_line<\/code> function, which conveniently split the string into its main\ncomponents, i.e. the thread identifier, the stack of frames and the sample\nduration. We then print the resulting triple to screen.<\/p>\n<p>The rest of the code is there to create an instance of this custom Austin\napplication. We call <code>wait<\/code> to ensure that Austin has been started successfully.\nThe optional argument is a timeout, which defaults to 1. If Austin is not\nstarted within 1 second, <code>wait<\/code> returns <code>False<\/code>. If we do not wish to do\nanything else with the event loop, we can then simply call the <code>join<\/code> methods\nwhich schedules the main read loop that calls the <code>on_sample_received<\/code> callback\nwhenever a sample is read from Austin's <code>stdout<\/code> file descriptor.<\/p>\n<h1 id=\"conclusions\">Conclusions<\/h1>\n<p>In this post, we have seen a few profiling options for Python. We have argued\nthat some statistical profilers, like Austin, can prove valuable tools. Whilst\nproviding approximate figures, the accuracy is in general quite high and the\nerror rate very low. Furthermore, no instrumentation is required and the\noverhead introduced is very minimal, all aspects that make a tool like Austin a\nperfect choice for many Python profiling needs.<\/p>\n<p>A feature that distinguishes Austin from the rest is its extreme simplicity\nwhich implies great flexibility. By just sampling the frame stack of the Python\ninterpreter, the user is left with the option of using the collected samples to\nderive the metrics that best suit the problem at hand.<\/p>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"python"}},{"@attributes":{"term":"profiling"}},{"@attributes":{"term":"optimisation"}}]},{"title":"What Actually Are Containers?","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/what-actually-are-containers.html","rel":"alternate"}},"published":"2018-08-04T18:42:00+01:00","updated":"2018-08-04T18:42:00+01:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2018-08-04:\/what-actually-are-containers.html","summary":"<p>Containers are the big thing of the moment. It is quite common to find blog posts and articles that explain what containers are <em>not<\/em>:  \"containers are not virtual machines\". Just what <em>are<\/em> they then? In this post we embark on a journey across some of the features of the Linux kernel to unveil the mystery.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#introduction\">Introduction<\/a><ul>\n<li><a href=\"#containers-defined\">Containers Defined<\/a><\/li>\n<li><a href=\"#a-note-on-operating-systems\">A Note on Operating Systems<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#creating-jails-with-chroot\">Creating Jails With chroot<\/a><ul>\n<li><a href=\"#a-minimal-chroot-jail\">A Minimal chroot Jail<\/a><\/li>\n<li><a href=\"#a-more-interesting-example\">A More Interesting Example<\/a><\/li>\n<li><a href=\"#leaky-containers\">Leaky Containers<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#control-groups\">Control Groups<\/a><ul>\n<li><a href=\"#a-hierarchy-of-cgroups\">A Hierarchy of cgroups<\/a><\/li>\n<li><a href=\"#how-to-work-with-control-groups\">How to Work with Control Groups<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#linux-namespaces\">Linux Namespaces<\/a><ul>\n<li><a href=\"#some-implementation-details\">Some Implementation Details<\/a><\/li>\n<li><a href=\"#how-to-work-with-namespaces\">How to Work with Namespaces<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#putting-it-all-together\">Putting It All Together<\/a><ul>\n<li><a href=\"#process-containment-for-chroot-jails\">Process Containment for chroot Jails<\/a><\/li>\n<li><a href=\"#wall-fortification\">Wall Fortification<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#conclusions\">Conclusions<\/a><\/li>\n<\/ul>\n<\/div>\n<h1 id=\"introduction\">Introduction<\/h1>\n<p>When I first heard about containers, I turned to my favourite search engine to\nfind out more about them and what they are. Most of the resources I have read\nthrough, though, seemed to put a great emphasis on what containers are <strong>not<\/strong>.\nContainers are like virtual machines, but are <strong>not<\/strong> virtual machines.<\/p>\n<p>So, what actually <strong>are<\/strong> they? After many unhelpful reads, the first good blog\npost that I've come across and that explains what containers indeed are is <a href=\"https:\/\/jvns.ca\/blog\/2016\/10\/10\/what-even-is-a-container\/\">What\neven is a container<\/a>\nby Julia Evans. If you go and read through that post (and I do recommended that\nyou do!), you will immediately learn that a container is like a cauldron where\nyou mix in the essential ingredients for a magic potion. Only in this case, the\ningredients are Linux kernel features.<\/p>\n<p>If many posts on containers make it sounds like they are some sort of black\nmagic (how can you have a <em>lightweight<\/em> virtual machine?!), the aim of this post\nis to show that the idea behind them is quite simple and made possible by a few\nLinux kernel features, like <strong>control groups<\/strong>, <strong>chroot<\/strong> and <strong>namespaces<\/strong>. I\nwill discuss each of them in turn in this post, but you should also be aware\nthat there are other kernel features involved in containers to make them robust\nand secure. These other aspects, however, will be part of a separate post. Here\nwe shall just focus on the essential ingredients that can allow you to literally\nhandcraft and run something that you may call a container, in the sense that is\ncommonly used these days.<\/p>\n<h2 id=\"containers-defined\">Containers Defined<\/h2>\n<p>Before we progress any further, I believe that we should take a moment to agree\non the meaning that we should attach to the word <em>container<\/em>. Much of the\nconfusion, in my opinion, arises from the many different definitions that are\nout there. According to\n<a href=\"https:\/\/en.wikipedia.org\/wiki\/Operating-system-level_virtualization\">Wikipedia<\/a>,\n<em>containers<\/em> ...<\/p>\n<blockquote>\n<p>... may look like real computers from the point of view of programs running in\nthem. A computer program running on an ordinary operating system can see all\nresources ... of that computer. However, programs running inside a container\ncan only see the container's contents and devices assigned to the container.<\/p>\n<\/blockquote>\n<p>My way of paraphrasing this definition is the following: a container is a main\nprocess that runs in user-space that gives you the impression that you are\nrunning an operating system with its own view of the file system, processes,\netc... on top of the operating system that is installed on the machine. In this\nsense, a container <em>contains<\/em> part of the host resources and hosts its own\nsystem and user applications.<\/p>\n<h2 id=\"a-note-on-operating-systems\">A Note on Operating Systems<\/h2>\n<p>Another cause of confusion, sometimes, is the definition of <em>operating system<\/em>\nitself, so before moving on, I want to make sure we agree on this too. An\noperating system can be thought as a <em>nut<\/em>. At its core we have, well, the\nkernel, which is in direct control of the hardware. On its own, the kernel is a\n<em>passive<\/em> component of an operating system. When an operating system is booted,\nthe kernel is the first part that gets loaded into memory and it quietly sits\nthere. Its purpose is to provide many \"buttons and levers\" (the <em>ABI<\/em>) that just\nwait to be pushed and pulled to operate the hardware and provide services to\nsystem and user applications. Around the kernel one usually finds, surprise\nsurprise, a shell. You might be familiar with Bash, Ksh, Zsh, etc... which allow\nyou to manipulate the file system (create, copy, move, delete files from disk),\nlaunch applications etc ... . Some of these applications are included with the\noperating system and build on top of the kernel services to provide basic\nfunctionalities (e.g. most if not all the standard Unix tools). Such\napplications are known as <em>system application<\/em>. Other software, like text\neditors, games, web browsers and alike are <em>user applications<\/em>. In some cases,\nit is hard to decide between system and user applications, as the line between\nthem is not very clear and open to debate. However, once you decide on what\nworks for you in terms of <em>system applications<\/em>, an operating system becomes the\ncombination of them and the kernel. Thus, Linux is just a <em>kernel<\/em> and not an\noperating system. On the other hand, Ubuntu <em>is<\/em> an example of a (Linux-based)\noperating system, since a typical Ubuntu installation includes the compiled code\nof the Linux kernel together with system applications.<\/p>\n<p>How do we tell which operating system we are currently running? Most Linux-based\noperating system have some files in the '\/etc' folder that contains information\nabout the distribution name and the installed version. For example, on\nDebian-based distributions, this file is typically named <code>os-release<\/code>. In my\ncase, this is what I get if I peek at its content with <code>cat<\/code>:<\/p>\n<div class=\"highlight\"><pre><span><\/span>$ cat \/etc\/os-release\nNAME=&quot;Ubuntu&quot;\nVERSION=&quot;18.04 LTS (Bionic Beaver)&quot;\nID=ubuntu\nID_LIKE=debian\nPRETTY_NAME=&quot;Ubuntu 18.04 LTS&quot;\nVERSION_ID=&quot;18.04&quot;\nHOME_URL=&quot;https:\/\/www.ubuntu.com\/&quot;\nSUPPORT_URL=&quot;https:\/\/help.ubuntu.com\/&quot;\nBUG_REPORT_URL=&quot;https:\/\/bugs.launchpad.net\/ubuntu\/&quot;\nPRIVACY_POLICY_URL=&quot;https:\/\/www.ubuntu.com\/legal\/terms-and-policies\/privacy-policy&quot;\nVERSION_CODENAME=bionic\nUBUNTU_CODENAME=bionic\n<\/pre><\/div>\n\n\n<h1 id=\"creating-jails-with-chroot\">Creating Jails With <code>chroot<\/code><\/h1>\n<p>One of the earliest examples of \"containers\" was provided by the use of\n<code>chroot<\/code>. This is a system call that was introduced in the BSD in 1982 and all\nit does is to change the apparent root directory for the process it is called\nfrom, and all its descendant processes.<\/p>\n<p>How can we use such a feature to create a container? Suppose that you have the\nroot file system of a Linux-based operating system in a sub-folder in your file\nsystem. For example, the new version of your favourite distribution came out and\nyou want to try the applications it comes with. You can use the <code>chroot<\/code> wrapper\napplication that ships with most if not all Unix-based operating systems these\ndays to launch the default shell with the apparent root set to\n<code>~\/myfavedistro-latest<\/code>. Assuming that your favourite distribution comes with\nmost of the standard Unix tools, you will now be able to launch applications\nfrom its latest version, using the services provided by the Linux kernel of the\nhost machine. Effectively, you are now running an instance of a different\noperating system that is using the kernel loaded at boot time from the host\noperating system (some sort of Frankenstein OS if you want).<\/p>\n<p>Does what we have just described fit into the above definition of <em>container<\/em>?\nSurely the default shell has its own view of the file system, which is a proper\nrestriction of the full file system of the host system. As for other resources,\nlike peripherals etc..., they happen to coincide with the host system, but at\nleast something is different. If we now look at the content of the <code>os-release<\/code>\nfile in the <code>\/etc<\/code> folder (or the equivalent for the distribution of your\nchoice), you will quite likely see something different from before, so indeed we\nhave a running instance of a different operating system.<\/p>\n<p>The term that is usually associated to <code>chroot<\/code> is <em>jail<\/em> rather than\n<em>container<\/em> though. Indeed, a process that is running within a new apparent root\nfile system cannot see the content of the parent folders and therefore it is\nconfined in a corner of the full, actual file system on the physical host. The\nmodified environment that we see from a shell started with chroot is sometimes\nreferred to as a <em>chroot jail<\/em>. But perhaps another reason why the term <em>jail<\/em>\nis being used is that, without the due precautions, it is relatively easy to\nbreak out of one (well, OK, maybe that's not an official reason).<\/p>\n<p>If the above discussion sounds a bit too abstract to you then don't worry\nbecause we are about to get hour hands dirty with <code>chroot<\/code>.<\/p>\n<h2 id=\"a-minimal-chroot-jail\">A Minimal <code>chroot<\/code> Jail<\/h2>\n<p>Since a <code>chroot<\/code> jail is pretty much like a <em>Bring Your Own System Application<\/em>\nparty, with the kernel kindly offered by the host, a minimal <code>chroot<\/code> jail can\nbe obtained with just the binary of a shell, and just a few other binary files.\nLet's try and create one with just <code>bash<\/code> in it then. Under the assumption\nthat you have it installed on your Linux system, we can determine all the\nshared object the <code>bash<\/code> shell depends on with <code>ldd<\/code><\/p>\n<div class=\"highlight\"><pre><span><\/span>$ ldd `which bash`\n        linux-vdso.so.1 =&gt;  (0x00007ffca3bca000)\n        libtinfo.so.5 =&gt; \/lib\/x86_64-linux-gnu\/libtinfo.so.5 (0x00007f9605411000)\n        libdl.so.2 =&gt; \/lib\/x86_64-linux-gnu\/libdl.so.2 (0x00007f960520d000)\n        libc.so.6 =&gt; \/lib\/x86_64-linux-gnu\/libc.so.6 (0x00007f9604e2d000)\n        \/lib64\/ld-linux-x86-64.so.2 (0x00007f960563a000)\n<\/pre><\/div>\n\n\n<p>So let's create a folder that will serve as the new root file system, e.g.\n<code>~\/minimal<\/code>, and copy the bash executable in it, together with all its\ndependencies. Copy the <code>bash<\/code> executable inside <code>~\/minimal\/bin<\/code>, the libraries\nfrom <code>\/lib<\/code> into <code>~\/minimal\/lib<\/code> and those from <code>\/lib64<\/code>into <code>~\/minimal\/lib64<\/code>.\nThen start the <code>chroot<\/code> jail with<\/p>\n<div class=\"highlight\"><pre><span><\/span>chroot ~\/minimal\n<\/pre><\/div>\n\n\n<p>You should now have a running bash session with a vanilla prompt format that\nlooks like this<\/p>\n<div class=\"highlight\"><pre><span><\/span>bash-4.4#\n<\/pre><\/div>\n\n\n<p>Note that <code>chroot<\/code> is being executed as the <code>root<\/code> user. This is because, under normal\ncircumstances, only <code>root<\/code> has the POSIX <em>capability<\/em> of calling the\n<code>SYS_CHROOT<\/code> system call.<\/p>\n<blockquote>\n<p>To see the current capabilities of a user one can use the <code>capsh --print<\/code>\n  command. The <code>Bounding set<\/code> line shows the capabilities that have been\n  inherited and that can be granted to a process from the current user.\n  Capabilities represent another feature that is relevant for containers. They\n  will be discussed in a separate post.<\/p>\n<\/blockquote>\n<p>If you now play around a bit with this bash session, you will realise pretty\nquickly that there isn't much that you can do. Most of the standard Unix tools\nare not available, not even <code>ls<\/code>. This container that we created as a <code>chroot<\/code>\njail is indeed minimal.<\/p>\n<h2 id=\"a-more-interesting-example\">A More Interesting Example<\/h2>\n<p>Ubuntu has released base images of the operating system since version 12.04.\nThese are just root file system images in the format of a compressed tarball.\nSuppose that a new stable version has come out and you want to give it a try\nbefore you upgrade your system. One thing you can do is to go to the <a href=\"http:\/\/cdimage.ubuntu.com\/ubuntu-latest\/releases\/\">Ubuntu\nBase releases<\/a> page and\ndownload the image that you want to test. Extract the content of the tarball\nsomewhere, e.g. <code>~\/ubuntu-latest<\/code> and \"run\" it with <code>chroot<\/code><\/p>\n<div class=\"highlight\"><pre><span><\/span>chroot ~\/ubuntu-latest\n<\/pre><\/div>\n\n\n<p>We are now running an instance of a new version of Ubuntu. To check that this is\nindeed the case, look at the output of <code>cat \/etc\/os-release<\/code>. Furthermore, we\nnow have access to all the basic tools that make up the Ubuntu operating system.\nFor instance you could use aptitude to download and install new packages, which\ncould be useful to test the latest version of an application.<\/p>\n<p>If you intend to do some serious work with these kinds of <code>chroot<\/code> jails, keep\nin mind that some of the pseudo-file systems won't be available from within the\njail. That's why you would have to mount them manually with<\/p>\n<div class=\"highlight\"><pre><span><\/span>mount -t proc proc proc\/\nmount -t sysfs sys sys\/\nmount -o <span class=\"nb\">bind<\/span> \/dev dev\/\n<\/pre><\/div>\n\n\n<p>This way you will be able to use, e.g., <code>ps<\/code> to look at the currently running\nprocesses.<\/p>\n<h2 id=\"leaky-containers\">Leaky Containers<\/h2>\n<p>With the simplicity of <code>chroot<\/code> jails comes many issues that make these kind of\n\"containers\" <em>leaky<\/em>. What do I mean by this? Suppose that you want to\n<em>containerise<\/em> two resource-intensive applications into two different <code>chroot<\/code>\njails (for example, the two applications, for some reasons, require different\nLinux-based operating systems). A typical example these days is that of\nmicroservices that we would like to run on the same host machine. When the first\nmicroservice fires up, it starts taking all the system resources (like CPU time\nfor instance), leaving no resources for the second microservice. The same can\nhappen for network bandwidth utilisation or disk I\/O rates.<\/p>\n<p>Unfortunately, this issue cannot be addressed within <code>chroot<\/code> jails, and their\nusefulness is somewhat restricted. Whilst we can use it to create some sort of\n\"ancestral\" containers, this is not the solution we would turn to in the long\nrun.<\/p>\n<p>Another serious issue with a poorly implemented <code>chroot<\/code> jail is the dreaded\nS-word: <em>security<\/em>. If nothing is done to prevent the user of the jail from\ncalling certain system calls (e.g. <code>chroot<\/code> itself), it is relatively\nstraightforward to <em>break out<\/em> of it. Recall how the <code>chroot<\/code> wrapper utility\nrequires <code>root<\/code> privileges to be executed. When we launched a bash session\nwithin the Ubuntu Base root file system, we were logged in as the root user.\nWithout any further configuration, nothing will prevent us from coding a simple\napplication that performs the following steps from within the jail:<\/p>\n<ol>\n<li>Create a folder with the <code>mkdir<\/code> system call or Unix wrapper tool.<\/li>\n<li>Call the <code>chroot<\/code> system call to change the apparent root to the newly\ncreated folder.<\/li>\n<li>Attempt to navigate sufficiently many levels up to hit the actual file system\nroot.<\/li>\n<li>Launch a shell.<\/li>\n<\/ol>\n<p>Why does this work? A simple call to the <code>chroot<\/code> system call only changes the\napparent root file system, but doesn't actually change the current working\ndirectory. The Unix <code>chroot<\/code> wrapper tool performs a combination of <code>chdir<\/code>\n<em>followed<\/em> by <code>chroot<\/code> to actually put the calling process inside the jail.<\/p>\n<blockquote>\n<p>A minimal version of the <code>chroot(2)<\/code> utility written in x86-64 assembly code\n  can be found in the\n  <a href=\"https:\/\/github.com\/P403n1x87\/asm\/blob\/master\/chroot\/minichroot.asm\"><code>minichroot.asm<\/code><\/a>\n  source file within the GitHub repository linked to this post.<\/p>\n<\/blockquote>\n<p>A call to <code>chroot<\/code> which is not preceded by a call to <code>chdir<\/code> moves the jail\nboundary <em>over<\/em> the current location down another level, so that we are\neffectively out of the jail. This means that we can <code>chdir<\/code> up many times now to\ntry and hit the actual root of the host file system. Now run a shell session and\nbang! We have full control of the host file system under the root user! Scary,\nisn't it?<\/p>\n<blockquote>\n<p>If you want to give this method a try, have a look at the\n  <a href=\"https:\/\/github.com\/P403n1x87\/asm\/blob\/master\/chroot\/jailbreak.asm\"><code>jailbreak.asm<\/code><\/a>\n  source file within the GitHub repository linked to this post.<\/p>\n<\/blockquote>\n<p>A less serious matter, but still something that you might want to address, is\nthat, after we have mounted the <code>proc<\/code> file system within the jail, the view of\nthe running processes from within the jail is the same as the one from the host\nsystem. Again, if we do nothing to strip down capabilities from the <code>chroot<\/code>\njail user, any process on the host machine can easily be killed (in the best\nhypothesis) by the jail user. Indeed, <code>chroot<\/code> containers really require a lot\nof care to prevent unwanted information from leaking. That is why present days\ncontainers make use of a different approach to guarantee \"airtight\" walls, as we\nshall soon see.<\/p>\n<h1 id=\"control-groups\">Control Groups<\/h1>\n<p>As we have argued above, when we make use of containers we might want to run\nmultiple instances of them on the same machine. The problem that we face is\nphysical resource sharing among the containers. How can we make sure that a\nrunning instance of a containerised process doesn't eat up all the available\nresources from the host machine?<\/p>\n<p>The answer is provided by a feature of the Linux kernel known as <strong>control\ngroups<\/strong>. Usually abbreviated as <code>cgroups<\/code>, control groups  were initially\nreleased in 2007, based on earlier work of Google engineers, and originally\nnamed <em>process containers<\/em>.<\/p>\n<p>Roughly speaking, <em>cgroups<\/em> allow you to limit, account for and isolate system\nresources usage among running processes. As a simple example, consider the\nscenario where one of your applications has a bug and starts leaking memory in\nan infinite loop. Slowly but inevitably, your process ends up using all the\nphysical memory available on the machine it is running on, causing some of the\nother processes to be killed at random by the OOM (Out of Memory) killer or, in\nthe worst case, crashing the entire system. If only you could assign a slice of\nmemory to the process that you want to test, then OOM killer would get rid of\nonly your faulty process, thus preventing your entire system from collapsing and\nallowing the other applications to run smoothly without consequences. Well, this\nis exactly one of the problems that <em>cgroups<\/em> allow you to solve.<\/p>\n<p>But physical memory is only one of the aspects (or <em>subsystem<\/em>, in the language\nof <em>cgroups<\/em>; another term that is used interchangeably is <em>controller<\/em>) that\ncan be limited with the use of control groups. CPU cycles, network bandwidth,\ndisk I\/O rate are other examples of resources that can be accounted for with\ncontrol groups. This way you can have two or more CPU-bound applications running\nhappily on the same machine, just by splitting the physical computing power\namong them.<\/p>\n<h2 id=\"a-hierarchy-of-cgroups\">A Hierarchy of cgroups<\/h2>\n<p>Linux processes are organised in a hierarchical structure. At boot, the <code>init<\/code>\nprocess, with PID 1, is spawned, and every other process originates from it as a\nchild process. This hierarchical structure is visible from the virtual file\nsystem mounted at <code>\/proc<\/code>.<\/p>\n<p>Cgroups have a similar hierarchical structure but, contrary to processes (also\nknown as <em>tasks<\/em> in <em>cgroup<\/em>-speak), there may be <em>many<\/em> of such hierarchies of\ncgroups. This is the case for cgroups v1, but starting with version 2,\nintroduced in 2015, cgroups follow a unified hierarchic structure. It is\npossible to use the two at the same time, thus having a hybrid cgroups resource\nmanagement, even though this is discouraged.<\/p>\n<p>Every cgroups inherits features from the parent cgroups and in general they can\nget more restrictive the further you move down the hierarchy, without the\npossibility of having overrides. Processes are then spawned or moved\/assigned to\ncgroups so that each process is in exactly one cgroup at any given time.<\/p>\n<p>This is, in a nutshell, what cgroups and cgroups2 are. A full treatment of\ncgroups would require a post on its own and it would take us off-topic. If you\nare curious to find out more about their history and their technical details,\nyou can have a look at the official documentation\n<a href=\"https:\/\/www.kernel.org\/doc\/Documentation\/cgroup-v1\/cgroups.txt\">here<\/a> and\n<a href=\"https:\/\/www.kernel.org\/doc\/Documentation\/cgroup-v2.txt\">here<\/a>.<\/p>\n<h2 id=\"how-to-work-with-control-groups\">How to Work with Control Groups<\/h2>\n<p>Let's have a look at how to use cgroups to limit the total amount of physical\n(or resident) memory that a process is allowed to use. The example is based on\ncgroups v1 since they are still in use today even though cgroups v2 are\nreplacing them and there currently is an on-going effort of migrating from v1 to\nv2.<\/p>\n<p>Since the introduction of cgroups in the Linux kernel, <em>every<\/em> process belongs\nto one and only one cgroup at any given time. By default, there is only one\ncgroup, the <em>root<\/em> cgroup, and every other process, together with its children,\nis in it.<\/p>\n<p>Control groups are manipulated with the use of file system operations on the\ncgroup mount-point (usually <code>\/sys\/fs\/cgroup<\/code>). For example, a new cgroup can be\ncreated with the <code>mkdir<\/code> command. Values can be set by writing on the files that\nthe kernel will create in the subfolder, and the simplest way is to just use\n<code>echo<\/code>. When a cgroup is no longer needed, it can be removed with <code>rmdir<\/code> (<code>rm\n-r<\/code> should not be used as an alternative!). This will effectively deactive the\ncgroup only when the last process in it has terminated, or if it only contains\nzombie processes.<\/p>\n<p>As an example, let's see how to create a cgroup that restricts the amount of\ntotal physical memory processes can use.<\/p>\n<div class=\"highlight\"><pre><span><\/span>mkdir \/sys\/fs\/cgroup\/memory\/mem_cg\n<span class=\"nb\">echo<\/span> 100m &gt; \/sys\/fs\/cgroup\/memory\/mem_cg\/memory.limit_in_bytes\n<span class=\"nb\">echo<\/span> 100m &gt; \/sys\/fs\/cgroup\/memory\/mem_cg\/memory.memsw.limit_in_bytes\n<\/pre><\/div>\n\n\n<blockquote>\n<p>If <code>memory.memsw.*<\/code> is not present in <code>\/sys\/fs\/cgroup\/memory<\/code>, you might need\n  to enable it on the kernel by adding the parameters <code>cgroup_enable=memory\n  swapaccount=1<\/code> to, e.g., GRUB's <em>kernel line<\/em>. To do so, open\n  <code>\/etc\/default\/grub<\/code> and append these parameters to\n  <code>GRUB_CMDLINE_LINUX_DEFAULT<\/code>.<\/p>\n<\/blockquote>\n<p>Any process running in the <code>mem_cg<\/code> cgroup will be constrained to a total amount\n(that is, physical plus swap) of memory equal to 100 MB. When a process gets\nabove the limit, the OOM killer will get rid of it. To add a process to the\n<code>mem_cg<\/code> cgroup we have to write its PID to the <code>tasks<\/code> file, e.g. with<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nb\">echo<\/span> <span class=\"nv\">$$<\/span> &gt; \/sys\/fs\/cgroup\/memory\/mem_cg\/tasks\n<\/pre><\/div>\n\n\n<p>This will put the currently running shell into the <code>mem_cg<\/code> cgroup. When we want\nto remove the cgroup, we can just delete its folder with<\/p>\n<div class=\"highlight\"><pre><span><\/span>rmdir \/sys\/fs\/cgroup\/memory\/mem_cg\n<\/pre><\/div>\n\n\n<p>Note that, even if fully removed from the virtual file system, any removed\ncgroups remain active until all the associated processes have terminated or have\nbecome zombies.<\/p>\n<p>Alternatively, one can work with cgroups by using the tools provided by\n<code>libcgroup<\/code> (Red Hat), or <code>cgroup-tools<\/code> (Debian). Once installed with the\ncorresponding package managers, the above commands can be replaced with the\nfollowing, perhaps more intuitive ones:<\/p>\n<div class=\"highlight\"><pre><span><\/span>cgcreate -g memory:mem_cg\ncgset -r memory.limit_in_bytes<span class=\"o\">=<\/span>100m\ncgset -r memory.memsw.limit_in_bytes<span class=\"o\">=<\/span>100m\ncgclassify -g memory:mem_cg <span class=\"nv\">$$<\/span>\ncgdelete memory:mem_cg\n<\/pre><\/div>\n\n\n<p>One can use <code>cgexec<\/code> as an alternative to start a new process directly within a\ncgroup:<\/p>\n<div class=\"highlight\"><pre><span><\/span>cgroup -g memory:mem_cg \/bin\/bash\n<\/pre><\/div>\n\n\n<p>We can test that the memory cgroup we have created works with the following\nsimple C program<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&quot;stdlib.h&quot;<\/span><span class=\"cp\"><\/span>\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&quot;stdio.h&quot;<\/span><span class=\"cp\"><\/span>\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&quot;string.h&quot;<\/span><span class=\"cp\"><\/span>\n\n<span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"nf\">main<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"kt\">int<\/span><span class=\"w\">    <\/span><span class=\"n\">a<\/span><span class=\"w\">      <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"kt\">char<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">buffer<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"nb\">NULL<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"k\">while<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">++<\/span><span class=\"n\">a<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">buffer<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">char<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"n\">malloc<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"mi\">20<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">buffer<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"n\">printf<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;Allocation #%d<\/span><span class=\"se\">\\n<\/span><span class=\"s\">&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">a<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"n\">memset<\/span><span class=\"p\">(<\/span><span class=\"n\">buffer<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"mi\">20<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"k\">return<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>We have created an infinite loop in which we allocate chunks of 1 MB of memory\nat each iteration. The call to <code>memset<\/code> is a trick to force the Linux kernel to\nactually allocate the requested memory under the copy-on-write strategy.<\/p>\n<p>Once compiled, we can run it into the <code>mem_cg<\/code> cgroup with<\/p>\n<div class=\"highlight\"><pre><span><\/span>cgexec -g mem_cg .\/a.out\n<\/pre><\/div>\n\n\n<p>We expect to see about 100 successful allocations and after that the OOM killer\nintervenes to stop our processes, since it would have reached the allocated\nmemory quota by then.<\/p>\n<p>Imagine now launching a <code>chroot<\/code> jail inside a memory cgroup like the one we\ncreated above. Every application that we launch from within it is automatically\ncreated inside the same cgroup. This way we can run, e.g., a microservice and we\ncan be assured that it won't eat up all the available memory from the host\nmachine. With a similar approach, we could also make sure that it won't reserve\nall the CPU and its cores to itself, thus allowing other processes (perhaps in\ndifferent jails\/containers) to run simultaneously and smoothly on the same\nphysical machine.<\/p>\n<h1 id=\"linux-namespaces\">Linux Namespaces<\/h1>\n<p>The description of Linux namespaces given by the dedicated manpage sums up the\nconcept pretty well:<\/p>\n<blockquote>\n<p>A namespace wraps a global system resource in an abstraction that\n  makes it appear to the processes within the namespace that they have\n  their own isolated instance of the global resource.  Changes to the\n  global resource are visible to other processes that are members of\n  the namespace, but are invisible to other processes.  One use of\n  namespaces is to implement containers.<\/p>\n<\/blockquote>\n<p>So no questions asked about why Linux namespaces were introduced in the first\nplace. As the description says, they are used to allow processes to have their\nown copy of a certain physical resource. For example, the most recent versions\nof the Linux kernel allow us to define a namespace of the <strong>network<\/strong> kind, and\nevery application that we run under it will have its own copy of the full\nnetwork stack. We have pretty much a rather lightweight way of virtualising an\nentire network!<\/p>\n<p>Linux namespaces represent a relatively new feature that made its first\nappearance in 2002 with the <strong>mount<\/strong> kind. Since there were no plans to have\ndifferent kinds of namespaces, at that time the term <em>namespace<\/em> was synonym of\n<em>mount<\/em> namespace. Beginning in 2006, more kinds were added and, presently,\nthere are plans for new kinds to be developed and included in future releases of\nthe Linux kernel.<\/p>\n<p>If you really want to identify a single feature that makes modern Linux\ncontainer possible, namespaces is arguably the candidate. Let's try to see why.<\/p>\n<h2 id=\"some-implementation-details\">Some Implementation Details<\/h2>\n<p>In order to introduce namespaces in Linux, a new system call, <code>unshare<\/code>, has\nbeen added to the kernel. Its use is \"to allow a process to control its shared\nexecution context without creating a new process.\" (quoted verbatim from the\nmanpage of <code>unshare(2)<\/code>). What does this mean? Suppose that, at a certain point,\nyou want the current process to be moved to a new network namespace so that it\nhas its own \"private\" network stack. All you have to do is make a call to the\n<code>unshare<\/code> system call with the appropriate flag set.<\/p>\n<p>What if we do want to spawn a new process in a new namespace instead? With the\nintroduction of namespaces, the existing <code>clone<\/code> system call has been extended\nwith new flags. When <code>clone<\/code> is called with some of these flags set, new\nnamespaces of the corresponding kinds are created and the new process is\nautomatically made a member of them.<\/p>\n<p>The namespace information of the currently running processes is stored in the\n<code>proc<\/code> file system, under the new <code>ns<\/code> subfolder of every PID folder (i.e.\n<code>\/proc\/[pid]\/ns\/<\/code>). This as well as other details of how namespaces are\nimplemented can be found in the\n<a href=\"http:\/\/man7.org\/linux\/man-pages\/man7\/namespaces.7.html\"><code>namespaces(7)<\/code><\/a>\nmanpage.<\/p>\n<h2 id=\"how-to-work-with-namespaces\">How to Work with Namespaces<\/h2>\n<p>As with cgroups, an in-depth description of namespaces would require a post on\nits own. So we will have a look at just one simple example. Since networks are\nubiquitous these days, let's try to launch a process that has its own\nvirtualised network stack and that is capable of communicating with the host\nsystem via a network link.<\/p>\n<p>This is the plan:<\/p>\n<ol>\n<li>Create a linked pair of virtual ethernet devices, e.g. <code>veth0<\/code> and <code>veth1<\/code>.<\/li>\n<li>Move <code>veth1<\/code> to a new network namespace<\/li>\n<li>Assign IP addresses to the virtual NICs and bring them up.<\/li>\n<li>Test that the can transfer data between them.<\/li>\n<\/ol>\n<p>Here is a simple bash script that does exactly this. Note that the creation of a\nnetwork namespace requires a capability that normal Unix user don't usually\nhave, so this is why you will need to run them as, e.g., root.<\/p>\n<table class=\"highlighttable\"><tr><td class=\"linenos\"><div class=\"linenodiv\"><pre><span class=\"normal\"> 1<\/span>\n<span class=\"normal\"> 2<\/span>\n<span class=\"normal\"> 3<\/span>\n<span class=\"normal\"> 4<\/span>\n<span class=\"normal\"> 5<\/span>\n<span class=\"normal\"> 6<\/span>\n<span class=\"normal\"> 7<\/span>\n<span class=\"normal\"> 8<\/span>\n<span class=\"normal\"> 9<\/span>\n<span class=\"normal\">10<\/span>\n<span class=\"normal\">11<\/span>\n<span class=\"normal\">12<\/span>\n<span class=\"normal\">13<\/span>\n<span class=\"normal\">14<\/span>\n<span class=\"normal\">15<\/span>\n<span class=\"normal\">16<\/span>\n<span class=\"normal\">17<\/span>\n<span class=\"normal\">18<\/span>\n<span class=\"normal\">19<\/span>\n<span class=\"normal\">20<\/span><\/pre><\/div><\/td><td class=\"code\"><div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># Create a new network namespace<\/span>\nip netns add <span class=\"nb\">test<\/span>\n\n<span class=\"c1\"># Create a pair of virtual ethernet interfaces<\/span>\nip link add veth0 <span class=\"nb\">type<\/span> veth peer name veth1\n\n<span class=\"c1\"># Configure the host virtual interface<\/span>\nip addr add <span class=\"m\">10<\/span>.0.0.1\/24 dev veth0\nip link <span class=\"nb\">set<\/span> veth0 up\n\n<span class=\"c1\"># Move the guest virtual interface to the test namespace<\/span>\nip link <span class=\"nb\">set<\/span> veth1 netns <span class=\"nb\">test<\/span>\n\n<span class=\"c1\"># Configure the guest virtual interface in the test namespace<\/span>\nip netns <span class=\"nb\">exec<\/span> <span class=\"nb\">test<\/span> bash\nip addr add <span class=\"m\">10<\/span>.0.0.2\/24 dev veth1\nip link <span class=\"nb\">set<\/span> veth1 up\n\n<span class=\"c1\"># Start listening for TCP packets on port 2000<\/span>\nnc -l <span class=\"m\">2000<\/span>\n<\/pre><\/div>\n<\/td><\/tr><\/table>\n\n<p>On line 2 we use the extended, namespace-capable version of <code>ip<\/code> to create a new\nnamespace of the network kind, called <code>test<\/code>. We then create the pair of virtual\nethernet devices with the command on line 5. On line 12 we move the <code>veth1<\/code>\ndevice to the <code>test<\/code> namespace and, in order to configure it, we launch a bash\nsession inside <code>test<\/code> with the command on line 15. Once in the new namespace we\ncan see the <code>veth1<\/code> device again, which has now disappeared from the default\n(also known as <em>global<\/em>) namespace. You can check that by opening a new terminal\nand typing <code>ip link list<\/code>. The <code>veth1<\/code> device should have disappeared after the\nexecution of the command on line 12.<\/p>\n<p>We can then use <code>netcat<\/code> to listen to TCP packets being sent on port 2000 from\nwithin the new namespace (line 20). On a new bash session in the default\nnamespaces, we can start <code>netcat<\/code> with<\/p>\n<p _=\"%\" endterminal>{% terminal $ %}\nnc 10.0.0.2 2000<\/p>\n<p>to start sending packets to the new namespace <code>test<\/code> via the link between\n<code>veth0<\/code> and <code>veth1<\/code>. Everything that you type should now be echoed by the bash\nsession in the <code>test<\/code> namespace after you press Enter.<\/p>\n<h1 id=\"putting-it-all-together\">Putting It All Together<\/h1>\n<p>Now let's see how to put all the stuff we have discussed thus far together to\nhandcraft some more (better) containers.<\/p>\n<h2 id=\"process-containment-for-chroot-jails\">Process Containment for <code>chroot<\/code> Jails<\/h2>\n<p>With our first attempt at manually crafting a container with <code>chroot<\/code>, we\ndiscovered a few weaknesses of different nature that made the result quite\nleaky. Let's try to address some of those issues, for instance the fact that all\nthe processes running on the host system are visible from within the container.\nTo this end, we shall make use of the Ubuntu Base image that we used in the\n<code>chroot<\/code> section. We then combine <code>chroot<\/code> with namespaces in the following way.\nAssuming that you have created the <code>test<\/code> network namespace as described in the\nprevious section, run<\/p>\n<div class=\"highlight\"><pre><span><\/span>unshare --fork -p -u ip netns <span class=\"nb\">exec<\/span> <span class=\"nb\">test<\/span> chroot ubuntu-latest\n<\/pre><\/div>\n\n\n<p>The <code>--fork<\/code> switch is required by the <code>-p<\/code> switch because we want to spawn a\nnew bash session with PID 1, rather than within the calling process. The <code>-u<\/code>\nswitch will give us a new hostname that we are then free to change without that\naffecting the host system. We then use the <code>ip<\/code> new capability of creating\nnamespaces of the network kind to create the Ubuntu Base <code>chroot<\/code> jail.<\/p>\n<p>The first improvement is now evident. From inside the <code>chroot<\/code> jail, mount the\n<code>proc<\/code> file system with, e.g.<\/p>\n<div class=\"highlight\"><pre><span><\/span>mount -t proc proc \/proc\n<\/pre><\/div>\n\n\n<p>and then look at the output of <code>ps aux<\/code>:<\/p>\n<div class=\"highlight\"><pre><span><\/span># ps -ef\nUID        PID  PPID  C STIME TTY          TIME CMD\nroot         1     0  0 11:54 ?        00:00:00 \/bin\/bash -i\nroot         8     1  0 11:56 ?        00:00:00 ps -ef\n<\/pre><\/div>\n\n\n<p>The bash session that we started inside the <code>chroot<\/code> jail has PID 1 and the <code>ps<\/code>\ntool from the Ubuntu Base distribution has PID 8 and parent PID 1, i.e. the\n<code>chroot<\/code> jail. That's all the processes that we can see from here! If we try to\nidentify this bash shell from the global namespace we find something like this<\/p>\n<div class=\"highlight\"><pre><span><\/span>$ ps -ef | grep unshare | grep -v grep\nroot      5829  4957  0 12:54 pts\/1    00:00:00 sudo unshare --fork -p -u ip netns exec test chroot ubuntu-latest\nroot      5830  5829  0 12:54 pts\/1    00:00:00 unshare --fork -p -u ip netns exec test chroot ubuntu-latest\n<\/pre><\/div>\n\n\n<p>PIDs in your case will quite likely be different, but the point here is that,\nwith namespaces, we have broken the assumption that a process has a <em>unique<\/em>\nprocess ID.<\/p>\n<h2 id=\"wall-fortification\">Wall Fortification<\/h2>\n<p>Whilst the process view problem has been solved (we can no longer kill the host\nprocesses since we cannot see them), the fact that the <code>chroot<\/code> jail runs as\nroot still leaves us with the <em>jailbreak<\/em> issue. To fix this we just use\nnamespaces again the way they where meant to be used originally. Recall that,\nwhen they were introduced, namespaces were of just one kind: mount. In fact,\nback then, namespaces was a synonym of <em>mount<\/em> namespace.<\/p>\n<p>The other ingredient that is needed to actually secure against jail breaking is\nthe <code>pivot_root<\/code> system call. At first sight it might look like <code>chroot<\/code>, but it\nis quite different. It allows you to put the old root to a new location and use\na new mount point as the new root for the calling process.<\/p>\n<p>The key here is the combination of <code>pivot_root<\/code> and the namespace of the kind\nmount that allows us to specify a new root and manipulate the mount points that\nare visible inside the container that we want to create, without messing about\nwith the host mount points. So here is the general idea and the steps required:<\/p>\n<ol>\n<li>Start a shell session from a shell executable inside the root file system in\na mount namespace.<\/li>\n<li>Unmount all the current mount points, including that of type <code>proc<\/code>.<\/li>\n<li>Turn the Ubuntu Base root file system into a (bind) mount point<\/li>\n<li>Use <code>pivot_root<\/code> and `chroot to swap the new root with the old one<\/li>\n<li>Unmount the new location of the old root to conceal the full host file\nsystem.<\/li>\n<\/ol>\n<p>The above steps can be performed with the following initialisation script.<\/p>\n<div class=\"highlight\"><pre><span><\/span>umount -a\numount \/proc\nmount --bind ubuntu-latest\/ ubuntu-latest\/\n<span class=\"nb\">cd<\/span> ubuntu-latest\/\n<span class=\"nb\">test<\/span> -d old-root <span class=\"o\">||<\/span> mkdir old-root\npivot_root . old-root\/\n<span class=\"nb\">exec<\/span> chroot . \/bin\/bash --init-file &lt;<span class=\"o\">(<\/span>mount -t proc proc \/proc <span class=\"o\">&amp;&amp;<\/span> umount -l \/old-root<span class=\"o\">)<\/span>\n<\/pre><\/div>\n\n\n<p>Copy and paste these lines in a file, e.g. <code>cnt-init.sh<\/code> and then run<\/p>\n<div class=\"highlight\"><pre><span><\/span>sudo unshare --fork -p -u -m ubuntu-latest\/bin\/bash --init-file cnt-init.sh\n<\/pre><\/div>\n\n\n<p>You can now check that the <code>\/old-root<\/code> folder is empty, meaning that we now have\nno ways of accessing the full host file system, but only the corner that\ncorresponds to the content of the new root, i.e. the content of the\n<code>ubuntu-latest<\/code> folder. Furthermore, you can go on and check that our recipe for\nbreaking out of a vanilla <code>chroot<\/code> jail does not work in this case, because the\njail itself is now an effective, rather than apparent, root!<\/p>\n<h1 id=\"conclusions\">Conclusions<\/h1>\n<p>We have come to the end of this journey across the features of the Linux kernel\nthat make containers possible. I hope this has given you a better understanding\nof what many people mean when they say that containers are like virtual\nmachines, but are <em>not<\/em> virtual machine.<\/p>\n<p>Whilst spinning containers by hand could be fun, and quite likely an interesting\neducational experience for many, to actually produce something that is robust\nand secure enough requires some effort. Even in our last examples there are many\nthings that need to be improved, starting from the fact that we would want to\navoid giving control of our containers to users as root. Despite all our effort\nto improve containment of resources, an user logged in as root can still do some\nnasty things (open lower-numbered ports and all such kind of businesses...). The\npoint here is that, if you need containers for production environments, you\nshould turn to well tested and established technologies, like LXC, Docker etc...\n.<\/p>","category":[{"@attributes":{"term":"Technology"}},{"@attributes":{"term":"containers"}},{"@attributes":{"term":"linux"}}]},{"title":"Extending Python with Assembly","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/extending-python-with-assembly.html","rel":"alternate"}},"published":"2018-03-24T00:32:00+01:00","updated":"2018-03-24T00:32:00+01:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2018-03-24:\/extending-python-with-assembly.html","summary":"<p>What's a better way to fill an empty evening if not by reading about how to extend Python with Assembly? I bet you don't even know where to start to answer this question :P. But if you're curious to know how you can use another language to extend Python, and if you happen to like Assembly programming, you might end up actually enjoying this post (I hope!).<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#introduction\">Introduction<\/a><\/li>\n<li><a href=\"#the-code\">The Code<\/a><ul>\n<li><a href=\"#shared-object\">Shared Object<\/a><\/li>\n<li><a href=\"#the-cpython-headers\">The CPython Headers<\/a><\/li>\n<li><a href=\"#exporting-global-symbols\">Exporting Global Symbols<\/a><\/li>\n<li><a href=\"#immutable-strings\">Immutable Strings<\/a><\/li>\n<li><a href=\"#cpython-data-structures\">CPython Data Structures<\/a><\/li>\n<li><a href=\"#local-and-global-functions\">Local and Global Functions<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#installation\">Installation<\/a><ul>\n<li><a href=\"#assembling-and-linking\">Assembling and Linking<\/a><\/li>\n<li><a href=\"#how-to-test-the-module\">How to Test the Module<\/a><\/li>\n<li><a href=\"#distributing-the-module\">Distributing the Module<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#conclusions\">Conclusions<\/a><\/li>\n<\/ul>\n<\/div>\n<h1 id=\"introduction\">Introduction<\/h1>\n<p>If you have landed on this page, you must have had one between two only possible\nreactions to the title of this post, either \"Hmm, this sounds interesting\" or\n\"Just, why?\". The straight answer is, well, \"just, because\". And perhaps a bit\nmore articulated answer is: because the people in the first category probably\nenjoy this kind of things :).<\/p>\n<p>Reactions aside, the subject of this post is the coding of an extension for\nPython written in pure Assembly for the Intel x86-64 architecture on a\nLinux-based operating system. If you are familiar with general assembly but have\nnever coded for the architecture that we are targeting, it is perhaps worth\nreading through my previous post \"<a href=\"https:\/\/p403n1x87.github.io\/getting-started-with-x86-64-assembly-on-linux.html\">Getting Started with x86-64 Assembly on\nLinux<\/a>\".<\/p>\n<p>I will also assume that you are somewhat familiar with extending Python with C.\nIf not, then it probably is a good idea to go through the <a href=\"https:\/\/docs.python.org\/3\/extending\/extending.html\">official\ndocumentation<\/a> before\nreading on, or some things might not make too much sense. The approach of this\npost is by example and builds on knowledge about C to transition to Assembly. My\nfavourite assembler on Linux is NASM, since it supports the Intel syntax, the\none that I am more comfortable with. Therefore the only dependencies for\nfollowing along are the NASM assembler and the GNU linker <code>ld<\/code>. Optionally, we\ncan make use of a <code>Makefile<\/code> to assemble and link our code, and perhaps <code>docker<\/code>\nto test it in a clean environment. You will find all the relevant files in the\nlinked <a href=\"https:\/\/github.com\/P403n1x87\/asm\/tree\/master\/python\">GitHub repository<\/a>.<\/p>\n<p>Now it's time to jump straight into the code.<\/p>\n<h1 id=\"the-code\">The Code<\/h1>\n<p>There isn't much more to say before we can see the code really, so here it is.\nThis is the content of my <code>asm.asm<\/code> source file.<\/p>\n<table class=\"highlighttable\"><tr><td class=\"linenos\"><div class=\"linenodiv\"><pre><span class=\"normal\"> 1<\/span>\n<span class=\"normal\"> 2<\/span>\n<span class=\"normal\"> 3<\/span>\n<span class=\"normal\"> 4<\/span>\n<span class=\"normal\"> 5<\/span>\n<span class=\"normal\"> 6<\/span>\n<span class=\"normal\"> 7<\/span>\n<span class=\"normal\"> 8<\/span>\n<span class=\"normal\"> 9<\/span>\n<span class=\"normal\">10<\/span>\n<span class=\"normal\">11<\/span>\n<span class=\"normal\">12<\/span>\n<span class=\"normal\">13<\/span>\n<span class=\"normal\">14<\/span>\n<span class=\"normal\">15<\/span>\n<span class=\"normal\">16<\/span>\n<span class=\"normal\">17<\/span>\n<span class=\"normal\">18<\/span>\n<span class=\"normal\">19<\/span>\n<span class=\"normal\">20<\/span>\n<span class=\"normal\">21<\/span>\n<span class=\"normal\">22<\/span>\n<span class=\"normal\">23<\/span>\n<span class=\"normal\">24<\/span>\n<span class=\"normal\">25<\/span>\n<span class=\"normal\">26<\/span>\n<span class=\"normal\">27<\/span>\n<span class=\"normal\">28<\/span>\n<span class=\"normal\">29<\/span>\n<span class=\"normal\">30<\/span>\n<span class=\"normal\">31<\/span>\n<span class=\"normal\">32<\/span>\n<span class=\"normal\">33<\/span>\n<span class=\"normal\">34<\/span>\n<span class=\"normal\">35<\/span>\n<span class=\"normal\">36<\/span>\n<span class=\"normal\">37<\/span>\n<span class=\"normal\">38<\/span>\n<span class=\"normal\">39<\/span>\n<span class=\"normal\">40<\/span>\n<span class=\"normal\">41<\/span>\n<span class=\"normal\">42<\/span>\n<span class=\"normal\">43<\/span>\n<span class=\"normal\">44<\/span>\n<span class=\"normal\">45<\/span>\n<span class=\"normal\">46<\/span>\n<span class=\"normal\">47<\/span>\n<span class=\"normal\">48<\/span>\n<span class=\"normal\">49<\/span>\n<span class=\"normal\">50<\/span>\n<span class=\"normal\">51<\/span>\n<span class=\"normal\">52<\/span>\n<span class=\"normal\">53<\/span>\n<span class=\"normal\">54<\/span>\n<span class=\"normal\">55<\/span>\n<span class=\"normal\">56<\/span>\n<span class=\"normal\">57<\/span>\n<span class=\"normal\">58<\/span>\n<span class=\"normal\">59<\/span>\n<span class=\"normal\">60<\/span>\n<span class=\"normal\">61<\/span>\n<span class=\"normal\">62<\/span>\n<span class=\"normal\">63<\/span>\n<span class=\"normal\">64<\/span>\n<span class=\"normal\">65<\/span>\n<span class=\"normal\">66<\/span>\n<span class=\"normal\">67<\/span>\n<span class=\"normal\">68<\/span>\n<span class=\"normal\">69<\/span>\n<span class=\"normal\">70<\/span>\n<span class=\"normal\">71<\/span>\n<span class=\"normal\">72<\/span>\n<span class=\"normal\">73<\/span>\n<span class=\"normal\">74<\/span>\n<span class=\"normal\">75<\/span>\n<span class=\"normal\">76<\/span>\n<span class=\"normal\">77<\/span>\n<span class=\"normal\">78<\/span>\n<span class=\"normal\">79<\/span><\/pre><\/div><\/td><td class=\"code\"><div class=\"highlight\"><pre><span><\/span><span class=\"nf\">DEFAULT<\/span><span class=\"w\">                 <\/span><span class=\"nv\">rel<\/span><span class=\"w\"><\/span>\n\n<span class=\"cp\">%include                &quot;asm\/python.inc&quot;<\/span>\n\n<span class=\"k\">GLOBAL<\/span><span class=\"w\">                  <\/span><span class=\"nv\">PyInit_asm<\/span><span class=\"p\">:<\/span><span class=\"nv\">function<\/span><span class=\"w\"><\/span>\n\n\n<span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\">                 <\/span><span class=\"nv\">.rodata<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n\n<span class=\"nf\">l_sayit_name<\/span><span class=\"w\">            <\/span><span class=\"nv\">db<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot;sayit&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">l_sayit_doc<\/span><span class=\"w\">             <\/span><span class=\"nv\">db<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot;This method has something important to say.&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">l_sayit_msg<\/span><span class=\"w\">             <\/span><span class=\"nv\">db<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot;Assembly is great fun! :)&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">10<\/span><span class=\"w\"><\/span>\n<span class=\"no\">l_sayit_msg_len<\/span><span class=\"w\">         <\/span><span class=\"kd\">equ<\/span><span class=\"w\"> <\/span><span class=\"kc\">$<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_sayit_msg<\/span><span class=\"w\"><\/span>\n\n<span class=\"nf\">l_module_name<\/span><span class=\"w\">           <\/span><span class=\"nv\">db<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot;asm&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"><\/span>\n\n\n<span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\">                 <\/span><span class=\"nv\">.data<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n\n<span class=\"nl\">l_asm_methods:<\/span><span class=\"w\">              <\/span><span class=\"c1\">;; struct PyMethodDef[] *<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">ISTRUC<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyMethodDef<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyMethodDef.ml_name<\/span><span class=\"w\">    <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_sayit_name<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyMethodDef.ml_meth<\/span><span class=\"w\">    <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">asm_sayit<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyMethodDef.ml_flags<\/span><span class=\"w\">   <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">METH_NOARGS<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyMethodDef.ml_doc<\/span><span class=\"w\">     <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_sayit_doc<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">IEND<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">NullMethodDef<\/span><span class=\"w\"><\/span>\n\n<span class=\"nl\">l_asm_module:<\/span><span class=\"w\">                <\/span><span class=\"c1\">;; struct PyModuleDef *<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">ISTRUC<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_base<\/span><span class=\"w\">     <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef_HEAD_INIT<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_name<\/span><span class=\"w\">     <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_module_name<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_doc<\/span><span class=\"w\">      <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">NULL<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_size<\/span><span class=\"w\">     <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"mi\">1<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_methods<\/span><span class=\"w\">  <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_asm_methods<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_slots<\/span><span class=\"w\">    <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">NULL<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_traverse<\/span><span class=\"w\"> <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">NULL<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_clear<\/span><span class=\"w\">    <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_free<\/span><span class=\"w\">     <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">NULL<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">IEND<\/span><span class=\"w\"><\/span>\n\n\n<span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\">                 <\/span><span class=\"nv\">.text<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n\n<span class=\"nl\">asm_sayit:<\/span><span class=\"w\"> <\/span><span class=\"c1\">;; ----------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">push<\/span><span class=\"w\">  <\/span><span class=\"nb\">rbp<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rbp<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nb\">rsp<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">                  <\/span><span class=\"c1\">; SYS_WRITE<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rdi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">                  <\/span><span class=\"c1\">; STDOUT<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rsi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_sayit_msg<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rdx<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_sayit_msg_len<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">syscall<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">Py_None<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">inc<\/span><span class=\"w\">   <\/span><span class=\"kt\">QWORD<\/span><span class=\"w\"> <\/span><span class=\"p\">[<\/span><span class=\"nb\">rax<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyObject.ob_refcnt<\/span><span class=\"p\">]<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">                        <\/span><span class=\"nf\">pop<\/span><span class=\"w\">   <\/span><span class=\"nb\">rbp<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">ret<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;; end asm_sayit<\/span><span class=\"w\"><\/span>\n\n\n<span class=\"nl\">PyInit_asm:<\/span><span class=\"w\"> <\/span><span class=\"c1\">;; --------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">push<\/span><span class=\"w\">  <\/span><span class=\"nb\">rbp<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rbp<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nb\">rsp<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rsi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">PYTHON_API_VERSION<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rdi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_asm_module<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">call<\/span><span class=\"w\">  <\/span><span class=\"nv\">PyModule_Create2<\/span><span class=\"w\"> <\/span><span class=\"ow\">WRT<\/span><span class=\"w\"> <\/span><span class=\"nv\">..plt<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">                        <\/span><span class=\"nf\">pop<\/span><span class=\"w\">   <\/span><span class=\"nb\">rbp<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">ret<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;; end PyInit_asm<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n<\/td><\/tr><\/table>\n\n<p>If you have never written a C extension for Python before, this might look a bit\nmysterious to you, although the general structure, at least, should be quite\nclear after you've glimpsed through the official Python documentation on\nextending Python with C.<\/p>\n<p>We shall now analyse every single part of the above code sample in details to\nsee what each block of code does.<\/p>\n<h2 id=\"shared-object\">Shared Object<\/h2>\n<p>On the very first line of the source we see the line<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nf\">DEFAULT<\/span><span class=\"w\">                 <\/span><span class=\"nv\">rel<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>Our goal is to assemble and link our code into an ELF64 <em>shared object<\/em> file.\nContrary to ordinary program code, shared object files are dynamically loaded\ninto random memory addresses. It is therefore important that all our code is\n<em>position-independent<\/em>. One way of doing this is to make sure that any memory\nreference is not absolute, but relative to the value of the <code>RIP<\/code> register,\nwhich points to the current instruction being executed. This guarantees that, no\nmatter where the shared object is loaded into memory, references to local\nvariables are correct. In 64-bit mode, NASM defaults to absolute addresses,\ntherefore the above line is necessary to switch to <code>RIP<\/code>-relative addresses.<\/p>\n<h2 id=\"the-cpython-headers\">The CPython Headers<\/h2>\n<p>On line 3 we include a file to our main Assembly source. Given the simplicity of\nthis example, we could have included the content of the <code>python.inc<\/code> file within\n<code>asm.asm<\/code> itself. However, for larger projects it is perhaps good practice to\nseparate declarations and actual code, like it is usually done in C, with <code>.h<\/code>\nand <code>.c<\/code> files. In fact, the <code>python.inc<\/code> file includes the equivalent of\nstructures and macros as declared in the CPython header files. As far as I'm\naware, there are no assembly-specific include files provided by the maintainers\nof CPython, so we have to go through the extra effort of typing them ourselves.\nWe will get back to the content of this file later on.<\/p>\n<h2 id=\"exporting-global-symbols\">Exporting Global Symbols<\/h2>\n<p>Line 5 is an important one. It exports the symbol <code>PyInit_asm<\/code>, of type\n<code>function<\/code>, and makes it available for external programs. This is the function\nthat CPython calls the moment we load the <code>asm<\/code> module with <code>import asm<\/code> from\nthe Python interpreter. If we do not export this symbol, then CPython won't be\nable to find the code necessary to initialise the module. In analogy with C,\nthis is equivalent to declaring a non-static function.<\/p>\n<h2 id=\"immutable-strings\">Immutable Strings<\/h2>\n<p>Next we have the read-only data section<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\">                 <\/span><span class=\"nv\">.rodata<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n\n<span class=\"nf\">l_sayit_name<\/span><span class=\"w\">            <\/span><span class=\"nv\">db<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot;sayit&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">l_sayit_doc<\/span><span class=\"w\">             <\/span><span class=\"nv\">db<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot;This method has something important to say.&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">l_sayit_msg<\/span><span class=\"w\">             <\/span><span class=\"nv\">db<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot;Assembly is great fun! :)&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">10<\/span><span class=\"w\"><\/span>\n<span class=\"no\">l_sayit_msg_len<\/span><span class=\"w\">         <\/span><span class=\"kd\">equ<\/span><span class=\"w\"> <\/span><span class=\"kc\">$<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_sayit_msg<\/span><span class=\"w\"><\/span>\n\n<span class=\"nf\">l_module_name<\/span><span class=\"w\">           <\/span><span class=\"nv\">db<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot;asm&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>Here we initialise the strings that we will need later on. As they won't change\nduring the course of the code execution, we put them in a read-only section of\nthe shared object. The GNU C compiler does just the same thing with every\nliteral string that you use in C code. You will notice references to their\naddress in the following section, that of (read-write) initialised data.<\/p>\n<h2 id=\"cpython-data-structures\">CPython Data Structures<\/h2>\n<p>Next is the section of initialised data.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\">                 <\/span><span class=\"nv\">.data<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n\n<span class=\"nl\">l_asm_methods:<\/span><span class=\"w\">              <\/span><span class=\"c1\">;; struct PyMethodDef[] *<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">ISTRUC<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyMethodDef<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyMethodDef.ml_name<\/span><span class=\"w\">    <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_sayit_name<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyMethodDef.ml_meth<\/span><span class=\"w\">    <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">asm_sayit<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyMethodDef.ml_flags<\/span><span class=\"w\">   <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">METH_NOARGS<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyMethodDef.ml_doc<\/span><span class=\"w\">     <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_sayit_doc<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">IEND<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">NullMethodDef<\/span><span class=\"w\"><\/span>\n\n<span class=\"nl\">l_asm_module:<\/span><span class=\"w\">               <\/span><span class=\"c1\">;; struct PyModuleDef *<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">ISTRUC<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_base<\/span><span class=\"w\">     <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef_HEAD_INIT<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_name<\/span><span class=\"w\">     <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_module_name<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_doc<\/span><span class=\"w\">      <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">NULL<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_size<\/span><span class=\"w\">     <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"mi\">1<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_methods<\/span><span class=\"w\">  <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_asm_methods<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_slots<\/span><span class=\"w\">    <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">NULL<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_traverse<\/span><span class=\"w\"> <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">NULL<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_clear<\/span><span class=\"w\">    <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">at<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef.m_free<\/span><span class=\"w\">     <\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">dq<\/span><span class=\"w\"> <\/span><span class=\"nv\">NULL<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">IEND<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>Here is where things start to get interesting, and the content of the\n<code>python.inc<\/code> file comes into play. The first two labels point to the beginning\nof CPython-specific structures. The first is an array of <code>PyMethodDef<\/code>\nstructures. As the name suggests, each instance of this structure is used to\nhold information about a method that should be made available to the Python\ninterpreter from within our module. To find out in which header file it is\ndefined, we can use the command<\/p>\n<div class=\"highlight\"><pre><span><\/span>grep -nr \/usr\/include\/python3.6 -e <span class=\"s2\">&quot;struct PyMethodDef&quot;<\/span>\n<\/pre><\/div>\n\n\n<p>In my case, I get that the structure is defined in\n<code>\/usr\/include\/python3.6\/methodobject.h<\/code>, starting from line 54. Inside the\n<code>python.inc<\/code> we then have the equivalent structure declaration<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">STRUC<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyMethodDef<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.ml_name<\/span><span class=\"w\">              <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; const char *<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.ml_meth<\/span><span class=\"w\">              <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; PyCFunction<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.ml_flags<\/span><span class=\"w\">             <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; int<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.ml_doc<\/span><span class=\"w\">               <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; const char *<\/span><span class=\"w\"><\/span>\n<span class=\"k\">ENDSTRUC<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>The <code>NullMethodDef<\/code> is a NASM macro that conveniently defines the <em>sentinel<\/em>\n<code>PyMethodDef<\/code> structure, which is used to mark the end of the <code>PyMethodDef<\/code>\narray pointed by <code>l_asm_methods<\/code>. Its definition is also in the <code>python.inc<\/code>\nfile and, as you can see, simply initialises a new instance of the structure\nwith all the fields set to <code>NULL<\/code> or 0, depending on their semantics, i.e.\nwhether they are memory pointers or general integers.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">%define NullMethodDef         dq NULL, NULL, 0, NULL<\/span>\n<\/pre><\/div>\n\n\n<blockquote>\n<p>Note that <code>NULL<\/code> is not a native NASM value. To align the coding conventions\nwith C, I have defined NULL as a constant in <code>python.inc<\/code> and assigned the\nvalue of 0 to it. The idea is that, like in C, it makes the intent of the code\nclearer, since any occurrence of <code>NULL<\/code> indicates a null pointer rather than\njust the literal value 0.<\/p>\n<\/blockquote>\n<p>The next label, <code>l_asm_module<\/code>, points to an instance of the <code>PyModuleDef<\/code>\nstructure, which is pretty much the core data structure of our Python module. It\ncontains all the relevant metadata that is then passed to CPython for correct\ninitialisation and use of the module. Its definition is in the <code>moduleobject.h<\/code>\nheader file and, at first sight, looks a bit complicated, with some references\nto other structures and C macros.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">typedef<\/span><span class=\"w\"> <\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">PyModuleDef<\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">PyModuleDef_Base<\/span><span class=\"w\"> <\/span><span class=\"n\">m_base<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">const<\/span><span class=\"w\"> <\/span><span class=\"kt\">char<\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">m_name<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">const<\/span><span class=\"w\"> <\/span><span class=\"kt\">char<\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">m_doc<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">Py_ssize_t<\/span><span class=\"w\"> <\/span><span class=\"n\">m_size<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">PyMethodDef<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"n\">m_methods<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">struct<\/span><span class=\"w\"> <\/span><span class=\"nc\">PyModuleDef_Slot<\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">m_slots<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">traverseproc<\/span><span class=\"w\"> <\/span><span class=\"n\">m_traverse<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">inquiry<\/span><span class=\"w\"> <\/span><span class=\"n\">m_clear<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">freefunc<\/span><span class=\"w\"> <\/span><span class=\"n\">m_free<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"> <\/span><span class=\"n\">PyModuleDef<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>So lets take our time to figure out what its byte content looks like. The first\nfield is an instance of the <code>PyModuleDef_Base<\/code> structure, which is defined in\nthe same header file, just a few lines above. The non-trivial bit in this new\nstructure is the first part, <code>PyObject_HEAD<\/code>, which looks like a C macro. As the\nname suggest, its definition is quite likely to be found in <code>object.h<\/code>. Indeed,\nthere we find<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">#define PyObject_HEAD                   PyObject ob_base;<\/span>\n<\/pre><\/div>\n\n\n<p>so our chase continues. The definition of the <code>PyObject<\/code> structure can be found\na few lines below. Again, all the fields are quite simple, i.e. just integers\nvalue or memory pointers, except for the macro <code>_PyObject_HEAD_EXTRA<\/code>. We then\nhave to jump back up a few lines, to find that this macro is conditionally\ndefined as either nothing or <code>0, 0<\/code>. By default, the macro <code>Py_TRACE_REFS<\/code> is\nnot defined, so in our case <code>_PyObject_HEAD_EXTRA<\/code> evaluates to nothing.\nBacktracking from our macro chase in CPython headers, we see that we can define\nthe following structures in <code>python.inc<\/code><\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">STRUC<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyObject<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.ob_refcnt<\/span><span class=\"w\">            <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; Py_ssize_t<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.ob_type<\/span><span class=\"w\">              <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; struct _typeobject *<\/span><span class=\"w\"><\/span>\n<span class=\"k\">ENDSTRUC<\/span><span class=\"w\"><\/span>\n\n<span class=\"k\">STRUC<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef_Base<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.ob_base<\/span><span class=\"w\">              <\/span><span class=\"nv\">resb<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyObject_size<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_init<\/span><span class=\"w\">               <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; PyObject *<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_index<\/span><span class=\"w\">              <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; Py_ssize_t<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_copy<\/span><span class=\"w\">               <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; PyObject *<\/span><span class=\"w\"><\/span>\n<span class=\"k\">ENDSTRUC<\/span><span class=\"w\"><\/span>\n\n<span class=\"k\">STRUC<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_base<\/span><span class=\"w\">               <\/span><span class=\"nv\">resb<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyModuleDef_Base_size<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_name<\/span><span class=\"w\">               <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; const char *<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_doc<\/span><span class=\"w\">                <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; const char *<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_size<\/span><span class=\"w\">               <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; Py_ssize_t<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_methods<\/span><span class=\"w\">            <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; PyMethodDef *<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_slots<\/span><span class=\"w\">              <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; struct PyModuleDef_Slot *<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_traverse<\/span><span class=\"w\">           <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; traverseproc<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_clear<\/span><span class=\"w\">              <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; inquiry<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nf\">.m_free<\/span><span class=\"w\">               <\/span><span class=\"nv\">resq<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">    <\/span><span class=\"c1\">; freefunc<\/span><span class=\"w\"><\/span>\n<span class=\"k\">ENDSTRUC<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>As you can easily guess, NASM generates the constants <code>PyObject_size<\/code> etc...\nautomatically so that they can be used to reserve enough memory to hold the\nentire structure in the definition of other structures. This makes nesting quite\neasy to implement in NASM.<\/p>\n<h2 id=\"local-and-global-functions\">Local and Global Functions<\/h2>\n<p>Finally we get to the actual code that will get executed by CPython when our\nmodule is loaded and initialised, and when the methods that it provides are\ncalled.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\">                 <\/span><span class=\"nv\">.text<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;; ---------------------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n\n<span class=\"nl\">asm_sayit:<\/span><span class=\"w\"> <\/span><span class=\"c1\">;; ----------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">push<\/span><span class=\"w\">  <\/span><span class=\"nb\">rbp<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rbp<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nb\">rsp<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">                  <\/span><span class=\"c1\">; SYS_WRITE<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rdi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">                  <\/span><span class=\"c1\">; STDOUT<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rsi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_sayit_msg<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rdx<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_sayit_msg_len<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">syscall<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">Py_None<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">inc<\/span><span class=\"w\">   <\/span><span class=\"kt\">QWORD<\/span><span class=\"w\"> <\/span><span class=\"p\">[<\/span><span class=\"nb\">rax<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"nv\">PyObject.ob_refcnt<\/span><span class=\"p\">]<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">                        <\/span><span class=\"nf\">pop<\/span><span class=\"w\">   <\/span><span class=\"nb\">rbp<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">ret<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;; end asm_sayit<\/span><span class=\"w\"><\/span>\n\n\n<span class=\"nl\">PyInit_asm:<\/span><span class=\"w\"> <\/span><span class=\"c1\">;; --------------------------------------------------------------<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">push<\/span><span class=\"w\">  <\/span><span class=\"nb\">rbp<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rbp<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nb\">rsp<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rsi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">PYTHON_API_VERSION<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">mov<\/span><span class=\"w\">   <\/span><span class=\"nb\">rdi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">l_asm_module<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">call<\/span><span class=\"w\">  <\/span><span class=\"nv\">PyModule_Create2<\/span><span class=\"w\"> <\/span><span class=\"ow\">WRT<\/span><span class=\"w\"> <\/span><span class=\"nv\">..plt<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">                        <\/span><span class=\"nf\">pop<\/span><span class=\"w\">   <\/span><span class=\"nb\">rbp<\/span><span class=\"w\"><\/span>\n<span class=\"w\">                        <\/span><span class=\"nf\">ret<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;; end PyInit_asm<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>In fact, we have a total of two functions, one local and one global. The first\none, <code>asm_sayit<\/code>, is the only method contained in our module. All it does is to\nwrite a string, <code>l_sayit_msg<\/code>, to standard output by invoking the <code>SYS_WRITE<\/code>\nsystem call. Perhaps the most interesting bit of this function is the code on\nlines 61-62. This is the idiom for any function that wishes to return <code>None<\/code> in\nPython. Recall that, in Python, <code>None<\/code> is an object instantiated by CPython. As\nsuch, our shared library needs to import it as an external symbol. This is why\nyou will find the macro <code>PyNone<\/code> defined as<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">%define Py_None               _Py_NoneStruct<\/span>\n<\/pre><\/div>\n\n\n<p>together with the line<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">EXTERN<\/span><span class=\"w\">    <\/span><span class=\"nv\">_Py_NoneStruct<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>in the <code>python.inc<\/code> file. This is equivalent to the two lines<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">PyAPI_DATA<\/span><span class=\"p\">(<\/span><span class=\"n\">PyObject<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"n\">_Py_NoneStruct<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"cm\">\/* Don&#39;t use this directly *\/<\/span><span class=\"w\"><\/span>\n<span class=\"cp\">#define Py_None (&amp;_Py_NoneStruct)<\/span>\n<\/pre><\/div>\n\n\n<p>in the <code>object.h<\/code> header file, where the <code>None<\/code> object is defined. All of this\nexplains line 61, but what about line 62? This has to do with <a href=\"https:\/\/docs.python.org\/3\/c-api\/refcounting.html\">Reference\nCounting<\/a>. In a nutshell,\nevery object created in Python comes with a counter that keeps track of all the\nreferences attached to it. When the counter gets down to 0, the object can be\nde-allocated from memory and resources freed for other objects to use. This is\nhow Python, which heavily relies on <code>malloc<\/code>, can keep memory leaks at bait. It\nis therefore very important to properly maintain reference counts in Python\nextensions. As <code>None<\/code> is a Python object like any others, when we return a\nreference to it, we have to bump its reference count. In C, this is conveniently\ndone with the <code>Py_INCREF<\/code> macro. Its definition is in the <code>object.h<\/code> and, as it\nis easy to guess, it just increases the <code>ob_refcnt<\/code> field of the <code>PyObject<\/code>\nstructure. This is precisely what we do on line 62.<\/p>\n<blockquote>\n<p><strong>Stack Frames Matter!<\/strong> You might be wondering why we are taking care of\ncreating a stack frame on function entry, and cleaning up after ourself on\nleave. The reason is a pretty obvious one: we don't know what code will call\nours, so it is safe to make sure that stack alignment is preserved across\ncalls by doing what every function is expected to do. When I was lying down\nthe code for this post, I was getting a SIGSEGV exception, and the debugger\nrevealed that the instruction <code>movaps<\/code> was trying to store the value of the\n<code>xmm0<\/code> register on a memory location that was not a multiple of 16. The\nproblem was solved by the extra 8 bytes from <code>push rbp<\/code>.<\/p>\n<\/blockquote>\n<p>The second and last function is our global exported symbol <code>PyInit_asm<\/code>. It gets\ncalled by CPython as soon as we <code>import<\/code> the module with <code>import asm<\/code>. In this\nsimple case, we don't have to do much here. In fact, all we have to do is call a\nstandard CPython function and pass it the instance of <code>PyModuleDef<\/code> allocated at\n<code>l_asm_module<\/code>. As we have briefly seen, this contains all the information about\nour module, from the documentation to the list of methods.<\/p>\n<p>Now, if you have read through the official documentation on how to extend Python\nwith C, you might be wondering why we are calling <code>PyModule_Create2<\/code> instead of\n<code>PyModule_Create<\/code> (is there a typo?), and why we are passing it two arguments\ninstead of one. If you are starting to smell a C macro, then you are correct!\nLong story short, <code>PyModule_Create<\/code> is a macro defined as<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">#define PyModule_Create(module) PyModule_Create2(module, PYTHON_API_VERSION)<\/span>\n<\/pre><\/div>\n\n\n<p>with <code>PYTHON_API_VERSION<\/code> defined as the literal 1013. So the actual function to\ncall is indeed <code>PyModule_Create2<\/code>, and it takes two arguments.<\/p>\n<blockquote>\n<p>Did you notice that weird <code>WRT ..plt<\/code>? Remember the discussion about ensuring\nposition-independent code? Since we have no clue of where the\n<code>PyModule_Create2<\/code> function resides in memory, we have to rely on some sort of\nindirection. This is provided by the so-called <em>Procedure Linkage Table<\/em>, or\n<em>PLT<\/em> for short, which is some code that is part of our shared library. When\nwe call <code>PyModule_Create2 WRT ..plt<\/code>, we are jumping to the PLT section of our\nobject file in memory, which contains the necessary code to make the actual\njump to the function that we want to call.<\/p>\n<\/blockquote>\n<h1 id=\"installation\">Installation<\/h1>\n<p>Once our assembly code is ready, it needs to be assembled and linked into a\nshared object file. We will now see how to perform these steps, and how to test\nand install our Python extension.<\/p>\n<h2 id=\"assembling-and-linking\">Assembling and Linking<\/h2>\n<p>Once the code is ready, it needs to be assembled and linked into the final\nshared object file. The NASM assembler is invoked with minimal arguments as<\/p>\n<div class=\"highlight\"><pre><span><\/span>nasm -f elf64 -o asm\/asm.o asm\/asm.asm\n<\/pre><\/div>\n\n\n<p>This creates an intermediate object file <code>asm.o<\/code>. To create the final shared\nobject file, we use the GNU linker with the following arguments<\/p>\n<div class=\"highlight\"><pre><span><\/span>ld -shared -o asm\/asm.so asm\/asm.o -I\/lib64\/ld-linux-x86-64.so.2\n<\/pre><\/div>\n\n\n<p>Note the use of the <code>-shared<\/code> switch, which instructs the linker to create a\nshared object file.<\/p>\n<h2 id=\"how-to-test-the-module\">How to Test the Module<\/h2>\n<p>The first thing that you might want to do is to manually test that the shared\nobject file works fine with Python. For CPython to be able to find the module,\nwe need to ensure that its location is included in the search path. One way is\nto add it to the <code>PYTHONPATH<\/code> environment variable. For example, from within the\nproject folder, we can launch Python with<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nv\">PYTHONPATH<\/span><span class=\"o\">=<\/span>.\/asm python3\n<\/pre><\/div>\n\n\n<p>and from the interactive session we should be able to import the module with<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"o\">&gt;&gt;&gt;<\/span> <span class=\"kn\">import<\/span> <span class=\"nn\">asm<\/span>\n<span class=\"o\">&gt;&gt;&gt;<\/span>\n<\/pre><\/div>\n\n\n<p>Alternatively, we can add the search path to <code>sys.path<\/code> with these few lines of\nPython code<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"o\">&gt;&gt;&gt;<\/span> <span class=\"kn\">import<\/span> <span class=\"nn\">sys<\/span>\n<span class=\"o\">&gt;&gt;&gt;<\/span> <span class=\"n\">sys<\/span><span class=\"o\">.<\/span><span class=\"n\">path<\/span><span class=\"o\">.<\/span><span class=\"n\">append<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;.\/asm&quot;<\/span><span class=\"p\">)<\/span>\n<span class=\"o\">&gt;&gt;&gt;<\/span> <span class=\"kn\">import<\/span> <span class=\"nn\">asm<\/span>\n<span class=\"o\">&gt;&gt;&gt;<\/span>\n<\/pre><\/div>\n\n\n<p>Once we have successfully imported our module in Python, we can test that its\nmethod <code>sayit<\/code> works as expected<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"o\">&gt;&gt;&gt;<\/span> <span class=\"n\">asm<\/span><span class=\"o\">.<\/span><span class=\"n\">sayit<\/span><span class=\"o\">.<\/span><span class=\"vm\">__doc__<\/span>\n<span class=\"s1\">&#39;This method has something important to say.&#39;<\/span>\n<span class=\"o\">&gt;&gt;&gt;<\/span> <span class=\"n\">asm<\/span><span class=\"o\">.<\/span><span class=\"n\">sayit<\/span><span class=\"p\">()<\/span>\n<span class=\"n\">Assembly<\/span> <span class=\"ow\">is<\/span> <span class=\"n\">great<\/span> <span class=\"n\">fun<\/span><span class=\"err\">!<\/span> <span class=\"p\">:)<\/span>\n<span class=\"o\">&gt;&gt;&gt;<\/span>\n<\/pre><\/div>\n\n\n<p>I hope that you would agree :).<\/p>\n<h2 id=\"distributing-the-module\">Distributing the Module<\/h2>\n<p>The simplicity of our sample module wouldn't justify the use of <code>setuptools<\/code> for\ndistribution. In this case, a simple, old-fashioned Makefile is the simplest\nsolution to go for. Even for larger projects, you would probably still delegate\nthe build job of your code to a Makefile anyway, which would then get called\nfrom your <code>setup.py<\/code> at some phase, perhaps during <code>build<\/code>. However, the\nrecommended standard is that you build <em>wheels<\/em> instead of <em>eggs<\/em>, and the\nrequirement is that you provide pre-built binaries with your package.<\/p>\n<p>This being said, let's see how to distribute the module. As we have seen in the\nprevious section, the shared object needs to resides in one of the Python search\npaths. The easiest way to find out what these paths are on the platform that you\nare targeting is to launch the Python interpreter and print the value of\n<code>sys.path<\/code>. On my platform, I get the following output<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">Python<\/span> <span class=\"mf\">3.6.3<\/span> <span class=\"p\">(<\/span><span class=\"n\">default<\/span><span class=\"p\">,<\/span> <span class=\"n\">Oct<\/span>  <span class=\"mi\">3<\/span> <span class=\"mi\">2017<\/span><span class=\"p\">,<\/span> <span class=\"mi\">21<\/span><span class=\"p\">:<\/span><span class=\"mi\">45<\/span><span class=\"p\">:<\/span><span class=\"mi\">48<\/span><span class=\"p\">)<\/span>\n<span class=\"p\">[<\/span><span class=\"n\">GCC<\/span> <span class=\"mf\">7.2.0<\/span><span class=\"p\">]<\/span> <span class=\"n\">on<\/span> <span class=\"n\">linux<\/span>\n<span class=\"n\">Type<\/span> <span class=\"s2\">&quot;help&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;copyright&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s2\">&quot;credits&quot;<\/span> <span class=\"ow\">or<\/span> <span class=\"s2\">&quot;license&quot;<\/span> <span class=\"k\">for<\/span> <span class=\"n\">more<\/span> <span class=\"n\">information<\/span><span class=\"o\">.<\/span>\n<span class=\"o\">&gt;&gt;&gt;<\/span> <span class=\"kn\">import<\/span> <span class=\"nn\">sys<\/span>\n<span class=\"o\">&gt;&gt;&gt;<\/span> <span class=\"n\">sys<\/span><span class=\"o\">.<\/span><span class=\"n\">path<\/span>\n<span class=\"p\">[<\/span><span class=\"s1\">&#39;&#39;<\/span><span class=\"p\">,<\/span> <span class=\"s1\">&#39;\/usr\/lib\/python36.zip&#39;<\/span><span class=\"p\">,<\/span> <span class=\"s1\">&#39;\/usr\/lib\/python3.6&#39;<\/span><span class=\"p\">,<\/span> <span class=\"s1\">&#39;\/usr\/lib\/python3.6\/lib-dynload&#39;<\/span><span class=\"p\">,<\/span> <span class=\"s1\">&#39;\/usr\/local\/lib\/python3.6\/dist-packages&#39;<\/span><span class=\"p\">,<\/span> <span class=\"s1\">&#39;\/usr\/lib\/python3\/dist-packages&#39;<\/span><span class=\"p\">,<\/span> <span class=\"s1\">&#39;\/usr\/lib\/python3.6\/dist-packages&#39;<\/span><span class=\"p\">]<\/span>\n<span class=\"o\">&gt;&gt;&gt;<\/span>\n<\/pre><\/div>\n\n\n<p>The Makefile could then contain the following line inside the <code>install<\/code> rule:<\/p>\n<div class=\"highlight\"><pre><span><\/span>install: default\n    cp asm\/asm.so \/usr\/lib\/python<span class=\"si\">${<\/span><span class=\"nv\">PYTHON_TARGET<\/span><span class=\"si\">}<\/span>\/asm.so\n<\/pre><\/div>\n\n\n<p>with the environment variable <code>PYTHON_TARGET<\/code> set to <code>3.6<\/code>.<\/p>\n<p>To automate the building and testing process of the module, we could use Docker\nto build an image out of the target platform and trigger a build, and perhaps\nexecute some unit tests too. A simple Dockerfile that does the minimum work to\nbuild and test would look something like the following<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">FROM<\/span><span class=\"w\">  <\/span><span class=\"s\">ubuntu:latest<\/span>\n<span class=\"k\">USER<\/span><span class=\"w\">  <\/span><span class=\"s\">root<\/span>\n<span class=\"k\">ADD<\/span><span class=\"w\">   <\/span>. asm\n<span class=\"k\">RUN<\/span><span class=\"w\">   <\/span>apt-get update              <span class=\"o\">&amp;&amp;<\/span><span class=\"se\">\\<\/span>\n      apt-get install -y            <span class=\"se\">\\<\/span>\n        nasm                        <span class=\"se\">\\<\/span>\n        python3-pytest              <span class=\"se\">\\<\/span>\n        build-essential\n<span class=\"k\">ENV<\/span><span class=\"w\">   <\/span><span class=\"nv\">PYTHON_TARGET<\/span><span class=\"o\">=<\/span><span class=\"m\">3<\/span>.5\n<span class=\"k\">RUN<\/span><span class=\"w\">   <\/span><span class=\"nb\">cd<\/span> asm                      <span class=\"o\">&amp;&amp;<\/span><span class=\"se\">\\<\/span>\n      make                        <span class=\"o\">&amp;&amp;<\/span><span class=\"se\">\\<\/span>\n      make install                <span class=\"o\">&amp;&amp;<\/span><span class=\"se\">\\<\/span>\n      python3 -m pytest -s\n<\/pre><\/div>\n\n\n<p>As you can see, we are targeting the latest stable version of Ubuntu, which\ncomes with <code>python3.5<\/code>. We make sure we install all the required dependencies,\nthe assembler and the standard build tools, along with <code>python3-pytest<\/code> to\nperform unit testing once our module builds successfully.<\/p>\n<p>The bare minimum that we can test is that the import of the module works fine\nand that we can call its method. So a possible <code>test_asm.py<\/code> test script would\nlook like<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">asm<\/span>\n\n\n<span class=\"k\">def<\/span> <span class=\"nf\">test_asm<\/span><span class=\"p\">():<\/span>\n    <span class=\"n\">asm<\/span><span class=\"o\">.<\/span><span class=\"n\">sayit<\/span><span class=\"p\">()<\/span>\n<\/pre><\/div>\n\n\n<h1 id=\"conclusions\">Conclusions<\/h1>\n<p>Whilst I appreciate that the cases where you'd want to seriously consider\nextending Python with Assembly code are rare, it is undoubtedly the case that,\nif you enjoy experimenting with code, this could be a fun and instructing\nexperience. In my case, this has forced me to go look into the CPython header\nfiles, which I probably wouldn't have if I were using C. I now know more about\nthe internal workings of Python and a clearer idea of how CPython is structured.<\/p>\n<p>As always, I hope you have enjoyed the read. Happy Assembly coding! :)<\/p>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"python"}},{"@attributes":{"term":"assembly"}}]},{"title":"IoT with WebSockets and Python's AsyncIO","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/iot-with-websockets-and-pythons-asyncio.html","rel":"alternate"}},"published":"2018-03-03T22:49:00+01:00","updated":"2018-03-03T22:49:00+01:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2018-03-03:\/iot-with-websockets-and-pythons-asyncio.html","summary":"<p>After <a href=\"https:\/\/p403n1x87.github.io\/a-gentle-introduction-to-iot.html\">a gentle introduction to the concept of IoT<\/a> and what it entails, we take a dive into WebSockets and Asynchronous I\/O in Python to explore other ways of controlling devices over a network. This post uses a simple two LED circuit to introduce WebSockets, and how to use them in Python together with the <code>asyncio<\/code> module.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#introduction\">Introduction<\/a><\/li>\n<li><a href=\"#connecting-devices-over-a-network-websockets\">Connecting Devices over a Network: WebSockets<\/a><ul>\n<li><a href=\"#coroutines\">Coroutines<\/a><\/li>\n<li><a href=\"#the-asyncio-module\">The asyncio module<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#setting-things-up\">Setting Things Up<\/a><ul>\n<li><a href=\"#the-circuitry\">The Circuitry<\/a><\/li>\n<li><a href=\"#the-server-code\">The Server Code<\/a><\/li>\n<li><a href=\"#the-client-code\">The Client Code<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#conclusions\">Conclusions<\/a><\/li>\n<\/ul>\n<\/div>\n<h1 id=\"introduction\">Introduction<\/h1>\n<p>In this post, we will get back to the topic of IoT to introduce two new\ntechnologies by example: <strong>WebSockets<\/strong> and <strong>Asynchronous I\/O<\/strong> in Python. The\nproject that will allow us to explore them is a simple system of two LEDs that\nwe will control with the gravity sensor of an Android device via a Raspberry Pi\nover a network.<\/p>\n<p>The post is divided into two parts. The first one is theoretical in nature and\nwill cover the essential technical details of the two main subjects, that is\n<em>WebSockets<\/em> and Python's <em>asyncio<\/em>. In the second part we will have a look at\nthe circuit that we are going to control over the network and discuss the server\nand client code.<\/p>\n<p>But before we dive into the study of the topics of this post, it is perhaps best\nto first have a high level view of how all the pieces fit together, thus\nmotivating the choices of technologies mentioned above. The idea of the project\nstems from the following scenario: suppose you want to build a device that can\nbe controlled over the network, e.g. an hand-held device like a phone. As we\nhave seen in <a href=\"https:\/\/p403n1x87.github.io\/a-gentle-introduction-to-iot.html\">a previous post<\/a>, one way of\nachieving this is by using a single-board computer like a Raspberry Pi and\ncontrol it via a web server.<\/p>\n<p>Now, what if you wanted more control over the connection method to the device,\nand perhaps a bi-directional channel to let data from, e.g. sensors on the\ndevice, to flow upstream to the controlling device? An elegant way of achieving\nthis goal these days is with <strong>WebSockets<\/strong>, which allow (full) duplex\ncommunication between pairs of connected devices. So, for instance, we could\nhave a WebSocket server running on a Raspberry Pi, and have a native application\non an Android device to run a WebSocket client. Control commands can then flow\nfrom the Android device to the single-board computer, while the server can feed\ndata from any sensors that the device is equipped with back to the client, with\njust a single connection.<\/p>\n<p>The code samples that we will look at towards the end of the post implement a\nPython WebSocket server that will run on, e.g., a Raspberry Pi, and a native\nAndroid application that will act as a WebSocket client to control a pair of\nLEDs mounted on a breadboard. The data that we will transfer from the Android\ndevice to our circuit is coming from a gravity sensor, so that when we twist the\nhand-held device clockwise or anti-clockwise, a different LED will turn on.\nFurthermore, the brightness will depend on how much the device is tilted.<\/p>\n<p>So, read on to find out more!<\/p>\n<h1 id=\"connecting-devices-over-a-network-websockets\">Connecting Devices over a Network: WebSockets<\/h1>\n<p>In <a href=\"https:\/\/p403n1x87.github.io\/a-gentle-introduction-to-iot.html\">A Gentle Introduction to IoT<\/a>, we saw\nhow to control a single LED over a network by running a web server hosting the\ncontrolling web application. This was a simple web page that displayed a button\nthat not only showed the current state of the LED, but allowed to control it by\nturning it on or off. All we needed to do was to point a web browser to the web\nserver running on the Raspberry Pi and play around with the only web page on it.<\/p>\n<p>There might be situations where we are not happy with a web server, but we\nactually want more control over the way devices connect with each other. In a\nsimple client-server relationship, surely TCP\/IP sockets spring to mind, but\nwhat if we want to allow for data to flow in both directions?<\/p>\n<p>Contrary to ordinary sockets, WebSockets offer a full-duplex communication\nchannel over a single TCP connection. They are compatible with the HTTP\nprotocol, but make use of their own protocol (the <em>WebSocket<\/em> protocol), which\nis switched to by including an HTTP Upgrade header in the HTTP handshake. By\ndefault, they are supposed to operate on the standard HTTP ports (80 for HTTP\nand 443 for secured HTTP), but a different one can be used for custom use, as we\nwill see in our example. The advantage over any other solutions is that\nWebSockets have been designed to allow for fast bi-directional communication\nbetween client and server via an TCP connection that is kept open until manually\nclosed (or dropped for other reasons), but without the overheads of HTTP\nheaders.<\/p>\n<p>By design, WebSockets are then the perfect tool for exchanging short <em>messages<\/em>\nbetween devices. As to what kind of messages we can send back and forth, we will\nsee that we can either send <em>text<\/em> messages, or raw <em>binary<\/em> data, by either\noperating the WebSocket with <em>text<\/em> or <em>binary<\/em> frames.<\/p>\n<p>Having justified the use of WebSockets for our project, we now have to motivate\nthe use of the other mentioned technology: <code>asyncio<\/code>.<\/p>\n<p>Like in <a href=\"https:\/\/p403n1x87.github.io\/a-gentle-introduction-to-iot.html\">A Gentle Introduction to IoT<\/a>,\nthe choice of Python is motivated by the use of the Raspberry Pi. In Python,\nWebSocket support is provided by the\n<a href=\"https:\/\/pypi.python.org\/pypi\/websockets\"><code>websockets<\/code><\/a> module, which is built\non top of <a href=\"https:\/\/docs.python.org\/3\/library\/asyncio.html\"><code>asyncio<\/code><\/a>.<\/p>\n<p>Here is where we encounter the first constraint though, since <code>asyncio<\/code> was\nintroduced in Python 3.4. We then have to ensure that we are using a version of\nPython greater than or at least equal to 3.4. But our coding can also vary based\non whether we are using Python 3.5 and later. That is because this version of\nPython introduces the <code>async<\/code> and <code>await<\/code> syntax for natively defining\ncoroutines (<a href=\"https:\/\/www.python.org\/dev\/peps\/pep-0492\/\">PEP 492<\/a>).<\/p>\n<p>In the last part of this post we will have a look at Python code samples written\nwith the standard <code>asyncio<\/code> syntax, as well as the new one introduced by PEP\n492. For now, we shall have a quick overview of the new features that are\nbrought to Python 3 by the <code>asyncio<\/code> module to better familiarise with it and\nits usage.<\/p>\n<h2 id=\"coroutines\">Coroutines<\/h2>\n<p>I believe that the best way to understand <code>asyncio<\/code> is to first look at the\nbasic concepts involved and recall the notion of <code>coroutine<\/code>. In Python we have\nthe concept of <em>Generators<\/em> since version 2.2. Generators look and feel like\nnormal functions, but rather than <em>returning<\/em> a value, they <em>yield<\/em> one, and are\nnormally used in loops to provide iterators.<\/p>\n<p>The typical scenario where you'd want to opt for a generator rather than a\nfunction is when you have to keep track of some state in between the different\nvalues returned. A simple example is a generator that generates the first <span class=\"math\">\\(n\\)<\/span>\nFibonacci numbers:<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">def<\/span> <span class=\"nf\">fibonacci<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">if<\/span> <span class=\"n\">n<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">0<\/span><span class=\"p\">:<\/span> <span class=\"k\">return<\/span>\n\n    <span class=\"k\">yield<\/span> <span class=\"mi\">0<\/span>\n    <span class=\"k\">if<\/span> <span class=\"n\">n<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">1<\/span><span class=\"p\">:<\/span> <span class=\"k\">return<\/span>\n\n    <span class=\"k\">yield<\/span> <span class=\"mi\">1<\/span>\n    <span class=\"k\">if<\/span> <span class=\"n\">n<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">2<\/span><span class=\"p\">:<\/span> <span class=\"k\">return<\/span>\n\n    <span class=\"n\">a0<\/span><span class=\"p\">,<\/span> <span class=\"n\">a1<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span>\n    <span class=\"n\">i<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">2<\/span>\n\n    <span class=\"k\">while<\/span> <span class=\"n\">i<\/span> <span class=\"o\">&lt;<\/span> <span class=\"n\">n<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">a2<\/span> <span class=\"o\">=<\/span> <span class=\"n\">a0<\/span> <span class=\"o\">+<\/span> <span class=\"n\">a1<\/span>\n        <span class=\"k\">yield<\/span> <span class=\"n\">a2<\/span>\n        <span class=\"n\">a0<\/span><span class=\"p\">,<\/span> <span class=\"n\">a1<\/span> <span class=\"o\">=<\/span> <span class=\"n\">a1<\/span><span class=\"p\">,<\/span> <span class=\"n\">a2<\/span>\n        <span class=\"n\">i<\/span> <span class=\"o\">+=<\/span> <span class=\"mi\">1<\/span>\n<\/pre><\/div>\n\n\n<p>If <span class=\"math\">\\(n=0\\)<\/span>, the generator doesn't yield any number and we can then return. When\n<span class=\"math\">\\(n=1\\)<\/span>, the generator must yield the first Fibonacci number only, which is 0, and\nso on. The difference between <code>return<\/code> and <code>yield<\/code> is that the former terminates\nthe iteration, while the latter allows it to continue. To better understand what\nis going on here, observe that a call to <code>fibonacci<\/code>, like <code>f = fibonacci(10)<\/code>\ndoesn't return a Fibonacci number (you might at first expect this to return the\nfirst Fibonacci number), but a <em>generator object<\/em> instead, that is, something\nthat we can use with a <code>for<\/code> loop, or any function that expects an iterable\nobject, like <code>sum<\/code>. The first two Fibonacci numbers can then be generated with<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">f<\/span> <span class=\"o\">=<\/span> <span class=\"n\">fibonacci<\/span><span class=\"p\">(<\/span><span class=\"mi\">2<\/span><span class=\"p\">)<\/span>\n<span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"n\">f<\/span><span class=\"p\">)<\/span>\n<span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"nb\">next<\/span><span class=\"p\">(<\/span><span class=\"n\">f<\/span><span class=\"p\">))<\/span>\n<span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"nb\">next<\/span><span class=\"p\">(<\/span><span class=\"n\">f<\/span><span class=\"p\">))<\/span>\n<span class=\"c1\"># Expected output<\/span>\n<span class=\"c1\"># &lt;generator object fibonacci at 0x7f2e0940baf0&gt;<\/span>\n<span class=\"c1\"># 0<\/span>\n<span class=\"c1\"># 1<\/span>\n<\/pre><\/div>\n\n\n<p>but if we now try to print yet another Fibonacci number, we get a\n<code>StopIteration<\/code> exception, which signals that the generator has returned and\nthat there are no more values to be generated:<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"nb\">next<\/span><span class=\"p\">(<\/span><span class=\"n\">f<\/span><span class=\"p\">))<\/span>\n<span class=\"c1\"># Expected output<\/span>\n<span class=\"c1\"># Traceback (most recent call last):<\/span>\n<span class=\"c1\">#   File &quot;&lt;stdin&gt;&quot;, line 5, in &lt;module&gt;<\/span>\n<span class=\"c1\">#     print(next(f))<\/span>\n<span class=\"c1\">#<\/span>\n<span class=\"c1\"># StopIteration<\/span>\n<\/pre><\/div>\n\n\n<p>The following code snippet illustrates the use of a generator with a <code>for<\/code> loop<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">test_cases<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">,<\/span> <span class=\"mi\">2<\/span><span class=\"p\">,<\/span> <span class=\"mi\">3<\/span><span class=\"p\">,<\/span> <span class=\"mi\">10<\/span><span class=\"p\">,<\/span> <span class=\"mi\">20<\/span><span class=\"p\">]<\/span>\n\n<span class=\"k\">for<\/span> <span class=\"n\">t<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">test_cases<\/span><span class=\"p\">:<\/span>\n    <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;Test case: n = <\/span><span class=\"si\">{}<\/span><span class=\"s2\">&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">format<\/span><span class=\"p\">(<\/span><span class=\"n\">t<\/span><span class=\"p\">))<\/span>\n    <span class=\"k\">for<\/span> <span class=\"n\">f<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">fibonacci<\/span><span class=\"p\">(<\/span><span class=\"n\">t<\/span><span class=\"p\">):<\/span> <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"n\">f<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>I am omitting the expected output to avoid cluttering the page, but it should be\nquite clear what the above code is supposed to do. To produce a more compact\nresult we can do something like<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">def<\/span> <span class=\"nf\">sfibonacci<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">for<\/span> <span class=\"n\">f<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">fibonacci<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span> <span class=\"k\">yield<\/span> <span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"n\">f<\/span><span class=\"p\">)<\/span>\n\n<span class=\"k\">for<\/span> <span class=\"n\">t<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">test_cases<\/span><span class=\"p\">:<\/span>\n    <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;Test case: n = <\/span><span class=\"si\">{}<\/span><span class=\"s2\">&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">format<\/span><span class=\"p\">(<\/span><span class=\"n\">t<\/span><span class=\"p\">))<\/span>\n    <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot; &quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">(<\/span><span class=\"n\">sfibonacci<\/span><span class=\"p\">(<\/span><span class=\"n\">t<\/span><span class=\"p\">)))<\/span>\n\n<span class=\"c1\"># Expected output:<\/span>\n<span class=\"c1\">#<\/span>\n<span class=\"c1\"># Test case: n = 0<\/span>\n<span class=\"c1\">#<\/span>\n<span class=\"c1\"># Test case: n = 1<\/span>\n<span class=\"c1\"># 0<\/span>\n<span class=\"c1\"># Test case: n = 2<\/span>\n<span class=\"c1\"># 0 1<\/span>\n<span class=\"c1\"># Test case: n = 3<\/span>\n<span class=\"c1\"># 0 1 1<\/span>\n<span class=\"c1\"># Test case: n = 10<\/span>\n<span class=\"c1\"># 0 1 1 2 3 5 8 13 21 34<\/span>\n<span class=\"c1\"># Test case: n = 20<\/span>\n<span class=\"c1\"># 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181<\/span>\n<\/pre><\/div>\n\n\n<p>which also shows how to construct a generator out of another generator.<\/p>\n<p>From the above example we deduce that the <code>yield<\/code> keyword is generating a value\nwhile also \"<em>pausing<\/em>\" the execution of the function, until another value is\nrequested from it. In this case, the execution resumes from the instruction that\ncomes soon after the <code>yield<\/code>, with all the values of local variables (the\n<em>state<\/em>) preserved.<\/p>\n<p><em>Generators<\/em> are also known as <em>semicoroutine<\/em>, and this leads us to talk about\n<em>coroutines<\/em>. This is a more general concept, because it encompasses those code\nelements that not only pass a value to the caller, but also receive and process\na value passed to them by another code element (e.g. a generator, or another\ncoroutine). This definition looks a bit circular, and this is due to the fact\nthat, with coroutines, the relation is not of caller-callee, but symmetric. In\nmore concrete terms, we are talking about functions that can retain a state in\nbetween invocations, and that can call to other functions, suspending and\nresuming execution from certain points of the code.<\/p>\n<p>Starting with Python 2.5, coroutines have become an integral part of the\nlanguage. They can be easily constructed with <code>yield<\/code>, which has been turned\ninto an <em>expression<\/em>. The value <code>yields<\/code> evaluates to is passed to the coroutine\nwith a call to <code>send<\/code> on the generator, like so<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">def<\/span> <span class=\"nf\">coroutine<\/span><span class=\"p\">():<\/span>\n    <span class=\"n\">c<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">0<\/span>\n    <span class=\"k\">while<\/span> <span class=\"kc\">True<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">text<\/span> <span class=\"o\">=<\/span> <span class=\"k\">yield<\/span>\n        <span class=\"n\">c<\/span> <span class=\"o\">+=<\/span> <span class=\"mi\">1<\/span>\n        <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;#<\/span><span class=\"si\">{}<\/span><span class=\"s2\"> : <\/span><span class=\"si\">{}<\/span><span class=\"s2\">&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">format<\/span><span class=\"p\">(<\/span><span class=\"n\">c<\/span><span class=\"p\">,<\/span> <span class=\"n\">text<\/span><span class=\"p\">))<\/span>\n\n<span class=\"n\">coro<\/span> <span class=\"o\">=<\/span> <span class=\"n\">coroutine<\/span><span class=\"p\">()<\/span>\n<span class=\"n\">coro<\/span><span class=\"o\">.<\/span><span class=\"n\">next<\/span><span class=\"p\">()<\/span>\n<span class=\"n\">coro<\/span><span class=\"o\">.<\/span><span class=\"n\">send<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;Hello&quot;<\/span><span class=\"p\">)<\/span>\n<span class=\"n\">coro<\/span><span class=\"o\">.<\/span><span class=\"n\">send<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;World&quot;<\/span><span class=\"p\">)<\/span>\n<span class=\"c1\"># Expected output<\/span>\n<span class=\"c1\"># #1 : Hello<\/span>\n<span class=\"c1\"># #2 : World<\/span>\n<\/pre><\/div>\n\n\n<p>If you look to code like the above for the first time, you might be wondering\nwhy I have included a call to <code>next<\/code>. Remember from our discussion on generators\nthat, contrary to normal functions, a call to <code>coroutine<\/code> returns a generator\nobject and not the first generated value. After creation, a coroutine needs to\nbe <em>primed<\/em>, that is it needs to be started so that it can execute its code\nuntil the first occurrence of the <code>yield<\/code> expression. The coroutine then halts\nand waits for values to be sent to it. In Python, there are two equivalent ways\nof priming a coroutine: either call <code>next<\/code>, as we have done in the code above,\nor send it a <code>None<\/code> with <code>coro.send(None)<\/code>.<\/p>\n<blockquote>\n<p>Since it is quite easy to forget to prime a coroutine, a good idea is to\ndefine a decorator, e.g. <code>@coroutine<\/code> that creates, primes and returns a\ncoroutine.<\/p>\n<\/blockquote>\n<p>If you want to know more about coroutines, I recommend that you have a look at\nDavid Beazley's <a href=\"http:\/\/www.dabeaz.com\/coroutines\/\">A Curious Course on Coroutines and\nConcurrency<\/a>. Here, I have just stated the\nessential details that we are going to need for our project. These should be\nenough to convince you that, with coroutines in Python we can implement\nsingle-threaded concurrency, which offer an ideal ground for asynchronous I\/O\noperations, without the overhead of many context switches between different\nthreads.<\/p>\n<h2 id=\"the-asyncio-module\">The <code>asyncio<\/code> module<\/h2>\n<p>If you have ever done any sort of I\/O before, like reading from\/writing to a\nfile, or creating a socket and waiting for a connection etc... you will surely\nknow that most of the I\/O operations are blocking. For example, if you are\ntrying to read from a file descriptor by making a <code>SYS_READ<\/code> system call, your\ncode will hand control over to the OS until your request can be honoured. The\nnormal execution flow then resumes.<\/p>\n<p>The problem with this wait is that, in most cases, you don't know when there\nwill be enough data available from the file descriptor to read. Your application\nthen halts while it might be doing something useful instead.<\/p>\n<p>The typical workaround is to poll the file descriptor periodically, and only\nread from it when data is actually available. As you can easily imagine, the\nsolutions come in different patterns, and every time you have to deal with this\nthere is some boilerplate code that you would have to write. This amounts to\nwriting your event loop to cycle through your tasks, which include I\/O polling.\nWouldn't it be nice if we were provided with such boilerplate code encapsulated\nin a module that we can use whenever we need to perform asynchronous I\/O\noperations?<\/p>\n<p>This must be what the people behind <code>asyncio<\/code> must have thought, and that's why\nthe essential element that this Python module offers is an <strong>event loop<\/strong>. This\nis designed to register and schedule <strong>Tasks<\/strong> which, with just a few words, can\nbe described as <em>objects decorating coroutines<\/em> (hence, in practice, they are\ncoroutines). Some of these tasks might involve I\/O operations, and some of these\noperations might be blocking. By continuously polling for the I\/O status of file\ndescriptors, sockets etc..., <code>asyncio<\/code> allows you to write single-threaded\nconcurrent code that performs I\/O.<\/p>\n<p>In this post, we are interested in working with WebSockets, as as an example of\nwhat we have just seen we can play around with some WebSocket servers and\nclients. In Python, we can make use of the <code>websockets<\/code> modules, which builds\nits functionalities on top of <code>asyncio<\/code>. The <a href=\"https:\/\/pypi.python.org\/pypi\/websockets\">Getting\nStarted<\/a> page on PyPI shows how simple\nit is to create a a WebSocket client, and an echo server to test it. The\nfollowing examples are based on them, but with just some slight modifications,\nand a twist: the client sends a string read from STDIN and the echo server\nreverses it.<\/p>\n<p>So here is the server code<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"ch\">#!\/usr\/bin\/env python3<\/span>\n\n<span class=\"c1\"># Echo server. Will reverse everything we throw at it.<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">asyncio<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">websockets<\/span>\n\n<span class=\"k\">async<\/span> <span class=\"k\">def<\/span> <span class=\"nf\">echo<\/span><span class=\"p\">(<\/span><span class=\"n\">websocket<\/span><span class=\"p\">,<\/span> <span class=\"n\">path<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">async<\/span> <span class=\"k\">for<\/span> <span class=\"n\">message<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">websocket<\/span><span class=\"p\">:<\/span>\n        <span class=\"k\">await<\/span> <span class=\"n\">websocket<\/span><span class=\"o\">.<\/span><span class=\"n\">send<\/span><span class=\"p\">(<\/span><span class=\"n\">message<\/span><span class=\"p\">[::<\/span><span class=\"o\">-<\/span><span class=\"mi\">1<\/span><span class=\"p\">])<\/span>\n\n<span class=\"n\">loop<\/span> <span class=\"o\">=<\/span> <span class=\"n\">asyncio<\/span><span class=\"o\">.<\/span><span class=\"n\">get_event_loop<\/span><span class=\"p\">()<\/span>\n<span class=\"n\">loop<\/span><span class=\"o\">.<\/span><span class=\"n\">run_until_complete<\/span><span class=\"p\">(<\/span><span class=\"n\">websockets<\/span><span class=\"o\">.<\/span><span class=\"n\">serve<\/span><span class=\"p\">(<\/span><span class=\"n\">echo<\/span><span class=\"p\">,<\/span> <span class=\"s1\">&#39;localhost&#39;<\/span><span class=\"p\">,<\/span> <span class=\"mi\">8765<\/span><span class=\"p\">))<\/span>\n<span class=\"n\">loop<\/span><span class=\"o\">.<\/span><span class=\"n\">run_forever<\/span><span class=\"p\">()<\/span>\n<\/pre><\/div>\n\n\n<p>and here is the client code<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"ch\">#!\/usr\/bin\/env python3<\/span>\n\n<span class=\"c1\"># Client. Sends stuff from STDIN to the server.<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">asyncio<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">websockets<\/span>\n\n<span class=\"k\">async<\/span> <span class=\"k\">def<\/span> <span class=\"nf\">hello<\/span><span class=\"p\">(<\/span><span class=\"n\">uri<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">async<\/span> <span class=\"k\">with<\/span> <span class=\"n\">websockets<\/span><span class=\"o\">.<\/span><span class=\"n\">connect<\/span><span class=\"p\">(<\/span><span class=\"n\">uri<\/span><span class=\"p\">)<\/span> <span class=\"k\">as<\/span> <span class=\"n\">websocket<\/span><span class=\"p\">:<\/span>\n        <span class=\"k\">await<\/span> <span class=\"n\">websocket<\/span><span class=\"o\">.<\/span><span class=\"n\">send<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;Hello world!&quot;<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">async<\/span> <span class=\"k\">for<\/span> <span class=\"n\">message<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">websocket<\/span><span class=\"p\">:<\/span>\n            <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"n\">message<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">await<\/span> <span class=\"n\">websocket<\/span><span class=\"o\">.<\/span><span class=\"n\">send<\/span><span class=\"p\">(<\/span><span class=\"nb\">input<\/span><span class=\"p\">())<\/span>\n\n<span class=\"n\">asyncio<\/span><span class=\"o\">.<\/span><span class=\"n\">get_event_loop<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"n\">run_until_complete<\/span><span class=\"p\">(<\/span><span class=\"n\">hello<\/span><span class=\"p\">(<\/span><span class=\"s1\">&#39;ws:\/\/localhost:8765&#39;<\/span><span class=\"p\">))<\/span>\n<\/pre><\/div>\n\n\n<p>And that's where we make our first encounter with the new <code>async<\/code>\/<code>await<\/code>\nkeywords, introduced by the already cited PEP 492. The new syntax <code>async def<\/code> is\nused to declare a <em>native coroutine<\/em> in Python 3.5 and later versions. Perhaps\nmore interesting are <code>async with<\/code> and <code>async for<\/code>. The former introduces native\nasynchronous context managers for classes that define the new magic methods\n<code>__aenter__<\/code> and <code>__aexit__<\/code>, but apart from this, its usage is analogous to the\nsynchronous counterpart <code>with<\/code>. The new syntax <code>async for<\/code> is used to consume\nasynchronous iterable, i.e. instances of classes that implement the <code>__aiter__<\/code>\nmagic method. When the <code>await<\/code> keyword is on its own, it defines an expression\nthat execute the coroutine it appears in to execute the one passed as its\nargument until it completes. In the case of the code above, we just wait for the\n<code>websocket<\/code> object to complete the task of sending the data.<\/p>\n<p>After the above discussion, the code for the client application should be quite\nclear. The first call to <code>websocket.send<\/code> is used to <em>prime<\/em> the socket, so that\n<code>async for message in websocket<\/code> won't hang indefinitely, waiting for something\nto show up on the socket's reading end.<\/p>\n<p>Before we look at how to rewrite the above code snippets for Python 3.4, where\nwe do not have native coroutines, I would like to briefly comment on the last\ntwo lines of code of the server application. The first time I came across that\ncode, they gave me a bit of thinking as to why we need to call\n<code>run_until_complete<\/code> and <code>run_forever<\/code>. Wouldn't the former alone suffice? If\nyou recall our discussion about blocking I\/O operations and the necessity of\nconstantly polling in order not to halt the execution of our application, you\nrealise that the call to <code>run_until_complete<\/code> will register the socket with the\nI\/O polling task. Hence, apart from this task, we have nothing else running, and\nif we do not start the event loop, the newly created socket won't be checked and\nthe application simply terminates. The last line is there to ensure that we keep\nmonitoring the socket for new incoming connections. When a client connects, a\nnew task is scheduled to serve the connection with the passed handler, which\nmust be a coroutine. This can be verified by peeking at the source code of both\n<a href=\"https:\/\/github.com\/python\/cpython\/tree\/a19fb3c6aaa7632410d1d9dcb395d7101d124da4\/Lib\/asyncio\"><code>asyncio<\/code><\/a>\nand <a href=\"https:\/\/github.com\/aaugustin\/websockets\"><code>websockets<\/code><\/a>.<\/p>\n<p>And now to the Python 3.4 version of the above code snippets. There are a few\nrule of thumb that we can use to convert from 3.5 and later to 3.4, where we\ndon't have <code>await<\/code> and <code>async<\/code>. The first is the use of the decorator\n<code>@asyncio.coroutine<\/code> instead of <code>async def<\/code>. The situation is a bit more\ncomplicated for <code>async with<\/code>, which requires a replacement for <code>await<\/code>, which is\n<code>yield from<\/code>. The latter is substantially equivalent to<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"k\">for<\/span> <span class=\"n\">a<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">foo<\/span><span class=\"p\">():<\/span> <span class=\"k\">yield<\/span> <span class=\"n\">a<\/span>    <span class=\"c1\"># Same as: yield from foo()<\/span>\n<\/pre><\/div>\n\n\n<p>With this in mind, <code>async with<\/code> can be coded with a more traditional <code>try ...\nfinally<\/code> block as<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># async with coro() as foo:<\/span>\n<span class=\"c1\">#     # &lt;code&gt;<\/span>\n<span class=\"c1\">#     pass<\/span>\n<span class=\"n\">foo<\/span> <span class=\"o\">=<\/span> <span class=\"k\">yield from<\/span> <span class=\"n\">coro<\/span><span class=\"p\">()<\/span>\n<span class=\"k\">try<\/span><span class=\"p\">:<\/span>\n    <span class=\"c1\"># &lt;code&gt;<\/span>\n    <span class=\"k\">pass<\/span>\n<span class=\"k\">finally<\/span><span class=\"p\">:<\/span>\n    <span class=\"c1\"># Whatever coro().__aexit__() would have done.<\/span>\n<\/pre><\/div>\n\n\n<p>The <code>async for<\/code> loop would translate to something like the following<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># async for foo in bar():<\/span>\n<span class=\"c1\">#     # &lt;code&gt;<\/span>\n<span class=\"c1\">#     pass<\/span>\n<span class=\"k\">try<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">while<\/span> <span class=\"kc\">True<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">foo<\/span> <span class=\"o\">=<\/span> <span class=\"k\">yield from<\/span> <span class=\"n\">bar<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"fm\">__anext__<\/span><span class=\"p\">()<\/span>\n        <span class=\"c1\"># &lt;code&gt;<\/span>\n<span class=\"k\">except<\/span> <span class=\"ne\">StopAsyncIteration<\/span><span class=\"p\">:<\/span>\n    <span class=\"k\">return<\/span>\n<\/pre><\/div>\n\n\n<p>This coding is not strict, since you could, or you might have to replace the\ncode of <code>bar().__anext__()<\/code> with the actual coding inside this coroutine. This\nis the case for the code that we are about to see, in which we perform the above\ntranslations. The server code for Python 3.4 is the following<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"ch\">#!\/usr\/bin\/env python3<\/span>\n\n<span class=\"c1\"># Echo server. Will reverse everything we throw at it.<\/span>\n<span class=\"c1\"># For Python 3.4<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">asyncio<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">websockets<\/span>\n\n<span class=\"nd\">@asyncio<\/span><span class=\"o\">.<\/span><span class=\"n\">coroutine<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">echo<\/span><span class=\"p\">(<\/span><span class=\"n\">websocket<\/span><span class=\"p\">,<\/span> <span class=\"n\">path<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">while<\/span> <span class=\"kc\">True<\/span><span class=\"p\">:<\/span>\n        <span class=\"n\">message<\/span> <span class=\"o\">=<\/span> <span class=\"k\">yield from<\/span> <span class=\"n\">websocket<\/span><span class=\"o\">.<\/span><span class=\"n\">recv<\/span><span class=\"p\">()<\/span>\n        <span class=\"k\">yield from<\/span> <span class=\"n\">websocket<\/span><span class=\"o\">.<\/span><span class=\"n\">send<\/span><span class=\"p\">(<\/span><span class=\"n\">message<\/span><span class=\"p\">[::<\/span><span class=\"o\">-<\/span><span class=\"mi\">1<\/span><span class=\"p\">])<\/span>\n\n<span class=\"n\">asyncio<\/span><span class=\"o\">.<\/span><span class=\"n\">get_event_loop<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"n\">run_until_complete<\/span><span class=\"p\">(<\/span><span class=\"n\">websockets<\/span><span class=\"o\">.<\/span><span class=\"n\">serve<\/span><span class=\"p\">(<\/span><span class=\"n\">echo<\/span><span class=\"p\">,<\/span> <span class=\"s1\">&#39;localhost&#39;<\/span><span class=\"p\">,<\/span> <span class=\"mi\">8765<\/span><span class=\"p\">))<\/span>\n<span class=\"n\">asyncio<\/span><span class=\"o\">.<\/span><span class=\"n\">get_event_loop<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"n\">run_forever<\/span><span class=\"p\">()<\/span>\n<\/pre><\/div>\n\n\n<p>while the client code now looks like this<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"ch\">#!\/usr\/bin\/env python3<\/span>\n\n<span class=\"c1\"># Client. Sends stuff from STDIN to the server.<\/span>\n<span class=\"c1\"># For Python 3.4<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">asyncio<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">websockets<\/span>\n\n<span class=\"nd\">@asyncio<\/span><span class=\"o\">.<\/span><span class=\"n\">coroutine<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">hello<\/span><span class=\"p\">(<\/span><span class=\"n\">uri<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">websocket<\/span> <span class=\"o\">=<\/span> <span class=\"k\">yield from<\/span> <span class=\"n\">websockets<\/span><span class=\"o\">.<\/span><span class=\"n\">connect<\/span><span class=\"p\">(<\/span><span class=\"n\">uri<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">try<\/span><span class=\"p\">:<\/span>\n        <span class=\"k\">yield from<\/span> <span class=\"n\">websocket<\/span><span class=\"o\">.<\/span><span class=\"n\">send<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;Hello world!&quot;<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">while<\/span> <span class=\"kc\">True<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">message<\/span> <span class=\"o\">=<\/span> <span class=\"k\">yield from<\/span> <span class=\"n\">websocket<\/span><span class=\"o\">.<\/span><span class=\"n\">recv<\/span><span class=\"p\">()<\/span>\n            <span class=\"nb\">print<\/span><span class=\"p\">(<\/span><span class=\"n\">message<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">yield from<\/span>  <span class=\"n\">websocket<\/span><span class=\"o\">.<\/span><span class=\"n\">send<\/span><span class=\"p\">(<\/span><span class=\"nb\">input<\/span><span class=\"p\">())<\/span>\n    <span class=\"k\">finally<\/span><span class=\"p\">:<\/span>\n        <span class=\"k\">yield from<\/span> <span class=\"n\">websocket<\/span><span class=\"o\">.<\/span><span class=\"n\">close<\/span><span class=\"p\">()<\/span>\n\n<span class=\"n\">asyncio<\/span><span class=\"o\">.<\/span><span class=\"n\">get_event_loop<\/span><span class=\"p\">()<\/span><span class=\"o\">.<\/span><span class=\"n\">run_until_complete<\/span><span class=\"p\">(<\/span><span class=\"n\">hello<\/span><span class=\"p\">(<\/span><span class=\"s1\">&#39;ws:\/\/localhost:8765&#39;<\/span><span class=\"p\">))<\/span>\n<\/pre><\/div>\n\n\n<p>Note how to execute a native coroutine, or one decorated with\n<code>@asyncio.coroutine<\/code> in Python 3.4, needs to be executed by an event loop from\nthe <code>asyncio<\/code> module.<\/p>\n<h1 id=\"setting-things-up\">Setting Things Up<\/h1>\n<p>We shall now present the circuit that makes up the device that we want to\ncontrol over the network, and both the server and client code that will allow us\nto use the gravity sensor of an Android device to operate it.<\/p>\n<h2 id=\"the-circuitry\">The Circuitry<\/h2>\n<p>This part of the post will be brief, because we are going to recycle part of the\nprevious post on IoT A Gentle Introduction to IoT. In fact, we are going to\ndouble it up by making a circuit with two LEDs, each one following the same\ndesign in the just mentioned post. The idea is to turn either one or the other\non, depending on the rotation angle of our Android device. For example, if we\ntilt our device to the right, the green LED will become brighter, while when we\ntilt it to the left, the red LED will become brighter.<\/p>\n<p>So, based on the knowledge acquired with the previous post on IoT, the circuit\nthat we want to build will look like this.<\/p>\n<p><img alt=\"Double LED configuration\" src=\"https:\/\/p403n1x87.github.io\/images\/ws_asyncio\/ws_asyncio_bb.png\"><\/p>\n<p>For this project I am using a Raspberry Pi 3 Model B. As you can see from the\npicture above, we need two LEDs, preferably of different colours, e.g. green and\nred, and two 220 \u03a9 resistors. The green LED will be controlled via the BCM 18\n(8) pin (green jumper wire), while the red one is controlled via the BCM 5 (29)\npin (red jumper wire) of the Raspberry Pi. The black jumper wires indicate a\nconnection on a ground pin.<\/p>\n<h2 id=\"the-server-code\">The Server Code<\/h2>\n<p>The server code will run on the Raspberry Pi and will open a WebSocket server,\nlistening for connection requests. At the moment of writing, Raspbian Jesse has\nPython 3.4.2 in its repository, so that we cannot benefit of the native Python\ncoroutine offered by Python 3.5, unless we install this version manually.<\/p>\n<p>All the server code can be found inside the\n<a href=\"https:\/\/github.com\/P403n1x87\/iot\/tree\/master\/gravity_led\/server\">gravity_led\/server<\/a>\nfolder of the <a href=\"https:\/\/github.com\/P403n1x87\/iot\">iot<\/a> repository. Due to its\nlength I will refrain from embedding it on this post, but I will comment on the\nessential aspects.<\/p>\n<p>The code is based on an abstract WebSocket Server class, <code>WSServer<\/code>, contained\nin\n<a href=\"https:\/\/github.com\/P403n1x87\/iot\/blob\/master\/gravity_led\/server\/lib\/wss.py\">lib\/wss.py<\/a>.\nAn actual WebSocket server has to inherit from this class and implement the\n<code>handler<\/code> method, which is the one that bootstraps the logic of the server\napplication. This will be called as soon as the WebSocket server starts serving\nconnection requests.<\/p>\n<p>On creation, we can specify the address and port the server is to listen on, as\nwell as a possible limit on the number of simultaneous connections that the\nserver is allowed to serve. Why would such a limit be ever necessary? The reason\nis that, in this particular example, we only have one device, and this can only\nbe controlled by one client at a time. It makes no sense to accept more than one\nconnection. By setting the limit to 1, we then prevent other clients from\nconnecting and finding out that the device is already being controlled by\nanother client. For the way it is implemented though, the class is flexible\nenough to allow changing this limit at run-time with a call to the\n<code>set_server_limit<\/code>.<\/p>\n<p>After instantiation, we can run the server with a call to the <code>start<\/code> method,\nand we can check whether it is running at any time with <code>is_running<\/code>. Most of\nthe code in the former method provides logging information and a graceful\nshutdown when the user presses Ctrl+C to halt the server.<\/p>\n<p>The code contained in\n<a href=\"https:\/\/github.com\/P403n1x87\/iot\/blob\/master\/gravity_led\/server\/main.py\">main.py<\/a>\nis used to bootstrap the class <code>GyroListener<\/code> contained in\n<a href=\"https:\/\/github.com\/P403n1x87\/iot\/blob\/master\/gravity_led\/server\/gyro_listener.py\">gyro_listener.py<\/a>,\nwhich holds the actual logic of the server. As you can see, all that we have to\ndo is inherit from <code>WSServer<\/code> and implement the <code>handler<\/code> coroutine, as shown\nhere.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nd\">@asyncio<\/span><span class=\"o\">.<\/span><span class=\"n\">coroutine<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">handler<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">websocket<\/span><span class=\"p\">,<\/span> <span class=\"n\">path<\/span><span class=\"p\">,<\/span> <span class=\"n\">conn_id<\/span><span class=\"p\">):<\/span>\n    <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">logger<\/span><span class=\"o\">.<\/span><span class=\"n\">info<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;Connection ID <\/span><span class=\"si\">{}<\/span><span class=\"s2\"> established [path <\/span><span class=\"si\">{}<\/span><span class=\"s2\">]&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">format<\/span><span class=\"p\">(<\/span><span class=\"n\">conn_id<\/span><span class=\"p\">,<\/span> <span class=\"n\">path<\/span><span class=\"p\">))<\/span>\n\n    <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">setmode<\/span><span class=\"p\">(<\/span><span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">BOARD<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">setup<\/span><span class=\"p\">(<\/span><span class=\"n\">CHR<\/span><span class=\"p\">,<\/span> <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">OUT<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">setup<\/span><span class=\"p\">(<\/span><span class=\"n\">CHL<\/span><span class=\"p\">,<\/span> <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">OUT<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"n\">pr<\/span> <span class=\"o\">=<\/span> <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">PWM<\/span><span class=\"p\">(<\/span><span class=\"n\">CHR<\/span><span class=\"p\">,<\/span> <span class=\"n\">FREQ<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">pl<\/span> <span class=\"o\">=<\/span> <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">PWM<\/span><span class=\"p\">(<\/span><span class=\"n\">CHL<\/span><span class=\"p\">,<\/span> <span class=\"n\">FREQ<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">pr<\/span><span class=\"o\">.<\/span><span class=\"n\">start<\/span><span class=\"p\">(<\/span><span class=\"mi\">0<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">pl<\/span><span class=\"o\">.<\/span><span class=\"n\">start<\/span><span class=\"p\">(<\/span><span class=\"mi\">0<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">while<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">is_running<\/span><span class=\"p\">:<\/span>\n        <span class=\"k\">try<\/span><span class=\"p\">:<\/span>\n            <span class=\"n\">gyro_data<\/span> <span class=\"o\">=<\/span> <span class=\"k\">yield from<\/span> <span class=\"n\">websocket<\/span><span class=\"o\">.<\/span><span class=\"n\">recv<\/span><span class=\"p\">()<\/span>\n            <span class=\"n\">g<\/span>         <span class=\"o\">=<\/span> <span class=\"n\">gyro_data<\/span><span class=\"o\">.<\/span><span class=\"n\">split<\/span><span class=\"p\">()<\/span>\n            <span class=\"n\">val<\/span>       <span class=\"o\">=<\/span> <span class=\"nb\">int<\/span><span class=\"p\">(<\/span><span class=\"nb\">float<\/span><span class=\"p\">(<\/span><span class=\"n\">g<\/span><span class=\"p\">[<\/span><span class=\"mi\">1<\/span><span class=\"p\">]))<\/span>\n\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">logger<\/span><span class=\"o\">.<\/span><span class=\"n\">debug<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;Received datum <\/span><span class=\"si\">{}<\/span><span class=\"s2\">&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">format<\/span><span class=\"p\">(<\/span><span class=\"n\">val<\/span><span class=\"p\">))<\/span>\n            <span class=\"k\">if<\/span> <span class=\"n\">val<\/span> <span class=\"o\">&gt;<\/span> <span class=\"mi\">0<\/span><span class=\"p\">:<\/span>\n                <span class=\"n\">pr<\/span><span class=\"o\">.<\/span><span class=\"n\">ChangeDutyCycle<\/span><span class=\"p\">(<\/span><span class=\"n\">val<\/span><span class=\"p\">)<\/span>\n                <span class=\"n\">pl<\/span><span class=\"o\">.<\/span><span class=\"n\">ChangeDutyCycle<\/span><span class=\"p\">(<\/span><span class=\"mi\">0<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                <span class=\"n\">pr<\/span><span class=\"o\">.<\/span><span class=\"n\">ChangeDutyCycle<\/span><span class=\"p\">(<\/span><span class=\"mi\">0<\/span><span class=\"p\">)<\/span>\n                <span class=\"n\">pl<\/span><span class=\"o\">.<\/span><span class=\"n\">ChangeDutyCycle<\/span><span class=\"p\">(<\/span><span class=\"o\">-<\/span><span class=\"n\">val<\/span><span class=\"p\">)<\/span>\n\n        <span class=\"k\">except<\/span> <span class=\"n\">websockets<\/span><span class=\"o\">.<\/span><span class=\"n\">exceptions<\/span><span class=\"o\">.<\/span><span class=\"n\">ConnectionClosed<\/span><span class=\"p\">:<\/span>\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">logger<\/span><span class=\"o\">.<\/span><span class=\"n\">info<\/span><span class=\"p\">(<\/span><span class=\"s2\">&quot;Connection ID <\/span><span class=\"si\">{}<\/span><span class=\"s2\"> closed.&quot;<\/span><span class=\"o\">.<\/span><span class=\"n\">format<\/span><span class=\"p\">(<\/span><span class=\"n\">conn_id<\/span><span class=\"p\">))<\/span>\n            <span class=\"n\">pr<\/span><span class=\"o\">.<\/span><span class=\"n\">stop<\/span><span class=\"p\">()<\/span>\n            <span class=\"k\">break<\/span>\n\n    <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">cleanup<\/span><span class=\"p\">()<\/span>\n<\/pre><\/div>\n\n\n<p>Here we initialise the Pi's GPIO and start listening to data over the socket\nuntil the server is running or the connection is closed by the client. The two\npins that we have chosen to use to control the LEDs (<code>CHR<\/code> and <code>CHL<\/code>) are set on\nthe PWM (Pulse Width Modulation) mode so that we can control the brightness: the\nmore the Android device is tilted in one direction, the brighter the LED for\nthat direction.<\/p>\n<p>Here we also notice the \"<em>contract<\/em>\" between the server and the client: the\nlatter will send the three coordinates of the gravity sensor as a\nspace-separated list of three floating point values. The server will split this\nstring and use the second component (the <em>y<\/em>-axis of the gravity sensor) to\ncontrol the LEDs. When the value is positive the LED on the <code>CHR<\/code> pin will turn\non, with a duty cycle proportional to the value passed by the client; when the\nvalue is negative the LED on the <code>CHL<\/code> pin will start to turn on.<\/p>\n<p>To launch the server, run <code>main.py<\/code> and pass the IPv4 address and a port number,\nfor example<\/p>\n<div class=\"highlight\"><pre><span><\/span>python3 main.py <span class=\"m\">0<\/span>.0.0.0 <span class=\"m\">5678<\/span>\n<\/pre><\/div>\n\n\n<p>to listen on <em>any<\/em> IP address on the Raspberry Pi and accept connections from everywhere.<\/p>\n<h2 id=\"the-client-code\">The Client Code<\/h2>\n<p>For the client, we are going to develop a minimalist Android application, with\nthe same project structure that we have encountered in the previous post on\n<a href=\"https:\/\/p403n1x87.github.io\/android-development-from-the-command-line.html\">Android Development from the Command\nLine<\/a>. Again, the code is quite extensive,\nbut you can find it in the\n<a href=\"https:\/\/github.com\/P403n1x87\/iot\/tree\/master\/gravity_led\/client\">gravity_iot\/client<\/a>\nfolder. Apart from the minimal Android project setup, with the\n<code>AndroidManifest.xml<\/code> and the resource files for the UI, all that we need is a\nsingle activity where we can specify the IP address and the port of the server\nto connect to, and a toggle button to start and close a connection to the\nserver.<\/p>\n<p>For dealing with WebSockets in Java, we make use of the <a href=\"https:\/\/mvnrepository.com\/artifact\/org.java-websocket\/Java-WebSocket\">Java\nWebSockets<\/a>\nlibrary, which offers the <code>WebSocketClient<\/code> abstract class. As you can see from\nthe code in\n<a href=\"https:\/\/github.com\/P403n1x87\/iot\/blob\/master\/gravity_led\/client\/src\/main\/java\/MainActivity.java\">MainActivity.java<\/a>,\nall that we have to do is extend the <code>WebSocketClient<\/code> class and implement a few\nmethods, which we just use for logging purposes. The actual signalling is done\nvia callbacks triggered by the on-board gravity sensor on the Android device:<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"nd\">@Override<\/span>\n<span class=\"kd\">public<\/span> <span class=\"kt\">void<\/span> <span class=\"nf\">onSensorChanged<\/span><span class=\"p\">(<\/span><span class=\"kd\">final<\/span> <span class=\"n\">SensorEvent<\/span> <span class=\"n\">event<\/span><span class=\"p\">)<\/span> <span class=\"p\">{<\/span>\n  <span class=\"kt\">float<\/span> <span class=\"n\">x<\/span> <span class=\"o\">=<\/span> <span class=\"n\">event<\/span><span class=\"p\">.<\/span><span class=\"na\">values<\/span><span class=\"o\">[<\/span><span class=\"mi\">0<\/span><span class=\"o\">]<\/span><span class=\"p\">;<\/span>\n  <span class=\"kt\">float<\/span> <span class=\"n\">y<\/span> <span class=\"o\">=<\/span> <span class=\"n\">event<\/span><span class=\"p\">.<\/span><span class=\"na\">values<\/span><span class=\"o\">[<\/span><span class=\"mi\">1<\/span><span class=\"o\">]<\/span><span class=\"p\">;<\/span>\n  <span class=\"kt\">float<\/span> <span class=\"n\">z<\/span> <span class=\"o\">=<\/span> <span class=\"n\">event<\/span><span class=\"p\">.<\/span><span class=\"na\">values<\/span><span class=\"o\">[<\/span><span class=\"mi\">2<\/span><span class=\"o\">]<\/span><span class=\"p\">;<\/span>\n  <span class=\"kt\">float<\/span> <span class=\"n\">g<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span><span class=\"kt\">float<\/span><span class=\"p\">)<\/span> <span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">x<\/span><span class=\"o\">*<\/span><span class=\"n\">x<\/span> <span class=\"o\">+<\/span> <span class=\"n\">y<\/span><span class=\"o\">*<\/span><span class=\"n\">y<\/span> <span class=\"o\">+<\/span> <span class=\"n\">z<\/span><span class=\"o\">*<\/span><span class=\"n\">z<\/span><span class=\"p\">);<\/span>\n  <span class=\"k\">try<\/span> <span class=\"p\">{<\/span>\n    <span class=\"n\">webSocketClient<\/span><span class=\"p\">.<\/span><span class=\"na\">send<\/span><span class=\"p\">(<\/span><span class=\"n\">Float<\/span><span class=\"p\">.<\/span><span class=\"na\">toString<\/span><span class=\"p\">(<\/span><span class=\"n\">x<\/span><span class=\"o\">\/<\/span><span class=\"n\">g<\/span><span class=\"o\">*<\/span><span class=\"mi\">100<\/span><span class=\"p\">)<\/span> <span class=\"o\">+<\/span> <span class=\"s\">&quot; &quot;<\/span> <span class=\"o\">+<\/span> <span class=\"n\">Float<\/span><span class=\"p\">.<\/span><span class=\"na\">toString<\/span><span class=\"p\">(<\/span><span class=\"n\">y<\/span><span class=\"o\">\/<\/span><span class=\"n\">g<\/span><span class=\"o\">*<\/span><span class=\"mi\">100<\/span><span class=\"p\">)<\/span> <span class=\"o\">+<\/span> <span class=\"s\">&quot; &quot;<\/span> <span class=\"o\">+<\/span> <span class=\"n\">Float<\/span><span class=\"p\">.<\/span><span class=\"na\">toString<\/span><span class=\"p\">(<\/span><span class=\"n\">z<\/span><span class=\"o\">\/<\/span><span class=\"n\">g<\/span><span class=\"o\">*<\/span><span class=\"mi\">100<\/span><span class=\"p\">));<\/span>\n  <span class=\"p\">}<\/span> <span class=\"k\">catch<\/span> <span class=\"p\">(<\/span><span class=\"n\">WebsocketNotConnectedException<\/span> <span class=\"n\">e<\/span><span class=\"p\">)<\/span> <span class=\"p\">{<\/span>\n    <span class=\"n\">Log<\/span><span class=\"p\">.<\/span><span class=\"na\">w<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;ctrl_ws_client&quot;<\/span><span class=\"p\">,<\/span> <span class=\"s\">&quot;Sensor updated but socket not connected&quot;<\/span><span class=\"p\">);<\/span>\n  <span class=\"p\">}<\/span>\n<span class=\"p\">}<\/span>\n<\/pre><\/div>\n\n\n<p>This method is part of the <code>MainActivity<\/code> class since we decided that this\nshould implement the <code>SensorEventListener<\/code> interface. When a connection is\nsuccessfully started, we register the instance of this class with the sensor, so\nthat when the value is changed a white-space separated list of the three\ncomponents of the gravity vector are sent in text form to the server via the\nWebSocket.<\/p>\n<p>We can use Gradle to build and deploy the application for testing on a device\nwith the command<\/p>\n<div class=\"highlight\"><pre><span><\/span>.\/gradlew installDebug\n<\/pre><\/div>\n\n\n<p>assuming that you have an Android device in developer mode connected to your\nbuilding machine via USB. This is what it looks like on my Nexus 5<\/p>\n<p><img alt=\"ctrl_ws_client\" class=\"center-image\" src=\"https:\/\/p403n1x87.github.io\/images\/ws_asyncio\/ctrl_ws_client.png\"><\/p>\n<p>Before moving to the conclusion, just a quick note on the contents of the\n<code>AndroidManifest.xml<\/code> file. If you have had a look at it, you might have noticed\nthe attribute <code>android:configChanges=\"orientation|screenSize\"<\/code> on the <code>activity<\/code>\nelement. This allows the application to rotate with the device orientation,\nwithout the application being restarted. If we do not put this attribute, any\nprevious connection would be closed and we would have to restart it by pressing\non the toggle button. The use of this attribute is not mandatory, but bear in\nmind that if you decide not to use it, then you would have to make sure that the\nsame WebSocket connection is persisted across every screen rotation.<\/p>\n<h1 id=\"conclusions\">Conclusions<\/h1>\n<p>In the introduction, I have mentioned that WebSockets can be operated in two\nmodes: <em>text<\/em> and <em>binary<\/em>. In the code that we have seen in this post we have\nsent <em>text<\/em> frames between the server and the client. This is because we have\nused data of type <code>String<\/code> in the Java client code, which is then received as\n<code>str<\/code> on the Python server end.<\/p>\n<p>In most practical cases, it is more convenient to send a stream of bytes\ninstead, thus increasing throughput. On the Java side, this is achieved by\npassing an array of bytes (<code>bytes[]<\/code>) to the <code>send<\/code> method. On the Python end,\ndata will then be received as <code>bytes<\/code> (that is, the return value of the <code>recv<\/code>\ncoroutine is of type <code>bytes<\/code>) that one can iterate over, or perform any\nnecessary operation to make sense of the received information.<\/p>\n<p>The same holds on reverse: if we send an array of bytes from Python, we will\nreceive an binary frame on Java. In this case, though, we will have to implement\nthe <code>ByteBuffer<\/code> version of the <code>onMessage<\/code> callback of the <code>WebSocketClient<\/code>\nclass, i.e. the one with signature<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kd\">public<\/span> <span class=\"kt\">void<\/span> <span class=\"nf\">onMessage<\/span><span class=\"p\">(<\/span> <span class=\"n\">ByteBuffer<\/span> <span class=\"n\">bytes<\/span> <span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<script type=\"text\/javascript\">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {\n    var align = \"center\",\n        indent = \"0em\",\n        linebreak = \"false\";\n\n    if (false) {\n        align = (screen.width < 768) ? \"left\" : align;\n        indent = (screen.width < 768) ? \"0em\" : indent;\n        linebreak = (screen.width < 768) ? 'true' : linebreak;\n    }\n\n    var mathjaxscript = document.createElement('script');\n    mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';\n    mathjaxscript.type = 'text\/javascript';\n    mathjaxscript.src = 'https:\/\/cdnjs.cloudflare.com\/ajax\/libs\/mathjax\/2.7.3\/latest.js?config=TeX-AMS-MML_HTMLorMML';\n\n    var configscript = document.createElement('script');\n    configscript.type = 'text\/x-mathjax-config';\n    configscript[(window.opera ? \"innerHTML\" : \"text\")] =\n        \"MathJax.Hub.Config({\" +\n        \"    config: ['MMLorHTML.js'],\" +\n        \"    TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'none' } },\" +\n        \"    jax: ['input\/TeX','input\/MathML','output\/HTML-CSS'],\" +\n        \"    extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js'],\" +\n        \"    displayAlign: '\"+ align +\"',\" +\n        \"    displayIndent: '\"+ indent +\"',\" +\n        \"    showMathMenu: true,\" +\n        \"    messageStyle: 'normal',\" +\n        \"    tex2jax: { \" +\n        \"        inlineMath: [ ['\\\\\\\\(','\\\\\\\\)'] ], \" +\n        \"        displayMath: [ ['$$','$$'] ],\" +\n        \"        processEscapes: true,\" +\n        \"        preview: 'TeX',\" +\n        \"    }, \" +\n        \"    'HTML-CSS': { \" +\n        \"        availableFonts: ['STIX', 'TeX'],\" +\n        \"        preferredFont: 'STIX',\" +\n        \"        styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} },\" +\n        \"        linebreaks: { automatic: \"+ linebreak +\", width: '90% container' },\" +\n        \"    }, \" +\n        \"}); \" +\n        \"if ('default' !== 'default') {\" +\n            \"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {\" +\n                \"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;\" +\n                \"VARIANT['normal'].fonts.unshift('MathJax_default');\" +\n                \"VARIANT['bold'].fonts.unshift('MathJax_default-bold');\" +\n                \"VARIANT['italic'].fonts.unshift('MathJax_default-italic');\" +\n                \"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');\" +\n            \"});\" +\n            \"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {\" +\n                \"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;\" +\n                \"VARIANT['normal'].fonts.unshift('MathJax_default');\" +\n                \"VARIANT['bold'].fonts.unshift('MathJax_default-bold');\" +\n                \"VARIANT['italic'].fonts.unshift('MathJax_default-italic');\" +\n                \"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');\" +\n            \"});\" +\n        \"}\";\n\n    (document.body || document.getElementsByTagName('head')[0]).appendChild(configscript);\n    (document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);\n}\n<\/script>","category":[{"@attributes":{"term":"IoT"}},{"@attributes":{"term":"python"}},{"@attributes":{"term":"android"}},{"@attributes":{"term":"raspberry pi"}},{"@attributes":{"term":"electronics"}}]},{"title":"Android Development from the Command Line","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/android-development-from-the-command-line.html","rel":"alternate"}},"published":"2017-10-14T22:42:00+01:00","updated":"2017-10-14T22:42:00+01:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2017-10-14:\/android-development-from-the-command-line.html","summary":"<p>Do you like your development tools to be as simple as a text editor to write the code and a bunch of CLI application to build your projects? Do you feel like you are in a cage when you use an IDE? Or perhaps your PC or laptop is a bit dated and all the cores spin like crazy when you fire up Android Studio? Then read on to learn how you can develop Android application with just the text editor of your choice and the standard Android SDK CLI tools.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#introduction\">Introduction<\/a><\/li>\n<li><a href=\"#pre-requisites\">Pre-requisites<\/a><ul>\n<li><a href=\"#android-sdk-tools\">Android SDK Tools<\/a><\/li>\n<li><a href=\"#android-sdk-platform-tools\">Android SDK Platform Tools<\/a><\/li>\n<li><a href=\"#android-sdk-platforms\">Android SDK Platforms<\/a><\/li>\n<li><a href=\"#android-sdk-build-tools\">Android SDK Build Tools<\/a><\/li>\n<li><a href=\"#android-emulator\">Android Emulator<\/a><\/li>\n<li><a href=\"#gradle\">Gradle<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#creating-the-gradle-project\">Creating the Gradle Project<\/a><\/li>\n<li><a href=\"#writing-the-application\">Writing the Application<\/a><ul>\n<li><a href=\"#the-application-manifest\">The Application Manifest<\/a><\/li>\n<li><a href=\"#the-main-activity\">The Main Activity<\/a><\/li>\n<li><a href=\"#the-layout-resource\">The Layout Resource<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#running-the-application\">Running the Application<\/a><ul>\n<li><a href=\"#building-with-gradle\">Building with Gradle<\/a><\/li>\n<li><a href=\"#creating-a-virtual-device\">Creating a Virtual Device<\/a><\/li>\n<li><a href=\"#installing-the-apk\">Installing the APK<\/a><\/li>\n<li><a href=\"#the-lint-tasks\">The lint tasks<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#conclusions\">Conclusions<\/a><\/li>\n<\/ul>\n<\/div>\n<h1 id=\"introduction\">Introduction<\/h1>\n<p>Yes you guessed it right, I'm not a huge fan of IDEs. Don't get me wrong though,\nI am fully aware of how powerful modern IDEs are, and all the magic that they\ncan do for you to assist you while you are coding your application. But this is\nalso why I don't like them, especially when I'm picking up something new.<\/p>\n<p>To the date of writing, I am not an experienced Android developer. When I\nstarted developing applications (there was a time when I believed that no matter\nhow big a project could be, you could always find the time to code it in\nassembly language), a plain old text editor was my friend, together with some\ncommand line tools, like assemblers, linkers, compilers, debuggers etc.... With\nno support at all, you had to know exactly what you were doing, you had to know\nthe syntax, the APIs and where they were located. You also needed a minimum\nknowledge of what a linker is for and when and why you need one.<\/p>\n<p>Whenever I tried an IDE, e.g. Android Studio, I always felt like I didn't really\nneed to know much about the frameworks I was using, as all the built-in tools\nwould come to my rescue. As a consequence, I started feeling like I wasn't\nreally mastering anything and put the project I was working on aside for future\nme to one day resume working on it. Rather than me using the IDE, it kind of was\nthe other way around: the IDE was using me to magically generate code.<\/p>\n<p>IDEs also tend to hide all the machinery involved in the build process from the\ndeveloper as well. In most cases, everything goes well, but what would you do if\nyou suddenly come across a problem and you have no clue at which stage of the\nbuild process it is happening?<\/p>\n<p>Surely, if you work on a big company project, it would be crazy to renounce\nentirely to IDEs, as your life might be a bit harder in everyday maintenance of\nyour code, but for smaller projects this argument is somewhat weak, and opting\nfor a plain text editor might have its many advantages. For once, you are in\ntotal control of the code that is going into your final product. And then again,\nthere is also the educational aspect, which can give you the right amount of\nexperience to tackle unexpected issues that could pop up during any stage of the\ndevelopment life-cycle.<\/p>\n<p>All this being said, in this post we shall see how to develop an Android\napplication by only relying on a text editor of your choice and the standard CLI\ntool provided by the Android SDK. The focus is on the steps required to install\nthe Android SDK command line tools and how to organise your source code, rather\nthan on the details of the application itself.<\/p>\n<p>As with a standard Android project created with Android Studio, we are going to\nrely on Gradle and the Android Gradle plugin for the build process. You may\nrightfully think that this somehow partly defeats the point of this post, but,\nhey, in the end Gradle is just a command line tool, and quite a standard way to\nbuild and deploy Java projects these days.<\/p>\n<p>This post is targeted to Linux users, but there is a good chance that the steps\nthat we will go through have an equivalent on other platforms, like Windows. I'm\nafraid this is something that you will have to find out on your own.<\/p>\n<p>Code very similar to the one presented in this post can be found in the GitHub\nrepository <a href=\"https:\/\/github.com\/P403n1x87\/androtest\">androtest<\/a>.<\/p>\n<h1 id=\"pre-requisites\">Pre-requisites<\/h1>\n<p>Before embarking on an adventure, it is wise to check that we are taking all\nthat we need along the way with us. As we are trying to keep things as simple as\npossible, we won't need much, but there are a few preliminary steps that we need\nto perform in order to set up our development environment.<\/p>\n<p>The first few steps couldn't be simpler: pick your favourite text editor (my\nlaptop can still handle an application like Atom) and terminal application, and\nwe already have almost half of what we need! The rest of the tools is provided\nby the Java Development Kit, the Android SDK Tools and Gradle. More details in\ndue time.<\/p>\n<p>The JDK is usually available from your distro's repositories. On Ubuntu, it can\nbe installed with, e.g.<\/p>\n<div class=\"highlight\"><pre><span><\/span>sudo apt install openjdk-9-jdk\n<\/pre><\/div>\n\n\n<p>The Android SDK might not be available from the official repositories. In\nprinciple the could be installed along with Android Studio, but if you are not\ngoing to use Google's official IDE then it is a bit of waste of space. The\n<em>cleaner<\/em> alternative is to just download the Android SDK Tools from the\n<a href=\"https:\/\/developer.android.com\/studio\/index.html#command-tools\">Android Studio<\/a>\ndownload page. For simplicity, I will split the installation process of the\nAndroid SDK into different steps.<\/p>\n<h2 id=\"android-sdk-tools\">Android SDK Tools<\/h2>\n<p>Download the zip archive and extract it somewhere, e.g. <code>~\/.android\/sdk<\/code>, then\nupdate your <code>.bashrc<\/code> file to define the <code>ANDROID_HOME<\/code> environment variable and\ninclude the SDK tools binaries in the <code>PATH<\/code> variable by adding the following\nlines<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># Android SDK Tools<\/span>\n<span class=\"nb\">export<\/span> <span class=\"nv\">ANDROID_HOME<\/span><span class=\"o\">=<\/span><span class=\"nv\">$HOME<\/span>\/.android\/sdk\n<span class=\"nb\">export<\/span> <span class=\"nv\">PATH<\/span><span class=\"o\">=<\/span><span class=\"nv\">$PATH<\/span>:<span class=\"nv\">$ANDROID_HOME<\/span>\/tools:<span class=\"nv\">$ANDROID_HOME<\/span>\/tools\/bin\n<\/pre><\/div>\n\n\n<p>Note that most of the tools installed in <code>$ANDROID_HOME\/tools<\/code> are deprecated\nand one should use the dedicated ones provided in the <code>$ANDROID_HOME\/tools\/bin<\/code>\nfolder. These include the fundamental\n<a href=\"https:\/\/developer.android.com\/studio\/command-line\/sdkmanager.html\"><code>sdkmanager<\/code><\/a>\nand the\n<a href=\"https:\/\/developer.android.com\/studio\/command-line\/avdmanager.html\"><code>avdmanager<\/code><\/a>\ntools for respectively creating and managing different SDK versions (and other\npackages too) and virtual devices (the emulators).<\/p>\n<h2 id=\"android-sdk-platform-tools\">Android SDK Platform Tools<\/h2>\n<p>Throughout the Android development life-cycle, you are likely to need to\ninterface with the Android platform for testing your progress. In concrete terms\nthis means that you might want to compile your project as you develop it for\ntesting on an actual Android device. In order to connect to the device and look\nat the log you will need the Android Debug Bridge, which is provided with the\nAndroid Platform Tools. To install them we can use the <code>sdkmanager<\/code> CLI tool to\npull the latest released version with the following command<\/p>\n<div class=\"highlight\"><pre><span><\/span>sdkmanager <span class=\"s2\">&quot;platform-tools&quot;<\/span>\n<\/pre><\/div>\n\n\n<p>This will install the Platform Tools into the <code>$ANDROID_HOME\/platform-tools<\/code>\nfolder. We can then add it to the <code>PATH<\/code> variable for easy invocation by simply\nadding the following lines to <code>.bashrc<\/code><\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># Android Platform Tools<\/span>\n<span class=\"nb\">export<\/span> <span class=\"nv\">PATH<\/span><span class=\"o\">=<\/span><span class=\"nv\">$PATH<\/span>:<span class=\"nv\">$ANDROID_HOME<\/span>\/platform-tools\n<\/pre><\/div>\n\n\n<h2 id=\"android-sdk-platforms\">Android SDK Platforms<\/h2>\n<p>In order to compile a project we need a certain API revision to be installed.\nThis provides all the functionalities that our application can use and are\nprovided in the form of Java packages and classes and other useful components.\nFor example, if our project targets Marshmallow, then we need to install the\nAndroid SDK Platform API Level 23. You can find out more at <a href=\"https:\/\/developer.android.com\/guide\/topics\/manifest\/uses-sdk-element.html#ApiLevels\">this\npage<\/a>.\nTo find out about all the packages available to download we can use the command<\/p>\n<div class=\"highlight\"><pre><span><\/span>sdkmanager --list --verbose\n<\/pre><\/div>\n\n\n<p>At the moment we are interested in Android platforms, i.e. those packages that\nare prefixed with <code>platforms;<\/code>, so we can filter the output of the above command\nas follows<\/p>\n<div class=\"highlight\"><pre><span><\/span>sdkmanager --list --verbose <span class=\"p\">|<\/span> grep -A <span class=\"m\">3<\/span> platforms<span class=\"se\">\\;<\/span>\n<\/pre><\/div>\n\n\n<p>Among the matching results, you should see something similar to the following<\/p>\n<div class=\"highlight\"><pre><span><\/span>platforms;android-23\n    Description:        Android SDK Platform 23\n    Version:            3\n<\/pre><\/div>\n\n\n<p>We can then proceed to installing the Android SDK Platform API Level 23 with the\ncommand<\/p>\n<div class=\"highlight\"><pre><span><\/span>sdkmanager <span class=\"s2\">&quot;platforms;android-23&quot;<\/span>\n<\/pre><\/div>\n\n\n<blockquote>\n<p>When you use the <code>sdkmanager<\/code> tool, you might see the following warning message<\/p>\n<p><code>Warning: File \/home\/user\/.android\/repositories.cfg could not be loaded.<\/code><\/p>\n<p>In order to get rid of it you can simply create this file with no content.<\/p>\n<\/blockquote>\n<h2 id=\"android-sdk-build-tools\">Android SDK Build Tools<\/h2>\n<p>Now that we have the API to compile against, we need the tools to actually be\nable to build a project: the build tools. They provide utilities like\n<code>apksigner<\/code>, Jack and Jill etc..., but for the moment we don't have to worry\nabout the details of this package, as they will be invoked behind the scenes by\nGradle.<\/p>\n<p>The <a href=\"https:\/\/developer.android.com\/studio\/releases\/build-tools.html\">Android Studio User\nGuide<\/a>\nrecommends that you keep the build tools updated to the latest version. To find\nout all the versions available for download, run the following command<\/p>\n<div class=\"highlight\"><pre><span><\/span>sdkmanager --list --verbose <span class=\"p\">|<\/span> grep -A <span class=\"m\">3<\/span> build-tools<span class=\"se\">\\;<\/span>\n<\/pre><\/div>\n\n\n<p>and locate the latest version. At the moment of writing this is 26.0.1, so the\ncommand to use in this case is<\/p>\n<div class=\"highlight\"><pre><span><\/span>sdkmanager <span class=\"s2\">&quot;build-tools;26.0.1&quot;<\/span>\n<\/pre><\/div>\n\n\n<h2 id=\"android-emulator\">Android Emulator<\/h2>\n<p>The installation of the Android Emulator package is not mandatory, as you can\ndecide to test your application on an actual Android device. However, there are\nmany reasons why you might want to use an emulator: you probably don't own a\nhuge variety of Android devices, differing not only in physical size, but also\nin the API version (Kit Kat, Lollipop, Marshmallow, just to name a few of the\nmost recent code-names). The package can be installed through the <code>sdkmanger<\/code>\nwith the command<\/p>\n<div class=\"highlight\"><pre><span><\/span>sdkmanager <span class=\"s2\">&quot;emulator&quot;<\/span>\n<\/pre><\/div>\n\n\n<p>Again, we can add the emulator folder to the <code>PATH<\/code> variable for easy access.\nHowever, the Android SDK Tools provides a deprecated set of tools, <code>emulator<\/code>\nand <code>emulator-check<\/code>, that would collide with the ones we have just installed.\nTo solve this problem we can rename the deprecated executables with<\/p>\n<div class=\"highlight\"><pre><span><\/span>chmod -x <span class=\"nv\">$ANDROID_HOME<\/span>\/tools\/emulator\nchmod -x <span class=\"nv\">$ANDROID_HOME<\/span>\/tools\/emulator-check\nmv <span class=\"nv\">$ANDROID_HOME<\/span>\/tools\/emulator <span class=\"nv\">$ANDROID_HOME<\/span>\/tools\/emulator.dep\nmv <span class=\"nv\">$ANDROID_HOME<\/span>\/tools\/emulator-check <span class=\"nv\">$ANDROID_HOME<\/span>\/tools\/emulator-check.dep\n<\/pre><\/div>\n\n\n<p>and then add the others to the <code>PATH<\/code> variables by appending the following lines\nto the <code>~\/.bashrc<\/code> file<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># Android Emulator<\/span>\n<span class=\"nb\">export<\/span> <span class=\"nv\">PATH<\/span><span class=\"o\">=<\/span><span class=\"nv\">$PATH<\/span>:<span class=\"nv\">$ANDROID_HOME<\/span>\/emulator\n<\/pre><\/div>\n\n\n<h2 id=\"gradle\">Gradle<\/h2>\n<p>Let's now proceed to the installation of Gradle, a build automation system that\nis also the default in Android Studio. Google has developed a dedicated Android\nplugin to assist with the most common tasks. The ones that I personally tend to\nrun more frequently are collected in the following table<\/p>\n<table>\n<thead>\n<tr>\n<th><strong>Task<\/strong><\/th>\n<th><strong>Description<\/strong><\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><code>compileDebugJavaWithJavac<\/code><\/td>\n<td>Compiles the debug version of the java sources. Useful to check for syntax errors while coding<\/td>\n<\/tr>\n<tr>\n<td><code>installDebug<\/code><\/td>\n<td>Compiles and installs the debug version on all the devices discovered by the ADB.<\/td>\n<\/tr>\n<tr>\n<td><code>lint<\/code><\/td>\n<td>Runs a lint on the sources, producing a report in <code>build\/reports<\/code>.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Gradle is quite popular so it should be available from your distro's official\nrepositories. On Ubuntu 17.04 though, a quite old version of Gradle is available\nfrom them, so I would recommend that you add <a href=\"https:\/\/launchpad.net\/~cwchien\/+archive\/ubuntu\/gradle\">Cheng-Wei Chien's\nPPA<\/a> to your software\nsources and install Gradle from there<\/p>\n<div class=\"highlight\"><pre><span><\/span>sudo add-apt-repository ppa:cwchien\/gradle\nsudo apt update <span class=\"o\">&amp;&amp;<\/span> sudo apt install gradle\n<\/pre><\/div>\n\n\n<p>Gradle is quite a powerful tool, but you might find that it has a rather steep\nlearning curve to master all of its features, especially if you are not familiar\nwith Groovy. In this post we shall only scratch the very surface and look at\nonly the closures that we need for this project, as a discussion on Gradle\nsurely deserves a dedicated post on its own.<\/p>\n<p>Getting back on business, the installation of Gradle was the last step that we\nneeded to perform in order to set up the development environment, and we can now\nmove on to creating an Android project from scratch.<\/p>\n<h1 id=\"creating-the-gradle-project\">Creating the Gradle Project<\/h1>\n<p>The first thing to do is to create a Gradle project of Java type. This involves\nsetting up a directory structure in the project's parent folder and creating the\nGradle build script <code>build.gradle<\/code>. With Gradle installed, these steps can be\nautomated with the\n<a href=\"https:\/\/docs.gradle.org\/current\/userguide\/build_init_plugin.html\"><code>init<\/code><\/a> task<\/p>\n<div class=\"highlight\"><pre><span><\/span>gradle init --type java-library\n<\/pre><\/div>\n\n\n<p>If you are contributing to a project that is the work of many hands, it will\nprobably be the case that everybody is using the same build tools. As everybody\ncan have a different version of Gradle on their local machine, the Gradle\nproject recommends that one uses a\n<a href=\"https:\/\/docs.gradle.org\/current\/userguide\/gradle_wrapper.html\">wrapper<\/a> to\nbuild a project, rather than invoke the local installation of Gradle directly.\nBy sharing the wrapper along with your project, every other developer working on\nthe same project will be able to use the same version of Gradle as everybody\nelse, thus getting rid of problems caused by switching between different\nversions. Even though this is a sample project, we will nonetheless create and\nuse a Gradle wrapper to build our project. The previous command should have\ncreated a <code>gradlew<\/code> shell script in the project's folder. If not, run the\nfollowing command<\/p>\n<div class=\"highlight\"><pre><span><\/span>gradle wrapper\n<\/pre><\/div>\n\n\n<p>You should also have a folder <code>src\/<\/code> containing all the sub-folders where Gradle\nexpects the sources and the resources that make up your project. But, most\nimportantly, you should also have the <code>build.gradle<\/code> and <code>settings.gradle<\/code> files\ncontaining some sample build settings. This structure is slightly different from\nthe one generated by Android Studio, and documented on the <a href=\"https:\/\/developer.android.com\/studio\/build\/index.html\">developer\nportal<\/a>, where you can\nnotice a nested Gradle project, with the topmost one used to import the actual\nproject as a module, and configure the global build settings. For the case at\nhand, we could to without this nested structure and only define a single build\nscript, since our project is made up of only one module.<\/p>\n<p>The following is the content of the <code>build.gradle<\/code> file.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">buildscript<\/span> <span class=\"o\">{<\/span>\n    <span class=\"n\">repositories<\/span> <span class=\"o\">{<\/span>\n        <span class=\"n\">jcenter<\/span><span class=\"o\">()<\/span>\n    <span class=\"o\">}<\/span>\n    <span class=\"n\">dependencies<\/span> <span class=\"o\">{<\/span>\n        <span class=\"n\">classpath<\/span> <span class=\"s1\">&#39;com.android.tools.build:gradle:2.3.3&#39;<\/span>\n    <span class=\"o\">}<\/span>\n<span class=\"o\">}<\/span>\n\n<span class=\"n\">apply<\/span> <span class=\"nl\">plugin:<\/span> <span class=\"s1\">&#39;com.android.application&#39;<\/span>\n\n<span class=\"n\">android<\/span> <span class=\"o\">{<\/span>\n    <span class=\"n\">compileSdkVersion<\/span> <span class=\"mi\">23<\/span>\n    <span class=\"n\">buildToolsVersion<\/span> <span class=\"s1\">&#39;26.0.1&#39;<\/span>\n<span class=\"o\">}<\/span>\n<\/pre><\/div>\n\n\n<p>You can find a detailed explanation of the meaning of each closure in the\n<a href=\"https:\/\/developer.android.com\/studio\/build\/index.html#top-level\">Configure Your\nBuild<\/a> page of\nthe Android Developer portal. Briefly, the <code>buildscript<\/code> closure is used to\nconfigure Gradle itself so that it knows where to find the Android-specific\nGradle tools that we want to use. We can then import the Android Gradle plugin\nand use the extensions to the DSL that it provides to configure the\nAndroid-specific build process. In this case we specify that the compilation SDK\nversion that we want to use is Level 23 (Marshmallow), and that we want to use\nthe version <code>26.0.1<\/code> of the build tools.<\/p>\n<p>The <code>gradle.properties<\/code> file is used to configure the project-wide Gradle\nsettings, such as the Gradle daemon's maximum heap size. In fact, this is all we\nwill use it for in our case. Open it with your text editor and put the following\ncontent in it<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">org<\/span><span class=\"o\">.<\/span><span class=\"na\">gradle<\/span><span class=\"o\">.<\/span><span class=\"na\">jvmargs<\/span><span class=\"o\">=-<\/span><span class=\"n\">Xmx1536m<\/span>\n<\/pre><\/div>\n\n\n<p>This is our Gradle project created and configured! All the magic of the building\nprocess is hidden from us by the Android Gradle plugin so that we don't have to\nworry about anything else and just focus on our application code.<\/p>\n<h1 id=\"writing-the-application\">Writing the Application<\/h1>\n<p>Two essential ingredients for an Android application are the <strong>Main Activity<\/strong>\nand the <a href=\"https:\/\/developer.android.com\/guide\/topics\/manifest\/manifest-intro.html\"><strong>App\nManifest<\/strong><\/a>.\nLet's start with the latter first.<\/p>\n<h2 id=\"the-application-manifest\">The Application Manifest<\/h2>\n<p>The Android Application Manifest is an XML manifest file that is used to provide\nthe Android system with essential information of an application, like the name\nof the Java package, the activities provided, the permissions required, the\nminimum SDK version supported etc... . Within a Gradle project, this file must\nbe located in the <code>src\/main<\/code> folder and named rigorously <code>AndroidManifest.xml<\/code>.\nIn our case, this is what such manifest file would look like<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;<\/span>\n<span class=\"p\">&lt;<\/span><span class=\"nt\">manifest<\/span> <span class=\"na\">xmlns:android<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;http:\/\/schemas.android.com\/apk\/res\/android&quot;<\/span>\n  <span class=\"na\">package<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;com.p403n1x87.androtest&quot;<\/span><span class=\"p\">&gt;<\/span>\n\n  <span class=\"p\">&lt;<\/span><span class=\"nt\">uses-sdk<\/span>\n    <span class=\"na\">android:minSdkVersion<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;21&quot;<\/span>\n    <span class=\"na\">android:targetSdkVersion<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;23&quot;<\/span> <span class=\"p\">\/&gt;<\/span>\n\n  <span class=\"p\">&lt;<\/span><span class=\"nt\">application<\/span>\n    <span class=\"na\">android:label<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;AndroTest&quot;<\/span>\n    <span class=\"na\">android:theme<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;@android:style\/Theme.Material&quot;<\/span><span class=\"p\">&gt;<\/span>\n\n    <span class=\"p\">&lt;<\/span><span class=\"nt\">activity<\/span> <span class=\"na\">android:name<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;MainActivity&quot;<\/span><span class=\"p\">&gt;<\/span>\n      <span class=\"p\">&lt;<\/span><span class=\"nt\">intent-filter<\/span><span class=\"p\">&gt;<\/span>\n        <span class=\"p\">&lt;<\/span><span class=\"nt\">action<\/span> <span class=\"na\">android:name<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;android.intent.action.MAIN&quot;<\/span> <span class=\"p\">\/&gt;<\/span>\n        <span class=\"p\">&lt;<\/span><span class=\"nt\">category<\/span> <span class=\"na\">android:name<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;android.intent.category.LAUNCHER&quot;<\/span> <span class=\"p\">\/&gt;<\/span>\n      <span class=\"p\">&lt;\/<\/span><span class=\"nt\">intent-filter<\/span><span class=\"p\">&gt;<\/span>\n    <span class=\"p\">&lt;\/<\/span><span class=\"nt\">activity<\/span><span class=\"p\">&gt;<\/span>\n  <span class=\"p\">&lt;\/<\/span><span class=\"nt\">application<\/span><span class=\"p\">&gt;<\/span>\n<span class=\"p\">&lt;\/<\/span><span class=\"nt\">manifest<\/span><span class=\"p\">&gt;<\/span>\n<\/pre><\/div>\n\n\n<p>The structure of the Android Application Manifest file is described <a href=\"https:\/\/developer.android.com\/guide\/topics\/manifest\/manifest-intro.html#filestruct\">on the\ndeveloper\nportal<\/a>\nin the form of a skeleton, as it is not based on any XML schema like DTD or XSD.<\/p>\n<p>The root element is <code>manifest<\/code>, which accepts the <code>package<\/code> attribute. Here we\nspecify the name of the Java package of our application. We also specify our\ntarget API level to 23. The minimum level supported is set to 21 for reasons\nthat relate to the output of the <code>lint<\/code> Android Gradle task that we will look at\nlater on. It is recommended that you set this attribute to a reasonable value.\nFor example, if you do not set it, Android will automatically add the legacy\noverflow <em>three-dot<\/em> button, even though there are no actions to show. Starting\nfrom level 11, Android does not add this button by default.<\/p>\n<p>The <code>manifest<\/code> element must contain a unique <code>application<\/code> element that\ndescribes our application in may aspects, from the theme to use, to the activity\ncontent. In this case we set the application name to <code>AndroTest<\/code> and the theme\nto <code>Material<\/code> and we expose only one activity, i.e. the <code>MainActivity<\/code>, which is\nthe entry point of our application, and the one that would be fired up when we\nlaunch it.<\/p>\n<h2 id=\"the-main-activity\">The Main Activity<\/h2>\n<p>This takes us now to the main activity, i.e. the Java class that contains the\ncode to be executed when our application is launched. Within a Gradle project,\nthe main Java code should reside in the <code>src\/main\/java<\/code> folder. Since the\npackage name is <code>com.p403n1x87.androtest<\/code>, the <code>MainActivity.java<\/code> source file\nshould be created within the <code>src\/main\/java\/com\/p403n1x87\/androtest<\/code> folder.\nHere is the code<\/p>\n<table class=\"highlighttable\"><tr><td class=\"linenos\"><div class=\"linenodiv\"><pre><span class=\"normal\"> 1<\/span>\n<span class=\"normal\"> 2<\/span>\n<span class=\"normal\"> 3<\/span>\n<span class=\"normal\"> 4<\/span>\n<span class=\"normal\"> 5<\/span>\n<span class=\"normal\"> 6<\/span>\n<span class=\"normal\"> 7<\/span>\n<span class=\"normal\"> 8<\/span>\n<span class=\"normal\"> 9<\/span>\n<span class=\"normal\">10<\/span>\n<span class=\"normal\">11<\/span>\n<span class=\"normal\">12<\/span>\n<span class=\"normal\">13<\/span>\n<span class=\"normal\">14<\/span>\n<span class=\"normal\">15<\/span>\n<span class=\"normal\">16<\/span>\n<span class=\"normal\">17<\/span>\n<span class=\"normal\">18<\/span>\n<span class=\"normal\">19<\/span>\n<span class=\"normal\">20<\/span>\n<span class=\"normal\">21<\/span>\n<span class=\"normal\">22<\/span>\n<span class=\"normal\">23<\/span>\n<span class=\"normal\">24<\/span>\n<span class=\"normal\">25<\/span>\n<span class=\"normal\">26<\/span>\n<span class=\"normal\">27<\/span>\n<span class=\"normal\">28<\/span>\n<span class=\"normal\">29<\/span>\n<span class=\"normal\">30<\/span>\n<span class=\"normal\">31<\/span>\n<span class=\"normal\">32<\/span>\n<span class=\"normal\">33<\/span>\n<span class=\"normal\">34<\/span>\n<span class=\"normal\">35<\/span>\n<span class=\"normal\">36<\/span>\n<span class=\"normal\">37<\/span>\n<span class=\"normal\">38<\/span>\n<span class=\"normal\">39<\/span>\n<span class=\"normal\">40<\/span>\n<span class=\"normal\">41<\/span>\n<span class=\"normal\">42<\/span>\n<span class=\"normal\">43<\/span>\n<span class=\"normal\">44<\/span>\n<span class=\"normal\">45<\/span>\n<span class=\"normal\">46<\/span>\n<span class=\"normal\">47<\/span>\n<span class=\"normal\">48<\/span>\n<span class=\"normal\">49<\/span>\n<span class=\"normal\">50<\/span>\n<span class=\"normal\">51<\/span>\n<span class=\"normal\">52<\/span>\n<span class=\"normal\">53<\/span>\n<span class=\"normal\">54<\/span>\n<span class=\"normal\">55<\/span>\n<span class=\"normal\">56<\/span>\n<span class=\"normal\">57<\/span>\n<span class=\"normal\">58<\/span>\n<span class=\"normal\">59<\/span>\n<span class=\"normal\">60<\/span>\n<span class=\"normal\">61<\/span>\n<span class=\"normal\">62<\/span>\n<span class=\"normal\">63<\/span>\n<span class=\"normal\">64<\/span>\n<span class=\"normal\">65<\/span>\n<span class=\"normal\">66<\/span>\n<span class=\"normal\">67<\/span>\n<span class=\"normal\">68<\/span>\n<span class=\"normal\">69<\/span>\n<span class=\"normal\">70<\/span>\n<span class=\"normal\">71<\/span>\n<span class=\"normal\">72<\/span>\n<span class=\"normal\">73<\/span>\n<span class=\"normal\">74<\/span>\n<span class=\"normal\">75<\/span>\n<span class=\"normal\">76<\/span>\n<span class=\"normal\">77<\/span><\/pre><\/div><\/td><td class=\"code\"><div class=\"highlight\"><pre><span><\/span><span class=\"kn\">package<\/span> <span class=\"nn\">com.p403n1x87.androtest<\/span><span class=\"p\">;<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">android.app.Activity<\/span><span class=\"p\">;<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">android.os.Bundle<\/span><span class=\"p\">;<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">android.content.Context<\/span><span class=\"p\">;<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">android.hardware.Sensor<\/span><span class=\"p\">;<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">android.hardware.SensorEvent<\/span><span class=\"p\">;<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">android.hardware.SensorEventListener<\/span><span class=\"p\">;<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">android.hardware.SensorManager<\/span><span class=\"p\">;<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">android.view.ViewGroup.LayoutParams<\/span><span class=\"p\">;<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">android.widget.TextView<\/span><span class=\"p\">;<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">android.widget.LinearLayout<\/span><span class=\"p\">;<\/span>\n\n<span class=\"kn\">import<\/span> <span class=\"nn\">java.util.List<\/span><span class=\"p\">;<\/span>\n\n<span class=\"kn\">import static<\/span> <span class=\"nn\">java.lang.Math.sqrt<\/span><span class=\"p\">;<\/span>\n\n\n<span class=\"kd\">public<\/span> <span class=\"kd\">class<\/span> <span class=\"nc\">MainActivity<\/span> <span class=\"kd\">extends<\/span> <span class=\"n\">Activity<\/span>\n<span class=\"p\">{<\/span>\n  <span class=\"kd\">private<\/span> <span class=\"n\">SensorManager<\/span> <span class=\"n\">mSensorManager<\/span><span class=\"p\">;<\/span>\n  <span class=\"kd\">private<\/span> <span class=\"n\">Sensor<\/span>        <span class=\"n\">mSensor<\/span><span class=\"p\">;<\/span>\n\n  <span class=\"kd\">private<\/span> <span class=\"n\">TextView<\/span>      <span class=\"n\">text<\/span>       <span class=\"o\">=<\/span> <span class=\"kc\">null<\/span><span class=\"p\">;<\/span>\n  <span class=\"kd\">private<\/span> <span class=\"n\">LinearLayout<\/span>  <span class=\"n\">layoutMain<\/span> <span class=\"o\">=<\/span> <span class=\"kc\">null<\/span><span class=\"p\">;<\/span>\n\n  <span class=\"nd\">@Override<\/span>\n  <span class=\"kd\">public<\/span> <span class=\"kt\">void<\/span> <span class=\"nf\">onCreate<\/span><span class=\"p\">(<\/span><span class=\"n\">Bundle<\/span> <span class=\"n\">savedInstanceState<\/span><span class=\"p\">)<\/span> <span class=\"p\">{<\/span>\n    <span class=\"kd\">super<\/span><span class=\"p\">.<\/span><span class=\"na\">onCreate<\/span><span class=\"p\">(<\/span><span class=\"n\">savedInstanceState<\/span><span class=\"p\">);<\/span>\n\n    <span class=\"c1\">\/\/ UI<\/span>\n    <span class=\"n\">setContentView<\/span><span class=\"p\">(<\/span><span class=\"n\">R<\/span><span class=\"p\">.<\/span><span class=\"na\">layout<\/span><span class=\"p\">.<\/span><span class=\"na\">main_layout<\/span><span class=\"p\">);<\/span>\n    <span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">text<\/span> <span class=\"o\">==<\/span> <span class=\"kc\">null<\/span><span class=\"p\">)<\/span>       <span class=\"n\">text<\/span>       <span class=\"o\">=<\/span> <span class=\"p\">(<\/span><span class=\"n\">TextView<\/span><span class=\"p\">)<\/span>     <span class=\"n\">findViewById<\/span><span class=\"p\">(<\/span><span class=\"n\">R<\/span><span class=\"p\">.<\/span><span class=\"na\">id<\/span><span class=\"p\">.<\/span><span class=\"na\">text<\/span><span class=\"p\">);<\/span>\n    <span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">layoutMain<\/span> <span class=\"o\">==<\/span> <span class=\"kc\">null<\/span><span class=\"p\">)<\/span> <span class=\"n\">layoutMain<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span><span class=\"n\">LinearLayout<\/span><span class=\"p\">)<\/span> <span class=\"n\">findViewById<\/span><span class=\"p\">(<\/span><span class=\"n\">R<\/span><span class=\"p\">.<\/span><span class=\"na\">id<\/span><span class=\"p\">.<\/span><span class=\"na\">layout_main<\/span><span class=\"p\">);<\/span>\n\n    <span class=\"c1\">\/\/ Sensors<\/span>\n    <span class=\"n\">mSensorManager<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span><span class=\"n\">SensorManager<\/span><span class=\"p\">)<\/span> <span class=\"n\">getSystemService<\/span><span class=\"p\">(<\/span><span class=\"n\">Context<\/span><span class=\"p\">.<\/span><span class=\"na\">SENSOR_SERVICE<\/span><span class=\"p\">);<\/span>\n    <span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">mSensorManager<\/span><span class=\"p\">.<\/span><span class=\"na\">getDefaultSensor<\/span><span class=\"p\">(<\/span><span class=\"n\">Sensor<\/span><span class=\"p\">.<\/span><span class=\"na\">TYPE_GRAVITY<\/span><span class=\"p\">)<\/span> <span class=\"o\">!=<\/span> <span class=\"kc\">null<\/span><span class=\"p\">){<\/span>\n      <span class=\"n\">List<\/span><span class=\"o\">&lt;<\/span><span class=\"n\">Sensor<\/span><span class=\"o\">&gt;<\/span> <span class=\"n\">gravSensors<\/span> <span class=\"o\">=<\/span> <span class=\"n\">mSensorManager<\/span><span class=\"p\">.<\/span><span class=\"na\">getSensorList<\/span><span class=\"p\">(<\/span><span class=\"n\">Sensor<\/span><span class=\"p\">.<\/span><span class=\"na\">TYPE_GRAVITY<\/span><span class=\"p\">);<\/span>\n      <span class=\"kt\">int<\/span> <span class=\"n\">nSensors<\/span> <span class=\"o\">=<\/span> <span class=\"n\">gravSensors<\/span><span class=\"p\">.<\/span><span class=\"na\">size<\/span><span class=\"p\">();<\/span>\n      <span class=\"n\">text<\/span><span class=\"p\">.<\/span><span class=\"na\">setText<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;Detected gravity sensors: &quot;<\/span> <span class=\"o\">+<\/span> <span class=\"n\">Integer<\/span><span class=\"p\">.<\/span><span class=\"na\">toString<\/span><span class=\"p\">(<\/span><span class=\"n\">nSensors<\/span><span class=\"p\">));<\/span>\n\n      <span class=\"k\">for<\/span> <span class=\"p\">(<\/span><span class=\"kt\">int<\/span> <span class=\"n\">i<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">0<\/span><span class=\"p\">;<\/span> <span class=\"n\">i<\/span> <span class=\"o\">&lt;<\/span> <span class=\"n\">nSensors<\/span><span class=\"p\">;<\/span> <span class=\"n\">i<\/span><span class=\"o\">++<\/span><span class=\"p\">)<\/span> <span class=\"p\">{<\/span>\n        <span class=\"kd\">final<\/span> <span class=\"n\">TextView<\/span> <span class=\"n\">tvSensor<\/span> <span class=\"o\">=<\/span> <span class=\"k\">new<\/span> <span class=\"n\">TextView<\/span><span class=\"p\">(<\/span><span class=\"k\">this<\/span><span class=\"p\">);<\/span>\n        <span class=\"kd\">final<\/span> <span class=\"kt\">int<\/span>      <span class=\"n\">j<\/span>        <span class=\"o\">=<\/span> <span class=\"n\">i<\/span> <span class=\"o\">+<\/span> <span class=\"mi\">1<\/span><span class=\"p\">;<\/span>\n\n        <span class=\"n\">tvSensor<\/span><span class=\"p\">.<\/span><span class=\"na\">setLayoutParams<\/span><span class=\"p\">(<\/span><span class=\"k\">new<\/span> <span class=\"n\">LayoutParams<\/span><span class=\"p\">(<\/span><span class=\"n\">LayoutParams<\/span><span class=\"p\">.<\/span><span class=\"na\">WRAP_CONTENT<\/span><span class=\"p\">,<\/span> <span class=\"n\">LayoutParams<\/span><span class=\"p\">.<\/span><span class=\"na\">WRAP_CONTENT<\/span><span class=\"p\">));<\/span>\n\n        <span class=\"n\">mSensor<\/span> <span class=\"o\">=<\/span> <span class=\"n\">gravSensors<\/span><span class=\"p\">.<\/span><span class=\"na\">get<\/span><span class=\"p\">(<\/span><span class=\"n\">i<\/span><span class=\"p\">);<\/span>\n        <span class=\"n\">mSensorManager<\/span><span class=\"p\">.<\/span><span class=\"na\">registerListener<\/span><span class=\"p\">(<\/span><span class=\"k\">new<\/span> <span class=\"n\">SensorEventListener<\/span><span class=\"p\">()<\/span> <span class=\"p\">{<\/span>\n          <span class=\"nd\">@Override<\/span>\n          <span class=\"kd\">public<\/span> <span class=\"kt\">void<\/span> <span class=\"nf\">onSensorChanged<\/span><span class=\"p\">(<\/span><span class=\"kd\">final<\/span> <span class=\"n\">SensorEvent<\/span> <span class=\"n\">event<\/span><span class=\"p\">)<\/span> <span class=\"p\">{<\/span>\n                <span class=\"kt\">float<\/span> <span class=\"n\">x<\/span> <span class=\"o\">=<\/span> <span class=\"n\">event<\/span><span class=\"p\">.<\/span><span class=\"na\">values<\/span><span class=\"o\">[<\/span><span class=\"mi\">0<\/span><span class=\"o\">]<\/span><span class=\"p\">;<\/span>\n                <span class=\"kt\">float<\/span> <span class=\"n\">y<\/span> <span class=\"o\">=<\/span> <span class=\"n\">event<\/span><span class=\"p\">.<\/span><span class=\"na\">values<\/span><span class=\"o\">[<\/span><span class=\"mi\">1<\/span><span class=\"o\">]<\/span><span class=\"p\">;<\/span>\n                <span class=\"kt\">float<\/span> <span class=\"n\">z<\/span> <span class=\"o\">=<\/span> <span class=\"n\">event<\/span><span class=\"p\">.<\/span><span class=\"na\">values<\/span><span class=\"o\">[<\/span><span class=\"mi\">2<\/span><span class=\"o\">]<\/span><span class=\"p\">;<\/span>\n                <span class=\"kt\">float<\/span> <span class=\"n\">g<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span><span class=\"kt\">float<\/span><span class=\"p\">)<\/span> <span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">x<\/span><span class=\"o\">*<\/span><span class=\"n\">x<\/span> <span class=\"o\">+<\/span> <span class=\"n\">y<\/span><span class=\"o\">*<\/span><span class=\"n\">y<\/span> <span class=\"o\">+<\/span> <span class=\"n\">z<\/span><span class=\"o\">*<\/span><span class=\"n\">z<\/span><span class=\"p\">);<\/span>\n\n            <span class=\"n\">tvSensor<\/span><span class=\"p\">.<\/span><span class=\"na\">setText<\/span><span class=\"p\">(<\/span><span class=\"n\">String<\/span><span class=\"p\">.<\/span><span class=\"na\">format<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;Sensor %d: %f m\/s^2&quot;<\/span><span class=\"p\">,<\/span> <span class=\"n\">j<\/span><span class=\"p\">,<\/span> <span class=\"n\">g<\/span><span class=\"p\">));<\/span>\n          <span class=\"p\">}<\/span>\n\n          <span class=\"nd\">@Override<\/span>\n          <span class=\"kd\">public<\/span> <span class=\"kt\">void<\/span> <span class=\"nf\">onAccuracyChanged<\/span><span class=\"p\">(<\/span><span class=\"n\">Sensor<\/span> <span class=\"n\">sensor<\/span><span class=\"p\">,<\/span> <span class=\"kt\">int<\/span> <span class=\"n\">a<\/span><span class=\"p\">)<\/span> <span class=\"p\">{}<\/span>\n        <span class=\"p\">},<\/span> <span class=\"n\">mSensor<\/span><span class=\"p\">,<\/span> <span class=\"n\">SensorManager<\/span><span class=\"p\">.<\/span><span class=\"na\">SENSOR_DELAY_NORMAL<\/span><span class=\"p\">);<\/span>\n\n        <span class=\"n\">layoutMain<\/span><span class=\"p\">.<\/span><span class=\"na\">addView<\/span><span class=\"p\">(<\/span><span class=\"n\">tvSensor<\/span><span class=\"p\">);<\/span>\n      <span class=\"p\">}<\/span>\n    <span class=\"p\">}<\/span>\n    <span class=\"k\">else<\/span> <span class=\"p\">{<\/span>\n      <span class=\"n\">text<\/span><span class=\"p\">.<\/span><span class=\"na\">setText<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;We DO NOT have gravity! :(&quot;<\/span><span class=\"p\">);<\/span>\n    <span class=\"p\">}<\/span>\n  <span class=\"p\">}<\/span>\n\n<span class=\"p\">}<\/span>\n<\/pre><\/div>\n<\/td><\/tr><\/table>\n\n<p>The code is fairly simple and quite self-explanatory. We override the <code>onCreate<\/code>\nmethod of the <code>Activity<\/code> class, from which <code>MainActivity<\/code> inherits, to set up\nthe UI. We use a layout resource as a basis, which we then dynamically extend\nwith extra <code>TextView<\/code> widgets to hold the value of each gravity sensor that gets\ndiscovered at run-time.<\/p>\n<h2 id=\"the-layout-resource\">The Layout Resource<\/h2>\n<p>In this project we have a mixture of static layout resources and dynamic\ncreation of <code>TextView<\/code> elements. This offers us the chance to see how to make\nresources available to the Java code, i.e. by placing them in the place where\nGradle, and the Android Gradle plugin, would expect them. Files placed in the\n<code>src\/main\/res<\/code> folder are treated as resources and can be referenced in the way\ndescribed at the Android Developer portal. In the previous code block, on line\n36 we have<\/p>\n<div class=\"highlight\"><pre><span><\/span>    <span class=\"n\">setContentView<\/span><span class=\"p\">(<\/span><span class=\"n\">R<\/span><span class=\"p\">.<\/span><span class=\"na\">layout<\/span><span class=\"p\">.<\/span><span class=\"na\">main_layout<\/span><span class=\"p\">);<\/span>\n<\/pre><\/div>\n\n\n<p>The expression <code>R.layout.main_layout<\/code> refers to the resource <code>main_layout.xml<\/code>\nlocated in the <code>layout<\/code> sub-folder of <code>src\/main\/res<\/code>. The various grouping of\nresources is detailed at the page <a href=\"https:\/\/developer.android.com\/guide\/topics\/resources\/providing-resources.html\">Providing\nResources<\/a>.<\/p>\n<p>Later on, in dealing with the output of the <code>lint<\/code> task, we will have the chance\nto look at the <code>strings.xml<\/code> resources in the <code>values<\/code> sub-folder. For the time\nbeing, let's have a look at what the <code>main_layout.xml<\/code> looks like in this case:<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;<\/span>\n<span class=\"p\">&lt;<\/span><span class=\"nt\">LinearLayout<\/span> <span class=\"na\">xmlns:android<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;http:\/\/schemas.android.com\/apk\/res\/android&quot;<\/span>\n              <span class=\"na\">android:id<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;@+id\/layout_main&quot;<\/span>\n              <span class=\"na\">android:layout_width<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;match_parent&quot;<\/span>\n              <span class=\"na\">android:layout_height<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;match_parent&quot;<\/span>\n              <span class=\"na\">android:padding<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;32px&quot;<\/span>\n              <span class=\"na\">android:gravity<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;center&quot;<\/span>\n              <span class=\"na\">android:orientation<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;vertical&quot;<\/span> <span class=\"p\">&gt;<\/span>\n\n  <span class=\"p\">&lt;<\/span><span class=\"nt\">TextView<\/span> <span class=\"na\">android:id<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;@+id\/text&quot;<\/span>\n            <span class=\"na\">android:layout_width<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;wrap_content&quot;<\/span>\n            <span class=\"na\">android:layout_height<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;wrap_content&quot;<\/span>\n            <span class=\"na\">android:paddingBottom<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;12px&quot;<\/span>\n            <span class=\"na\">android:text<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;Gravity sensors&quot;<\/span> <span class=\"p\">\/&gt;<\/span>\n\n  <span class=\"cm\">&lt;!--<\/span>\n<span class=\"cm\">  A new TextView element will be added programmatically for each on-board<\/span>\n<span class=\"cm\">  gravity sensor detected.<\/span>\n<span class=\"cm\">  --&gt;<\/span>\n\n<span class=\"p\">&lt;\/<\/span><span class=\"nt\">LinearLayout<\/span><span class=\"p\">&gt;<\/span>\n<\/pre><\/div>\n\n\n<p>This is a rather simple layout. We create a vertical <code>LinearLayout<\/code> container to\ndisplay a vertical stack of <code>TextView<\/code> elements. The first one, statically\nincluded in the XML resource file, will hold the counter of the discovered\ngravity sensors. More <code>TextView<\/code> elements are added at run-time, one for each of\nthe sensors, to display their updated value.<\/p>\n<h1 id=\"running-the-application\">Running the Application<\/h1>\n<p>Now that we have set up the Gradle project and coded our application, it is time\nto build it and install it on an Android device to run it. In this section we\nshall see how to invoke the Gradle <code>build<\/code> task to generate the APK (the Android\nPackage Kit) package, and how to install it on our devices, being them either\nvirtual or physical, in two different ways. Finally, we will give a final touch\nto our sample application by fixing a few of the issue reported by the <code>lint<\/code>\ntask.<\/p>\n<h2 id=\"building-with-gradle\">Building with Gradle<\/h2>\n<p>The process of building an Android application involves many steps and tools. If\nwe rely on a Gradle project, as we have done in this case, and as it would be if\nyou were using Android Studio, all of the details of this process are hidden to\nus. In a post where we are trying to make use of only simple tools, you may\nthink that Gradle would defeat the point. Whilst I'd agree with you, I also\nbelieve that we will have to draw a line at some point, and Gradle feels like\nthe right place. If, at any point, you feel the need to manually build your\nproject, you can refer to these two references for more details:<\/p>\n<ul>\n<li><a href=\"https:\/\/spin.atomicobject.com\/2011\/08\/22\/building-android-application-bundles-apks-by-hand\/\">Building Android Application Bundles (APKs) by Hand<\/a><\/li>\n<li><a href=\"http:\/\/czak.pl\/2016\/05\/31\/handbuilt-android-project.html\">Jack, Jill &amp; building Android apps by hand<\/a><\/li>\n<\/ul>\n<p>This being said, let's see how to build our sample project with Gradle. We can\nlist all the available tasks with<\/p>\n<div class=\"highlight\"><pre><span><\/span>.\/gradlew tasks --all\n<\/pre><\/div>\n\n\n<p>Among all the tasks listed by the previous command, we should see the following\nones<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"gh\">Build tasks<\/span>\n<span class=\"gh\">-----------<\/span>\nassemble - Assembles all variants of all applications and secondary packages.\nassembleAndroidTest - Assembles all the Test applications.\nassembleDebug - Assembles all Debug builds.\nassembleRelease - Assembles all Release builds.\nbuild - Assembles and tests this project.\nbuildDependents - Assembles and tests this project and all projects that depend on it.\nbuildNeeded - Assembles and tests this project and all projects it depends on.\nclean - Deletes the build directory.\ncleanBuildCache - Deletes the build cache directory.\ncompileDebugAndroidTestSources\ncompileDebugSources\ncompileDebugUnitTestSources\ncompileReleaseSources\ncompileReleaseUnitTestSources\nmockableAndroidJar - Creates a version of android.jar that&#39;s suitable for unit tests.\n<\/pre><\/div>\n\n\n<p>We could decide to <code>assemble<\/code> our project, or <code>build<\/code> it in case we also wrote\ntests and want them built too. As it doesn't make much of a difference in our\ncase, we can run the <code>build<\/code> task with<\/p>\n<div class=\"highlight\"><pre><span><\/span>.\/gradlew build\n<\/pre><\/div>\n\n\n<blockquote>\n<p>The <code>build<\/code> task actually does a bit more than assemble and execute tests. It\nalso runs the <code>lint<\/code> task. More on this later.<\/p>\n<\/blockquote>\n<p>If the build task was successful, you should now have the <code>androtest-debug.apk<\/code>\ninto the <code>build\/outputs\/apk<\/code> folder, ready to be deployed on an Android device.<\/p>\n<h2 id=\"creating-a-virtual-device\">Creating a Virtual Device<\/h2>\n<p>In case you want to install the application on a physical device, you can skip\nthis section and go straight to the next one. Under some circumstances, you\nmight want to test your application on diverse hardware settings, and the best\nway is probably to make use of a <em>virtual device<\/em>. If you have installed the\n<code>emulator<\/code> package as previously described, you can create a virtual machine\nwith the following commands. But first we need to install a <em>system image<\/em>, for\nexample the <strong>Intel x86 Atom_64 System Image<\/strong>. Since our application doesn't\nmake use of the Google APIs we can opt for the default image:<\/p>\n<div class=\"highlight\"><pre><span><\/span>sdkmanager <span class=\"s2\">&quot;system-images;android-23;default;x86_64&quot;<\/span>\n<\/pre><\/div>\n\n\n<p>The download and installation might take some time, so just wait for\n<code>sdkmanager<\/code> to complete. You can then proceed to creating a virtual device with<\/p>\n<div class=\"highlight\"><pre><span><\/span>avdmanager create avd -n <span class=\"nb\">test<\/span> -k <span class=\"s2\">&quot;system-images;android-23;default;x86_64&quot;<\/span> -d <span class=\"m\">8<\/span>\n<\/pre><\/div>\n\n\n<p>This will create a virtual device named <code>test<\/code> with the system image that we\nhave just downloaded. The <code>-d<\/code> switch specifies which device we want to emulate.\nIn this case the value 8 represents a \"Nexus 5\" device. A complete list of the\ndevices that can be emulated may be found with<\/p>\n<div class=\"highlight\"><pre><span><\/span>avdmanager list device\n<\/pre><\/div>\n\n\n<p>Finally, to run the newly created virtual device we use the <code>emulator<\/code> tool,\njust like this:<\/p>\n<div class=\"highlight\"><pre><span><\/span>emulator @test -skin 768x1280\n<\/pre><\/div>\n\n\n<p>where <code>@test<\/code> specified the name of the virtual device we want to run, and the\n<code>-skin<\/code> switch defines the resolution we want to run the emulator at. Note that\na bug in the <code>emulator<\/code> tools prevents you from running this command from any\nworking directory. Instead, you need to navigate to <code>$ANDROID_HOME\/emulator<\/code> in\norder to start it without errors.<\/p>\n<p>For more details on how to manage virtual devices, have a look at the\n<a href=\"https:\/\/developer.android.com\/studio\/command-line\/avdmanager.html#global_options\"><code>avdmanager<\/code><\/a>\ndocumentation page.<\/p>\n<h2 id=\"installing-the-apk\">Installing the APK<\/h2>\n<p>There are at least to ways of installing the APK to an Android device. One is\nvia a Gradle task, and the other is more manual and involves the Android Debug\nBridge tool <code>adb<\/code>. Let's have a look at both.<\/p>\n<p>From the previous run of the <code>.\/gradlew tasks --all<\/code>, you should have seen the\nfollowing tasks<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"gh\">Install tasks<\/span>\n<span class=\"gh\">-------------<\/span>\ninstallDebug - Installs the Debug build.\ninstallDebugAndroidTest - Installs the android (on device) tests for the Debug build.\nuninstallAll - Uninstall all applications.\nuninstallDebug - Uninstalls the Debug build.\nuninstallDebugAndroidTest - Uninstalls the android (on device) tests for the Debug build.\nuninstallRelease - Uninstalls the Release build.\n<\/pre><\/div>\n\n\n<p>The <code>installDebug<\/code> task will look for all the Android devices in debugging mode\nconnected to your machine and attempt to install the APK on <em>all<\/em> of them. For\ninstance, if you have a physical device in debug mode connected to your\nPC\/laptop while also running a virtual device, this Gradle task will install the\nAPK on both of them.<\/p>\n<p>If you want to install the APK on only one of the currently connected devices,\nyou can do so with the <code>adb<\/code> tool. First of all, determine the identifier of\neach of the connected devices with<\/p>\n<div class=\"highlight\"><pre><span><\/span>$ adb devices\nList of devices attached\n* daemon not running. starting it now at tcp:5037 *\n* daemon started successfully *\n05426e2409d30434    device\nemulator-5554   device\n<\/pre><\/div>\n\n\n<p>The device with ID <code>05426e2409d30434<\/code> is a physical Nexus 5 in debug mode\nconnected via USB to my laptop, while <code>emulator-5554<\/code>, as the name indicates, is\na running instance of the <code>test<\/code> virtual device we created before. We can\ninstall the APK on e.g. the emulator with<\/p>\n<div class=\"highlight\"><pre><span><\/span>adb -s <span class=\"s2\">&quot;emulator-5554&quot;<\/span> install build\/outputs\/apk\/androtest-debug.apk\n<\/pre><\/div>\n\n\n<p>which will now be ready to be executed on the virtual device.<\/p>\n<h2 id=\"the-lint-tasks\">The lint tasks<\/h2>\n<p>If you have ever used Android Studio you might wonder if there is the chance to\nget all the useful code analysis and hints that the IDE brings up while you code\nyour application. While this feature is probably not supported by the text\neditor of your choice, we can remedy by using the lint tool provided by the\nAndroid SDK. A convenient way to invoke it is through a dedicated Gradle task.\nFrom the output of <code>.\/gradlew tasks --all<\/code> you should also see the following\ntasks<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"gh\">Verification tasks<\/span>\n<span class=\"gh\">------------------<\/span>\ncheck - Runs all checks.\nconnectedAndroidTest - Installs and runs instrumentation tests for all flavors on connected devices.\nconnectedCheck - Runs all device checks on currently connected devices.\nconnectedDebugAndroidTest - Installs and runs the tests for debug on connected devices.\ndeviceAndroidTest - Installs and runs instrumentation tests using all Device Providers.\ndeviceCheck - Runs all device checks using Device Providers and Test Servers.\nlint - Runs lint on all variants.\nlintDebug - Runs lint on the Debug build.\nlintRelease - Runs lint on the Release build.\ntest - Run unit tests for all variants.\ntestDebugUnitTest - Run unit tests for the debug build.\ntestReleaseUnitTest - Run unit tests for the release build.\n<\/pre><\/div>\n\n\n<p>Notice how we can invoke the lint tool in multiple ways. As we saw before, the\n<code>build<\/code> task, as opposed to <code>assemble<\/code>, will also run the lint tool. But we can\nalso choose to run it as part of a full check with the <code>check<\/code> task, or just on\nits own, on every variant of our project or simply on the debug build. As an\nexample, let's run the <code>lint<\/code> task with<\/p>\n<div class=\"highlight\"><pre><span><\/span>.\/gradlew lint\n<\/pre><\/div>\n\n\n<p>The output should contain something that looks like the following<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"o\">...<\/span><span class=\"w\"><\/span>\n<span class=\"o\">&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">Task<\/span><span class=\"w\"> <\/span><span class=\"p\">:<\/span><span class=\"n\">lint<\/span><span class=\"w\"><\/span>\n<span class=\"n\">Ran<\/span><span class=\"w\"> <\/span><span class=\"n\">lint<\/span><span class=\"w\"> <\/span><span class=\"n\">on<\/span><span class=\"w\"> <\/span><span class=\"n\">variant<\/span><span class=\"w\"> <\/span><span class=\"n\">debug<\/span><span class=\"p\">:<\/span><span class=\"w\"> <\/span><span class=\"mi\">13<\/span><span class=\"w\"> <\/span><span class=\"n\">issues<\/span><span class=\"w\"> <\/span><span class=\"n\">found<\/span><span class=\"w\"><\/span>\n<span class=\"n\">Ran<\/span><span class=\"w\"> <\/span><span class=\"n\">lint<\/span><span class=\"w\"> <\/span><span class=\"n\">on<\/span><span class=\"w\"> <\/span><span class=\"n\">variant<\/span><span class=\"w\"> <\/span><span class=\"n\">release<\/span><span class=\"p\">:<\/span><span class=\"w\"> <\/span><span class=\"mi\">13<\/span><span class=\"w\"> <\/span><span class=\"n\">issues<\/span><span class=\"w\"> <\/span><span class=\"n\">found<\/span><span class=\"w\"><\/span>\n<span class=\"n\">Wrote<\/span><span class=\"w\"> <\/span><span class=\"n\">HTML<\/span><span class=\"w\"> <\/span><span class=\"n\">report<\/span><span class=\"w\"> <\/span><span class=\"n\">to<\/span><span class=\"w\"> <\/span><span class=\"n\">file<\/span><span class=\"p\">:<\/span><span class=\"o\">\/\/\/<\/span><span class=\"n\">home<\/span><span class=\"o\">\/<\/span><span class=\"n\">gabriele<\/span><span class=\"o\">\/<\/span><span class=\"n\">Projects<\/span><span class=\"o\">\/<\/span><span class=\"n\">androtest<\/span><span class=\"o\">\/<\/span><span class=\"n\">build<\/span><span class=\"o\">\/<\/span><span class=\"n\">reports<\/span><span class=\"o\">\/<\/span><span class=\"n\">lint<\/span><span class=\"o\">-<\/span><span class=\"n\">results<\/span><span class=\"o\">.<\/span><span class=\"n\">html<\/span><span class=\"w\"><\/span>\n<span class=\"n\">Wrote<\/span><span class=\"w\"> <\/span><span class=\"n\">XML<\/span><span class=\"w\"> <\/span><span class=\"n\">report<\/span><span class=\"w\"> <\/span><span class=\"n\">to<\/span><span class=\"w\"> <\/span><span class=\"n\">file<\/span><span class=\"p\">:<\/span><span class=\"o\">\/\/\/<\/span><span class=\"n\">home<\/span><span class=\"o\">\/<\/span><span class=\"n\">gabriele<\/span><span class=\"o\">\/<\/span><span class=\"n\">Projects<\/span><span class=\"o\">\/<\/span><span class=\"n\">androtest<\/span><span class=\"o\">\/<\/span><span class=\"n\">build<\/span><span class=\"o\">\/<\/span><span class=\"n\">reports<\/span><span class=\"o\">\/<\/span><span class=\"n\">lint<\/span><span class=\"o\">-<\/span><span class=\"n\">results<\/span><span class=\"o\">.<\/span><span class=\"n\">xml<\/span><span class=\"w\"><\/span>\n\n\n<span class=\"n\">BUILD<\/span><span class=\"w\"> <\/span><span class=\"n\">SUCCESSFUL<\/span><span class=\"w\"> <\/span><span class=\"ow\">in<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"n\">s<\/span><span class=\"w\"><\/span>\n<span class=\"mi\">25<\/span><span class=\"w\"> <\/span><span class=\"n\">actionable<\/span><span class=\"w\"> <\/span><span class=\"n\">tasks<\/span><span class=\"p\">:<\/span><span class=\"w\"> <\/span><span class=\"mi\">7<\/span><span class=\"w\"> <\/span><span class=\"n\">executed<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">18<\/span><span class=\"w\"> <\/span><span class=\"n\">up<\/span><span class=\"o\">-<\/span><span class=\"n\">to<\/span><span class=\"o\">-<\/span><span class=\"n\">date<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>This tells us that the lint tool has found 13 issues on both the debug and the\nrelease build of our project, and that a report is available as both an HTML and\nan XML file in <code>build\/reports<\/code>.<\/p>\n<p>Let's open the HTML report and have a look at the reported issues. As\npredictable, the largest number of issues is about <em>internationalisation<\/em>, as we\nhave used many hard-coded strings in our source code. So let's try to fix some\nof these, starting from the <code>HardcodedText<\/code> one. By clicking on it, we get sent\nto a more detailed part of the page that suggests us to use a <code>@string<\/code> resource\ninstead of the hard-coded string <code>Gravity sensors<\/code>. Since our code will change\nthe string at run-time in any case, the simplest solution is to replace it with\nan empty string. If we now run the lint task again, we should see only 12\nissues. Good! :).<\/p>\n<p>Let's now tackle a couple of <em>TextView Internationalization<\/em> issues. The first\none is about the string <code>\"Detected gravity sensors: \" +\nInteger.toString(nSensors)<\/code> that we have used in the <code>MainActivity.java<\/code> source\nfile. The problem here is two-fold. We should use a resource string and favour\nplaceholders rather than concatenation. So let's create a <a href=\"https:\/\/developer.android.com\/guide\/topics\/resources\/string-resource.html\">string\nresource<\/a>\nfile. Recall from before that this should go in an XML file, say <code>strings.xml<\/code>,\nwithin the <code>src\/main\/res\/values<\/code> folder.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;<\/span>\n<span class=\"p\">&lt;<\/span><span class=\"nt\">resources<\/span><span class=\"p\">&gt;<\/span>\n    <span class=\"p\">&lt;<\/span><span class=\"nt\">string<\/span> <span class=\"na\">name<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;sensors_no&quot;<\/span><span class=\"p\">&gt;<\/span>Detected gravity sensors: %d<span class=\"p\">&lt;\/<\/span><span class=\"nt\">string<\/span><span class=\"p\">&gt;<\/span>\n<span class=\"p\">&lt;\/<\/span><span class=\"nt\">resources<\/span><span class=\"p\">&gt;<\/span>\n<\/pre><\/div>\n\n\n<p>We can now replace line 44 in <code>MainActivity.java<\/code> with<\/p>\n<div class=\"highlight\"><pre><span><\/span>      <span class=\"n\">text<\/span><span class=\"p\">.<\/span><span class=\"na\">setText<\/span><span class=\"p\">(<\/span><span class=\"n\">getString<\/span><span class=\"p\">(<\/span><span class=\"n\">R<\/span><span class=\"p\">.<\/span><span class=\"na\">string<\/span><span class=\"p\">.<\/span><span class=\"na\">sensors_no<\/span><span class=\"p\">,<\/span> <span class=\"n\">nSensors<\/span><span class=\"p\">));<\/span>\n<\/pre><\/div>\n\n\n<p>Run the lint task again, and the count should now be down to only 9 issues. This\nshould give you the idea of how to continue fixing the remaining findings.<\/p>\n<h1 id=\"conclusions\">Conclusions<\/h1>\n<p>In this post we saw that, if we want to develop Android applications by simply\nrelying on a text editor and some command line tools, the initial set up of the\ndevelopment environment requires a few preliminary steps, some of which are\nperformed when you install Android Studio. However, some of the other steps need\nto be performed regardless, with the only difference that, in Android Studio,\nyou would have a GUI to assist you. The bonus of going IDE-free, however, is in\nthe fact that we now know what operations are being performed by the IDE when,\ne.g., we install an Android SDK that targets a certain API level.<\/p>\n<p>Creating a Gradle project is quite easy, and in this we are assisted by Gradle\nitself. The Android Gradle plugin takes care of invoking the right tools to\nperform the most common tasks, like building our project, or even installing it\non an Android device. One handy feature that we might miss out with a plain text\neditor is the live linting of our code. However, at any stage of the\ndevelopment, we can run the <code>lint<\/code> Gradle task to generate a report with all the\nissues that have been discovered within the whole project.<\/p>\n<p>This post should have also showed you that, whilst Android Studio is a very\nuseful tool, there is practically nothing that we would miss out by using the\ncommand line tools of the SDK instead. Development might at first be less\nsmooth, and the learning curve steeper, but surely has its positives too.<\/p>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"android"}},{"@attributes":{"term":"gradle"}},{"@attributes":{"term":"java"}}]},{"title":"A Gentle Introduction to IoT","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/a-gentle-introduction-to-iot.html","rel":"alternate"}},"published":"2017-07-31T19:30:15+01:00","updated":"2017-07-31T19:30:15+01:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2017-07-31:\/a-gentle-introduction-to-iot.html","summary":"<p>The IoT revolution has started. But what is it exactly? Is it hard to take part to it? In this post I present you with all the details of a very simple and almost inexpensive <i>Internet of Things<\/i> project. Read through as we go from assembling the required hardware, to coding the software that will drive it, exploring some of the most modern free technologies that are on offer today. At the end we will be able to take control of some LEDs over the internet, from wherever we are. Consider this as a launch pad to more complex and exciting IoT projects.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#preamble\">Preamble<\/a><ul>\n<li><a href=\"#whats-iot\">What's IoT?<\/a><\/li>\n<li><a href=\"#the-project\">The Project<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#the-hardware\">The Hardware<\/a><ul>\n<li><a href=\"#the-gpio-pins\">The GPIO Pins<\/a><\/li>\n<li><a href=\"#the-circuit\">The Circuit<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#the-software\">The Software<\/a><ul>\n<li><a href=\"#the-rpi-python-module\">The RPi Python Module<\/a><\/li>\n<li><a href=\"#the-wsgi-specification\">The WSGI Specification<\/a><\/li>\n<li><a href=\"#writing-the-web-application\">Writing the Web Application<\/a><\/li>\n<li><a href=\"#configuring-apache\">Configuring Apache<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#conclusions\">Conclusions<\/a><\/li>\n<\/ul>\n<\/div>\n<h1 id=\"preamble\">Preamble<\/h1>\n<p>No, I'm afraid this post is not on how to build your army of toy soldiers. If\nthe disappointment hasn't stopped you from reading any further, what I will\ndiscuss here is something simpler and more peaceful, involving general\nelectronics single board computers, LEDs, Python, web servers and web\napplications. All in just one place!<\/p>\n<p>I will try to introduce the fundamental concepts of IoT by example, and I will\ndo so by sharing my recent hands-on experience with my first Raspberry Pi, and\nhow this has led me to get to know of fascinating technologies of the modern\nera. If your main objective is still to build an army of toy soldiers, this is\ndefinitely a first step (however you should consider using technology for better\npurposes, really!).<\/p>\n<p>I will try not to give much for granted, but of course I will have to draw a\nline at some point, or this post would have never been finished! The subject is\nsoftware, but it is also hardware, for the former would make little sense\nwithout the latter.<\/p>\n<p>The approach will be very practical. We will start with a concrete problem, and\nwe shall see how to solve it, both from a hardware and a software perspective.<\/p>\n<h2 id=\"whats-iot\">What's IoT?<\/h2>\n<p>Can you eat it? Well, although your are free of munching on a breadboard if you\nreally want to, IoT, as you probably know already, stands for <em>Internet of\nThings<\/em>. This term is used to indicate the inter-networking of physical devices\n(the <em>things<\/em>) that are equipped with electronics, sensors, motors etc.. and\nthat communicate information with each other, sometimes taking actions based on\nthe received inputs. In this sense, every-day life devices, like the fridges or\nthe washing machines in our homes, or vending machines, or cars even, become\n<em>smart<\/em>.<\/p>\n<p>Even though the first example of <em>smart things<\/em> appeared in the 1982 (and you\ncan surely consider the toy army of Toy Story as an example of IoT in early\nanimation movies), it is around 2016 that the IoT has evolved greatly, and the\nkey aspect is the almost ubiquitous availability of wireless networks that allow\nan increasing pool of diverse devices to communicate with one another.<\/p>\n<p>The IoT example of this post is somewhat a classic, but that will hopefully give\nyou a rough idea of what the IoT is also about, in case this is the first time\nthat you came across it. It is also quite simple and with a contained cost, but\nnonetheless will spawn many interesting connections with some fascinating modern\ntechnologies.<\/p>\n<h2 id=\"the-project\">The Project<\/h2>\n<p>So what is exactly the project described in this post about? The idea is to turn\nsome LEDs on and off by sending commands to a Raspberry Pi through a web page.\nThis might seem quite a trivial project, but it has given me the opportunity to\nrefresh some rusty knowledge of electronics that dated back to my undergraduate\nyears, as well as learn a great deal of new things from the software side too.<\/p>\n<h1 id=\"the-hardware\">The Hardware<\/h1>\n<p>In this first part I will describe all the hardware that is necessary for the\nproject. The main one is, I would say, a single-board computer like a Raspberry\nPi. In my case, I'm using a Raspberry Pi 3 Model B running the Raspbian OS. To\nturn an LED on and off we will use one of the GPIO pins. For commodity, I will\nalso use a T-cobbler to connect all the GPIO pins on a breadboard, where the\nrest of the circuit will be assembled. All that we still need are a couple of\njumper wires, a 220 \u03a9 resistor and an LED of your favourite colour. To\nsummarise, here is the minimum required:<\/p>\n<ul>\n<li>1x Raspberry Pi<\/li>\n<li>1x breadboard<\/li>\n<li>1x T-cobbler and a bus cable (alternatively 2x male-to-female jumper wires)<\/li>\n<li>2x male-to-male jumper wires<\/li>\n<li>1x 220 \u03a9 resistor<\/li>\n<li>1x LED (about 20 mA max current)<\/li>\n<\/ul>\n<h2 id=\"the-gpio-pins\">The GPIO Pins<\/h2>\n<p>Before looking at the circuit, it is probably best to mention a few facts about\nthe Raspberry Pi. In particular, the key part here is the set of GPIO pins that\nit provides.<\/p>\n<p>The General Purpose Input-Output pins represent, as the name itself suggests, a\nway to connect to external devices in order to send commands to and from the\nRaspberry Pi. Some of these can be set to work as input or output, and can be\nset either high (3.3 V or a logic 1) or low (0 V or a logic 0). Some pins can be\nused together for other purposes, like communicating with another device through\na Serial Peripheral Interface (SPI), attaching LCD screen through the Display\nParallel Interface (DPI) etc.... Good references for the Raspberry Pi GPIO pins\nare <a href=\"https:\/\/pinout.xyz\/\">this website<\/a> and the <a href=\"https:\/\/www.raspberrypi.org\/documentation\/hardware\/raspberrypi\/bcm2835\/BCM2835-ARM-Peripherals.pdf\">Broadcom BCM2835 ARM\nPeripherals<\/a>\nmanual.<\/p>\n<p>From the software side, the pins can be conveniently configured and controlled\nby means of the RPi Python module. The thing to be mindful of is that there are\na bunch of different naming conventions for the pins on the Raspberry Pi. The\nmain ones, to use a terminology proper of the RPi module, is <code>BOARD<\/code> and <code>BCM<\/code>.\nThe former is a pin numbering that reflects the physical location of the pins on\nthe PCB. Pin number one is on the top-left corner and gives a 3.3 V fixed\noutput. Pin 2 is the one to its right, Pin 3 is the one below it, and so forth.\nThe latter is the numbering convention used in the Broadcom manual.<\/p>\n<h2 id=\"the-circuit\">The Circuit<\/h2>\n<p>Here is the schematic of the circuit that we want to build.<\/p>\n<p><img alt=\"The circuit schematics\" class=\"center-image\" src=\"https:\/\/p403n1x87.github.io\/images\/iot\/iot_schem.png\">\n<em>The schematic of the circuit, showing all the components used for this project.<\/em><\/p>\n<p>As I have already mentioned, I prefer using a T-cobbler to connect all the GPIO\npins to the breadboard. In case you are not using one, this is what your\nbreadboard should look like this picture.<\/p>\n<p><img alt=\"The physical components\" class=\"center-image\" src=\"https:\/\/p403n1x87.github.io\/images\/iot\/iot_bb.png\">\n<em>Another schematic representation of the circuit, showing how the components are physically connected with each other on the breadboard and the Raspberry Pi 3 Model B.<\/em><\/p>\n<p>Where did the magic number 220 \u03a9 come from? The explanation is very simple and\nessentially based on Ohm's law. Across a resistor <span class=\"math\">\\(R\\)<\/span> to which a voltage\ndifference of <span class=\"math\">\\(V\\)<\/span> is applied, the current flowing through it is given by<\/p>\n<div class=\"math\">$$I = \\frac VR.$$<\/div>\n<p>An LED is a diode, i.e. a p-n junction, that is capable of emitting light. In\n<em>forward bias<\/em>, an order-one approximation of a diode is given by a small\nresistor (order of 10 \u03a9) in series with a voltage generator (of about -0.67 V).\nThe manufacturer of the LED usually provides the maximum current that the diode\ncan withstand. In the case of common LEDs, this value is around 20 mA.\nConsidering that, when a GPIO pin is on, it will provide 3.3 V to our circuit,\nin order not to burn our LED we need to use a resistor of resistance <span class=\"math\">\\(R\\)<\/span> given\nby the inequality<\/p>\n<div class=\"math\">$$\\frac{3.3\\text{ V} - 0.67\\text{ V}} R \\leq 20\\text{ mA},$$<\/div>\n<p>which yields<\/p>\n<div class=\"math\">$$R\\geq 130\\ \\Omega.$$<\/div>\n<p>For a better estimate, we can look at the V-I chart provided by the\nmanufacturer, which would probably give us a minimum value closer to 200 \u03a9.\nAnything below and you might risk frying your LED. Using a way bigger resistor,\nhowever, would starve it of current and it would not turn on at all. But with\n220 \u03a9 we should be perfectly fine (and safe!).<\/p>\n<h1 id=\"the-software\">The Software<\/h1>\n<p>Now that we have assembled the required hardware, it is time to see how to\ncontrol it. What we have done so far is to connect the Raspberry Pi with a very\nsimple one-wire device, i.e. an LED. The IoT is the possibility of controlling a\ndevice from the internet, anywhere in the world, with the actual device that we\nwant to control possibly miles and miles away from us.<\/p>\n<p>What we now need is then a simple interface, accessible from the internet, that\nallows us to control the LED. Let's see how to create such interface,\nstep-by-step.<\/p>\n<p>But before we go any further, let's have a look, like we did with the hardware\npart, at all that we will need.<\/p>\n<ul>\n<li>Python (2.7 or later; note that the project has been tested with Python2.7\n   and it might have to be adapted slightly to work with Python3)<\/li>\n<li>a web server e.g. Apache2<\/li>\n<li>a text editor of your choice.<\/li>\n<\/ul>\n<p>Yep, that's all.<\/p>\n<h2 id=\"the-rpi-python-module\">The RPi Python Module<\/h2>\n<p>On Raspbian there is a pretty simple way to control the GPIO pin that comes\nalready bundled with the OS. I'm talking about the\n<a href=\"https:\/\/pypi.python.org\/pypi\/RPi.GPIO\">RPi<\/a> Python module. For the simple task\nthat we want to achieve here, RPi exposes all the features that we need.\nHowever, keep in mind that more advanced tasks, like real-time applications,\ncannot be solved by this module. This is not just because it is a Python module,\nbut a \"limitation\" of any OS based on the Linux kernel, which is multitasking in\nnature. This means that the kernel can decide on its own how to allocate\nresources for all the running processes, potentially giving rise to jitter in\nyour applications.<\/p>\n<p>Another important limitation that should push you towards other approaches, like\n<a href=\"http:\/\/wiringpi.com\/\">wiringPi<\/a>, is the lack of hardware PWM. PWM stands for\nPulse-Width Modulation and is a technique used to encode a message in a pulsing\nsignal. Some of the common uses are that of dimming an LED (recall that an LED\nresponse is exponential, even though we treated it as linear in our first-order\napproximation discussed above), or controlling a motor, but this is a topic that\nwould take us away from the main focus of this post, and it might be the subject\nof a future one.<\/p>\n<p>Returning to our project, let's have a look at how to turn our LED on. Recall\nfrom the schematic above that we are using the GPIO Pin 16 (according to the\nBroadcom convention), which is the 36th physical pin on the GPIO.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">import<\/span> <span class=\"nn\">RPi.GPIO<\/span> <span class=\"k\">as<\/span> <span class=\"nn\">G<\/span>     <span class=\"c1\"># Import the GPIO component from the RPi module<\/span>\n\n<span class=\"n\">LED<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">36<\/span>                 <span class=\"c1\"># Define a constant with the pin number where the LED<\/span>\n                         <span class=\"c1\"># is connected<\/span>\n\n<span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">setmode<\/span><span class=\"p\">(<\/span><span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">BOARD<\/span><span class=\"p\">)<\/span>       <span class=\"c1\"># Set the pin numbering mode to the physical number<\/span>\n\n<span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">setup<\/span><span class=\"p\">(<\/span><span class=\"n\">LED<\/span><span class=\"p\">,<\/span> <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">OUTPUT<\/span><span class=\"p\">)<\/span>   <span class=\"c1\"># Set the LED pin to output mode<\/span>\n\n<span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">output<\/span><span class=\"p\">(<\/span><span class=\"n\">LED<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">)<\/span>         <span class=\"c1\"># Set the pin to high (3.3 V)<\/span>\n<\/pre><\/div>\n\n\n<p>We can type the above lines of Python code directly into the Python interpreter.\nTo turn the LED off we can either set the value on the pin 36 back to 0<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">output<\/span><span class=\"p\">(<\/span><span class=\"n\">LED<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>or clean up the GPIO configuration with<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">cleanup<\/span><span class=\"p\">()<\/span>\n<\/pre><\/div>\n\n\n<p>If we'll ever want to use more than just one LED on our breadboard, we can\nencapsulate most of the above code inside a Python class so that it can be\nreused instead of having to type it every time with setup a new LED. A\nminimalist Python class that would represent a physical LED would be something\nlike<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># File: led.py<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">RPi.GPIO<\/span> <span class=\"k\">as<\/span> <span class=\"nn\">G<\/span>\n\n<span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">setmode<\/span><span class=\"p\">(<\/span><span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">BOARD<\/span><span class=\"p\">)<\/span>\n\n<span class=\"k\">class<\/span> <span class=\"nc\">LED<\/span><span class=\"p\">(<\/span><span class=\"nb\">object<\/span><span class=\"p\">):<\/span>\n  <span class=\"k\">def<\/span> <span class=\"fm\">__init__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">ch<\/span><span class=\"p\">):<\/span>\n    <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">_ch<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ch<\/span>\n    <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">setup<\/span><span class=\"p\">(<\/span><span class=\"n\">ch<\/span><span class=\"p\">,<\/span> <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">OUT<\/span><span class=\"p\">)<\/span>\n\n  <span class=\"k\">def<\/span> <span class=\"nf\">on<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">output<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">_ch<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n\n  <span class=\"k\">def<\/span> <span class=\"nf\">off<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">output<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">_ch<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">)<\/span>\n\n  <span class=\"nd\">@property<\/span>\n  <span class=\"k\">def<\/span> <span class=\"nf\">state<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">return<\/span> <span class=\"nb\">bool<\/span><span class=\"p\">(<\/span><span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">input<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">_ch<\/span><span class=\"p\">))<\/span>\n\n  <span class=\"nd\">@state<\/span><span class=\"o\">.<\/span><span class=\"n\">setter<\/span>\n  <span class=\"k\">def<\/span> <span class=\"nf\">state<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">,<\/span> <span class=\"n\">value<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">output<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">_ch<\/span><span class=\"p\">,<\/span> <span class=\"nb\">bool<\/span><span class=\"p\">(<\/span><span class=\"n\">value<\/span><span class=\"p\">))<\/span>\n\n  <span class=\"k\">def<\/span> <span class=\"nf\">toggle<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">output<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">_ch<\/span><span class=\"p\">,<\/span> <span class=\"ow\">not<\/span> <span class=\"n\">G<\/span><span class=\"o\">.<\/span><span class=\"n\">input<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">_ch<\/span><span class=\"p\">))<\/span>\n<\/pre><\/div>\n\n\n<p>Our code for turning an LED on and off would then reduce to the following few\nlines<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kn\">from<\/span> <span class=\"nn\">led<\/span>  <span class=\"kn\">import<\/span> <span class=\"n\">LED<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">time<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">sleep<\/span>\n\n<span class=\"n\">led_red<\/span> <span class=\"o\">=<\/span> <span class=\"n\">LED<\/span><span class=\"p\">(<\/span><span class=\"mi\">36<\/span><span class=\"p\">)<\/span>\n\n<span class=\"n\">led_red<\/span><span class=\"o\">.<\/span><span class=\"n\">on<\/span><span class=\"p\">()<\/span>\n\n<span class=\"n\">sleep<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n\n<span class=\"n\">led_red<\/span><span class=\"o\">.<\/span><span class=\"n\">off<\/span><span class=\"p\">()<\/span>\n<\/pre><\/div>\n\n\n<p>The extra method <code>toggle<\/code> can be used, as the name suggests, to toggle the LED\nstate from on to off and vice-versa. In the above example we could then replace\nboth <code>led_red.on<\/code> and <code>led_red.off()<\/code> by <code>led_red.toggle()<\/code>. The property\n<code>state<\/code> is used to get the state of the LED as a Boolean value (<code>True<\/code> for on\nand <code>False<\/code> for off), and it can also be used to set it. For instance, something\nlike<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">led_red<\/span><span class=\"o\">.<\/span><span class=\"n\">state<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"mi\">1<\/span><span class=\"p\">]<\/span>\n<\/pre><\/div>\n\n\n<p>would turn the LED on, since <code>[1]<\/code> evaluates to <code>True<\/code> when converted to a\nBoolean. Analogously, the following line of code<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"n\">led_red<\/span><span class=\"o\">.<\/span><span class=\"n\">state<\/span> <span class=\"o\">=<\/span> <span class=\"p\">{}<\/span>\n<\/pre><\/div>\n\n\n<p>would turn the LED off, since <code>bool({}) = False<\/code> in Python.<\/p>\n<p>Sweet! We now know how to control our LED with code and all that's left to do is\nbuild a nice web interface that will execute this code on demand.<\/p>\n<h2 id=\"the-wsgi-specification\">The WSGI Specification<\/h2>\n<p>Given that we already have some code in Python to control our LED, it would be\ngood if the web interface could use that code directly. Is this possible? The\n(short) answer is <em>yes<\/em>!<\/p>\n<p>A more articulated answer to the above question leads us into the realm of the\nWeb Server Gateway Interface specification, or WSGI for short. It is a universal\nspecification that was born out of the necessity of putting some order among all\nthe Python frameworks for developing web applications. Before the specification,\neach of said frameworks would be compatible with just a few web servers.\nChoosing one of them would then restrict your choice of a web server to go with\nit, and vice-versa. To overcome this limitation, the WSGI specification was\nproposed in the <a href=\"https:\/\/www.python.org\/dev\/peps\/pep-0333\/\">PEP 333<\/a>, which\ndates back ton 2003. The technical details can be found in the linked page. Here\nwe just limit ourself to the essential details of the specification that will\nallow us to write a simple web application to control the LED over the internet.<\/p>\n<p>In very simple terms, a Python web application consists of a <em>bootstrap\ncallable<\/em> object that is called by the web server. The python code contained in\nthe callable is executed, and the web server expects a response consisting of a\ncode (200 for OK, 403 for Forbidden, 404 for Not Found etc...) and a stream of\nbytes (usually the HTML code to be rendered by the browser). A callable can be\nany Python object that exposes a <code>__call__<\/code> function like, e.g., classes and\nfunctions.<\/p>\n<p>My favourite web server is Apache2 and it will be the one that I will discuss in\nthis post. Its functionalities can be extended with modules, and the\n<a href=\"https:\/\/modwsgi.readthedocs.io\/en\/develop\/\">mod_wsgi<\/a> project provides a\nWSGI-compliant module for Apache. The documentation is very detailed and covers\nall the aspects, from the installation to the configuration and debugging\ntechniques.<\/p>\n<p>Regarding the installation process, this can be carried out in two modes. Either\non the Apache-side, or on the Python-side. If you are into the IoT, chances are\nyou will have your web server running on a Raspberry Pi. For this reason, I will\ndiscuss how to install the mod_wsgi module for the Apache web server. On\nRaspbian, this can be done with<\/p>\n<div class=\"highlight\"><pre><span><\/span>sudo apt install libapache2-mod-wsgi\n<\/pre><\/div>\n\n\n<p>After the installation, the module should already be enabled. If this isn't the\ncase, you can enable it with<\/p>\n<div class=\"highlight\"><pre><span><\/span>sudo a2enmod wsgi\n<\/pre><\/div>\n\n\n<h2 id=\"writing-the-web-application\">Writing the Web Application<\/h2>\n<p>The next steps are to actually write our web application and configure Apache to\nrun it when a request comes in. In the configuration process we will specify our\nbootstrap Python script. By default, Apache expects to find a callable object\ninside it with the name <code>application<\/code>. The simplest thing that we could do is\nthen to create a Python script and define a function with such a name. The code\nof our application would then be contained in this function, or called by it, in\ncase we decide for a more modular approach.<\/p>\n<p>In case things would go wrong while we develop our web application, we might\nwant to be able to have a look at the Python stack trace to see where the\nproblems are. Normally we would have to look in the Apache error log (usually in\n<code>\/var\/log\/apache2\/error.log<\/code>). However, we can make use of a middleware from the\n<a href=\"https:\/\/paste.readthedocs.io\/en\/latest\/\">paste<\/a> Python package, which is\nspecifically designed for WSGI applications. Our bootstrap script will then look\nlike this.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># File: bootstrap.wsgi<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">sys<\/span><span class=\"o\">,<\/span> <span class=\"nn\">os<\/span>\n<span class=\"n\">sys<\/span><span class=\"o\">.<\/span><span class=\"n\">path<\/span><span class=\"o\">.<\/span><span class=\"n\">append<\/span><span class=\"p\">(<\/span><span class=\"n\">os<\/span><span class=\"o\">.<\/span><span class=\"n\">path<\/span><span class=\"o\">.<\/span><span class=\"n\">dirname<\/span><span class=\"p\">(<\/span><span class=\"vm\">__file__<\/span><span class=\"p\">))<\/span>\n\n<span class=\"kn\">from<\/span> <span class=\"nn\">index<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">Index<\/span>\n\n<span class=\"k\">def<\/span> <span class=\"nf\">main<\/span><span class=\"p\">(<\/span><span class=\"n\">env<\/span><span class=\"p\">,<\/span> <span class=\"n\">start_response<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">status<\/span> <span class=\"o\">=<\/span> <span class=\"s1\">&#39;200 OK&#39;<\/span>\n\n    <span class=\"n\">output<\/span> <span class=\"o\">=<\/span> <span class=\"n\">Index<\/span><span class=\"o\">.<\/span><span class=\"n\">main<\/span><span class=\"p\">(<\/span><span class=\"n\">env<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"n\">response_headers<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span>\n        <span class=\"p\">(<\/span><span class=\"s1\">&#39;Content-type&#39;<\/span>   <span class=\"p\">,<\/span> <span class=\"s1\">&#39;text\/html&#39;<\/span>     <span class=\"p\">)<\/span>\n       <span class=\"p\">,(<\/span><span class=\"s1\">&#39;Content-Length&#39;<\/span> <span class=\"p\">,<\/span> <span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"n\">output<\/span><span class=\"p\">)))<\/span>\n    <span class=\"p\">]<\/span>\n    <span class=\"n\">start_response<\/span><span class=\"p\">(<\/span><span class=\"n\">status<\/span><span class=\"p\">,<\/span> <span class=\"n\">response_headers<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"k\">return<\/span> <span class=\"p\">[<\/span><span class=\"n\">output<\/span><span class=\"p\">]<\/span>\n\n<span class=\"kn\">from<\/span> <span class=\"nn\">paste.exceptions.errormiddleware<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">ErrorMiddleware<\/span>\n<span class=\"n\">application<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ErrorMiddleware<\/span><span class=\"p\">(<\/span><span class=\"n\">main<\/span><span class=\"p\">,<\/span> <span class=\"n\">debug<\/span> <span class=\"o\">=<\/span> <span class=\"kc\">True<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>The first two lines are necessary if we want to be able to import modules and\npackages that are in the same folder as the bootstrap script.<\/p>\n<p>We then import the class <code>Index<\/code> from the <code>index<\/code> module. This is just a design\nchoice. The bootstrap script contains the essential code to get us to the main\npage (by means of the <code>Index<\/code> class), and returns us the full stack trace in\ncase of errors (by means of the <code>ErrorMiddleware<\/code> class from the paste package).\nTo make sense of the rest of the code, have a look at the already referenced\ndocumentation of the <code>mod_wsgi<\/code> module for Apache2.<\/p>\n<p>The core code of our application is contained in the <code>Index<\/code> class from the\n<code>index<\/code> module.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c1\"># File index.py<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">util<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">templ<\/span><span class=\"p\">,<\/span> <span class=\"n\">qs<\/span>\n<span class=\"kn\">from<\/span> <span class=\"nn\">led<\/span>  <span class=\"kn\">import<\/span> <span class=\"n\">LED<\/span>\n\n<span class=\"n\">led<\/span> <span class=\"o\">=<\/span> <span class=\"n\">LED<\/span><span class=\"p\">(<\/span><span class=\"mi\">15<\/span><span class=\"p\">)<\/span>\n\n<span class=\"k\">class<\/span> <span class=\"nc\">Index<\/span><span class=\"p\">(<\/span><span class=\"nb\">object<\/span><span class=\"p\">):<\/span>\n\n    <span class=\"nd\">@staticmethod<\/span>\n    <span class=\"k\">def<\/span> <span class=\"nf\">main<\/span><span class=\"p\">(<\/span><span class=\"n\">env<\/span><span class=\"p\">):<\/span>\n        <span class=\"n\">params<\/span> <span class=\"o\">=<\/span> <span class=\"n\">qs<\/span><span class=\"p\">(<\/span><span class=\"n\">env<\/span><span class=\"p\">)<\/span>\n        <span class=\"k\">if<\/span> <span class=\"s2\">&quot;action&quot;<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">params<\/span> <span class=\"ow\">and<\/span> <span class=\"s1\">&#39;toggle&#39;<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">params<\/span><span class=\"p\">[<\/span><span class=\"s1\">&#39;action&#39;<\/span><span class=\"p\">]:<\/span>\n            <span class=\"n\">led<\/span><span class=\"o\">.<\/span><span class=\"n\">toggle<\/span><span class=\"p\">()<\/span>\n        <span class=\"k\">return<\/span> <span class=\"n\">templ<\/span><span class=\"p\">(<\/span><span class=\"s1\">&#39;index&#39;<\/span><span class=\"p\">,<\/span> <span class=\"n\">state<\/span> <span class=\"o\">=<\/span> <span class=\"s2\">&quot;on&quot;<\/span> <span class=\"k\">if<\/span> <span class=\"n\">led<\/span><span class=\"o\">.<\/span><span class=\"n\">state<\/span> <span class=\"k\">else<\/span> <span class=\"s2\">&quot;off&quot;<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>In fact, this module looks more like a view rather than a controller, as the\nactual code for controlling the LED is buried in the <code>led<\/code> module that we have\nanalysed previously. The <code>util.py<\/code> module contains some helper functions to\nconveniently deal with HTML templates and query strings. We refrain from showing\nthe code in this post, but you can find it in the <a href=\"https:\/\/github.com\/P403n1x87\/led_app\">dedicated GitHub\nrepository<\/a>.<\/p>\n<p>The HTML template contained in the <code>index.html<\/code> file is very simple and looks\nlike this.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">&lt;!DOCTYPE html&gt;<\/span>\n<span class=\"p\">&lt;<\/span><span class=\"nt\">html<\/span> <span class=\"na\">lang<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;en-us&quot;<\/span><span class=\"p\">&gt;<\/span>\n  <span class=\"p\">&lt;<\/span><span class=\"nt\">head<\/span><span class=\"p\">&gt;<\/span>\n    <span class=\"p\">&lt;<\/span><span class=\"nt\">meta<\/span> <span class=\"na\">name<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;viewport&quot;<\/span> <span class=\"na\">content<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;width=device-width, initial-scale=1.0&quot;<\/span><span class=\"p\">&gt;<\/span>\n    <span class=\"p\">&lt;<\/span><span class=\"nt\">meta<\/span> <span class=\"na\">name<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;theme-color&quot;<\/span> <span class=\"na\">content<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;#666666&quot;<\/span> <span class=\"p\">\/&gt;<\/span>\n    <span class=\"p\">&lt;<\/span><span class=\"nt\">link<\/span> <span class=\"na\">rel<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;stylesheet&quot;<\/span> <span class=\"na\">type<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;text\/css&quot;<\/span> <span class=\"na\">href<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;led_app\/led.css&quot;<\/span><span class=\"p\">&gt;<\/span>\n  <span class=\"p\">&lt;\/<\/span><span class=\"nt\">head<\/span><span class=\"p\">&gt;<\/span>\n  <span class=\"p\">&lt;<\/span><span class=\"nt\">body<\/span><span class=\"p\">&gt;<\/span>\n    <span class=\"p\">&lt;<\/span><span class=\"nt\">a<\/span> <span class=\"na\">href<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;?action=toggle&quot;<\/span><span class=\"p\">&gt;&lt;<\/span><span class=\"nt\">div<\/span> <span class=\"na\">class<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;button {state}&quot;<\/span><span class=\"p\">&gt;&lt;\/<\/span><span class=\"nt\">div<\/span><span class=\"p\">&gt;&lt;\/<\/span><span class=\"nt\">a<\/span><span class=\"p\">&gt;<\/span>\n  <span class=\"p\">&lt;\/<\/span><span class=\"nt\">body<\/span><span class=\"p\">&gt;<\/span>\n<span class=\"p\">&lt;\/<\/span><span class=\"nt\">html<\/span><span class=\"p\">&gt;<\/span>\n<\/pre><\/div>\n\n\n<p>The body is essentially just a link that executes a request with the parameter\n<code>action<\/code> set to <code>toggle<\/code>, containing a placeholder <code>div<\/code> element. The\nlook-and-feel of a 3D button is then provided by the classes contained in the\nlinked stylesheet, which you can find in the GitHub repository. Note how we use\nthe <code>{state}<\/code> placeholder in the class attribute of the <code>div<\/code> element. This\nallows us setting the button appearance according to the LED current state. In\nthe stylesheet we have two classes, <code>.on<\/code> and <code>.off<\/code>, the former giving a bright\nred colour to the button, while the latter giving a darker shade. The value is\npassed by the line<\/p>\n<div class=\"highlight\"><pre><span><\/span>        <span class=\"k\">return<\/span> <span class=\"n\">templ<\/span><span class=\"p\">(<\/span><span class=\"s1\">&#39;index&#39;<\/span><span class=\"p\">,<\/span> <span class=\"n\">state<\/span> <span class=\"o\">=<\/span> <span class=\"s2\">&quot;on&quot;<\/span> <span class=\"k\">if<\/span> <span class=\"n\">led<\/span><span class=\"o\">.<\/span><span class=\"n\">state<\/span> <span class=\"k\">else<\/span> <span class=\"s2\">&quot;off&quot;<\/span><span class=\"p\">)<\/span>\n<\/pre><\/div>\n\n\n<p>in the static method <code>main<\/code> of the <code>Index<\/code> class of <code>index.py<\/code>.<\/p>\n<h2 id=\"configuring-apache\">Configuring Apache<\/h2>\n<p>We are almost ready to start playing with our LED over the internet. The last\nstep is to put our application up and running on the Apache web server. To this\nend there is a tiny bit of configuration that we need to do. We can start by\nmaking a copy of the default web site Apache2 comes with. On Raspbian, this is\nlocated in <code>\/etc\/apache2\/sites-available<\/code> and is contained in the\n<code>000-default.conf<\/code> configuration file. Make a copy of this file in, say,\n<code>led.conf<\/code> and modify it to look like the following one.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"c\"># File: led.conf<\/span><span class=\"w\"><\/span>\n<span class=\"nt\">&lt;VirtualHost<\/span><span class=\"w\"> <\/span><span class=\"s\">*:80<\/span><span class=\"nt\">&gt;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nb\">ServerAdmin<\/span><span class=\"w\"> <\/span>webmaster@localhost<span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"c\"># Required by static data storage access (e.g. css files)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nb\">DocumentRoot<\/span><span class=\"w\"> <\/span><span class=\"sx\">\/home\/pi\/Projects\/www<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"nb\">ErrorLog<\/span><span class=\"w\"> <\/span>${APACHE_LOG_DIR}\/error.log<span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nb\">CustomLog<\/span><span class=\"w\"> <\/span>${APACHE_LOG_DIR}\/access.log<span class=\"w\"> <\/span>combined<span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"c\"># WSGI Configuration<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nb\">WSGIScriptAlias<\/span><span class=\"w\"> <\/span><span class=\"sx\">\/led<\/span><span class=\"w\"> <\/span><span class=\"sx\">\/home\/pi\/Projects\/www\/led_app\/bootstrap.wsgi<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"nt\">&lt;Directory<\/span><span class=\"w\"> <\/span><span class=\"s\">\/home\/pi\/Projects\/www\/led_app<\/span><span class=\"nt\">&gt;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nb\">Require<\/span><span class=\"w\"> <\/span><span class=\"k\">all<\/span><span class=\"w\"> <\/span>granted<span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"nt\">&lt;\/Directory&gt;<\/span><span class=\"w\"><\/span>\n\n<span class=\"nt\">&lt;\/VirtualHost&gt;<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>Even though we are implementing a WSGI application, the <code>DocumentRoot\n\/home\/pi\/Projects\/www<\/code> is needed because we are importing a css file in\n<code>index.html<\/code>. The above configuration file assumes that the web application\nresides in the <code>\/home\/pi\/Projects\/www\/led_app<\/code> folder. This way, static files\ncan be accessed with a relative path referring to the parent folder\n<code>\/home\/pi\/Projects\/www<\/code>. This explains why we are importing the <code>led.css<\/code> file\nas<\/p>\n<div class=\"highlight\"><pre><span><\/span>    <span class=\"p\">&lt;<\/span><span class=\"nt\">link<\/span> <span class=\"na\">rel<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;stylesheet&quot;<\/span> <span class=\"na\">type<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;text\/css&quot;<\/span> <span class=\"na\">href<\/span><span class=\"o\">=<\/span><span class=\"s\">&quot;led_app\/led.css&quot;<\/span><span class=\"p\">&gt;<\/span>\n<\/pre><\/div>\n\n\n<p>The section below the <code># WSGI Configuration<\/code> comment is the WSGI part of the\nconfiguration. The first line of this section tells Apache which script to use\nto bootstrap the web application. This is the script that is assumed to contain\nthe callable object named <code>application<\/code>. The rest of the WSGI configuration\nsection is required to set the rights to read the bootstrap and the python\nscripts contained in the web application folder.<\/p>\n<p>That's it! All we have to do now is deactivate any other website on port 80 with\n<code>a2dissite<\/code> (this is only required if the web sites are not named), and enable\n<code>led.conf<\/code> with <code>a2ensite led.conf<\/code>. Restart the Apache2 web server with<\/p>\n<div class=\"highlight\"><pre><span><\/span>sudo service apache2 restart\n<\/pre><\/div>\n\n\n<p>and point the browser on any device that is on the same local network as the Pi\nto its local IP address, append '\/led' to it to start the web application (e.g.\n<code>http:\/\/192.168.0.203\/led<\/code>) and you should now be in the web application, with a\nred button that will now allow you to control the LED state over the LAN.<\/p>\n<p><img alt=\"The final look of the web application\" class=\"center-image\" src=\"https:\/\/p403n1x87.github.io\/images\/iot\/website.png\">\n<em>The final look of the web application. The red button in the middle is used to toggle the LED state.<\/em><\/p>\n<p>If you are behind a router, of course you will need to forward the port 80 to\nthe Pi's local address before you could be able to access your web application\nfrom the internet, outside of your local network. In this case you will have to\nuse the router's public IP instead of the local IP.<\/p>\n<h1 id=\"conclusions\">Conclusions<\/h1>\n<p>We have come to the end of this post on a simple IoT project. Its main purpose,\nlike most of the other posts in this blog, is two-fold. On one hand, it is a way\nfor me to take notes of new things that I have discovered and that I can later\ncome back to if I need to. All the information gathered from the cited sources\nis gathered in a single place, which makes it more convenient than having to go\nthrough them separately. On the other hand, it is a way to share my experience\nwith others, in the hope that it could be useful somehow. Even though the\nproject in it self, as I have remarked many times now, is quite simple, the post\nincludes references to many topics, e.g. electronics, programming, web servers\nand applications, and it shows you how all these different aspects can be\norganically combined together to create something.<\/p>\n<script type=\"text\/javascript\">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {\n    var align = \"center\",\n        indent = \"0em\",\n        linebreak = \"false\";\n\n    if (false) {\n        align = (screen.width < 768) ? \"left\" : align;\n        indent = (screen.width < 768) ? \"0em\" : indent;\n        linebreak = (screen.width < 768) ? 'true' : linebreak;\n    }\n\n    var mathjaxscript = document.createElement('script');\n    mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';\n    mathjaxscript.type = 'text\/javascript';\n    mathjaxscript.src = 'https:\/\/cdnjs.cloudflare.com\/ajax\/libs\/mathjax\/2.7.3\/latest.js?config=TeX-AMS-MML_HTMLorMML';\n\n    var configscript = document.createElement('script');\n    configscript.type = 'text\/x-mathjax-config';\n    configscript[(window.opera ? \"innerHTML\" : \"text\")] =\n        \"MathJax.Hub.Config({\" +\n        \"    config: ['MMLorHTML.js'],\" +\n        \"    TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'none' } },\" +\n        \"    jax: ['input\/TeX','input\/MathML','output\/HTML-CSS'],\" +\n        \"    extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js'],\" +\n        \"    displayAlign: '\"+ align +\"',\" +\n        \"    displayIndent: '\"+ indent +\"',\" +\n        \"    showMathMenu: true,\" +\n        \"    messageStyle: 'normal',\" +\n        \"    tex2jax: { \" +\n        \"        inlineMath: [ ['\\\\\\\\(','\\\\\\\\)'] ], \" +\n        \"        displayMath: [ ['$$','$$'] ],\" +\n        \"        processEscapes: true,\" +\n        \"        preview: 'TeX',\" +\n        \"    }, \" +\n        \"    'HTML-CSS': { \" +\n        \"        availableFonts: ['STIX', 'TeX'],\" +\n        \"        preferredFont: 'STIX',\" +\n        \"        styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} },\" +\n        \"        linebreaks: { automatic: \"+ linebreak +\", width: '90% container' },\" +\n        \"    }, \" +\n        \"}); \" +\n        \"if ('default' !== 'default') {\" +\n            \"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {\" +\n                \"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;\" +\n                \"VARIANT['normal'].fonts.unshift('MathJax_default');\" +\n                \"VARIANT['bold'].fonts.unshift('MathJax_default-bold');\" +\n                \"VARIANT['italic'].fonts.unshift('MathJax_default-italic');\" +\n                \"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');\" +\n            \"});\" +\n            \"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {\" +\n                \"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;\" +\n                \"VARIANT['normal'].fonts.unshift('MathJax_default');\" +\n                \"VARIANT['bold'].fonts.unshift('MathJax_default-bold');\" +\n                \"VARIANT['italic'].fonts.unshift('MathJax_default-italic');\" +\n                \"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');\" +\n            \"});\" +\n        \"}\";\n\n    (document.body || document.getElementsByTagName('head')[0]).appendChild(configscript);\n    (document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);\n}\n<\/script>","category":[{"@attributes":{"term":"IoT"}},{"@attributes":{"term":"raspberry pi"}},{"@attributes":{"term":"wsgi"}},{"@attributes":{"term":"electronics"}}]},{"title":"Prime Numbers, Algorithms and Computer Architectures","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/prime-numbers-algorithms-and-computer-architectures.html","rel":"alternate"}},"published":"2017-03-07T00:15:00+01:00","updated":"2017-03-07T00:15:00+01:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2017-03-07:\/prime-numbers-algorithms-and-computer-architectures.html","summary":"<p>What does the principle of locality of reference have to do with prime numbers? This is what we will discover in this post. We will use the segmented version of the Sieve of Eratosthenes to see how hardware specifications can (read <em>should<\/em>) be used to fix design parameters for our routines.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#counting-primes\">Counting Primes<\/a><\/li>\n<li><a href=\"#segmented-sieve\">Segmented Sieve<\/a><\/li>\n<\/ul>\n<\/div>\n<p>A natural number <span class=\"math\">\\(p\\in\\mathbb N\\)<\/span> is said to be <em>prime<\/em> if its only divisors are 1 and <span class=\"math\">\\(p\\)<\/span> itself. Any other number that does not have this property is sometimes called <em>composite<\/em>. The discovery that there are infinitely many prime numbers dates back to c. 300 BC and is due to Euclid. His argument by contradiction is very simple: suppose that, indeed, there are only finitely many primes, say <span class=\"math\">\\(p_1,\\ldots,p_n\\)<\/span>. The natural number<\/p>\n<div class=\"math\">$$m=p_1p_2\\cdots p_n + 1$$<\/div>\n<p>is larger than and evidently not divisible by any of the primes by construction, and therefore <span class=\"math\">\\(m\\)<\/span> must be prime. However, being larger than any of the <span class=\"math\">\\(p_k\\)<\/span>s, <span class=\"math\">\\(m\\)<\/span> cannot be one of the finitely many primes, thus reaching to a contradiction.<\/p>\n<p>Prime numbers play a fundamental role in <em>Number Theory<\/em>, a branch of Mathematics that deals with the properties of the natural numbers. Everybody gets to know about the prime factorisation of the natural numbers, a result so important that has been given the name of <em>Fundamental Theorem of Arithmetic<\/em>.<\/p>\n<h1 id=\"counting-primes\">Counting Primes<\/h1>\n<p>Even though we saw that prime numbers are infinite, one might still want to know how many prime numbers are there within a certain upper bound. As numbers become bigger, the help of a calculator becomes crucial to tackle this problem and therefore it makes sense to think of algorithms that would get us to the answer efficiently.<\/p>\n<p>The fastest way to count all the primes less than a given upper bound <span class=\"math\">\\(n\\)<\/span> is by means of an ancient algorithm known as the <em>Sieve of Eratosthenes<\/em>. The idea is to start with the sequence of all the numbers from 0 up to <span class=\"math\">\\(n\\)<\/span> and discard\/mark the composite numbers as they are discovered. By definition, 0 and 1 are not prime, so they are removed. The number 2 is prime, but all its multiples are not, so we proceed by removing all the multiples of 2. We keep the first number that remains after 2, 3 in this case, and proceed to remove its multiples (starting from its square, since smaller multiples have already been removed at the previous steps). The process is repeated until it is no longer possible to proceed beyond the assigned bound <span class=\"math\">\\(n\\)<\/span>. It is clear that it is enough to get up to at most <span class=\"math\">\\(\\lceil\\sqrt n\\rceil\\)<\/span>.<\/p>\n<p>The following is a simple implementation in C++.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;iostream&gt;<\/span><span class=\"cp\"><\/span>\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;vector&gt;<\/span><span class=\"cp\"><\/span>\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;cmath&gt;<\/span><span class=\"cp\"><\/span>\n<span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;cassert&gt;<\/span><span class=\"cp\"><\/span>\n\n<span class=\"k\">using<\/span><span class=\"w\"> <\/span><span class=\"k\">namespace<\/span><span class=\"w\"> <\/span><span class=\"nn\">std<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"k\">class<\/span><span class=\"w\"> <\/span><span class=\"nc\">Sieve<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n\n<span class=\"k\">private<\/span><span class=\"o\">:<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">vector<\/span><span class=\"o\">&lt;<\/span><span class=\"kt\">bool<\/span><span class=\"o\">&gt;<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">sieve<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">vector<\/span><span class=\"o\">&lt;<\/span><span class=\"kt\">int<\/span><span class=\"o\">&gt;<\/span><span class=\"w\">  <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">primes<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"k\">public<\/span><span class=\"o\">:<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">Sieve<\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">n<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"o\">:<\/span><span class=\"w\"> <\/span><span class=\"n\">primes<\/span><span class=\"p\">(<\/span><span class=\"nb\">NULL<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">sieve<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"k\">new<\/span><span class=\"w\"> <\/span><span class=\"n\">vector<\/span><span class=\"o\">&lt;<\/span><span class=\"kt\">bool<\/span><span class=\"o\">&gt;<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nb\">true<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">sieve<\/span><span class=\"p\">)[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"nb\">false<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">sieve<\/span><span class=\"p\">)[<\/span><span class=\"mi\">1<\/span><span class=\"p\">]<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"nb\">false<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;=<\/span><span class=\"w\"> <\/span><span class=\"n\">ceil<\/span><span class=\"p\">(<\/span><span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">));<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"o\">++<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">((<\/span><span class=\"o\">*<\/span><span class=\"n\">sieve<\/span><span class=\"p\">)[<\/span><span class=\"n\">i<\/span><span class=\"p\">])<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">unsigned<\/span><span class=\"w\"> <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">j<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">j<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">sieve<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">size<\/span><span class=\"p\">();<\/span><span class=\"w\"> <\/span><span class=\"n\">j<\/span><span class=\"w\"> <\/span><span class=\"o\">+=<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">          <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">sieve<\/span><span class=\"p\">)[<\/span><span class=\"n\">j<\/span><span class=\"p\">]<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"nb\">false<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"o\">~<\/span><span class=\"n\">Sieve<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">delete<\/span><span class=\"w\"> <\/span><span class=\"n\">sieve<\/span><span class=\"w\"> <\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">delete<\/span><span class=\"w\"> <\/span><span class=\"n\">primes<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"n\">vector<\/span><span class=\"o\">&lt;<\/span><span class=\"kt\">int<\/span><span class=\"o\">&gt;<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">get_primes<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">primes<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"nb\">NULL<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"n\">primes<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"k\">new<\/span><span class=\"w\"> <\/span><span class=\"n\">vector<\/span><span class=\"o\">&lt;<\/span><span class=\"kt\">int<\/span><span class=\"o\">&gt;<\/span><span class=\"p\">();<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">      <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">unsigned<\/span><span class=\"w\"> <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">sieve<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">size<\/span><span class=\"p\">();<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"o\">++<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">((<\/span><span class=\"o\">*<\/span><span class=\"n\">sieve<\/span><span class=\"p\">)[<\/span><span class=\"n\">i<\/span><span class=\"p\">])<\/span><span class=\"w\"> <\/span><span class=\"n\">primes<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">push_back<\/span><span class=\"p\">(<\/span><span class=\"n\">i<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">primes<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">get<\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">assert<\/span><span class=\"p\">(<\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">count<\/span><span class=\"p\">());<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">primes<\/span><span class=\"p\">)[<\/span><span class=\"n\">i<\/span><span class=\"p\">];<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">count<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">primes<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"nb\">NULL<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"n\">get_primes<\/span><span class=\"p\">();<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">primes<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">size<\/span><span class=\"p\">();<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<span class=\"p\">};<\/span><span class=\"w\"><\/span>\n\n<span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"nf\">main<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">n<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"n\">cin<\/span><span class=\"w\"> <\/span><span class=\"o\">&gt;&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">n<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"n\">Sieve<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">Sieve<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"n\">cout<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot;There are &quot;<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"p\">.<\/span><span class=\"n\">count<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot; primes between 0 and &quot;<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">n<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">endl<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"cp\">#ifdef VERBOSE<\/span>\n<span class=\"w\">  <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"p\">.<\/span><span class=\"n\">count<\/span><span class=\"p\">();<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"o\">++<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"n\">cout<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"p\">.<\/span><span class=\"n\">get<\/span><span class=\"p\">(<\/span><span class=\"n\">i<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">endl<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"cp\">#endif<\/span>\n\n<span class=\"w\">  <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<blockquote>\n<p>A vector of booleans is implemented in C++ by an arry of bits instead of single bytes. Apart from turning all the possible complier optimisations, at the hardware level, this more compact data structure is more cache-friendly. Here is a first link between a software implementation of a prime search and the computer architecture the code runs on.<\/p>\n<\/blockquote>\n<p>With an input of the order of <span class=\"math\">\\(10^6\\)<\/span> the sieve is still quite fast. However the memory requirements are substantial: up to <span class=\"math\">\\(10^9\\)<\/span> we are able to still use integers, but the memory consumption is of the order of the GB. The amount of memory on the system then can pose a serious limitation to the input parameter.<\/p>\n<h1 id=\"segmented-sieve\">Segmented Sieve<\/h1>\n<p>If we want to list and\/or count all the primes between two given (and possibly quite large) integers, we need a <em>Segmented Sieve<\/em>. If we are interested in all the primes between <span class=\"math\">\\(a\\)<\/span> and <span class=\"math\">\\(b\\)<\/span> we could, in principle, use the sieve of Eratosthenes to find all the primes up to <span class=\"math\">\\(b\\)<\/span> and then list\/count all the primes larger than <span class=\"math\">\\(a\\)<\/span>. But with <span class=\"math\">\\(b\\)<\/span> of the order, say, <span class=\"math\">\\(10^{15}\\)<\/span>, a lot of memory is required to hold the result. Instead we can split the interval <span class=\"math\">\\([a,b]\\)<\/span> into chunks and process them separately.<\/p>\n<p>The two main questions that we need to answer are: how do we adapt the sieve algorithm to start from <span class=\"math\">\\(a\\)<\/span> rather than 0, and how do we fix the chunk size. Let us deal with the latter question first. The reason why we need a segmented sieve in the first place is because of memory limitations. So an upper bound for the chunck size is given by the available memory. However, for large values of the inputs, the sieve might need to jump to memory location which are further apart. But how do we quantify this \"further apart\"? The answer, again, is in the system architecture, which quite likely include a system of cache memory. In order not to violate the locality principle we should choose a chunk size which is comparable to the cache size. Assuming this to be of the order of the MB, and recalling that <code>vector&lt;bool&gt;<\/code> is an array of bits, a possible chunk size is of the order of <span class=\"math\">\\(10^7\\)<\/span>.<\/p>\n<p>Coming to the question of how to implemente a segmented sieve, all we need to do is mark\/remove all the composite number in range. Of course we would need to start by removing all the even numbers, then all the multiples of 3, then of 5 and so on. Therefore we still need the knowledge of the primes starting from 2 and going above. But how much above? Since our upper limit is <span class=\"math\">\\(b\\)<\/span>, we need all the prime numbers up to <span class=\"math\">\\(\\lceil\\sqrt b\\rceil\\)<\/span>, which can be obtained with the standard sieve discussed earlier. These prime numbers can then be used to discover all the primes in the range <span class=\"math\">\\([a,b]\\)<\/span>. We start by removing the first even number greater than or equal to <span class=\"math\">\\(a\\)<\/span>, together with all the numbers obtained by repeatedly adding 2 to it until we are out of bound. More generally, to find the first multiple of the prime <span class=\"math\">\\(p\\)<\/span> in <span class=\"math\">\\([a,b]\\)<\/span> we use the formula<\/p>\n<div class=\"math\">$$s = \\left\\lceil\\frac ap\\right\\rceil\\cdot p$$<\/div>\n<p>However, recall that, for the standard sieve we really have to start from <span class=\"math\">\\(p^2\\)<\/span>, since lower multiples of <span class=\"math\">\\(p\\)<\/span> have already been removed at the previous iteration. Therefore, as our starting point we pick the <em>maximum<\/em> between <span class=\"math\">\\(s\\)<\/span> and <span class=\"math\">\\(p^2\\)<\/span> (actually between <span class=\"math\">\\(\\lceil a\/p\\rceil\\)<\/span> and <span class=\"math\">\\(p\\)<\/span>).<\/p>\n<p>The following is a simple implementation of the Segmented Sieve in C++.<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">#define CHUNK 10000000 <\/span><span class=\"c1\">\/\/ 10e7<\/span>\n\n<span class=\"k\">class<\/span><span class=\"w\"> <\/span><span class=\"nc\">SSieve<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n\n<span class=\"k\">private<\/span><span class=\"o\">:<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">Sieve<\/span><span class=\"w\">        <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">sieve<\/span><span class=\"p\">;<\/span><span class=\"w\">   <\/span><span class=\"c1\">\/\/ Sieve<\/span>\n<span class=\"w\">  <\/span><span class=\"n\">vector<\/span><span class=\"o\">&lt;<\/span><span class=\"kt\">bool<\/span><span class=\"o\">&gt;<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">ssieve<\/span><span class=\"p\">;<\/span><span class=\"w\">  <\/span><span class=\"c1\">\/\/ Segmented Sieve<\/span>\n<span class=\"w\">  <\/span><span class=\"n\">vector<\/span><span class=\"o\">&lt;<\/span><span class=\"kt\">int<\/span><span class=\"o\">&gt;<\/span><span class=\"w\">  <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">seg_c<\/span><span class=\"p\">;<\/span><span class=\"w\">   <\/span><span class=\"c1\">\/\/ primes in each segment<\/span>\n<span class=\"w\">  <\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"kt\">long<\/span><span class=\"w\">      <\/span><span class=\"n\">a<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">b<\/span><span class=\"p\">;<\/span><span class=\"w\">    <\/span><span class=\"c1\">\/\/ Bounds<\/span>\n<span class=\"w\">  <\/span><span class=\"kt\">int<\/span><span class=\"w\">            <\/span><span class=\"n\">c<\/span><span class=\"p\">;<\/span><span class=\"w\">       <\/span><span class=\"c1\">\/\/ Cached primes count<\/span>\n<span class=\"w\">  <\/span><span class=\"kt\">int<\/span><span class=\"w\">            <\/span><span class=\"n\">seg<\/span><span class=\"p\">;<\/span><span class=\"w\">     <\/span><span class=\"c1\">\/\/ Current segment<\/span>\n<span class=\"w\">  <\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"kt\">long<\/span><span class=\"w\">      <\/span><span class=\"n\">size<\/span><span class=\"p\">;<\/span><span class=\"w\">    <\/span><span class=\"c1\">\/\/ Total numbers in interval<\/span>\n<span class=\"w\">  <\/span><span class=\"kt\">int<\/span><span class=\"w\">            <\/span><span class=\"n\">max_seg<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"c1\">\/\/ Total number of segments<\/span>\n\n<span class=\"w\">  <\/span><span class=\"kt\">void<\/span><span class=\"w\"> <\/span><span class=\"nf\">do_segment<\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">seg<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"k\">return<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"c1\">\/\/ Do not regenerate the current segment<\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">assert<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">max_seg<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">seg<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"c1\">\/\/ Determine segment bounds<\/span>\n<span class=\"w\">    <\/span><span class=\"kt\">unsigned<\/span><span class=\"w\"> <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">l<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">a<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">CHUNK<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"kt\">unsigned<\/span><span class=\"w\"> <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">h<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">l<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"n\">min<\/span><span class=\"p\">(<\/span><span class=\"n\">CHUNK<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"p\">)<\/span><span class=\"n\">size<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">CHUNK<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"c1\">\/\/ Allocate the new segmented sieve<\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">ssieve<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"k\">delete<\/span><span class=\"w\"> <\/span><span class=\"n\">ssieve<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">ssieve<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"k\">new<\/span><span class=\"w\"> <\/span><span class=\"n\">vector<\/span><span class=\"o\">&lt;<\/span><span class=\"kt\">bool<\/span><span class=\"o\">&gt;<\/span><span class=\"p\">(<\/span><span class=\"n\">h<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"n\">l<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nb\">true<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"c1\">\/\/ Remove composite numbers in segment<\/span>\n<span class=\"w\">    <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">p<\/span><span class=\"w\"> <\/span><span class=\"o\">:<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">sieve<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">get_primes<\/span><span class=\"p\">()))<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">p<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">p<\/span><span class=\"w\"> <\/span><span class=\"o\">&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">h<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"k\">break<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">max<\/span><span class=\"p\">((<\/span><span class=\"kt\">int<\/span><span class=\"p\">)<\/span><span class=\"n\">l<\/span><span class=\"o\">\/<\/span><span class=\"n\">p<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">l<\/span><span class=\"w\"> <\/span><span class=\"o\">%<\/span><span class=\"w\"> <\/span><span class=\"n\">p<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"> <\/span><span class=\"o\">?<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"> <\/span><span class=\"o\">:<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">),<\/span><span class=\"w\"> <\/span><span class=\"n\">p<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">p<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"n\">l<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">ssieve<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">size<\/span><span class=\"p\">();<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">+=<\/span><span class=\"w\"> <\/span><span class=\"n\">p<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"c1\">\/\/{<\/span>\n<span class=\"w\">        <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">ssieve<\/span><span class=\"p\">)[<\/span><span class=\"n\">i<\/span><span class=\"p\">]<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"nb\">false<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"k\">public<\/span><span class=\"o\">:<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"n\">SSieve<\/span><span class=\"p\">(<\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"n\">low<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"n\">high<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"o\">:<\/span><span class=\"w\"> <\/span><span class=\"n\">c<\/span><span class=\"p\">(<\/span><span class=\"mi\">-1<\/span><span class=\"p\">),<\/span><span class=\"w\"> <\/span><span class=\"n\">ssieve<\/span><span class=\"p\">(<\/span><span class=\"nb\">NULL<\/span><span class=\"p\">),<\/span><span class=\"w\"> <\/span><span class=\"n\">seg<\/span><span class=\"p\">(<\/span><span class=\"mi\">-1<\/span><span class=\"p\">),<\/span><span class=\"w\"> <\/span><span class=\"n\">seg_c<\/span><span class=\"p\">(<\/span><span class=\"nb\">NULL<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">assert<\/span><span class=\"p\">(<\/span><span class=\"n\">low<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;=<\/span><span class=\"w\"> <\/span><span class=\"n\">high<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">seg_c<\/span><span class=\"w\">   <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"k\">new<\/span><span class=\"w\"> <\/span><span class=\"n\">vector<\/span><span class=\"o\">&lt;<\/span><span class=\"kt\">int<\/span><span class=\"o\">&gt;<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">high<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"mi\">2<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"n\">c<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"k\">return<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">low<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"mi\">2<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"n\">low<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">2<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"n\">a<\/span><span class=\"w\">       <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">low<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">b<\/span><span class=\"w\">       <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">high<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">size<\/span><span class=\"w\">    <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">b<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"n\">a<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">sieve<\/span><span class=\"w\">   <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"k\">new<\/span><span class=\"w\"> <\/span><span class=\"n\">Sieve<\/span><span class=\"p\">(<\/span><span class=\"n\">ceil<\/span><span class=\"p\">(<\/span><span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">b<\/span><span class=\"p\">)));<\/span><span class=\"w\"> <\/span><span class=\"c1\">\/\/ The standard sieve<\/span>\n<span class=\"w\">    <\/span><span class=\"n\">max_seg<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">size<\/span><span class=\"w\"> <\/span><span class=\"o\">\/<\/span><span class=\"w\"> <\/span><span class=\"n\">CHUNK<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">size<\/span><span class=\"w\"> <\/span><span class=\"o\">%<\/span><span class=\"w\"> <\/span><span class=\"n\">CHUNK<\/span><span class=\"w\"> <\/span><span class=\"o\">&gt;<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"> <\/span><span class=\"o\">?<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\"> <\/span><span class=\"o\">:<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"o\">~<\/span><span class=\"n\">SSieve<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">delete<\/span><span class=\"w\"> <\/span><span class=\"n\">sieve<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">delete<\/span><span class=\"w\"> <\/span><span class=\"n\">ssieve<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">delete<\/span><span class=\"w\"> <\/span><span class=\"n\">seg_c<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"kt\">unsigned<\/span><span class=\"w\"> <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">count<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">c<\/span><span class=\"w\"> <\/span><span class=\"o\">==<\/span><span class=\"w\"> <\/span><span class=\"mi\">-1<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"n\">c<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">unsigned<\/span><span class=\"w\"> <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">max_seg<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"o\">++<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"n\">do_segment<\/span><span class=\"p\">(<\/span><span class=\"n\">i<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">bool<\/span><span class=\"w\"> <\/span><span class=\"n\">p<\/span><span class=\"w\"> <\/span><span class=\"o\">:<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">ssieve<\/span><span class=\"p\">))<\/span><span class=\"w\"> <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">p<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"n\">c<\/span><span class=\"o\">++<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">        <\/span><span class=\"c1\">\/\/ Keep track of the number of primes in segments<\/span>\n<span class=\"w\">        <\/span><span class=\"c1\">\/\/ This is used by SSieve::get to retrieve the primes<\/span>\n<span class=\"w\">        <\/span><span class=\"n\">seg_c<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">push_back<\/span><span class=\"p\">(<\/span><span class=\"n\">c<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">c<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"n\">get<\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"n\">assert<\/span><span class=\"p\">(<\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">&gt;=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\"> <\/span><span class=\"o\">&amp;&amp;<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">count<\/span><span class=\"p\">());<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">k<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">n<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"c1\">\/\/ Determine which segment the requested prime belongs to<\/span>\n<span class=\"w\">    <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">seg_c<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">size<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"o\">++<\/span><span class=\"p\">)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">((<\/span><span class=\"o\">*<\/span><span class=\"n\">seg_c<\/span><span class=\"p\">)[<\/span><span class=\"n\">s<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"p\">]<\/span><span class=\"w\"> <\/span><span class=\"o\">&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"k\">break<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"c1\">\/\/ Reconstruct the segmented sieve if necessary<\/span>\n<span class=\"w\">    <\/span><span class=\"n\">do_segment<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"c1\">\/\/ Translate into the actual prime<\/span>\n<span class=\"w\">    <\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">j<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"o\">*<\/span><span class=\"n\">seg_c<\/span><span class=\"p\">)[<\/span><span class=\"n\">s<\/span><span class=\"p\">];<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">k<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">k<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">ssieve<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">size<\/span><span class=\"p\">();<\/span><span class=\"w\"> <\/span><span class=\"n\">k<\/span><span class=\"o\">++<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">((<\/span><span class=\"o\">*<\/span><span class=\"n\">ssieve<\/span><span class=\"p\">)[<\/span><span class=\"n\">k<\/span><span class=\"p\">])<\/span><span class=\"w\"> <\/span><span class=\"n\">n<\/span><span class=\"o\">++<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">      <\/span><span class=\"k\">if<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"w\"> <\/span><span class=\"o\">&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">j<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"k\">break<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">k<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"n\">s<\/span><span class=\"w\"> <\/span><span class=\"o\">*<\/span><span class=\"w\"> <\/span><span class=\"n\">CHUNK<\/span><span class=\"w\"> <\/span><span class=\"o\">+<\/span><span class=\"w\"> <\/span><span class=\"n\">a<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<span class=\"p\">};<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>This can be tested with a slightly modified <code>main<\/code> procedure, for example<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"nf\">main<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"kt\">long<\/span><span class=\"w\"> <\/span><span class=\"n\">n<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">m<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"n\">cin<\/span><span class=\"w\"> <\/span><span class=\"o\">&gt;&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">n<\/span><span class=\"w\"> <\/span><span class=\"o\">&gt;&gt;<\/span><span class=\"w\"> <\/span><span class=\"n\">m<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"n\">SSieve<\/span><span class=\"w\"> <\/span><span class=\"n\">ss<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"n\">SSieve<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"n\">m<\/span><span class=\"p\">);<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"n\">cout<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot;There are &quot;<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">ss<\/span><span class=\"p\">.<\/span><span class=\"n\">count<\/span><span class=\"p\">()<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot; primes between &quot;<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">n<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"s\">&quot; and &quot;<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">m<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">endl<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">  <\/span><span class=\"cp\">#ifdef VERBOSE<\/span>\n<span class=\"w\">  <\/span><span class=\"k\">for<\/span><span class=\"w\"> <\/span><span class=\"p\">(<\/span><span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">=<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">ss<\/span><span class=\"p\">.<\/span><span class=\"n\">count<\/span><span class=\"p\">();<\/span><span class=\"w\"> <\/span><span class=\"n\">i<\/span><span class=\"o\">++<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"n\">cout<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">ss<\/span><span class=\"p\">.<\/span><span class=\"n\">get<\/span><span class=\"p\">(<\/span><span class=\"n\">i<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"o\">&lt;&lt;<\/span><span class=\"w\"> <\/span><span class=\"n\">endl<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"cp\">#endif<\/span>\n\n<span class=\"w\">  <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<script type=\"text\/javascript\">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {\n    var align = \"center\",\n        indent = \"0em\",\n        linebreak = \"false\";\n\n    if (false) {\n        align = (screen.width < 768) ? \"left\" : align;\n        indent = (screen.width < 768) ? \"0em\" : indent;\n        linebreak = (screen.width < 768) ? 'true' : linebreak;\n    }\n\n    var mathjaxscript = document.createElement('script');\n    mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';\n    mathjaxscript.type = 'text\/javascript';\n    mathjaxscript.src = 'https:\/\/cdnjs.cloudflare.com\/ajax\/libs\/mathjax\/2.7.3\/latest.js?config=TeX-AMS-MML_HTMLorMML';\n\n    var configscript = document.createElement('script');\n    configscript.type = 'text\/x-mathjax-config';\n    configscript[(window.opera ? \"innerHTML\" : \"text\")] =\n        \"MathJax.Hub.Config({\" +\n        \"    config: ['MMLorHTML.js'],\" +\n        \"    TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'none' } },\" +\n        \"    jax: ['input\/TeX','input\/MathML','output\/HTML-CSS'],\" +\n        \"    extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js'],\" +\n        \"    displayAlign: '\"+ align +\"',\" +\n        \"    displayIndent: '\"+ indent +\"',\" +\n        \"    showMathMenu: true,\" +\n        \"    messageStyle: 'normal',\" +\n        \"    tex2jax: { \" +\n        \"        inlineMath: [ ['\\\\\\\\(','\\\\\\\\)'] ], \" +\n        \"        displayMath: [ ['$$','$$'] ],\" +\n        \"        processEscapes: true,\" +\n        \"        preview: 'TeX',\" +\n        \"    }, \" +\n        \"    'HTML-CSS': { \" +\n        \"        availableFonts: ['STIX', 'TeX'],\" +\n        \"        preferredFont: 'STIX',\" +\n        \"        styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} },\" +\n        \"        linebreaks: { automatic: \"+ linebreak +\", width: '90% container' },\" +\n        \"    }, \" +\n        \"}); \" +\n        \"if ('default' !== 'default') {\" +\n            \"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {\" +\n                \"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;\" +\n                \"VARIANT['normal'].fonts.unshift('MathJax_default');\" +\n                \"VARIANT['bold'].fonts.unshift('MathJax_default-bold');\" +\n                \"VARIANT['italic'].fonts.unshift('MathJax_default-italic');\" +\n                \"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');\" +\n            \"});\" +\n            \"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {\" +\n                \"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;\" +\n                \"VARIANT['normal'].fonts.unshift('MathJax_default');\" +\n                \"VARIANT['bold'].fonts.unshift('MathJax_default-bold');\" +\n                \"VARIANT['italic'].fonts.unshift('MathJax_default-italic');\" +\n                \"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');\" +\n            \"});\" +\n        \"}\";\n\n    (document.body || document.getElementsByTagName('head')[0]).appendChild(configscript);\n    (document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);\n}\n<\/script>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"c++"}},{"@attributes":{"term":"number theory"}},{"@attributes":{"term":"optimisation"}}]},{"title":"Getting Started with x86-64 Assembly on Linux","link":{"@attributes":{"href":"https:\/\/p403n1x87.github.io\/getting-started-with-x86-64-assembly-on-linux.html","rel":"alternate"}},"published":"2016-08-10T15:48:37+01:00","updated":"2016-08-10T15:48:37+01:00","author":{"name":"Gabriele N. Tornetta"},"id":"tag:p403n1x87.github.io,2016-08-10:\/getting-started-with-x86-64-assembly-on-linux.html","summary":"<p>You have experience of x86 assembly and you wonder what the fundamental architectural differences with the 64 bit Intel architecture are? Then this post might be what you are looking for. Here we'll see how to use the Netwide Assembler (NASM) to write a simple Hello World application in x86_64 assembly. Along the way, we will also have the chance to see how to use some standard tools to optimise the final executable by stripping out unnecessary debug symbols.<\/p>","content":"<div class=\"toc\"><span class=\"toctitle\">Table of contents:<\/span><ul>\n<li><a href=\"#overview\">Overview<\/a><\/li>\n<li><a href=\"#tools\">Tools<\/a><\/li>\n<li><a href=\"#hello-syscalls\">Hello Syscalls!<\/a><\/li>\n<li><a href=\"#hello-libc\">Hello libc!<\/a><\/li>\n<li><a href=\"#conclusions\">Conclusions<\/a><\/li>\n<\/ul>\n<\/div>\n<p>In this post we will learn how to assemble and link a simple \"Hello World\"\napplication written in x86-64 assembly for the Linux operating system. If you\nhave experience with Intel IA-32 assembly and you want to quickly get adjusted\nto the x86-64 world then this post is for you. If you're trying to learn the\nassembly language from scratch then I'm afraid this post is not for you. There\nare many great resources online on 32-bit assembly. One of my favourite\ndocuments is Paul Carter's PC Assembly Language, which I highly recommend if\nyou're moving your first steps into the assembly language. If you then decide to\ncome back to this post, you should be able to read it with no problems, since\nthe tools that I will employ here are the same used in Carter's book.<\/p>\n<h1 id=\"overview\">Overview<\/h1>\n<p>This post is organised as follows. In the next section, I gather some details\nabout the tools that we will use to code, assemble, link and execute the\napplications. As already mentioned above, most of the tools are the same as\nthose used in Carter's book. Our assembler (and hence the syntax) will be NASM.\nI will make use of two linkers, <code>ld<\/code> and the one that comes with <code>gcc<\/code>, the GNU\nC Compiler, for reasons that will be explained later. The first x64 application\nthat we will code will give us the chance to get familiar with the new system\ncalls and how they differ from the 32-bit architecture. With the second one we\nwill make use of the Standard C Library. Both examples will give us the chance\nto explore the x86-64 calling convention as set out in the <a href=\"http:\/\/www.x86-64.org\/documentation\/abi.pdf\">System V Application\nBinary Interface<\/a>.<\/p>\n<p>All the code shown in this post will also be available from the <a href=\"https:\/\/github.com\/P403n1x87\/asm\/tree\/master\/hello64\">dedicated asm\nGitHub repository<\/a>.<\/p>\n<h1 id=\"tools\">Tools<\/h1>\n<p>The Netwide Assembler is arguably the most popular assembler for the Linux\nOperating System and it is an open-source project. Its documentation is nicely\nwritten and explains all the features of the language and of the (dis)assembler.\nThis post will try to be as much self-contained as possible, but whenever you\nfeel the need to explore something a bit more, the NASM documentation will\nprobably be the right place. To assemble a 64-bit application we will need to\nuse the command<\/p>\n<div class=\"highlight\"><pre><span><\/span>nasm -f elf64 -o myapp.o myapp.asm\n<\/pre><\/div>\n\n\n<p>The flag <code>-f elf64<\/code> instructs NASM that we want to create a 64-bit ELF object\nfile. The flag <code>-o myapp.o<\/code> tells the assembler that we want the output object\nfile to be <code>myapp.o<\/code> in the current directory, whereas <code>myapp.asm<\/code> specifies the\nname of the source file containing the NASM code to be assembled.<\/p>\n<p>When an application calls functions from shared libraries it is necessary to\n<em>link<\/em> our object file to them so that it knows where to find them. Even if we\nare not using any external libraries, we still need to invoke the linker in\norder to obtain a valid executable file. The typical usage of <code>ld<\/code> that we will\nencounter in this post is<\/p>\n<div class=\"highlight\"><pre><span><\/span>ld -o myapp.o myapp\n<\/pre><\/div>\n\n\n<p>This is enough to produce a valid executable when we are not linking our object\nfile <code>myapp.o<\/code> against any external shared library or any other object file.\nOccasionally, depending on your distribution, you will have to specify which\ninterpreter you want to use. This is a library which, for ELF executables, acts\nas a loader. It loads the application in memory, as well as the required linked\nshared libraries. On Ubuntu 16.04, the right 64-bit interpreter is at\n<code>\/lib64\/ld-linux-x86-64.so.2<\/code> and therefore my invocation of <code>ld<\/code> will look like<\/p>\n<div class=\"highlight\"><pre><span><\/span>ld -o myapp myapp.o -I\/lib64\/ld-linux-x86-64.so.2\n<\/pre><\/div>\n\n\n<p>Some external shared libraries are designed to work with C. It is then advisable\nto include a <code>main<\/code> function in the assembly source code since the Standard C\nLibrary will take care of some essential cleanup steps when the execution\nreturns from it. Cases where one might want to opt for this approach are when\nthe application works with file descriptors and\/or spawns child processes. We\nwill see an example of this situation in a future tutorial on assembly and Gtk+.\nFor the time being, we shall limit ourselves to see how to use the GNU C\nCompiler to link our object file with other object files (and in particular with\n<code>libc<\/code>). The typical usage of <code>gcc<\/code> will be something like<\/p>\n<div class=\"highlight\"><pre><span><\/span>gcc -o myapp.o myapp\n<\/pre><\/div>\n\n\n<p>which is very much similar to <code>ld<\/code>.<\/p>\n<h1 id=\"hello-syscalls\">Hello Syscalls!<\/h1>\n<p>In this first example we will make use of the Linux system calls to print the\nstring <code>Hello World!<\/code> to the screen. Here is where we encounter a major\ndifference between the 32-bit and the 64-bit Linux world.<\/p>\n<p>But before we get to that, let's have a look at what is probably the most\nimportant difference between the 32-bit and the 64-bit architecture: the\nregisters. The number of the general purpose registers (GPRs for short) has\ndoubled and now have a maximum size of ... well ... 64-bit. The old <code>EAX<\/code>,\n<code>EBX<\/code>, <code>ECX<\/code> etc... are now the low 32-bit of the larger <code>RAX<\/code>, <code>RBX<\/code>, <code>RCX<\/code>\netc... respectively, while the new 8 GPRs are named <code>R8<\/code> to <code>R15<\/code>. The prefix\n<code>R<\/code> stands for, surprise, surprise, <em>register<\/em>. This seems like a sensible\ndecision, since this is in line with many other CPU manufacturers. Further\ndetails can be found in <a href=\"http:\/\/www.nasm.us\/doc\/nasmdo11.html\">Chapter 11<\/a> of\nthe NASM documentation and in <a href=\"https:\/\/software.intel.com\/sites\/default\/files\/m\/d\/4\/1\/d\/8\/Introduction_to_x64_Assembly.pdf\">this Intel\nwhite<\/a>.<\/p>\n<p>Let's now move back to system calls. Unix systems and derivatives do not make\nuse of software interrupts, with the only exception of <code>INT 0x80<\/code>, which on\n32-bit systems is used to make system calls. A system call is a way to request a\nservice from the kernel of the operating system. Most C programmers don't need\nto worry about them, as the Standard C Library provides wrappers around them.\nThe x86_64 architecture introduced the dedicated instruction <code>syscall<\/code> in order\nto make system calls. You can still use interrupts to make system calls, but\n<code>syscall<\/code> will be faster as it does not access the interrupt descriptor table.<\/p>\n<p>The purpose of this section is to explore this new opcode with an example.\nWithout further ado, let's dive into some assembly code. The following is the\ncontent of my <code>hello64.asm<\/code> file.<\/p>\n<table class=\"highlighttable\"><tr><td class=\"linenos\"><div class=\"linenodiv\"><pre><span class=\"normal\"> 1<\/span>\n<span class=\"normal\"> 2<\/span>\n<span class=\"normal\"> 3<\/span>\n<span class=\"normal\"> 4<\/span>\n<span class=\"normal\"> 5<\/span>\n<span class=\"normal\"> 6<\/span>\n<span class=\"normal\"> 7<\/span>\n<span class=\"normal\"> 8<\/span>\n<span class=\"normal\"> 9<\/span>\n<span class=\"normal\">10<\/span>\n<span class=\"normal\">11<\/span>\n<span class=\"normal\">12<\/span>\n<span class=\"normal\">13<\/span>\n<span class=\"normal\">14<\/span>\n<span class=\"normal\">15<\/span>\n<span class=\"normal\">16<\/span>\n<span class=\"normal\">17<\/span>\n<span class=\"normal\">18<\/span>\n<span class=\"normal\">19<\/span>\n<span class=\"normal\">20<\/span>\n<span class=\"normal\">21<\/span>\n<span class=\"normal\">22<\/span>\n<span class=\"normal\">23<\/span>\n<span class=\"normal\">24<\/span>\n<span class=\"normal\">25<\/span>\n<span class=\"normal\">26<\/span>\n<span class=\"normal\">27<\/span>\n<span class=\"normal\">28<\/span>\n<span class=\"normal\">29<\/span>\n<span class=\"normal\">30<\/span>\n<span class=\"normal\">31<\/span>\n<span class=\"normal\">32<\/span>\n<span class=\"normal\">33<\/span>\n<span class=\"normal\">34<\/span>\n<span class=\"normal\">35<\/span><\/pre><\/div><\/td><td class=\"code\"><div class=\"highlight\"><pre><span><\/span><span class=\"k\">global<\/span><span class=\"w\"> <\/span><span class=\"nv\">_start<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">; CONSTANTS<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"no\">SYS_WRITE<\/span><span class=\"w\">   <\/span><span class=\"kd\">equ<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\"><\/span>\n<span class=\"no\">SYS_EXIT<\/span><span class=\"w\">    <\/span><span class=\"kd\">equ<\/span><span class=\"w\"> <\/span><span class=\"mi\">60<\/span><span class=\"w\"><\/span>\n<span class=\"no\">STDOUT<\/span><span class=\"w\">      <\/span><span class=\"kd\">equ<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">; Initialised data goes here<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\"> <\/span><span class=\"nv\">.data<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">hello<\/span><span class=\"w\">           <\/span><span class=\"nv\">db<\/span><span class=\"w\">  <\/span><span class=\"s\">&quot;Hello World!&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">10<\/span><span class=\"w\">      <\/span><span class=\"c1\">; char *<\/span><span class=\"w\"><\/span>\n<span class=\"no\">hello_len<\/span><span class=\"w\">       <\/span><span class=\"kd\">equ<\/span><span class=\"w\"> <\/span><span class=\"kc\">$<\/span><span class=\"o\">-<\/span><span class=\"nv\">hello<\/span><span class=\"w\">                 <\/span><span class=\"c1\">; size_t<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">; Code goes here<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\"> <\/span><span class=\"nv\">.text<\/span><span class=\"w\"><\/span>\n\n<span class=\"nl\">_start:<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"c1\">; syscall(SYS_WRITE, STDOUT, hello, hello_len);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">mov<\/span><span class=\"w\">     <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">SYS_WRITE<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">mov<\/span><span class=\"w\">     <\/span><span class=\"nb\">rdi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">STDOUT<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">mov<\/span><span class=\"w\">     <\/span><span class=\"nb\">rsi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">hello<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">mov<\/span><span class=\"w\">     <\/span><span class=\"nb\">rdx<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">hello_len<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">syscall<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">push<\/span><span class=\"w\">    <\/span><span class=\"nb\">rax<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"c1\">; syscall(SYS_EXIT, &lt;sys_write return value&gt; - hello_len);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">mov<\/span><span class=\"w\">     <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">SYS_EXIT<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">pop<\/span><span class=\"w\">     <\/span><span class=\"nb\">rdi<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">sub<\/span><span class=\"w\">     <\/span><span class=\"nb\">rdi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">hello_len<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">syscall<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n<\/td><\/tr><\/table>\n\n<p><em>hello64.asm<\/em><\/p>\n<p>Lines 1, 13, 20 and 22 are part of the skeleton of any NASM source code. With\nline 1 we export the symbol <code>_start<\/code>, which defines the entry point for the\napplication, i.e. the point in where the execution starts from. The actual\nsymbol is declared on line 22, and line 24 will be the fist one to be executed.<\/p>\n<p>In lines 6 to 8 we define some constants to increase the readability of the\ncode. The price to pay is that NASM will export these symbols as well, thus\nincreasing the size of the final executable file. I will discuss how to deal\nwith this later on in this post. For the time being, let's focus on the rest of\nthe code.<\/p>\n<p>Line 13 marks the beginning of the initialised data section. Here we define\nstrings and other immediate values. In this case we only need to define the\n<code>\"Hello World!\\n\"<\/code> string (<code>10<\/code> is the ASCII code for the newline character\n<code>\\n<\/code>) and label it <code>hello<\/code>. Line 15 defines a constant equals to the length of\nthe string, and this is accomplished by subtracting the address of the label\n<code>hello<\/code> from the current address, given by <code>$<\/code> in NASM syntax.<\/p>\n<p>The <code>.text<\/code> section, where the actual code resides, starts at line 20. Here we\ndeclare the <code>_start<\/code> symbol, i.e. the entry point of the application, followed\nby the code to be executed. In this simple example, all we need to do is print\nthe string to screen and then terminate the application. This means that we need\nto call the <code>sys_write<\/code> system call, followed by a call to <code>sys_exit<\/code>, perhaps\nwith an exit code that will tell us whether the call to <code>sys_write<\/code> has been\nsuccessful or not.<\/p>\n<p>Here is our first encounter with the new syscall opcode and the x86_64 calling\nconvention. There isn't much to say about syscall. It does what you would expect\nit to do, i.e. make a system call. The system call to make is specified by the\nvalue of the rax register, whereas the parameters are passed according to the\nalready mentioned x86_64 calling convention. It is recommended that you have a\nlook at the official documentation to fully grasp it, especially when it comes\nto complex calls. In a nutshell, some of the parameters are passed through\nregisters and the rest go to the stack.<\/p>\n<blockquote>\n<p>The order of the registers is: <code>rdi<\/code>, <code>rsi<\/code>, <code>rdx<\/code>, <code>r10<\/code>, <code>r8<\/code>, <code>r9<\/code>.<\/p>\n<\/blockquote>\n<p>We shall see in the next code example that, when we call a C function, we should\nuse <code>rcx<\/code> instead of <code>r10<\/code>. Indeed, the latter is only used for the Linux kernel\ninterface, while the former is used in all the other cases.<\/p>\n<p>On line 23 we have a comment that shows us the equivalent C code for a call to\n<code>sys_write<\/code>. Its \"signature\" is the following.<\/p>\n<div class=\"highlight\"><pre><span><\/span>1. SYS_WRITE\n\nParameters\n----------\n\nunsigned int    file descriptor\nconst char *    pointer to the string\nsize_t              number of bytes to write\n\n\nReturn value\n------------\nThe number of bytes of the pointed string written on the file descriptor.\n<\/pre><\/div>\n\n\n<p>The number that appears on the top right corner is the code associated to the\nsystem call (compare this with line 6 above), and by convention this goes into\nthe <code>rax<\/code> register (see line 24). Since <code>sys_write<\/code> requires 3 integer\nparameters we only need the registers <code>rdi<\/code>, <code>rsi<\/code> and <code>rdx<\/code>, in this order.\nTherefore, the file descriptor, the standard output in this case, will go in\n<code>rdi<\/code>, the address of the first byte of the string will go in <code>rsi<\/code> while its\nlength will be loaded into <code>rdx<\/code> (lines 25 to 27).<\/p>\n<p>In order to make the actual system call we can now use the new opcode <code>syscall<\/code>.\nThe return value, namely the number of bytes written by <code>sys_write<\/code> in this\ncase, is returned in the <code>rax<\/code> register. With line 29 we save the return value\nin the stack in order to use it as an exit code to be passed to <code>sys_exit<\/code>.<\/p>\n<p>Since the application has done everything that needed to be done, i.e. print a\nstring to standard output, we are ready to terminate the execution of the main\nprocess. This is achieved by making the exit system call, whose \"signature\" is\nthe following.<\/p>\n<div class=\"highlight\"><pre><span><\/span>60. SYS_EXIT\n\nParameters\n----------\n\nint     error code\n\n\nReturn value\n------------\nThis system call does not return.\n<\/pre><\/div>\n\n\n<p>With line 32 we load the code of <code>sys_exit<\/code> into the <code>rax<\/code> register in\npreparation for the system call. As error code, we might want to return <code>0<\/code> if\n<code>sys_write<\/code> has done its job properly, i.e. if it has written all the expected\nnumber of bytes, and something else otherwise. The simplest way to achieve this\nis by subtracting the string length from the return value of <code>sys_write<\/code>.\nRemember that we stored the latter in the stack, so it is now time to retrieve\nit. The first and only argument of <code>sys_exit<\/code> must go in <code>rdi<\/code>, so we might as\nwell pop the <code>sys_write<\/code> return value in there directly, and this is precisely\nwhat line 33 does. On line 34 we subtract the length of the string from <code>rdi<\/code>,\nso that if <code>sys_write<\/code> has written all the expected number of bytes, <code>rdi<\/code> will\nnow be <code>0<\/code>. The last instruction on line 35 is the <code>syscall<\/code> opcode that will\nmake the system call and terminate the execution.<\/p>\n<p>All right, time now to assemble, link and execute the above code.<\/p>\n<div class=\"highlight\"><pre><span><\/span>nasm -f elf64 -o hello64.o hello64.asm\nld -o hello64 hello64.o -I\/lib64\/ld-linux-x86-64.so.2\n<\/pre><\/div>\n\n\n<p>This will assemble the source code of <code>hello64.asm<\/code> into the object file\n<code>hello64.o<\/code>, while the linker will finish off the job by linking the interpreter\nto the object file and produce the ELF64 executable. To run the application,\nsimply type<\/p>\n<div class=\"highlight\"><pre><span><\/span>.\/hello64\n<\/pre><\/div>\n\n\n<p>If you also want to display the exit code to make sure the executable is\nbehaving as expected we could use<\/p>\n<div class=\"highlight\"><pre><span><\/span>.\/hello64<span class=\"p\">;<\/span> <span class=\"nb\">echo<\/span> <span class=\"s2\">&quot;exit code:&quot;<\/span> <span class=\"nv\">$?<\/span>\n<\/pre><\/div>\n\n\n<p>and, on screen, we should now see<\/p>\n<div class=\"highlight\"><pre><span><\/span>Hello World!\nexit code: 0\n<\/pre><\/div>\n\n\n<p>Apart from the fun, another reason to write assembly code is that you can shrink\nthe size of the executable file. Let's check how big <code>hello64<\/code> is at this stage<\/p>\n<div class=\"highlight\"><pre><span><\/span>$ wc -c &lt; hello64\n1048\n<\/pre><\/div>\n\n\n<p>A kilobyte seems a bit excessive for an assembly application that only prints a\nshort string on screen. The reason of such a bloated executable is in the symbol\ntable created by NASM. This plays an important role inside our ELF file in case\nwe'd need to link it with other object files. You can see all the symbols stored\nin the elf file with<\/p>\n<div class=\"highlight\"><pre><span><\/span>$ objdump -t hello64\n\nhello64:     file format elf64-x86-64\n\nSYMBOL TABLE:\n00000000004000b0 l    d  .text 0000000000000000 .text\n00000000006000d8 l    d  .data 0000000000000000 .data\n0000000000000000 l    df *ABS* 0000000000000000 hello64.asm\n0000000000000001 l       *ABS* 0000000000000000 SYS_WRITE\n000000000000003c l       *ABS* 0000000000000000 SYS_EXIT\n0000000000000001 l       *ABS* 0000000000000000 STDOUT\n00000000006000d8 l       .data 0000000000000000 hello\n000000000000000d l       *ABS* 0000000000000000 hello_len\n00000000004000b0 g       .text 0000000000000000 _start\n00000000006000e5 g       .data 0000000000000000 __bss_start\n00000000006000e5 g       .data 0000000000000000 _edata\n00000000006000e8 g       .data 0000000000000000 _end\n<\/pre><\/div>\n\n\n<p>Assuming that we are not planning of doing this with our simple Hello World\nexample, we strip the symbol table off <code>hello64<\/code> with<\/p>\n<div class=\"highlight\"><pre><span><\/span>strip -s hello64\n<\/pre><\/div>\n\n\n<p>If we now check the file size again, this is what we get<\/p>\n<div class=\"highlight\"><pre><span><\/span>$ wc -c &lt; hello64\n512\n<\/pre><\/div>\n\n\n<p>i.e. less than half the original size. Looking at the symbol table again, this\nis what we get now:<\/p>\n<div class=\"highlight\"><pre><span><\/span>$ objdump -t hello64\n\nhello64: file format elf64-x86-64\n\nSYMBOL TABLE:\nno symbols\n<\/pre><\/div>\n\n\n<p>Observe that we can obtain the same result with the -s switch to the linker we\ndecide to use, that is, either ld or gcc. Thus, for example,<\/p>\n<div class=\"highlight\"><pre><span><\/span>ld -s -o hello64 hello64.o\n<\/pre><\/div>\n\n\n<p>will produce an ELF executable that lacks the symbol table completely.<\/p>\n<p>The possibility of removing symbols from an ELF file gives us the chance of\ndefining the constants for the system calls once and for all. In my GitHub\nrepository you can find the file\n<a href=\"https:\/\/github.com\/P403n1x87\/asm\/blob\/master\/syscalls\/syscalls.inc\"><code>syscalls.inc<\/code><\/a>\nwhere I have defined all the system calls together with their associated ID, and\nthe \"signature\" of each on a comment line. With the help of this file, our\nsource code would look like this<\/p>\n<table class=\"highlighttable\"><tr><td class=\"linenos\"><div class=\"linenodiv\"><pre><span class=\"normal\"> 1<\/span>\n<span class=\"normal\"> 2<\/span>\n<span class=\"normal\"> 3<\/span>\n<span class=\"normal\"> 4<\/span>\n<span class=\"normal\"> 5<\/span>\n<span class=\"normal\"> 6<\/span>\n<span class=\"normal\"> 7<\/span>\n<span class=\"normal\"> 8<\/span>\n<span class=\"normal\"> 9<\/span>\n<span class=\"normal\">10<\/span>\n<span class=\"normal\">11<\/span>\n<span class=\"normal\">12<\/span>\n<span class=\"normal\">13<\/span>\n<span class=\"normal\">14<\/span>\n<span class=\"normal\">15<\/span>\n<span class=\"normal\">16<\/span>\n<span class=\"normal\">17<\/span>\n<span class=\"normal\">18<\/span>\n<span class=\"normal\">19<\/span>\n<span class=\"normal\">20<\/span>\n<span class=\"normal\">21<\/span>\n<span class=\"normal\">22<\/span>\n<span class=\"normal\">23<\/span>\n<span class=\"normal\">24<\/span>\n<span class=\"normal\">25<\/span>\n<span class=\"normal\">26<\/span>\n<span class=\"normal\">27<\/span>\n<span class=\"normal\">28<\/span>\n<span class=\"normal\">29<\/span>\n<span class=\"normal\">30<\/span>\n<span class=\"normal\">31<\/span>\n<span class=\"normal\">32<\/span>\n<span class=\"normal\">33<\/span>\n<span class=\"normal\">34<\/span>\n<span class=\"normal\">35<\/span><\/pre><\/div><\/td><td class=\"code\"><div class=\"highlight\"><pre><span><\/span><span class=\"k\">global<\/span><span class=\"w\"> <\/span><span class=\"nv\">_start<\/span><span class=\"w\"><\/span>\n\n<span class=\"cp\">%include &quot;..\/syscalls.inc&quot;<\/span>\n\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">; CONSTANTS<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"no\">STDOUT<\/span><span class=\"w\">      <\/span><span class=\"kd\">equ<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">; Initialised data goes here<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\"> <\/span><span class=\"nv\">.data<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">hello<\/span><span class=\"w\">           <\/span><span class=\"nv\">db<\/span><span class=\"w\">  <\/span><span class=\"s\">&quot;Hello World!&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">10<\/span><span class=\"w\">      <\/span><span class=\"c1\">; char *<\/span><span class=\"w\"><\/span>\n<span class=\"no\">hello_len<\/span><span class=\"w\">       <\/span><span class=\"kd\">equ<\/span><span class=\"w\"> <\/span><span class=\"kc\">$<\/span><span class=\"o\">-<\/span><span class=\"nv\">hello<\/span><span class=\"w\">                 <\/span><span class=\"c1\">; size_t<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">; Code goes here<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\"> <\/span><span class=\"nv\">.text<\/span><span class=\"w\"><\/span>\n\n<span class=\"nl\">_start:<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"c1\">; syscall(SYS_WRITE, STDOUT, hello, hello_len);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">mov<\/span><span class=\"w\">     <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">SYS_WRITE<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">mov<\/span><span class=\"w\">     <\/span><span class=\"nb\">rdi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">STDOUT<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">lea<\/span><span class=\"w\">     <\/span><span class=\"nb\">rsi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"p\">[<\/span><span class=\"nv\">hello<\/span><span class=\"p\">]<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">mov<\/span><span class=\"w\">     <\/span><span class=\"nb\">rdx<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">hello_len<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">syscall<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">push<\/span><span class=\"w\">    <\/span><span class=\"nb\">rax<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"c1\">; syscall(SYS_EXIT, &lt;sys_write return value&gt; - hello_len);<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">mov<\/span><span class=\"w\">     <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">SYS_EXIT<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">pop<\/span><span class=\"w\">     <\/span><span class=\"nb\">rdi<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">sub<\/span><span class=\"w\">     <\/span><span class=\"nb\">rdi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">hello_len<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">syscall<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n<\/td><\/tr><\/table>\n\n<p><em>hello64_inc.asm<\/em><\/p>\n<p>Note the inclusion of the file <code>syscalls.inc<\/code> at line 3, assumed to be stored in\nthe parent folder of the one containing the assembly source code, and the only\nconstant <code>STDOUT<\/code> at line 8.<\/p>\n<p>If you do not need symbols in the final ELF file, you can just remove the symbol\ntable completely with the previous command. However, if you want to retain some,\nbut get rid of the one associated to constants that are meaningful to just your\nsource code, you can add a <code>-N &lt;symbol name&gt;<\/code> (e.g. <code>strip -N STDOUT hello64<\/code>)\nswitch to strip for each symbol you want dropped. To automate this when using\n<code>syscalls.inc<\/code>, one can execute the following (rather long) command<\/p>\n<div class=\"highlight\"><pre><span><\/span>strip <span class=\"sb\">`<\/span><span class=\"k\">while<\/span> <span class=\"nv\">IFS<\/span><span class=\"o\">=<\/span><span class=\"s1\">&#39;&#39;<\/span> <span class=\"nb\">read<\/span> -r line <span class=\"o\">||<\/span> <span class=\"o\">[[<\/span> -n <span class=\"s2\">&quot;<\/span><span class=\"nv\">$line<\/span><span class=\"s2\">&quot;<\/span> <span class=\"o\">]]<\/span><span class=\"p\">;<\/span> <span class=\"k\">do<\/span> <span class=\"nb\">read<\/span> s _ <span class=\"o\">&lt;&lt;&lt;<\/span> <span class=\"nv\">$line<\/span><span class=\"p\">;<\/span> <span class=\"nb\">echo<\/span> -n <span class=\"s2\">&quot;-N <\/span><span class=\"nv\">$s<\/span><span class=\"s2\"> &quot;<\/span><span class=\"p\">;<\/span> <span class=\"k\">done<\/span> &lt; &lt;<span class=\"o\">(<\/span>tail -n +5 ..\/syscalls.inc<span class=\"o\">)<\/span><span class=\"sb\">`<\/span> hello64\n<\/pre><\/div>\n\n\n<p>on the ELF executable.<\/p>\n<p>Finally, let's verify that all we really have is pure assembly code, i.e. that\nour application doesn't depend on external shared objects:<\/p>\n<div class=\"highlight\"><pre><span><\/span>$ ldd hello64\n        not a dynamic executable\n<\/pre><\/div>\n\n\n<p>In this case, this output is telling us that <code>hello64<\/code> is not linked to any\nother shared object files.<\/p>\n<h1 id=\"hello-libc\">Hello libc!<\/h1>\n<p>We shall now rewrite the above Hello World! example and let the Standard C\nLibrary take care of the output operation. That is, we won't deal with system\ncalls directly, we shall instead delegate a higher abstraction layer, the\nStandard C Library, do that for us. Furthermore, with this approach, we will\nalso delegate some basic clean-up involving, e.g., open file descriptor, child\nprocesses etc..., which we would have to deal with otherwise. For a simple\napplication like a Hello World! this last point is pretty much immaterial, but\nwe will see in another post on GUIs with Gtk+ 3 the importance of waiting for\nchild processes to terminate an application gracefully.<\/p>\n<p>So the code we want to write is the assembly analogue of the following C code<\/p>\n<div class=\"highlight\"><pre><span><\/span><span class=\"cp\">#include<\/span><span class=\"w\"> <\/span><span class=\"cpf\">&lt;stdio.h&gt;<\/span><span class=\"cp\"><\/span>\n\n<span class=\"kt\">int<\/span><span class=\"w\"> <\/span><span class=\"nf\">main<\/span><span class=\"p\">()<\/span><span class=\"w\"><\/span>\n<span class=\"p\">{<\/span><span class=\"w\"><\/span>\n<span class=\"w\">  <\/span><span class=\"k\">return<\/span><span class=\"w\"> <\/span><span class=\"n\">printf<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;Hello World!<\/span><span class=\"se\">\\n<\/span><span class=\"s\">&quot;<\/span><span class=\"p\">)<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"mi\">13<\/span><span class=\"p\">;<\/span><span class=\"w\"><\/span>\n<span class=\"p\">}<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n\n\n<p>Inside the <code>main<\/code> function, we call <code>printf<\/code> to print the string on screen and\nthen use its return value, decreased by the string length, as exit code. Thus,\nif <code>printf<\/code> writes all the bytes of our string, we get 0 as exit code, meaning\nthat the call has been successful.<\/p>\n<p>The didactic importance of this example resides in the use of the variadic\nfunction <code>printf<\/code>. The System V ABI specifies that, when calling a variadic\nfunction, the register <code>rax<\/code> should hold the number of XMM registers used for\nparameter passing. In this case, since we are just printing a string, we are not\npassing any other arguments apart from the location of the first character of\nthe string, and therefore we need to set <code>rax<\/code> to zero. With all these\nconsiderations, the assembly analogue of the above C code will look like this<\/p>\n<table class=\"highlighttable\"><tr><td class=\"linenos\"><div class=\"linenodiv\"><pre><span class=\"normal\"> 1<\/span>\n<span class=\"normal\"> 2<\/span>\n<span class=\"normal\"> 3<\/span>\n<span class=\"normal\"> 4<\/span>\n<span class=\"normal\"> 5<\/span>\n<span class=\"normal\"> 6<\/span>\n<span class=\"normal\"> 7<\/span>\n<span class=\"normal\"> 8<\/span>\n<span class=\"normal\"> 9<\/span>\n<span class=\"normal\">10<\/span>\n<span class=\"normal\">11<\/span>\n<span class=\"normal\">12<\/span>\n<span class=\"normal\">13<\/span>\n<span class=\"normal\">14<\/span>\n<span class=\"normal\">15<\/span>\n<span class=\"normal\">16<\/span>\n<span class=\"normal\">17<\/span>\n<span class=\"normal\">18<\/span>\n<span class=\"normal\">19<\/span>\n<span class=\"normal\">20<\/span>\n<span class=\"normal\">21<\/span>\n<span class=\"normal\">22<\/span><\/pre><\/div><\/td><td class=\"code\"><div class=\"highlight\"><pre><span><\/span><span class=\"k\">global<\/span><span class=\"w\"> <\/span><span class=\"nv\">main<\/span><span class=\"w\"><\/span>\n\n<span class=\"k\">extern<\/span><span class=\"w\"> <\/span><span class=\"nv\">printf<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">; Initialised data goes here<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\"> <\/span><span class=\"nv\">.data<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">hello<\/span><span class=\"w\">           <\/span><span class=\"nv\">db<\/span><span class=\"w\">  <\/span><span class=\"s\">&quot;Hello World!&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">10<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\">   <\/span><span class=\"c1\">; const char *<\/span><span class=\"w\"><\/span>\n<span class=\"no\">hello_len<\/span><span class=\"w\">       <\/span><span class=\"kd\">equ<\/span><span class=\"w\"> <\/span><span class=\"kc\">$<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"nv\">hello<\/span><span class=\"w\">               <\/span><span class=\"c1\">; size_t<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">; Code goes here<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\"> <\/span><span class=\"nv\">.text<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">; int main ()<\/span><span class=\"w\"><\/span>\n<span class=\"nl\">main:<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"c1\">; return printf(hello) - hello_len;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">lea<\/span><span class=\"w\">     <\/span><span class=\"nb\">rdi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"p\">[<\/span><span class=\"nv\">hello<\/span><span class=\"p\">]<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">xor<\/span><span class=\"w\">     <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nb\">rax<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">call<\/span><span class=\"w\">    <\/span><span class=\"nv\">printf<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">sub<\/span><span class=\"w\">     <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">hello_len<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n<\/td><\/tr><\/table>\n\n<p><em>hello64_libc.asm<\/em><\/p>\n<p>On line 1 we export the main symbol, which will get called by the <code>libc<\/code>\nframework. On line 3 we instruct NASM that our application uses an external\nsymbol, i.e. the variadic function <code>printf<\/code>. There is nothing new to say about\nthe <code>.data<\/code> section, that starts at line 8. The code, however, is quite\ndifferent. On line 17 we declare the label <code>main<\/code>, which marks the entry point\nof the C main function. We do not need local variables no access the standard\nargument of <code>main<\/code>, namely <code>argc<\/code> and <code>argv<\/code>, so we do not create a local stack\nframe. Instead, we go straight to calling <code>printf<\/code>. We load the string address\nin the <code>rdi<\/code> register (line 19), set the <code>rax<\/code> register to zero (line 20), since\nwe are not passing any arguments by the XMM registers, and finally call the\n<code>printf<\/code> function. On the last line we subtract the string length, <code>hello_len<\/code>,\nfrom the return value of <code>printf<\/code>.<\/p>\n<p>Assuming the above code resides in the file <code>hello64_libc.asm<\/code>, we can assemble\nand link it with<\/p>\n<div class=\"highlight\"><pre><span><\/span>nasm -f elf64 -o hello64_libc.o hello64_libc\ngcc -o hello64_libc hello64_libc.o\n<\/pre><\/div>\n\n\n<p>The ELF executable I get on my machine is 8696 bytes in size, and 6328 without\nthe symbol table. If we thought 1048 was too much for a simple Hello World\napplication, the libc example is 8 times bigger. And without symbols, you can\nsee that we are wasting about 8K by relying on the Standard C Library.<\/p>\n<p>A somewhat intermediate approach is to drop the main function and only use the\n<code>printf<\/code> function from <code>libc<\/code>. The advantage is a reduced file size, since our\nexecutable only depends on the Standard C Library. However, as discussed above,\nwe lose an important clean-up process that can be convenient, if not necessary,\nat times.<\/p>\n<table class=\"highlighttable\"><tr><td class=\"linenos\"><div class=\"linenodiv\"><pre><span class=\"normal\"> 1<\/span>\n<span class=\"normal\"> 2<\/span>\n<span class=\"normal\"> 3<\/span>\n<span class=\"normal\"> 4<\/span>\n<span class=\"normal\"> 5<\/span>\n<span class=\"normal\"> 6<\/span>\n<span class=\"normal\"> 7<\/span>\n<span class=\"normal\"> 8<\/span>\n<span class=\"normal\"> 9<\/span>\n<span class=\"normal\">10<\/span>\n<span class=\"normal\">11<\/span>\n<span class=\"normal\">12<\/span>\n<span class=\"normal\">13<\/span>\n<span class=\"normal\">14<\/span>\n<span class=\"normal\">15<\/span>\n<span class=\"normal\">16<\/span>\n<span class=\"normal\">17<\/span>\n<span class=\"normal\">18<\/span>\n<span class=\"normal\">19<\/span>\n<span class=\"normal\">20<\/span>\n<span class=\"normal\">21<\/span>\n<span class=\"normal\">22<\/span>\n<span class=\"normal\">23<\/span>\n<span class=\"normal\">24<\/span>\n<span class=\"normal\">25<\/span>\n<span class=\"normal\">26<\/span>\n<span class=\"normal\">27<\/span>\n<span class=\"normal\">28<\/span>\n<span class=\"normal\">29<\/span><\/pre><\/div><\/td><td class=\"code\"><div class=\"highlight\"><pre><span><\/span><span class=\"k\">global<\/span><span class=\"w\"> <\/span><span class=\"nv\">_start<\/span><span class=\"w\"><\/span>\n\n<span class=\"cp\">%include &quot;..\/syscalls.inc&quot;<\/span>\n\n<span class=\"k\">extern<\/span><span class=\"w\"> <\/span><span class=\"nv\">printf<\/span><span class=\"w\"><\/span>\n\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">; Initialised data goes here<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\"> <\/span><span class=\"nv\">.data<\/span><span class=\"w\"><\/span>\n<span class=\"nf\">hello<\/span><span class=\"w\">           <\/span><span class=\"nv\">db<\/span><span class=\"w\">  <\/span><span class=\"s\">&quot;Hello World!&quot;<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">10<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"mi\">0<\/span><span class=\"w\">   <\/span><span class=\"c1\">; const char *<\/span><span class=\"w\"><\/span>\n<span class=\"no\">hello_len<\/span><span class=\"w\">       <\/span><span class=\"kd\">equ<\/span><span class=\"w\"> <\/span><span class=\"kc\">$<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"nv\">hello<\/span><span class=\"w\"> <\/span><span class=\"o\">-<\/span><span class=\"w\"> <\/span><span class=\"mi\">1<\/span><span class=\"w\">           <\/span><span class=\"c1\">; size_t<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">; Code goes here<\/span><span class=\"w\"><\/span>\n<span class=\"c1\">;<\/span><span class=\"w\"><\/span>\n<span class=\"k\">SECTION<\/span><span class=\"w\"> <\/span><span class=\"nv\">.text<\/span><span class=\"w\"><\/span>\n\n<span class=\"nl\">_start:<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"c1\">; printf(hello) - hello_len;<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">lea<\/span><span class=\"w\">     <\/span><span class=\"nb\">rdi<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"p\">[<\/span><span class=\"nv\">hello<\/span><span class=\"p\">]<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">xor<\/span><span class=\"w\">     <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nb\">rax<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">call<\/span><span class=\"w\">    <\/span><span class=\"nv\">printf<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">sub<\/span><span class=\"w\">     <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">hello_len<\/span><span class=\"w\"><\/span>\n\n<span class=\"w\">    <\/span><span class=\"c1\">; syscall(SYS_EXIT, rax - hello_len)<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">push<\/span><span class=\"w\">    <\/span><span class=\"nb\">rax<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">mov<\/span><span class=\"w\">     <\/span><span class=\"nb\">rax<\/span><span class=\"p\">,<\/span><span class=\"w\"> <\/span><span class=\"nv\">SYS_EXIT<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">pop<\/span><span class=\"w\">     <\/span><span class=\"nb\">rdi<\/span><span class=\"w\"><\/span>\n<span class=\"w\">    <\/span><span class=\"nf\">syscall<\/span><span class=\"w\"><\/span>\n<\/pre><\/div>\n<\/td><\/tr><\/table>\n\n<p><em>hello64_libc2.asm<\/em><\/p>\n<p>Note how, on lines 1 and 18, we removed the main function and reintroduced the\n<code>_start<\/code> symbol to tell NASM where the entry point is. Thus, execution of our\napplication now starts at line 20. Here we prepare to call the <code>printf<\/code>\nfunction from <code>libc<\/code> (lines 20 to 22), we compute the exit code (line 23) and we\nstore it in the stack. Now there is no Standard C Library framework to terminate\nthe execution for us, since we cannot return from the non-existent main\nfunction, and therefore we have to make a call to <code>SYS_EXIT<\/code> ourselves (lines 26\nto 29).<\/p>\n<p>Assuming this code resides in the file <code>hello64_libc2<\/code>, we assemble and link\nwith the commands<\/p>\n<div class=\"highlight\"><pre><span><\/span>nasm -f elf64 -o hello64_libc2.o hello64_libc2\nld -s -o hello64_libc2 hello64_libc.o -I\/lib64\/ld-linux-x86-64.so.2\n<\/pre><\/div>\n\n\n<p>Checking the file size, this is what I get on my machine now<\/p>\n<div class=\"highlight\"><pre><span><\/span>$ wc -c &lt; hello64_libc2\n2056\n<\/pre><\/div>\n\n\n<p>i.e. about a third of the \"full\" <code>libc<\/code> example above. There is something we can\nstill do with <code>strip<\/code>, namely determine which sections are not needed. After\nlinking with <code>ld<\/code>, the ELF I get has the following sections<\/p>\n<div class=\"highlight\"><pre><span><\/span>$ readelf -S hello64_libc2 | grep [.]\n  [ 1] .interp           PROGBITS         0000000000400158  00000158\n  [ 2] .hash             HASH             0000000000400178  00000178\n  [ 3] .dynsym           DYNSYM           0000000000400190  00000190\n  [ 4] .dynstr           STRTAB           00000000004001c0  000001c0\n  [ 5] .gnu.version      VERSYM           00000000004001de  000001de\n  [ 6] .gnu.version_r    VERNEED          00000000004001e8  000001e8\n  [ 7] .rela.plt         RELA             0000000000400208  00000208\n  [ 8] .plt              PROGBITS         0000000000400220  00000220\n  [ 9] .text             PROGBITS         0000000000400240  00000240\n  [10] .eh_frame         PROGBITS         0000000000400260  00000260\n  [11] .dynamic          DYNAMIC          0000000000600260  00000260\n  [12] .got.plt          PROGBITS         00000000006003a0  000003a0\n  [13] .data             PROGBITS         00000000006003c0  000003c0\n  [14] .shstrtab         STRTAB           0000000000000000  000003ce\n<\/pre><\/div>\n\n\n<p>By trials and errors, I have discovered that I can get rid of <code>.hash<\/code>,\n<code>.gnu.version<\/code> and <code>.eh_frame<\/code> while still getting a valid ELF executable that\ndoes its job. To get rid of these sections one can use the command<\/p>\n<div class=\"highlight\"><pre><span><\/span>strip -R .hash -R .gnu.version -R .eh_frame hello64_libc2\n<\/pre><\/div>\n\n\n<p>which yields an executable of 1832 bytes.<\/p>\n<h1 id=\"conclusions\">Conclusions<\/h1>\n<p>With the above examples, we have seen that, if our real goal is that of coding a\nHello World application meant to run on an architecture with the x86_64\ninstruction set, assembly is the best shot we have. Chances are, if you are\ncoding an application, it is more complex than just printing a string on screen.\nEven pretending for a moment that you don't care about the portability of your\ncode, there are certainly some benefits from linking your application with gcc\nand letting the Standard C Library do some clean-up work for you. We will have\nthe chance to see this last point from a close-up perspective in a future post.\nSo take this current post as a reference point where you can look back when you\nneed to recall the basics of writing a 64-bit assembly application for the Linux\nOS.<\/p>","category":[{"@attributes":{"term":"Programming"}},{"@attributes":{"term":"assembly"}}]}]}