Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Low Level RPython

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

Low Level RPython

Avatar for David Beazley

David Beazley

February 09, 2012
Tweet

More Decks by David Beazley

Other Decks in Programming

Transcript

  1. PyPy Overview • PyPy is Python implemented in Python Interpreter

    (ANSI C) Python Program Interpreter (Python) Python Program CPython PyPy • Take the C version of the interpreter and rewrite it as a Python program.
  2. rpython • PyPy is actually implemented in "rpython" • rpython

    is not an "interpreter", but a restricted subset of the Python language Python rpython • It can run as valid Python code, but that's about the only similarity
  3. rpython • rpython is a completely different language • Python

    syntax, yes. • Must be compiled (like C, C++, etc.) • Static typing via type inference • Very different than anything you're used to
  4. A Simple Example • Example rpython code: # fib.py def

    fib(n): if n < 2: return 1 else: return fib(n-1) + fib(n-2) # entry point. Like C main() def main(argv): print fib(int(argv[1])) return 0 def target(*args): return main, None
  5. Translation (Compilation) • rpython translation bash % pypy/translator/goal/translate.py fib.py [platform:msg]

    Setting platform to 'host' cc=None [translation:info] Translating target as defined by hello [platform:execute] gcc-4.0 -c -arch x86_64 -O3 - fomit-frame-pointer -mdynamic-no-pic /var/folders/- \ ... lots of additional output ... • Creates and compiles a C program into an exe bash % ./fib-c 38 63245986 bash %
  6. Performance • It runs pretty fast CPython 2.7 95.4s pypy

    17.0s rpython 2.6s ANSI C (-O2) 2.1s • Almost as fast as ANSI C
  7. R is for Restricted • rpython allows no dynamic typing

    def add(x,y): return x+y def main(argv): r1 = add(2,3) # Ok r2 = add("Hello","World") # Error return 0 • Functions can only have one type signature
  8. R is for Restricted • Containers can only have a

    single type numbers = [1,2,3,4,5] # Ok items = [1, "Hello", 3.5] # Error names = { # Ok 'dabeaz' : 'David Beazley', 'gaynor' : 'Alex Gaynor', } record = { # Error 'name' : 'ACME', 'shares' : 100 } • Think C, not Python.
  9. R is for Restricted • Attributes can only be a

    single type class Pair(object): def __init__(self,x,y): self.x = x self.y = y a = Pair(2,3) # OK (first use) b = Pair("Hello","World") # Error • Again, think C
  10. Today's Topic • Going deeper into the generated C •

    Looking at the code • Studying efficiency • Accessing C libraries • ???
  11. Looking at the C Code • rpython generates C code

    and places it into a temporary directory ... [translation:info] usession directory: /var/folders/M7/ M7Q2OurGGbezUFSLGEgQZ++++TI/-Tmp-/usession-unknown-0 [translation:info] created: /Users/beazley/Desktop/PyPyResearch/fib-c [Timer] Timings: [Timer] annotate --- 2.2 s [Timer] rtype_lltype --- 1.8 s [Timer] backendopt_lltype --- 1.2 s [Timer] stackcheckinsertion_lltype --- 0.0 s [Timer] database_c --- 16.7 s [Timer] source_c --- 2.8 s [Timer] compile_c --- 2.2 s [Timer] ========================================= [Timer] Total: --- 26.8 s bash %
  12. Looking at the C Code • Go look for the

    "testing_1" directory bash % cd /var/folders/M7/M7Q2OurGGbezUFSLGEgQZ++++TI/-Tmp-/usession-unknown-0 bash % cd testing_1 bash % ls *.c data_objspace_flow_specialcase.c data_rlib_rdtoa.c data_rlib_rposix.c data_rlib_rstack.c data_rlib_rstack_1.c data_rpython_lltypesystem_rffi.c data_rpython_lltypesystem_rlist.c data_rpython_memory_gc_env.c data_rpython_memory_gc_minimark.c data_rpython_memory_gc_minimark_1.c data_rpython_memory_gctransform_framework.c debug_print.c implement.c nonfuncnodes.c objspace_flow_specialcase.c profiling.c ...
  13. Essential Files • Here's where most of the generated code

    from your program gets placed • implement.c (functions) • nonfuncnodes.c (globals) • structdef.h (data structures) • Look at them if you dare... yes.
  14. Example Code long pypy_g_fib(long l_n_0) { bool_t l_v280; bool_t l_v283;

    bool_t l_v286; bool_t l_v289; long l_v278; long l_v279; long l_v284; long l_v287; long l_v290; ... goto block0; block0: OP_INT_LT(l_n_0, 2L, l_v280); if (l_v280) { l_v294 = 1L; goto block5; } goto block1; block1: pypy_g_stack_check___(); l_v282 = (&pypy_g_ExcData)->ed_exc_type; l_v283 = (l_v282 == NULL); if (!l_v283) { goto block8; } goto block2; block2: OP_INT_SUB(l_n_0, 1L, l_v284); l_v278 = pypy_g_fib(l_v284); PYPY_INHIBIT_TAIL_CALL(); l_v285 = (&pypy_g_ExcData)->ed_exc_type;
  15. Example Code long pypy_g_fib(long l_n_0) { bool_t l_v280; bool_t l_v283;

    bool_t l_v286; bool_t l_v289; long l_v278; long l_v279; long l_v284; long l_v287; long l_v290; ... goto block0; block0: OP_INT_LT(l_n_0, 2L, l_v280); if (l_v280) { l_v294 = 1L; goto block5; } goto block1; block1: pypy_g_stack_check___(); l_v282 = (&pypy_g_ExcData)->ed_exc_type; l_v283 = (l_v282 == NULL); if (!l_v283) { goto block8; } goto block2; block2: OP_INT_SUB(l_n_0, 1L, l_v284); l_v278 = pypy_g_fib(l_v284); PYPY_INHIBIT_TAIL_CALL(); l_v285 = (&pypy_g_ExcData)->ed_exc_type; l_v286 = (l_v285 == NULL); It's a literal translation of flow-graphs into C (rather cryptic)
  16. Experimental Coding • You can try different things in rpython

    and go look at the output C code • A bit of a challenge • But interesting to study what happens
  17. Example: Objects class Stock(object): def __init__(self,name,shares,price): self.name = name self.shares

    = shares self.price = price s = Stock('ACME',50,123.45) // structdef.h ... struct pypy_cls_Stock0 { struct pypy_object0 s_super; struct pypy_rpy_string0 *s_inst_name; double s_inst_price; long s_inst_shares; } rpython
  18. Accessing C Code • If everything in PyPy is written

    in rpython, how does it access low-level C libraries? • os modules • time functions • math functions • General question: How would you access C code from any Python program?
  19. ctypes (CPython) • Perhaps you've used ctypes before... import ctypes

    mlib = ctypes.cdll.LoadLibrary("libm.dylib") sin = mlib.sin sin.argtypes = (ctypes.c_double,) sin.restype = ctypes.c_double ... x = sin(2) • rpython is kind of similar
  20. rpython rffi • Foreign Function Interface from pypy.rpython.lltypesystem import rffi

    sin = rffi.llexternal("sin", [rffi.DOUBLE], rffi.DOUBLE) cos = rffi.llexternal("cos", [rffi.DOUBLE], rffi.DOUBLE) ... • Declares external C functions with types • Can use in your rpython program y = sin(x) + cos(x) ... • Instructive to look at low-level C code
  21. rffi Commentary • The rpython rffi is highly developed •

    Most C primitive datatypes • Arrays • Structures • Pointers • Memory management • (More advanced example shortly)
  22. Configuration System • There is a C compilation/configuration system from

    pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E']
  23. Configuration System • There is a C compilation/configuration system from

    pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E'] C compiler specification (includes, libraries, paths, etc.)
  24. Configuration System • There is a C compilation/configuration system from

    pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E'] Some "queries" for things you want to know from C
  25. Configuration System • There is a C compilation/configuration system from

    pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E'] Run the C compiler and get results back
  26. Configuration Comments • There is no centralized "configuration" • Individual

    program modules simply request information from the C compilation environment whenever they need it • Somehow (magically), the system will invoke the C compiler as needed. • It hurts my head...
  27. Now What?!? • We haven't even talked about PyPy yet!

    • .... or the JIT • Implemented in rpython • So, all of this is just a starting point • More talks? (maybe)