{"title":"Cryptech Project - FutureWork","link":[{"@attributes":{"href":"https:\/\/wiki.cryptech.is\/","rel":"alternate"}},{"@attributes":{"href":"https:\/\/wiki.cryptech.is\/feeds\/futurework.atom.xml","rel":"self"}}],"id":"https:\/\/wiki.cryptech.is\/","updated":"2017-07-27T19:02:00+00:00","entry":[{"title":"Secure Channel","link":{"@attributes":{"href":"https:\/\/wiki.cryptech.is\/SecureChannel","rel":"alternate"}},"published":"2017-07-27T00:24:00+00:00","updated":"2017-07-27T19:02:00+00:00","author":{"name":"Rob Austein"},"id":"tag:wiki.cryptech.is,2017-07-27:\/SecureChannel","summary":"<p>This is a sketch of a design for the secure channel that we want to\nhave between the Cryptech HSM and the client libraries which talk to\nit.  Work in progress, and not implemented yet because a few of the\npieces are still missing.<\/p>\n<h2>Design goals and constraints<\/h2>\n<p>Basic design \u2026<\/p>","content":"<p>This is a sketch of a design for the secure channel that we want to\nhave between the Cryptech HSM and the client libraries which talk to\nit.  Work in progress, and not implemented yet because a few of the\npieces are still missing.<\/p>\n<h2>Design goals and constraints<\/h2>\n<p>Basic design goals:<\/p>\n<ul>\n<li>\n<p>End-to-end between client library and HSM.<\/p>\n<\/li>\n<li>\n<p>Not require yet another presentation layer if we can avoid it (so,\n    reuse XDR if possible, unless we have some strong desire to switch\n    to something else).<\/p>\n<\/li>\n<li>\n<p>Provide end-to-end message integrity between client library and HSM.<\/p>\n<\/li>\n<li>\n<p>Provide end-to-end message confidentiality between client library\n    and HSM.  We only need this for a few operations, but between PINs\n    and private keys it would be simpler just to provide it all the\n    time than to be selective.<\/p>\n<\/li>\n<li>\n<p>Provide some form of mutual authentication between client library\n    and HSM.  This is tricky, since it requires either configuration\n    (of the other party's authenticator) or leap-of-faith.\n    Leap-of-faith is probably good enough for most of what we really\n    care about (insuring that we're talking to the same dog now as we\n    were earlier).<\/p>\n<p>Not 100% certain we need this at all, but if we're going to leave\nourselves wide open to monkey-in-the-middle attacks, there's not\nmuch point in having a secure channel at all.<\/p>\n<\/li>\n<li>\n<p>Use boring simple crypto that we already have (or almost have) and\n    which runs fast.<\/p>\n<\/li>\n<li>\n<p>Continue to support multiplexer.  Taken together with end-to-end\n    message confidentiality, this may mean two layers of headers: an\n    outer set which the multiplexer is allowed to mutate, then an\n    inner set which is protected.  Better, though, would be if the\n    multiplexer can work just by reading the outer headers without\n    modifying anything.<\/p>\n<\/li>\n<li>\n<p>Simple enough that we can implement it easily in HSM, PKCS #11\n    library, and Python library.<\/p>\n<\/li>\n<\/ul>\n<h2>Why not TLS?<\/h2>\n<p>We could, of course, Just Use TLS.  Might end up doing that, if it\nturns out to be easier, but TLS is a complicated beast, with far more\noptions than we need, and doesn't provide all of what we want, so a\nfair amount of the effort would be, not wasted exactly, but a giant\nstep sideways.  Absent sane alternatives, I'd just suck it up and do\nthis, with a greatly restricted ciphersuite, but I think we have a\nbetter option.<\/p>\n<h2>Design<\/h2>\n<p>Basic design lifted from \"Cryptography Engineering: Design Principles\nand Practical Applications\" (ISBN 978-0-470-47424-2,\nhttp:\/\/www.wiley.com\/WileyCDA\/WileyTitle\/productCd-0470474246.html),\ntweaked in places to fit tools we have readily available.<\/p>\n<p>Toolkit:<\/p>\n<ul>\n<li>AES<\/li>\n<li>SHA-2<\/li>\n<li>ECDH<\/li>\n<li>ECDSA<\/li>\n<li>XDR<\/li>\n<\/ul>\n<p>As in the book, there are two layers here: the basic secure channel,\nmoving encrypted-and-authenticated frames back and forth, and a higher\nlevel which handles setup, key agreement, and endpoint authentication.<\/p>\n<p>Chapter 7 outlines a simple lower layer using AES-CTR and\nHMAC-SHA-256.  I don't see any particular reason to change any of\nthis, AES-CTR is easy enough.  I suppose it might be worth looking\ninto AES-CCM and AES-GCM, but they're somewhat more complicated;\nsection 7.5 (\"Alternatives\") discusses these briefly, we also know\nsome of the authors.<\/p>\n<p>For key agreement we probably want to use ECDH.  We don't quite have\nthat yet, but in theory it's relatively minor work to generalize our\nexisting ECDSA code to cover that too, and, again in theory, it should\nbe possible to generalize our existing ECDSA fast base point multiplier\nVerilog cores into fast point multiplier cores (sic: limitation of the\ncurrent cores is that they only compute scalar times the base point,\nnot scalar times an arbitrary point, which is fine for ECDSA but\ndoesn't work for ECDH).<\/p>\n<p>For signature (mutual authentication) we probably want to use ECDSA,\nagain because we have it and it's fast.  The more interesting question\nis the configuration vs leap-of-faith discussion, figuring out under\nwhich circumstances we really care about the peer's identity, and\nfiguring out how to store state.<\/p>\n<p>Chapter 14 (key negotiation) of the same book covers the rest of the\nprotocol, substituting ECDH and ECDSA for DH and RSA, respectively.\nAs noted in the text, we could use a shared secret key and a MAC\nfunction instead of public key based authentication.<\/p>\n<p>Alternatively, the Station-to-Station protocol described in 4.6.1 of\n\"Guide to Elliptic Curve Cryptography\" (ISBN 978-0-387-95273-4,\nhttps:\/\/link.springer.com\/book\/10.1007\/b97644) appears to do what\nwe want, straight out of the box.<\/p>\n<p>Interaction with multiplexer is slightly interesting.  The multiplexer\nreally only cares about one thing: being able to match responses from\nthe HSM to queries sent into the HSM, so that the multiplexer can send\nthe responses back to the right client.  At the moment, it does this\nby seizing control of the client_handle field in the RPC frame, which\nit can get away with doing because there's no end-to-end integrity\ncheck at all (yuck).  We could add an outer layer of headers for the\nmultiplexer, but would rather not.<\/p>\n<p>The obvious \"real\" identity for clients to use would be the public\nkeys (ECDSA in the above discussion) they use to authenticate to the\nHSM, or a hash (perhaps truncated) thereof.  That's good as far as it\ngoes, and may suffice if we can assume that clients always have unique\nkeys, but if client keys are something over which the client has any\ncontrol (which includes selecting where they're stored, which we may\nnot be able to avoid), we have to consider the possibility of multiple\nclients using the same key (yuck).  So a candidate replacement for the\nclient_handle for multiplexer purposes would be some combination of a\npublic key hash and a process ID, both things the client could provide\nwithout the multiplexer needing to do anything.<\/p>\n<p>The one argument in favor of leaving control of this to the\nmultiplexer (rather than the endpoints) is that it would (sort of)\nprotect against one client trying to masquerade as another -- but\nthat's really just another reason why clients should have their own\nkeys to the extent possible.<\/p>\n<p>As a precaution, perhaps the multiplexer should check for duplicate\nidentifiers, then do, um, something? if it finds duplicates.  This\nkind of violates Steinbach's Guideline for Systems Programming (\"Never\ntest for an error condition you don't know how to handle\").  Obvious\nanswer is to break all connections old and new using the duplicate\nidentity, minor questions about how to reset from that, whether worth\ndoing at all, etc.  Maybe clients just shouldn't do that.<\/p>\n<h2>Open issues<\/h2>\n<ul>\n<li>\n<p>Does the resulting design pass examination by clueful people?<\/p>\n<\/li>\n<li>\n<p>Does this end up still being significantly simpler than TLS?<\/p>\n<\/li>\n<li>\n<p>The Cryptography Engineering protocols include a hack to work\n    around a length extension weakness in SHA-2 (see section 5.4.2).\n    Do we need this?  Would we be better off using SHA-3 instead?  The\n    book claims that SHA-3 was expected to fix this, but that was\n    before NIST pissed away their reputation by getting too cosy with\n    the NSA again.  Over my head, ask somebody with more clue.<\/p>\n<\/li>\n<\/ul>","category":{"@attributes":{"term":"FutureWork"}}},{"title":"Development of a Cryptech ASIC Implementation","link":{"@attributes":{"href":"https:\/\/wiki.cryptech.is\/ASICImplementations","rel":"alternate"}},"published":"2016-12-15T22:44:00+00:00","updated":"2016-12-15T22:44:00+00:00","author":{"name":"Cryptech Core Team"},"id":"tag:wiki.cryptech.is,2016-12-15:\/ASICImplementations","summary":"<h2>Introduction<\/h2>\n<p>The aim of the Cryptech project is to develop an open, free, and\nauditable HSM.  The Cryptech HSM includes both SW and HW parts.  In at\nleast the first iteration of the Cryptech HSM, the HW parts are\nimplemented using FPGA devices.  However, the ability to implement the\nHW \u2026<\/p>","content":"<h2>Introduction<\/h2>\n<p>The aim of the Cryptech project is to develop an open, free, and\nauditable HSM.  The Cryptech HSM includes both SW and HW parts.  In at\nleast the first iteration of the Cryptech HSM, the HW parts are\nimplemented using FPGA devices.  However, the ability to implement the\nHW parts in a Cryptech ASIC device in a future iteration is anticipated\nin the design.  This text provides a short description of what the HW\npart of the Cryptech HSM contains, the design style used, and what would\nhave to change in order to implement the HW part in an ASIC.<\/p>\n<h2>General digital functions and internal memories<\/h2>\n<p>The Cryptech digital functionality cores, such as the SHA-256 core, are\nwritten in generic RTL (Register Transfer Level) Verilog code.  The code\nis written in a fairly conservative coding style and use language\nfeatures from IEEE 1364-2001 (aka Verilog 2001).<\/p>\n<p>All RTL code is divided into modules that contain one process for register updates and reset (<em>reg_update<\/em>), one or more combinational processes for datapath and support logic such as counters. Finally if needed, each module has a separate process that implements the logic for the final state machine that controls the behaviour of the module.<\/p>\n<p>All cores are divided into a core, for example <em>sha256_core.v<\/em> and a number of submodules the core instantiates. The core provides raw, wide ports (256 bit wide key for AES for example) that is not suitable to use in a stand alone system. Instead each core comes with a top level wrapper, for example <em>sha256.v<\/em>. This top level wrapper contains all registers and logic needed to provide all functionality of the core via a simple 32-bit memory like interface. If the core is going to be used as a tightly integrated submodule, the wrapper can be discarded. Similarly, if the core is going to be used in a bus system that use a specific bus standard such as AMBA AHB, CoreConnect or WISHBONE, only the top level wrapper will be needed to be replaced or modified to match the desired bus standard.<\/p>\n<p>The RTL code does not explicitly instantiate any hard macros such as\nmemories, multipliers, etc.  Instead all such functions are left to the\nsynthesis tool to infer based on the code. All memories are placed in separate modules to allow easy modification of the design. In an ASIC setting any memories not automatically mapped will be replaced by instantiation of specific macros.<\/p>\n<p>Some of the memories in the designs have combinational read (i.e the read\ndata is not locked by an output register, which infers a one cycle read\nlatency). For some FPGA technologies these memories are not compatible with the available physical memories. The synthesis tools therefor implement these memories\nusing separate registers rather than selecting a memory instance.  In an ASIC\nimplementation these memories would likely become real memory macros to allow for a faster and more compact implementation.<\/p>\n<h2>Interfaces<\/h2>\n<p>External interfaces such as GPIO, Ethernet GMII, UART, etc., will always\nrequire some modification for the Cryptech design to be implemented in a\ngiven technology, whether it is a specific FPGA type or an ASIC.  The\nimportant thing is that the Cryptech design does not use technology\nspecific macros to implement the interfaces.  But pin assignments,\ntiming, and electrical requirements will always require adjustment and\nwork.<\/p>\n<h2>Clocking and reset<\/h2>\n<p>The design style used in the Cryptech Verilog code currently follows the\nguidelines from the FPGA vendors Altera and Xilinx.  This means that we\nuse synchronous reset.  For an ASIC implementation this will also work,\neven though asynchronous reset is far more common in ASIC designs.  Changing\nto asynchronous reset is not a very big undertaking however, as the\nregister reset and update clocking are separated into easily\nidentifiable processes (<em>reg_update<\/em>) in the modules.<\/p>\n<p>Most if not all registers in the Cryptech Verilog code have a defined\nreset state.  Most registers also have a write enable signal that\ncontrols the update.  This corresponds well with the registers available\nin FPGA technologies from Altera and Xilinx and their recommended design strategy from the vendors. This is also in line with common\nand good design styles for ASICs, which allows for compact code and low\npower implementations. The design is currently not use any clock gating. In future revisions this might be added if power consumption needs to be reduced and does not add side channel issues.<\/p>\n<h2>External memories<\/h2>\n<p>The Cryptech hardware design will use external persistent memories for\nprotected key storage as well as external SRAM for protected master key\nstorage.  In an ASIC implementation the master key memory would probably\nbe integrated to further enhance security.<\/p>\n<p>Just like other external interfaces (see above), the interfaces for the\nexternal memories do not use any explicitly instantiated hard macros in\nthe FPGAs.<\/p>\n<h2>Entropy sources<\/h2>\n<p>The current Cryptech design contains two separate physical entropy\nsources.<\/p>\n<p>1: An avalanche noise based entropy source placed outside the FPGA.  The\nentropy source signal is sampled by the FPGA using a flank detection\nmechanism.<\/p>\n<p>An ASIC implementation would be able to use the external entropy source just like the FPGA. Furthermore, depending on the process options, it might be\npossible to have an internal avalanche diode based on ESD structures commonly used in I\/O pin implementations. In a power management capable process, functionality available in step-up converters might also be possible to use as internal avalanche noise source.<\/p>\n<p>Note that integrating the avalanche noise source does not mean that an off-chip noise source is excluded. The Cryptech RNG is modular and having both an internal and an external avalanche noise source is quite possible.<\/p>\n<p>2: A ring oscillator based entropy source placed inside the FPGA. The ring oscillator used in the FPGA is based on carry chain feedback through adders. An ASIC implementation of this ring oscillator should work and produce noise with similar characteristics. However the specific circuit will have to be characterized with explicit layout and qualified for the given process.<\/p>\n<h2>Toolchain<\/h2>\n<p>Crypech currently use Verilog simulators for functional verification and commercial FPGA tools for implementation including time analysis.<\/p>\n<p>An ASIC implementation will require several new tools including tools for synthesis, place &amp; route and static time analysis that is acceptable as sign-off tool by the chip process vendor.<\/p>\n<h2>Conclusions<\/h2>\n<p>The HW designed for the first iteration of Cryptech is not specifically\ndesigned for FPGA implementation, but is in fact designed in a generic\nway to allow for easy implementation using different technologies such\nas ASICs.<\/p>\n<p>There are however parts of the design that will have to be updated or\nmodified in order to create a good ASIC implementation.  The Cryptech\nproject is confident that we know what those parts are and what they\nwould entail.<\/p>\n<p>Developing an ASIC will however require new tools which will incur costs.<\/p>","category":{"@attributes":{"term":"FutureWork"}}},{"title":"Issues of an Assured Tool-Chain","link":{"@attributes":{"href":"https:\/\/wiki.cryptech.is\/AssuredTooChain","rel":"alternate"}},"published":"2016-12-15T22:44:00+00:00","updated":"2016-12-15T22:44:00+00:00","author":{"name":"Cryptech Core Team"},"id":"tag:wiki.cryptech.is,2016-12-15:\/AssuredTooChain","summary":"<p>We do not have any assurance that our basic tools are not compromised.<\/p>\n<ul>\n<li>Compilers<\/li>\n<li>Operating Systems<\/li>\n<li>Hardware Platforms<\/li>\n<li>Verilog and Other Tools to Produce Chips<\/li>\n<\/ul>\n<p>At the base, is the compiler.  The fear was first formally expressed in\nKen Thompson's 1984 Turing Award Lecture\n<a href=\"http:\/\/www.ece.cmu.edu\/~ganger\/712.fall02\/papers\/p761-thompson.pdf\">Reflections on Trusting Trust<\/a>.<\/p>\n<p>David A \u2026<\/p>","content":"<p>We do not have any assurance that our basic tools are not compromised.<\/p>\n<ul>\n<li>Compilers<\/li>\n<li>Operating Systems<\/li>\n<li>Hardware Platforms<\/li>\n<li>Verilog and Other Tools to Produce Chips<\/li>\n<\/ul>\n<p>At the base, is the compiler.  The fear was first formally expressed in\nKen Thompson's 1984 Turing Award Lecture\n<a href=\"http:\/\/www.ece.cmu.edu\/~ganger\/712.fall02\/papers\/p761-thompson.pdf\">Reflections on Trusting Trust<\/a>.<\/p>\n<p>David A. Wheeler's PhD thesis, <a href=\"http:\/\/www.dwheeler.com\/trusting-trust\/\">Fully Countering Trusting Trust through Diverse Double-Compiling<\/a>\noutlines how we might deal with the compiler trust conundrum.<\/p>","category":{"@attributes":{"term":"FutureWork"}}}]}