The Jabberwolk has moved

April 29, 2013

I’ve moved my blog to tommd.github.io which currently includes my exploration of image processing as well as a run-down of commsec-keyexchange.

For those of you who are as lazy as me about fixing and reinstalling broken GHC packages, I’ve ripped off the well-known and loved ghc-pkg-clean script and made it recursive.  No promises of termination:
 
function ghc-pkg-supercleaner() {
    b="0"
    for p in `ghc-pkg check $* 2>&1  | grep problems | awk '{print $6}' | sed -e 's/:$//'`
    do
        echo unregistering $p; ghc-pkg $* unregister $p;
        cabal install $p; b="1"
    done
    if [ $b == "1" ];
      then ghc-pkg-supercleaner
    fi
}

HacPDX II is set for July 22-24, 2011 (Friday – Sunday)! Be sure to register or else, exactly like HacPDX 1, you might not get network access.

I hope to spend most of my time working on Hackage (see here and here) with other interested Haskellers both locally and via IRC (#hacpdx on freenode). Still, I might sqeeze in some time for my seemingly never-complete side projects… we’ll see!

Hope to see you there!

Announcing a new version of crypto-api and the first release of both
DRBG and MonadCryptoRandom. Links can be found at the end of the e-mail
along with acknowledgments.

Crypto-API

Crypto-api, a generic interface for cryptographic algorithms, has hit
version 0.2. Notable additions include:

If you use cryptographic operations then crypto-api is here to give you
an easy implementation and algorithm independent interface.

  • The “OS-independent” random source (System.Crypto.Random) now works on
    Windows. (thanks Stuart Dootson)

  • No more MonoLocalBinds or ScopedTypeVariables, GHC < 6.12 should now
    work.

  • No more “instance Monad Either” what-so-ever. It got in the way other
    other instances more than being helpful.

  • The order of arguments are swapped in CryptoRandomGen so it fits
    standard convention.

  • Build deps updated so GHC-7 works (Paulo Tanimoto)

If you maintain block ciphers, hashes, asymmetric ciphers, stream
ciphers, or random number generators then I encourage you to make an
instance for the matching Crypto-API class. Also, make use of the
testing infrastructure provided if possible. AES, SHA*, and Twofish
have lots of known-answer tests and a simple module for parsing NIST
test files is also included.

MonadCryptoRandom

Much like MonadRandom, this has a class for random value generation
(CRandom), a class for monadic operation (MonadCryptoRandom), a
transformer (CRandT), and instances to make all this useful. The main
difference between MonadCryptoRandom from MonadRandom is the use of
CryptoRandomGen instead of RandomGen and the possibility for failure by
way of MonadError.

DRBG

DRBG, a cryptographically secure pseudo random number generator based on
the NIST SP 800-90 standard, is now on hackage. This uses the
crypto-api CryptoMonadRandom class and cryptohash >= 0.6.1.

pureMD5

Incidently, pureMD5’s deps have been bumped to use the new crypto-api.

Future Work

Once we have a cryptocipher like package with BlockCipher instance for
AES, I imagine DRBG will include cipher-based bit generators. Aside
from that, more testing is certainly an order.

Disclaimer

Just so there’s no confusion about my part in all this.

  1. Crypto-API: I started discussion and hacked up the code for this interface (and that’s all it is, an interface for algorithm producers and algorithm consumers to meet in the middle)
  2. MonadCryptoRandom: This is just a rip-off of the motivating idea behind MonadRandom but it allows failure. I tried to match even the names as closely as was sensible (CRand instead of Rand, etc).
  3. DRBG: Its a translation of a NIST spec. I never thought I’d do translator work, but here I am doing English to Haskell transcription.

Links

Crypto-API
Hackage: http://hackage.haskell.org/package/crypto-api
Home: http://trac.haskell.org/crypto-api/wiki

DRBG
Hackage: http://hackage.haskell.org/package/DRBG
Home: None, will share home and bug tracker with crypto-api

MonadCryptoRandom
Hackage: http://hackage.haskell.org/package/monadcryptorandom
Home: None, will share home and bug tracker with crypto-api

Thanks go to Vincent Hanquez for his cryptohash package including
crypto-api instances, Stuart Dootson for getting the Windows code in
System.Crypto.Random working, Paulo Tanimoto for pointing out/patching
an issue with GHC-7

Ben Boeckel – I haven’t forgot about your patch but it seems
cabal/hackage rejects such a .cabal file. I’ll have to investigate or
talk to Duncan.

GPS, a toy library for basic Latitude/Longitude based calculations, has just received a small face lift. I now leverage the GPX library and its classes/data types of Lat, Lon, and Time so GPS no longer needs to declare Latitude, Longitude, Coordinate, or Location.

In addition to the old addVector, restLocations, heading and divideArea functions GPS now also includes a convexHull function. Here’s how to get a convex hull from a gpx trace produced by a logger:

import Text.XML.HXT.Arrow
import Data.GPS
import Data.Geo.GPX
import System.Environment
import Control.Monad

main = do
        fs <- getArgs
        g >> xpickleDocument xpickle [(a_indent, v_1)] "convexHull.gpx")
        return ()

And comparing the traces this looks right.

Crypto-API Released

September 7, 2010

Crypto-API (hackage, haddock) 0.0.0.1 is now on Hackage.

Crypto-API is a generic interface for cryptographic operations, platform independent quality Entropy, property tests and known-answer tests (KATs) for common algorithms, and a basic benchmark infrastructure. Maintainers of hash and cipher implementations are encouraged to add instances for the classes defined in Crypto.Classes. Crypto users are similarly encouraged to use the interfaces defined in the Classes module.

Previous blogs on crypto-api have discussed its design and the RNG interface. These were to aid design discussion, so note the code there won’t work without minor changes.

Example: Hashes

An example class instance:

 instance Hash MD5Context MD5Digest where
        outputLength = Tagged 128
        blockLength  = Tagged 512
        initialCtx   = md5InitialContext
        updateCtx    = md5Update
        finalize     = md5Finalize

The hash user can remain agnostic about which type of hash is used:

 authMessage :: Hash ctx dgst => B.ByteString -> MacKey -> dgst -> Bool
 authMessage msg k = (==) (hmac' k msg)

 hashFile :: Hash c d => FilePath -> IO d
 hashFile = liftM hash L.readFile

Example: Block Cipher

Users of block cipher instances probably want to use Crypto.Modes:

 import Crypto.Classes
 import Crypto.Modes (cbc)
 import Data.Serialize (encode)

 cipherMsgAppendIV :: (BlockCipher k) => k -> B.ByteString -> IO B.ByteString,
 cipherMsgAppendIV msg = do
     iv <- getIVIO
     return $ B.append (encode iv) (cbc k iv msg)

Example RNG

Its easy to get a DRBG (aka PRNG) that can be used for generating seed material for keys, building asymmetric keys, obtaining initialization vectors, nonces, or many other uses. See Crypto.Random (which users System.Crypto.Random for entropy):

newGenIO :: CryptoRandomGen g => IO g
genBytes :: (CryptoRandomGen g) => g -> ByteLength -> Either GenError (ByteString, g)
getIV :: (CryptoRandomGen g, BlockCipher k) => g -> Either GenError (IV k, g)
buildKeyPair :: CryptoRandomGen g => g -> BitLength -> Maybe ((p, p), g)

Tests

A quick peek in the Test.Crypto module will show you that testing is decent (particularly for AES) and getting better all the time.

Given a BlockCipher instance the entire test code for an AES implementation is:

-- Omitting hack-job instances for SimpleAES in this snippet
 main = do
         ts <- makeAESTests (AESKey $ B.replicate 16 0)
         runTests ts

This automatically reads in hundreds of NIST Known Answer Tests (KATs) and checks the implementation. A lesser infrastructure exists for testing Hashes. Cipher property tests are still needed.

Example: Benchmarking

As with tests, benchmarking is quite simple:

 import Data.Digest.Pure.MD5
 import Benchmark.Crypto
 import Criterion.Main

 main = defaultMain [benchmarkHash (undefined :: MD5Digest) "pureMD5"]

Closing

So please, if you maintain a hash, cipher, or other cryptographic primitive please add instances for the crypto-api classes. If you need these primitives then consider using the crypto-api interfaces, allowing you to remain algorithm and implementation agnostic in all your low level code.

RandomGen – The Old Solution

Mathematicians talk about random bits and many programmers talk about streams of random bytes (ex: /dev/urandom, block cipher counter RNGs), so its a bit odd that Haskell adopted the RandomGen class, which only generates random Ints. Several aspects of RandomGen that are non-ideal include:

  • Only generates Ints (Ints need to be coerced to obtain other types)
  • By virtue of packaging it is often paired with StdGen, a sub-par generator
  • Mandates a ‘split’ operation, which is non-sense or unsafe for some generators (as BOS pointed out in a comment on my last post)
  • Doesn’t allow for generator failure (too much output without a reseed) – this is important for cryptographically secure RNGs
  • Doesn’t allow any method for additional entropy to be included upon request for new data (used at least in NIST SP 800-90 and there are obvious default implementations for all other generators)

Building Something Better

For these reasons I have been convinced that building the new crypto-api package on RandomGen would be a mistake. I’ve thus expanded the scope of crypto-api to include a decent RandomGenerator class. The proposal below is slightly more complex than the old RandomGen, but I consider it more honest (doesn’t hide error conditions / necessitate exceptions).

class RandomGenerator g where
        -- |Instantiate a new random bit generator
        newGen :: B.ByteString -> Either GenError g

        -- |Length of input entropy necessary to instantiate or reseed a generator
        genSeedLen :: Tagged g Int

        -- |Obtain random data using a generator
        genBytes        :: g -> Int -> Either GenError (B.ByteString, g)

        -- |'genBytesAI g i entropy' generates 'i' random bytes and use the
        -- additional input 'entropy' in the generation of the requested data.
        genBytesAI      :: g -> Int -> B.ByteString -> Either GenError (B.ByteString, g)
        genBytesAI g len entropy =
                ... default implementation ...

        -- |reseed a random number generator
        reseed          :: g -> B.ByteString -> Either GenError g

Compared to the old RandomGen class we have:

  1. Random data comes in Bytestrings. RandomGen only gave Ints (what is that? 29 bits? 32 bits? 64? argh!), and depended on another class (Random) to build other values. We can still have a ‘Random’ class built for RandomGenerator – should we have that in this module?
  2. Constructing and reseeding generators is now part of the class.
  3. Splitting the PRNG is now a separate class (not shown)
  4. Generators can accept additional input (genBytesAI). Most generators probably won’t use this, so there is a reasonable default implementation (fmap (xor additionalInput) genBytes).
  5. The possibility to fail – this is not new! Even in the old RandomGen class the underlying PRNGs can fail (the PRNG has hit its period and needs a reseed to avoid repeating the sequence), but RandomGen gave no failure mechanism. I feel justified in forcing all PRNGs to use the same set of error messages because many errors are common to all generators (ex: ReseedRequred) and the action necessary to fix such errors is generalized too.

    In Closing

    The full Data.Crypto.Random module is online and I welcome comments, complaints and patches. This is the class I intend to force users of the Crypto API block cipher modes and Asymmetric Cipher instances to use, so it’s important to get right!

Haskell has a moderate history and collection of cryptographically related libraries. For simple hashes and short-message encryption the Crypto library filled many needs. Higher-performing needs for SHA2 and MD5 were supported by pureMD5 and SHA. Gradually the AES, SimpleAES, TwoFish, RSA, ECC, and cryptohash packages appeared, most providing FFI to C implementations, which seemed to solve most users needs for individual low-level algorithms. Unfortunately, none of these gives developers a uniform interface with which to access any of a class of algorithms. To fill this gap I’ve been discussing / developing the crypto-api package.

Crypto-API is an interface to four classes of algorithms plus related helper functions. The four classes include hashes, block ciphers, stream cipher, and asymmetric cipher while related modules includes testing, benchmarking, platform-independent rng, cipher modes, and hash based message authentication codes (hmac).

NOTE: Crypto-API isn’t on Hackage yet, but will be soon. This post is intended to facilitate discussion and motivate package maintainers to write instances.

Hashes

The BlockCipher and Hash classes are the most stable. The interface for Hash is:

class (Binary d, Serialize d, Eq d, Ord d) => Hash ctx d | d -> ctx, ctx -> d where
    outputLength  :: Tagged d BitLength         -- ^ The size of the digest when encoded
    blockLength   :: Tagged d BitLength         -- ^ The size of data operated on in each round of the digest computation
    initialCtx    :: ctx                        -- ^ An initial context, provided with the first call to 'updateCtx'
    updateCtx     :: ctx -> B.ByteString -> ctx -- ^ Used to update a context, repeatedly called until all data is exhausted
                                                                         --   must operate correctly for imputs of n*blockLength bytes for n `elem` [0..]
    finalize      :: ctx -> B.ByteString -> d   -- ^ Finializing a context, plus any message data less than the block size, into a digest

That is, the hash algorithm developer only needs to build the most basic definition of a hash including initial context, update routine, and finalize. It is the responsibility of the higher level routine to obey certain semantics, such as only providing bytestrings that are a multiple of the block length to the update function. Users don’t need to know any of this – all they should care about is:

hash :: (Hash ctx d) => L.ByteString -> d
hash' :: (Hash ctx d) => B.ByteString -> d

… hashing strict or lazy bytestrings.

hashFunc :: Hash c d => d -> (L.ByteString -> d)
hashFunc' :: Hash c d => d -> (B.ByteString -> d)

… obtaining the function that produced a digest.

hmac :: Hash c d => B.ByteString -> L.ByteString -> d
hmac' :: (Hash c d) => B.ByteString -> B.ByteString -> d

… or computing an HMAC of a key + message.

I’d call this a simple interface and one that satisfies the majority of users. There was a comment about including ‘hash’ and associates in the class interface so FFI implementations could override the default for performance reasons. A few optimizations closed the gap significantly which is why these functions remain separate so far. The gap could probably be closed further if ByteString.Lazy would read in chunks of a size modulo 1024 bits (instead of 32KB – 8 bytes, which is a piddly multiple of 64).

Hash instances were made for cryptohash and pureMD5. So far consumers include DRBG and the algorithm specific tests.

Block Ciphers

The BlockCipher class is:

class (Binary k, Serialize k) => BlockCipher k where
    blockSize     :: Tagged k BitLength
    encryptBlock  :: k -> B.ByteString -> B.ByteString
    decryptBlock  :: k -> B.ByteString -> B.ByteString
    buildKey      :: B.ByteString -> Maybe k
    keyLength     :: k -> BitLength       -- ^ keyLength may inspect its argument to return the length

Again, this is intended to capture the essence of block ciphers. Also, a smart constructor ‘buildKey’ is provided so the implementation can weed out weak keys. A non-ideal instance for SimpleAES (see appendix to this blog) was made so I could run benchmarks and mode tests. Crypto-API includes an extensive test framework for AES + modes which is built around parsing NIST KAT files. Note the modes are not finished, not optimized, and only ECB CBC and OFB are tested (I’ve been programming during cocktail hour…).

I’ve yet to include modes as overridable routines of BlockCipher (see above cited comment). This is partly due to a lack of evidence showing a (very likely) performance gain that generalized routines can’t match. Once I see that evidence then I’ll be more likely to make the change.

As with hashes, most users won’t use the class interface but rather the higher level functions provided by Modes.hs (getIV, cbc, unCbc, etc).

RNG

The platform independent RNG is backed by urandom on *nix and the WinCrypt API on windows. My thinking here is any user of /dev/random (on *nix) must be so concerned about security they are carefully controlling most aspects of the platform, thus the non-portability of directly reading /dev/random is inconsequential; e.g. there’s no need to bother with a library to access /dev/random.

The interface: (untested on Windows! If you care about windows please test and debug!)

getEntropy :: ByteLength -> IO B.ByteString
openHandle :: IO CryptHandle
hGetEntropy :: CryptHandle -> Int -> IO B.ByteString
closeHandle :: CryptHandle -> IO ()

If you rarely need quality entropy (ex: just for a quality seed to a PRNG) then use ‘getEntropy’. Frequent users can amortize some handle opening costs by explictly managing their resources and calling the other three functions.

Stream Ciphers

Stream ciphers are assumed to be much like a block cipher in 1-bit CFB mode:

class (Binary k, Serialize k) => StreamCipher k iv | k -> iv where
    buildStreamKey        :: B.ByteString -> Maybe k
    encryptStream         :: k -> iv -> B.ByteString -> (B.ByteString, iv)
    decryptStream         :: k -> iv -> B.ByteString -> (B.ByteString, iv)
    streamKeyLength       :: k -> BitLength

A simple instance would be:

data Xor = Xor B.ByteString

instance Bin.Binary Xor where
    get = undefined
    put = undefined

instance Ser.Serialize Xor where
    get = undefined
    put = undefined

instance StreamCipher Xor Int where
    buildStreamKey = Just . Xor
    encryptStream (Xor k) iv msg = (ct, (B.length msg + iv) `rem` B.length k)
      where
      ct = B.pack $ zipWith xor (B.unpack msg) (drop iv $ cycle $ B.unpack k)
    decryptStream = encryptStream
    streamKeyLength (Xor k) = 8 * (B.length k)

Asymmetric Ciphers

The asymmetric cipher instance currently doesn’t fit any of the available algorithms as it is generalized over random generators. It also is the most likely to change – there are
things I’d change about it right now, but its best to leave the more irk-some aspects to motivate some of you readers to contribute / comment ;-)

class (Binary p, Serialize p) => AsymCipher p where
    generateKeypair :: RandomGen g => g -> BitLength -> Maybe ((p,p),g)
    encryptAsym     :: p -> B.ByteString -> B.ByteString
    decryptAsym     :: p -> B.ByteString -> B.ByteString
    asymKeyLength       :: p -> BitLength

In Closing

1) If you use or develop cryptographic algorithms then join the discussion. I might not use your input but I will carefully consider all comments. Discussion has lead to substantial changes already (thanks guys!). I’m particularly keen on input from stream or asymmetric cipher users.

2) If you maintain any crypto packages then please update to include the correct crypto-api instances. If your package is a block cipher then make sure you’re exporting a pure interface in addition to particular modes.

3) If you use Windows then please help shore up the System.Crypto.Random module – I know it needs work!

4) If you use crypto packages please don’t make an instance or only do so to submit them upstream! Instance belong with the algorithm implementation!

5) Everyone else who wants to help feel free to write modes (XTS, GCM, CTR, etc), make fixes & optimizations, add tests (cipher properties, known answer tests), fix ByteString.Lazy.Internal.defaultChunkSize or export hGetContentsN, and add Data.Crypto.Padding (ex: pkcs5). If none of that interests you but the general topic of cryptography in Haskell does then consider working to improve hecc, add TLS or digest-auth to HappStack, write an IPSec implementation, make a pfkey2 package, improve GHC optimization of the algorithms, or make more fitting primitives!

Appendix on SimpleAES:

SimpleAES exported sufficient constructs with which to build an instance but it isn’t very clean. The main issues are:
1) Building a key can throw exceptions (when it should use Maybe or Either) and the result of key expansion (a costly operation in AES) isn’t stored but recomputed each time.
2) A properly sized IV is required even for ECB mode – which doesn’t actually use an IV. Worse, the “encryptMsg'” function will actually expand the size of data even when using ECB mode.
3) The key isn’t it’s own type, which is a good practice in addition to being needed to make an instance. This ties back to the smart constructor concept of #1.

I always assumed my run was about 5k, but being out of shape it felt more like 7km.  Eventually this bugged me enough that I spent a whole ten minutes at a coffee shop to learn the GPX library and make a program that converts my GPX traces (I carry a GPS logger on jogs) to a distance.  Thank you hackage, thank you Tony.

module Main where

import Data.Geo.GPX
import Data.GPS
import Control.Monad
import System.Environment (getArgs)


main = do
 file <- liftM head getArgs
 run <- readGpxFile file
 let cs = map (degreePairToDMS. latlon) . trkpts . head . trksegs . head . trks . head $ run
     pairs = zip cs (drop 1 cs)
     dist = sum (map (uncurry distance) pairs)
 print dist

Introductions

January 26, 2010

Some introductions are an order.  Megan, meet the world.  World, meet Megan.