A Better Foundation for Random Values in Haskell
September 2, 2010
RandomGen – The Old Solution
Mathematicians talk about random bits and many programmers talk about streams of random bytes (ex: /dev/urandom, block cipher counter RNGs), so its a bit odd that Haskell adopted the RandomGen class, which only generates random Ints. Several aspects of RandomGen that are non-ideal include:
- Only generates Ints (Ints need to be coerced to obtain other types)
- By virtue of packaging it is often paired with StdGen, a sub-par generator
- Mandates a ‘split’ operation, which is non-sense or unsafe for some generators (as BOS pointed out in a comment on my last post)
- Doesn’t allow for generator failure (too much output without a reseed) – this is important for cryptographically secure RNGs
- Doesn’t allow any method for additional entropy to be included upon request for new data (used at least in NIST SP 800-90 and there are obvious default implementations for all other generators)
Building Something Better
For these reasons I have been convinced that building the new crypto-api package on RandomGen would be a mistake. I’ve thus expanded the scope of crypto-api to include a decent RandomGenerator class. The proposal below is slightly more complex than the old RandomGen, but I consider it more honest (doesn’t hide error conditions / necessitate exceptions).
class RandomGenerator g where -- |Instantiate a new random bit generator newGen :: B.ByteString -> Either GenError g -- |Length of input entropy necessary to instantiate or reseed a generator genSeedLen :: Tagged g Int -- |Obtain random data using a generator genBytes :: g -> Int -> Either GenError (B.ByteString, g) -- |'genBytesAI g i entropy' generates 'i' random bytes and use the -- additional input 'entropy' in the generation of the requested data. genBytesAI :: g -> Int -> B.ByteString -> Either GenError (B.ByteString, g) genBytesAI g len entropy = ... default implementation ... -- |reseed a random number generator reseed :: g -> B.ByteString -> Either GenError g
Compared to the old RandomGen class we have:
- Random data comes in Bytestrings. RandomGen only gave Ints (what is that? 29 bits? 32 bits? 64? argh!), and depended on another class (Random) to build other values. We can still have a ‘Random’ class built for RandomGenerator – should we have that in this module?
- Constructing and reseeding generators is now part of the class.
- Splitting the PRNG is now a separate class (not shown)
- Generators can accept additional input (genBytesAI). Most generators probably won’t use this, so there is a reasonable default implementation (fmap (xor additionalInput) genBytes).
- The possibility to fail – this is not new! Even in the old RandomGen class the underlying PRNGs can fail (the PRNG has hit its period and needs a reseed to avoid repeating the sequence), but RandomGen gave no failure mechanism. I feel justified in forcing all PRNGs to use the same set of error messages because many errors are common to all generators (ex: ReseedRequred) and the action necessary to fix such errors is generalized too.
The full Data.Crypto.Random module is online and I welcome comments, complaints and patches. This is the class I intend to force users of the Crypto API block cipher modes and Asymmetric Cipher instances to use, so it’s important to get right!