hsXenCtrl and pureMD5

August 7, 2008 by tommd

On vacation I found some time to upload the new hsXenCtrl library (0.0.7) and pureMD5 (0.2.4)

The new hsXenCtrl includes the System.Xen module, which is a WriterT ErrorT transformer stack and a brief attempt at ‘Haskellifying’ the xen control library.  I find it much more useful for simple tasks like pausing, unpasing, creating and destroying domains.  The API is still subject to change without notice as plenty of function are still very ‘C’ like (ex: scheduler / sedf functions).

pureMD5 received a much smaller change - some users noticed the -fvia-c caused compilation headaches on OS X.  After removing the offending flag, some benchmarks revealed no measureable difference in speed, so this is an over-due change. OS X users rejoice!

Past and Future libraries

June 18, 2008 by tommd

Hello planet, as my first post that gets placed on planet.haskell.org I decided to do a quick recap of the libraries I maintain and muse about future libraries.  My past posts include why static buffers make baby Haskell Curry cry and fun academic papers.

The Past

* pureMD5: An implementation of MD5 in Haskell using lazy ByteStrings.  It performs within an order of magnitude of the typical ‘md5sum’ binary, but has known inefficiencies and could be improved.

* ipc: A trivial to use inter-process communication library.  This could use some work, seeing as structures that serialize to over 4k get truncated currently.  I’ll probably only come back to this if I end up with a need for it.

* control-event: An event system for scheduling and canceling events, optimized for use with absolute clock times.

The Present

* hsXenCtrl: This library is intended to open the doors for Haskell apps to interact with and perhaps manage Xen.  Currently its just straight forward ‘c’ bindings to an old version of <xenctrl.h>, but the intent is to build a higher level library with useful plumbing.

* NumLazyByteString: Not sure if I’ll bother finishing this one, but it adds ByteString to the Num, Enum, and Bits type classes.  I just thought it would be funny to have lazy adding allowing: L.readFile “/dev/urandom” >>= \a -> print (a + a `mod` 256)

The Future

I tend to be a bit of a bouncing ball in terms of what nterests me.  Near and mid-term tasks will probably be a couple of these:

* A .cabal parser that can create a dependency graph for Haskell packages (not modules).  But should I use an outside package like graphviz or go pure / self contained Haskell?

* An implementation of something distributed like a P2P or ad-hoc networking protocol.  Would this be Pastry then Awerbuchs work or OLSR2?  These would be large tasks with their own ups and downs.

* Finally learn happs and make some sort of web Xen management system using hsXenCtrl.

* Learn Erlang - just because it looks cool too.

* Forget programming (and blogging) - read more TaPL!

Objections to Lenovo + SuSE

June 10, 2008 by tommd

I just received my Thinkpad T61p and was eager to try SUSE, seeing as I have never used SUSE before but have used about every other major Linux distribution. Sadly it didn’t pan out:

1) Updating the software via zmd

Sat for over half an hour “resolving dependencies” without telling me anything else (buttons were grayed out). Sorry, but rule number one: never leave the user out of the loop.

2) Installing a basic -devel library (libpng, depends on zlib):

resolve-dependency: 50 sec

transact: 54 sec

update-status: 49 sec

And when I add in zmd updating the list of software, I am talking about a three or four minute process just to install trivial developer libraries such as libpng-devel. I’m sure this is just a configuration issue but it really gave me a sour taste.

3) Using Yast2

A) Common libraries and programs not in SUSEs repo

Many packages weren’t even found, for example ‘wine’ and ‘libjpeg-devel’ turned up zero search results. Perhaps there is some reason the GUI isn’t giving me much, but using yast through the cli doesn’t seem to be any comparison to apt or yum. Am I mistaken in thinking yast2 should help here?
B) Many options not entirely visible

“Installation into Dire” ok, so I know thats “Directory”, but what does it do? Can it help with any of my issues? Why can’t I mouse over it and read it all or see the entire text in a status bar?

4) Upgrading to 10.1 and 10.2

A) Three was no blatant ‘upgrade to 10.1′ button, but I found it and lived, no big deal

B) Upgrading to 10.2, short of downloading openSUSE DVDs, seems to require registration. This is amazingly stupid if so. Even if there is a way to do so without registering why have the road block method be the easiest one to stumble across (via yast2 -> online update)?

C) Registration requires a “Service Tag” which probably came with my laptop (or could it be that I am not entitled to this service?). Being a savvy computer user I know right where to look for service tag numbers… after trying numbers on the laptop, the CD, the CD sleeve, and some on the box I gave up (I couldn’t find any documentation that identified such a number, obviously).

What does all this mean to the average Troll Enthusiast Fanatic Open-source-developer-wanna-be Parasite (TEFOP)?

1) I’m too tired

2) I don’t know how to use SuSE (and don’t care enough to spend much time)

3) My Internet connection must be mush (nope, I’ve been getting 200KBps, all of my HTTP downloads were fine)

And what do I want you to take away?

1) Hardware vendors shouldn’t ship 3 year old OSes (10.0 came out 6Oct2005)

2) More importantly: Software vendors trying to improve on their brand name should not allow older versions of their product to continue shipping on new hardware!

3) Requiring registration is a bad idea, but if it must be so then tell customers where to get the information you are requesting of them.

4) Keep the user informed. I don’t like 4 minute install times for 400KB libraries. I literally found the homepage, downloaded, compiled and installed a program before my single package install finished in two cases.

Planet Haskell

June 9, 2008 by tommd

Yes, the owner of this blog is requesting to get added to Planet Haskell - is this authentication enough? ;-)

Recommended Reading

May 25, 2008 by tommd

I’ve been wanting to advance my education through a PhD program for a while now. As such, I’ve been reading a reasonable number of papers mostly in the field programming languages (strong bias toward SPJs work), but also in Ad-hoc networks (strong bias toward Baruch Awerbuch papers). I can’t say I’m too selective on what I like, but here are some of my likes anyway. Enjoy and feel free to post your papers or any discussion of the ideas presented in these papers.

Within The World of Languages

Simon Peyton-Jones, “Call-pattern Specialisation for Haskell Programs” *

Simon Marlow et al “Faster Laziness Using Dynamic Pointer Tagging

Simon Peyton-Jones et al, “Playing by the Rules: Rewriting as a practical optimisation technique in GHC” *

Tom Schrijvers et al “Type Checking with Open Type Functions” *

Duncan Coutts et al “Stream Fusion: From Lists to Streams to Nothing at All” *

Neil Mitchell and Colin RuncimanA Supercompiler for Core Haskell” * (Looks great, but I want to try it on my own programs to see if it will benefit me as much as I hope)

Peng Li et al “Lightweight Concurrency Primitives for GHC” * (A simpler to understand RTS would be great, but I fear for the performance)

Robert Ennals et al “Task Partitioning for Multi-Core Network Processors” *

Peng Li and Steve Zdancewic “Encoding Information Flow in Haskell” (perhaps not sound, but certainly useful)

Tim Harris and Simon Peyton Jones “Transactional Memory with Data Invariants” * (some functions aren’t available in the standard GHC/STM load, but the paper is fun anyway)

Dana Xu et al “Static Contract Checking for Haskell” * (I don’t know about you, but I almost can’t wait to see the work embodied in a GHC release!)

Every name you know “Roadmap for Enhanced Languages and Methods to Aid Verification

Ad hoc / Distributed Systems / Protocols

Baruch Awerbuch et al “Towards Scalable and Robust Overlay Networks” (See the entire line of papers, including “A Denial-of-Service Resistant DHT” and “Towards a Scalable and Robust DHT”)

Rudolf Ahlswede et al “Network Information Flow

Sachin Katti et al “Network Coding Made Practical” * (Now why isn’t this an option when I click network manager -> ad hoc network in Fedora 9?)

Joshua Guttman “Authentication Tests and the Structure of Bundles” *

Baruch Awerbuch et al “Provably Competitive Adaptive Routing” *

Baruch Awerbuch et al “Medium Time Metric” (This one is just begging for someone to write a paper “The opportunity cost metric”, don’t you think?)

* Easy read (even if it isn’t your field) / very enjoyable

Static Buffers Considered Harmful

May 25, 2008 by tommd

I posted this as a page by accident - so here it is as a blog entry and I’ll delete the page some day.

My previous post discussed how inet_ntoa uses a static buffer which can cause a race condition. Unlike in ‘C’, this is particularly likely to cause a race in Haskell programs due to the quick, easy, and cheap threads using forkIO that (potentially) share a single OS thread. Two bright spots were that inet_ntoa was marked as IO and that the result is usually unimportant.

Another FFI binding, nano-md5, has a similar race condition but is much more series (not marked as IO and the result is a digest).

An even-handed note: iirc, nano-md5 remains hackage mostly as an FFI example - not that this is advertised in the nano-md5 description. “Real” users are told to look at hsOpenSSL and hopenssl - a cursory glance at the code suggests they don’t have this bug. Also, the other bindings don’t require O(n) space - so they are certainly worth switching to.

The nano-md5 line:

digest <- c_md5 ptr (fromIntegral n) nullPtr

is the culprit. It uses ‘nullPtr’ and according to the OpenSSL manual “If md is NULL, the digest is placed in a static array”.

Test code that confirms the bug can be found here - this will run three hash operations in parallel and eventually one result will be the correct first bits with ending bits from one of the other digests. The developer has already fixed the issue for versions 0.1.2+. I’ll wrap this post up with a request for library developers to please work to avoid use of static buffers - they have no place in this forkIO happy playland I call Haskell.

Racing inet_ntoa

April 24, 2008 by tommd

Just because I am feeling lazy wrt any real task, I decided to post about the sillyness that is inet_ntoa. Yes, this is ancient/known stuff to rehash, but you can hit the browser back button at any time.

As most of you probably know, the function inet_ntoa converts an IPv4 address to ascii, storing the result in a static buffer. It is this last part that periodically causes people fun when they forget. This mutable memory issue is revealed easily enough in goofed up ‘C’ statements such as:

  struct in_addr a,b;
  inet_aton("1.2.3.4", &a);
  inet_aton("99.43.214.7", &b);
  printf("addr a: %s\taddr b: %s\n", inet_ntoa(a), inet_ntoa(b));

which returns “addr a: 1.2.3.4 addr b: 1.2.3.4″. Sometimes more complex systems have a race condition (ex: exception handlers calling inet_ntoa), but it isn’t a larger issue in multi-threaded C programs thanks to thread local storage…

unless you cram many logical threads into a single OS thread like in Haskell. Zao in #haskell asked why inet_ntoa was of type IO (meaning, it isn’t a pure / deterministic function) and I correctly guessed it was a wrapper for the ‘C’ call.

Not to rip at any of the libraries folk, who made a faithful foreign function interface for the sockets/networking functions, but - this was a bad idea. Foremost, the use of IO means this can’t be called from any deterministic function even though the desired operation of converting an address to a string IS deterministic. Secondly, some Haskell programmers (myself included) use Haskells threads liberally (perhaps another, positive, blog post on that). So if someone is being brain-dead then they are going to have a bug - likely non-fatal and obvious due to how string representation of addresses are used.

And if you desire to see the race, I have some code… hope it runs… yep:

import Network.Socket (inet_ntoa)
import Control.Concurrent (forkIO, threadDelay)
import Control.Monad (when, forever)

main = do
    let zero = (0,"0.0.0.0")
        one  = (1,"1.0.0.0")
        two  = (2,"2.0.0.0")
        assert = confirm "assert"
        race = \x -> forever (confirm "FAILURE: " x)
    assert zero
    assert one
    assert two
    forkIO $ race zero
    forkIO $ race one
    forkIO $ race two
    threadDelay maxBound

test n s = forever $ confirm "" (n,s)

confirm prefix (n,str) = do
    s <- inet_ntoa n
    when (s /= str) (error (prefix ++ s ++ " does not equal " ++ str))

Edit:
Yes, I know this non-deterministic behavior is being screamed by that ‘IO’ type.
Yes, I know I should write a Haskellized network library.

Business Models and Open Source

April 6, 2008 by tommd

I’ve always thought the ideal work would include a healthy dose of open source. To this end, I try to make my time wasting activities (slashdot, proggit) worth while by noting the various business models and their success. This is an informal brain dump with extra facts pulled in from Wikipedia - as such you should confirm any information before long term storage in your brain.

RedHat

For idealistic reasons, the preferred model would be like RedHats. You compose, improve and build open source projects and products, providing them for free and selling the service (consulting, tailored development, support).

The main problem here is that, as a theoretical small business owner, I want to sell products and not services. In addition to the the discreet nature of products (as opposed to continuing duties of services), the constructive feel of a product oriented business seems nice in contrast to the droning of a service oriented business. Here is a question: is the concept of open source at odds with product driven revenue?

XenSource

The Xen method seemed to be making an open source component and selling closed source support tools. This appeared aimed at preventing RedHat et al. from ’stealing’ the GPL code and owing nothing to XenSource. I say “seemed” because the true aim could have been to get bought out - as XenSource was on 22Oct2007 by Citrix for $500M.

TrollTech

A well known (and often successful) method of dual licensing was employed by TrollTech. TrollTech owned the source to QT which they monotized by selling proprietary licenses in addition to offering the source code under GPL. Like Citrix, on 28Jan2008 Nokia offered (and TrollTech accepted) a $163M buyout.

My SQL

Like TrollTech, MySQL offered both support and dual licensing. I notice this is referred to as a ’second generation’ open source company on Wikipedia - not sure how common this term is, but I take issue with the idea that the distinguishing factor between second and first generation open source companies is their ability/willingness to sell closed source licenses. MySQL was purchased by SUN on 16Feb08 for $1B… are we seeing a trend yet?

The Jabberwolk

March 21, 2008 by tommd

Welcome to the Jabberwolk. This is my blog recently moved from sequence.complete.org. Here I will try to stay on the topics of Haskell, Xen, and Linux but anything technical is game and I’ll warn you that politics might appear (but I’ll try to keep that down).

Edit: My old blog is here.