A few attempts at implementing AES candidates, working from the papers' descriptions alone. Not always an easy task.
DFC The Decorrelated Fast Cipher. Translating description into code is a little fiddly: sometimes requires a bit of thought to determine what they are trying to say (e.g., the key expansion).

The table for EES as given in the paper does not match its definition: the third-to-last block (eb64749a) should be last, and the next two blocks shifted up to fill the gap.

A big endian description in contrast with all the other little endian ones.

Lack of test vectors. Thanks to Brian Gladman for getting some, and for many discussions on endianness.

Uses 64 bit words and multiplies: awkward on current machines, but will be better on future machines.

Use of multiple word sizes (32, 64, 128, etc) confusing for endian reasons.

MARS Sloppy in places: using << for rotate in pseudo code is confusing (particularly as they use <<< in diagrams). And the spec for generating Sboxes is wrong: it should be
S[5i+j] = SHA1(5i | c1 | c2 | c3)j

Difficult parts: generating the Sboxes. The E function was merely fiddly (computing the bitmask M).

I notice that test vectors and corrections have now appeared.

FROG Not very good description of algorithm: sometimes difficult to determine the order of parts of the algorithm. Several typos in document.

Lack of test vectors for complete cipher.

HPC The Hasty Pudding Cipher. Implementation still in progress. I don't like this one, it is far too complicated!

Twofish The hardest part was getting the generation of the Sboxes correct, in particular the modular multiplications: it was a while before I realised that they were producing the wrong answers. I settled on a stupid, but correct formulation.

Q boxes easy: first time. A couple of auxiliary progs to compute the qs, and compute lookup tables for the MDS multiples.

Would like more test data: using all zeros for a key meant that various matrix multiplies produced correct answers (i.e., zero) even though entries were incorrect. Some example Sboxes would have been useful.

After getting correct version, could simplify somewhat by inlining various parts (by hand) to produce a simpler version. Also producing a full-keyed version was straightforward.

Various ways of converting from words to bytes were considered (e.g., unions), until I settled on simple casts (hidden by macros) as the simplest solution. Big vs little endian: developed both at same time.

Algorithm description: good, but involved lots of flipping back and forth during implementation.

RC6 Almost trivial to implement. Got it right first time, even on both endian types.

Then simple to extend to other word-length variants (no test data, so cannot confirm correctness).

No attempt at optimisation yet: there's a lot less to fiddle with, so there's not much chance of any significant optimisations.

Algorithm description: good. The key schedule is hidden away in an appendix, though.