Tuesday, July 14, 2009

The mess of ATF's code

Yes. ATF's code is a "bit" messy, to put it bluntly. I'm quite happy with some of the newest bits but there are some huge parts in it that stink. The main reason for this is that the "ugly" parts were the ones that were written first, and they were basically a prototype; we didn't know all the requirements for the code at that point... and we still don't know them, but we know we can do much better. Even though I'm writing in plural... I'm afraid we = I at the moment :-P

So, is it time for the big-rewrite-from-scratch? NO! Joel Spolsky wrote about why this is a bad idea and I have to agree with him. Yeah, I'm basically the only developer of the code so everything is in my head, and I'd do a rewrite with a fresh mind, but... I'd lose tons of work and, specially, I'd lose tons of code that deals with tricky corner-cases that are hard to remember.

Sure, I want to clean things up but they'll happen incrementally. And preferably concurrently with feature additions. These two things could definitely happen at the same time if only I had infinite spare time...

Anyway, the major point of this post is to describe what I don't like about the current code base and how I'd like to see it changing:
  • A completely revamped C++ API for test cases. The current one sucks. It is not consistent with the C API. It lacks important functionality. It uses exceptions for test-case status reporting (yuck!). And it's ugly.
  • Clear separation of "internal/helper" APIs from the test APIs. You'll agree that the "fs" module, which provides path abstraction and other file system management routines, is something that cannot be part of ATF's API. ATF is about testing. Period. Either that fs module should be in a separate library or should be completely hidden from the public. Otherwise, it'll suffer from abuse and, what scares me, will have to become part of ATF's API. And likewise, most — really — most of the modules in the current code are internal.
  • Less dependencies from the C++ API to the C API. Most of the current C++ modules are wrappers of their corresponding C counterparts. This is nice for code reuse but makes the code extremely fragile. In C++, things like RAII can provide really robust code with minimum effort, but intermixing such C++ code with C makes things ugly really quickly. I'd like to find a way to keep the two libraries separate from each other (and thus keep the C++ binding "pure"), but at the same time I don't want to duplicate code... an interesting problem.
  • Split the tarball into smaller pieces. People writing test cases for C applications don't want to pull in a huge package that depends on C++ and whatnot. And ATF is huge. It takes forever to compile. And this is a serious issue for broad adoption. Note: whether the tools are written in C++ or not is a separate issue, because these are not a dependency for anything!
  • The shell binding is slow. Really slow compared to the other ones. Optimizations would be nice, but those do not address the root of the problem: it's costly to query information from shell-based tests at run time. I.e. it takes a long time to get the full list of test cases available in a test suite because you have to run every single test program with the -l flag. Keeping a separate file with test-case metadata alongside the binary could resolve this and allow more flexibility at run time.
  • And some other things.
Those are the major things I'd like to see addressed soon, but they involve tons of work. Of course, I'd like to be able to work on some features expected by other developers: easier debugging, DOCUMENTATION!...

So, helpers welcome :-)

Tuesday, June 23, 2009

Technicians and schedules

Here I am, on the afternoon of a work day, sitting at home waiting for an eircom technician to come set it up my phone line. How nice. The story goes like this:

Two weeks ago, I placed an online order to request a phone line, explicitly specifying that the physical installation is already done (even though I don't know if it works or not, but that should be fairly easy for them to check). A few days later, the technician called me saying that he'd come today (two weeks after), anytime from 12.00 to 15.00, but that I'd call the company the same day to get a more accurate schedule.

Fine, I'll wait until the 23rd to do that call. But you know what happened, right? I called them this morning and they said that, effectively, the technician was coming today, from 12.30 onwards but they were unable to provide me any more specific information because the technicians have multiple appointments. What? Again, WHAT? At this age of technology, can't we implement a system to track technicians and their schedules? Can't we make some approximations of how long each visit will take? I bet it's trivial if you put in just some common sense.

People have jobs, and they can't leave anytime for unknown periods of time; granted, I have some more freedom at Google, but that is absolutely not the case for most other companies. If you have to be at home at 12.30 sharp, and the appointment will last 30 minutes approximately, that is one thing, but having to be at home from 12.30 for an unexpected period of time, that is a very different thing.

Just wondering... couldn't they just make the technician call you about 20-30 minutes before arrival so that you could make the same arrangements as him and be there at the same time? It doesn't seem such an insane request.

Sunday, June 21, 2009

Child-process management in C for ATF

Let's face it: spawning child processes in Unix is a "mess". Yes, the interfaces involved (fork, wait, pipe) are really elegant and easy to understand, but every single time you need to spawn a new child process to, later on, execute a random command, you have to write quite a bunch of error-prone code to cope with it. If you have ever used any other programming language with higher-level abstraction layers — just check Python's subprocess.Popen — you surely understand what I mean.

The current code in ATF has many places were child processes have to be spawned. I recently had to add yet another case of this, and... enough was enough. Since then, I've been working on a C API to spawn child processes from within ATF's internals and just pushed it to the repository. It's still fairly incomplete, but with minor tweaks, it'll keep all the dirty details of process management contained in a single, one-day-to-be-portable module.

The interface tries to mimic the one that was designed on my Boost.Process Summer of Code project, but in C, which is quite painful. The main idea is to have a fork function to which you pass the subroutine you want to run on the child, the behavior you want for the stdout stream and the behavior you want for the stderr steam. These behaviors can be any of capture (aka create pipes for IPC communcations), silence (aka redirect to /dev/null), redirect to file descriptor and redirect to file. For simplicity, I've omitted stdin. With all this information, the fork function returns you an opaque structure representing the child, from which you can obtain the IPC channels if you requested them and on which you can wait for finalization.

Here is a little example, with tons of details such as error handling or resource finalization removed for simplicity. The code below would spawn "/bin/ls" and store its output in two files named ls.out and ls.err:
static
atf_error_t
run_ls(const void *v)
{
system("/bin/ls");
return atf_no_error();
}

static
void
some_function(...)
{
atf_process_stream_t outsb, errsb;
atf_process_child_t child;
atf_process_status_t status;

atf_process_status_init_redirect_path(&outsb, "ls.out");
atf_process_status_init_redirect_path(&errsb, "ls.err");

atf_process_fork(&child, run_ls, &outsb, &errsb, NULL);
... yeah, here comes the concurrency! ...
atf_process_child_wait(&child, &status);

if (atf_process_status_exited(&status))
printf("Exit: %d\n", atf_process_status_exitstatus(&status));
else
printf("Error!");
}
Yeah, quite verbose, huh? Well, it's the price to pay to simulate namespaces and similar other things in C. I'm not too happy with the interface yet, though, because I've already encountered a few gotchas when trying to convert some of the existing old fork calls to the new module. But, should you want to check the whole mess, check out the corresponding revision.

Saturday, June 13, 2009

How to find an apartment in Dublin

It has been three weeks since I moved to Dublin, Ireland, and I finally have settled into my new apartment. It has taken me two weeks (I was pretty busy during the first one) to go through ads, visits and offers to finally get a place that is cozy, nicely decorated and decently located, all at a quite reasonable price. I could have gotten nicer places for a bit more money, but I'm happy with this one so far.

If you are looking forward to finding a place to stay in Dublin, this post contains some suggestions based on my experience:

First of all, keep in mind that Dublin is outrageously expensive. The prices for housing here are insane at the moment (OK, not as expensive as NYC or SF, but really expensive anyway). Be prepared to spend around 1K EUR for a nice 1 bedroom apartment, and 1.5K EUR for a nice 2 bedroom apartment. Things may improve in the next months, as they just did for the first quarter of the year.

With that said, your first point of reference should be daft. This is the place where all landlords and agencies put their ads, and the place where everyone is looking for apartments. To get started, you need to know where you want to live. Get a rough idea and then locate that place in one of the Dublin postal districts and the ones surrounding it. Given that public transportation is... well... suboptimal, you don't want to live too far from your workplace. Then, hunt for places within your budget... and a budget a bit higher: you can always try to negotiate the rent down and get a nicer apartment than you would otherwise, still staying within your initial budget.

Once you have selected some of the apartments you want to check, call the landlords or agents and ask for an appointment as soon as possible. And, during the visit, check a few basic stuff:
  • Whether the house is old or new: if it's new, it'll probably be in nicer condition overall.
  • Water pressure: old houses have poor water pressure.
  • Electric shower: this is really scary to me, but it is what most old houses have to deal with poor water pressure.
  • Carpet: nice, but a horrible mess to clean up.
  • Garbage collection service: if the building does not do this for you, you'll have to pay for garbage collection separately. I just bought 3 bin tags and those were almost 9 EUR. Yes: 9 EUR to pay for the collection of THREE garbage bags.
  • Location of supermarkets: Dublin is basically a big town, so most roads don't have shops. Make sure that you have a supermarket nearby where you can walk to to get basic stuff.
  • Availability of cable/phone: you'll need this for Internet.
  • Furniture: most apartments in Dublin are provided fully-furnished, so make sure to pick one with furniture that you like. Ask if you are allowed to replace some. Pay special attention to the mattress and couches!!
  • Cutlery: OK, this is part of the furniture, but check what you have. Your landlord may provide you additional stuff for free upon request.
  • Washer and dryer: you want to have a dryer, as most lease contracts state you cannot hung clothes on public places.
  • Heating and double-windows: you'll need this during the winter.
And, at last, don't hurry! The housing market has improved during the last months, so if you see a place that you like, you'll most likely have a few days to decide whether you want it or not (in the past, you had to decide during viewing time, or otherwise it'd be gone afterwards). Think well about your decision and negotiate; don't show yourself as impatient or you'd get worse deals!

I think that's all for know. If there is anything else, the post will be updated :)

Thursday, June 11, 2009

Trying AdSense

I've just decided to enable AdSense on this blog and see what the results are. If they are not worth it (what I'm expecting), I'll disable ads after a while. But who knows, maybe I get a nice surprise!

Thursday, May 14, 2009

Paella in NYC

These days, I'm starting to cook by myself (aka learning) and yesterday I made paella for 6 people while staying in NYC (leaving on Sunday...). This is the third time in two weeks that I cook this Spanish dish, but I think the results were pretty good despite the lack of ingredients. After all, cooking is not as hard as I originally thought! And it's pretty fun too!

Just blogging this because the results look nice:

 
P.S. I'm now eating the leftovers from yesterday. Yummm! :-)

Wednesday, May 13, 2009

Mailing lists for commit notifications

The project I'm currently working on at university uses Subversion as its version control system. Unfortunately, the project itself has no mailing list to receive notifications on every commit, and the managers refuse to set this up. They do not see the value of such a list and they are scared of it because they probably assume that everyone ought to be subscribed to it.

Having worked on projects that have a commit notification mailing list available, I strongly advise to have such a list anytime you have more than one developer working on a project[1]. Bonus points if every commit message comes with a bundled copy of the change's diff (in unified form!). This list must be independent from the regular development mailing list and it must be opt-in: i.e. never subscribe anyone by default, let themselves subscribe if they want to! Not everyone will need to receive this information, but it comes very useful... and it's extremely valuable for the project managers themselves!

Why is this useful? Being subscribed to the commit notification mailing list, it is extremely easy to know what is going on on the project[2]. It is also really easy to review the code submissions as soon as they are made which, with proper reviews by other developers, trains the authors and improves their skills. And if the revision diff is inlined, it is trivial to pinpoint mistakes in it (be them style errors, subtle bugs, or serious design problems) by replying to the email.

So, to my current project managers: if you read me, here is a wish-list item. And, for everyone else, if you need to set up a new project, consider creating this mailing list as soon as possible. Maybe few developers will subscribe to it, but those that do will pay attention and will provide very valuable feedback in the form of replies.

1: Shame on me for not having such a mailing list for ATF. Haven't investigated how to do so with Monotone.

2: Of course, the developers must be conscious to commit early and often, and to provide well-formed changesets: i.e. self-contained and with descriptive logs.