This episode is sponsored by the __getitem__ and __metaclass__

8 06 2006

Yesterday I committed quite large piece of code. Cheesecake is much more modularized now, as you will see below. The general idea was to move all scoring out of Cheesecake class, making each index as much self-contained as possible.

I started with a base Index class. Each cheesecake index that scores separate element of a package inherits from this base class. This way we have IndexUrlDownload (add points if a package has been successfully downloaded from provided URL), IndexUnpack (add points for successful unpacking of a package), IndexRequiredFiles (rise score for existence of useful files, like README or INSTALL) and much more, all inheriting from Index. Each index can be a container for other indices, so that the value of this index will be a sum of these child indices values.

Life of an index have three stages. First, we define the index, subclassing from Index and defining some general parameters. Most important attributes are max_value number and compute method. First defines maximum score this index can give a package, while the second is used to compute index value for given package. For index to be actually used during Cheesecake score computation it has to be put into a list of indices that cheesecake script use. This list is called CheesecakeIndex and currently consist of three indices representing three different views on a package “goodness”: installability, documentation and code “kwalitee”. If you add your Index instance to CheesecakeIndex list of subidices, it will be automatically called and its value will affect overall score.

Each index have to base its score on some package characteristics. I’ve used special convention here that in my opinion simplifies whole indices implementation. From the Index point of view you name the compute method parameters in a way that corresponds to Cheesecake class attributes’ names. For example, when looking at get_pkg_from_pypi method you may notice it defines three instance variables: download_url, distance_from_pypi and found_on_cheeseshop. So, if you want to use the value of found_on_cheeseshop inside your index, define compute method like this:

def compute(self, found_on_cheeseshop):
    pass # your code goes here...

Cheesecake code that will call your index will take care of the rest. For explanation check out get_method_arguments function.

But there’s more than this to indices. Check out this line of test code:

index = self.cheesecake.index["INSTALLABILITY"]["url_download"]

It clearly shows that you can get to index children by this dictionary-like syntax. This magic comes from proper use of __getitem__. But that’s not all. Go ahead and grep for “url_download”. You won’t find anything! So, where the name comes from? I’ve followed a DRY rule here. I have already defined a name in the class definition, so why I should define it again? Getting this done was a bit tricky. I’ve used NameSetter meta-class that during class definition takes its name and using index_class_to_name function injects name attribute. Thanks to this mechanism, it’s enough to say IndexUnpackDir and I get “unpack_dir” name for free.

I’ve refactored most of the code I wanted to change, so I’m pretty happy with it now. Side effect of rewrite is much better coverage score (especially when you compare it to the last results):

Name Stmts Exec Cover
cheesecake/config 35 29 82%
cheesecake/logger 80 70 87%
cheesecake/cheesecake_index 592 531 89%
cheesecake/_util 85 80 94%
cheesecake/codeparser 92 90 97%
TOTAL 884 800 90%

I’ll get to automatic coverage generation soon, so you’ll be able to browse these statistics for each build.




3 responses

12 06 2006
Ian Bicking

That ungreppability is a bad feature, IMHO. Generally modifying names in any way is a bad idea (and when I’ve done it, I’ve later regretted it). It would be better to have [‘IndexURLDownload’] or something like that, or else simply to repeat yourself so that your code is explicit about names.

12 06 2006

It seemed like a nice feature at first, but you may be right about it – ungreppability is a bad thing. I will probably leave ‘name’ attribute metaclass magic, but without name modifications.

13 06 2006
Mousebender » Blog Archive » Camembert summary

[…] Let’s start with the mistakes. Coverage statistics I’ve published few days ago are bogus. […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: