Cheesecake for all

9 02 2007

If you maintain a Python package that is registered on PyPI, go check out Cheesecake service now! We automatically test new releases, so if you have released a new version of your code recently, you can check its Cheesecake score right away.

Tasty cheesecake photo by Sharyn Morrow

Cheesecake is a tool that gives you feedback about state of your python package. Unit testing gives you feedback about behaviour of your code, while Cheesecake tells you about such things like whenever your package can be easily installed, how well it is documented and how strictly your code adheres to common coding standards (like PEP-8).

Cheesecake defines three types of indexes: installability, documentation and code kwalitee index. In short, installability tells you if your package can be easily found, downloaded and installed using distutils/setuptools facilities. Documentation index informs you how many of your code objects (modules/classes/functions) have docstrings and did you remember to create files like README or INSTALL (which users tend to look for first after unpacking the source). Code kwalitee checks your unit tests and runs pylint on the whole package. If you combine all of those different aspects of a package and check their conformance to a common practice – you get Cheesecake score. Want more details? Check out description of an algorithm for computing the Cheesecake index.

Score isn’t meant to define “better” and “worse” packages, it is only a helpful estimate of progress, as you make certain efforts to make your package easier to install, understand and modify. More work you put into your distribution, higher Cheesecake score you should get. We tried hard to make this correlation of good packaging practice and Cheesecake score high, but chances are we made some mistakes. If you think we scored some parts of your package wrongly or we missed some effort, we urge you to send us a bug report. The whole Python community will benefit, as the definition of a good Python package is still not well crystallized. We want Cheesecake to be a useful tool for all Python programmers who seek guidance on how to improve their distributions. The profit is mutual – developer can raise his knowledge of good coding practices and potential distribution problems, while his improved package will get used more often for the benefit of whole Python community.

So, check out Cheesecake service or try Cheesecake on your computer. Bon Appétit!





Cheesecake 0.6

15 08 2006

First official version of Cheesecake is ready. It has been already uploaded to PyPI, so all you need to do is:

easy_install Cheesecake

Let the bugreports flow! :-)





100!

22 07 2006

Polish CheesecakeI know you were all waiting for this. 100th commit of Cheesecake has been made. Go on, check out code and taste this delicious piece of Python. Some have already done so and don’t regret it.

But seriously, I feel my first days of working on Cheesecake were sooo long ago. Now I know every line of its code… But not only a pure code was important – I extended Cheesecake testing infrastructure, setting up a buildbot and adding first functional tests. Now, after few more bugfixes and finishing touches Cheesecake will be ready for its first official release! Then we’ll head for PyPI integration, spreading XP ideas further into Python community. And that’s not all – check out latest Grig’s initiative. Looking at the results I feel the effort was really worth it. Enough words – time to get some sleep. 8-)





Devon

26 06 2006

DevonThis iteration took two weeks to complete because of exams I had. Fortunately exams will soon end; I have the last one on Thursday.

Most important thing about past iteration was testing Cheesecake on all PyPI packages, which revealed a few issues. First, lots of packages don’t have correct download URLs listed on their PyPI pages, what makes scoring impossible. setuptools try hard to find a download link, but it fails for many packages. PJE don’t want to include any more screen scraping code in setuptools, so now it’s package maintainers duty to update their download links. Without this manual work of Python developers, the whole idea of PyPI and setuptools is useless.

There is also a group of packages that have only an egg uploaded, without a link to project sources. Eggs are intended to be a binary distribution, and as such, will be harder to score. Cheesecake have support for eggs now, but scoring techniques for scoring them will have to be improved at some point. We still have many ideas waiting to be implemented, so please be patient.

Of course there is a reason I’ve done all these tests. My goal is to start PyPI integration soon, so I want to be sure I’m heading in the right direction and that all changes in Cheesecake are for the good of Python community. I’ve already fixed some scoring methods and there are more improvements on their way. If you’re a Python developer, please try running Cheesecake on your project. I would love to hear your comments.

You can read in more detail about stories completed in Devon on Grig’s blog.





Camembert summary

13 06 2006

Since camembert milestone is complete, few summary words is due.

Let’s start with the mistakes. Coverage statistics I’ve published few days ago are bogus. All because I didn’t remove .coverage file from my cheesecake development directory. During next run statistics got messed up. I should have suspected an error because line numbers in coverage output were obviously bad. I have to be careful the next time. Fortunately buildbot pointed out my laziness and I was able to fix the bug. So, remember to always use -cover-erase option! Actual number is 79% (with cheesecake_index scoring 75%).

Now about the good things. In this iteration I’ve managed to prepare buildbot setup for automatic generation of Cheesecake documentation (using wonderful epydoc package) and its coverage statistics (using coverage.py and cover2html script I’ve written). So, if you want to check out what’s currently going on, you have two more sources of up-to-date information (existing one was Trac of course).

Last thing I wrote was a simple tool for converting our ReST README file into Trac Wiki format. This way we are able to maintain a single file and automatically convert it to any necessary format on demand. HTML output is supported by standard docutils distribution, but there wasn’t any solution for Trac Wiki. With a bit of reading through docutils code I’ve come up with a working script that successfully converts Cheesecake README from ReST format into Trac Wiki. Current revision available for download is 14. If you’re interested in writing a custom ReST->anything converter, you may want to read that sources. docutils have quite verbose but self-explanatory API, so it’s not hard to start developing your own script. I also have few simple advices that may help you in your way:

  • Everything you have to do is to write a custom Writer. For examples you may want to look at the HTML Writer code.
  • To make your writer usable use the publisher interface to easily create a command line tool, like standard rst2html or rst2latex.
  • rst2pseudoxml is very helpful during debugging. This tool produces a document tree as seen by the ReST parser.
  • Very important thing is having good unit tests. Because we’re writing a converting engine, most of tests include feeding our application with sample data and validating output it produces. For this simple task the power of unittest or doctest will be getting in your way. I’ve prepared special testing script, run_text_tests.py, which can be found in the rest2trac package. It finds all files with .in extension in a directory and injects their contents to Trac Writer. After each test it checks if Writer output is the same as corresponding .out file contents. If not – it exits with an error. It should be very easy to use it in our project.

That’s it. Feel free to comment on everything I’ve written. I am happy to hear any feedback.





This episode is sponsored by the __getitem__ and __metaclass__

8 06 2006

Yesterday I committed quite large piece of code. Cheesecake is much more modularized now, as you will see below. The general idea was to move all scoring out of Cheesecake class, making each index as much self-contained as possible.

I started with a base Index class. Each cheesecake index that scores separate element of a package inherits from this base class. This way we have IndexUrlDownload (add points if a package has been successfully downloaded from provided URL), IndexUnpack (add points for successful unpacking of a package), IndexRequiredFiles (rise score for existence of useful files, like README or INSTALL) and much more, all inheriting from Index. Each index can be a container for other indices, so that the value of this index will be a sum of these child indices values.

Life of an index have three stages. First, we define the index, subclassing from Index and defining some general parameters. Most important attributes are max_value number and compute method. First defines maximum score this index can give a package, while the second is used to compute index value for given package. For index to be actually used during Cheesecake score computation it has to be put into a list of indices that cheesecake script use. This list is called CheesecakeIndex and currently consist of three indices representing three different views on a package “goodness”: installability, documentation and code “kwalitee”. If you add your Index instance to CheesecakeIndex list of subidices, it will be automatically called and its value will affect overall score.

Each index have to base its score on some package characteristics. I’ve used special convention here that in my opinion simplifies whole indices implementation. From the Index point of view you name the compute method parameters in a way that corresponds to Cheesecake class attributes’ names. For example, when looking at get_pkg_from_pypi method you may notice it defines three instance variables: download_url, distance_from_pypi and found_on_cheeseshop. So, if you want to use the value of found_on_cheeseshop inside your index, define compute method like this:

def compute(self, found_on_cheeseshop):
    pass # your code goes here...

Cheesecake code that will call your index will take care of the rest. For explanation check out get_method_arguments function.

But there’s more than this to indices. Check out this line of test code:

index = self.cheesecake.index["INSTALLABILITY"]["url_download"]

It clearly shows that you can get to index children by this dictionary-like syntax. This magic comes from proper use of __getitem__. But that’s not all. Go ahead and grep cheesecake_index.py for “url_download”. You won’t find anything! So, where the name comes from? I’ve followed a DRY rule here. I have already defined a name in the class definition, so why I should define it again? Getting this done was a bit tricky. I’ve used NameSetter meta-class that during class definition takes its name and using index_class_to_name function injects name attribute. Thanks to this mechanism, it’s enough to say IndexUnpackDir and I get “unpack_dir” name for free.

I’ve refactored most of the code I wanted to change, so I’m pretty happy with it now. Side effect of rewrite is much better coverage score (especially when you compare it to the last results):

Name Stmts Exec Cover
cheesecake/config 35 29 82%
cheesecake/logger 80 70 87%
cheesecake/cheesecake_index 592 531 89%
cheesecake/_util 85 80 94%
cheesecake/codeparser 92 90 97%
TOTAL 884 800 90%

I’ll get to automatic coverage generation soon, so you’ll be able to browse these statistics for each build.





The worst project ever

6 06 2006

Some people have very strong emotional relationship with Cheesecake:

for the record, the “cheesecake” mandatory-inane-code-metrics-invented-by-bozos project is the worst project ever in Python’s history.

“we don’t have any creative talent whatsoever, so let’s come up with a way to fuck with all the creative people that has made Python into what it is.”

seriously, if the only thing you find interesting in computing is the chance to control what other people do, and how they do it, please go work in some other field.

Well, I guess Cheesecake is right in front of pylint, setuptools, doctest, or even Python interpreter itself, as these tools also don’t hesitate to point programmer his/her mistakes. With a company of these projects we feel perfectly well.





Stage two: refactoring

5 06 2006

CamembertTo make Cheesecake easier to maintain and more extensible I’ve decided to do some major refactoring this week. You may expect things to be broken inside mk branch for some time. But at the end of the week you’ll hopefully be able to hack your own index and easily plug it into Cheesecake. I will write about it in more detail later. And for all agile fans I will also incorporate automatic creation of documentation and coverage statistics into buildbot.

Oh, and check out what Titus Brown has to say about Cheesecake. If you’re interested in explanation how Cheesecake development process looks like, read last Grig post.





What Cheesecake is all about…

5 06 2006

This is copy&paste from the project comments page that summarize my view on Cheesecake.

What do you mean there should be a README file? If I add an empty README file I get a good cheesecake score, while somebody who has excellent information in a file DOC, gets a bad score? I’m afraid cheesecake’s kwalitee is going to become a factor that won’t be respected by developers. If cheesecake gives a bad score to a project it may mean that either the project is bad, *or* cheesecake is bad and the project is good. And if there are many good projects that get a bad score cheesecake will render itself wrong!

And that’s exactly the reason we’re asking the community for feedback. We don’t want to implement “book standards” but real working and useful practices that good Python developers use. If there will be a few good projects with DOC instead of README file, we’ll add this check to Cheesecake. What I could call most common misunderstanding about Cheesecake is a misconception that we are trying to come up with our own standards and try to enforce them on Python developers. What we’ll actually do is listen to what developers have to say and come up with a greatest common divider for all these good programming advices. Cheesecake score will represent project compliance to this common and established way of doing things. Important part of this is that Cheesecake won’t be ever complete. It will have to incorporate changes in the same way the community and methodology changes over time. Having a tool to point you to current trends and suggest good advices on trivial things like file naming convention or distribution method is invaluable. This way you can focus on coding, leaving boring stuff to Cheesecake. One of our goals is to create an easy reference for all factors that affect Cheesecake score, so that every developer can easily look up why he’s loosing points. And that doesn’t mean these rules are set in stone – we encourage all developers to question them.

The point I’m trying to make is that Cheesecake is written to help programmers, not to put blame on them. It’s one of agile development advices: “Criticize Ideas, Not People”. If most developers use README it’s probably a good thing to have a README in your project. It tells nothing about the quality of your documentation, it only suggest you don’t have file that most people will try to look for first, right after package has been unpacked. It also make it easier for package managers in different open source distributions, like Debian or Gentoo, as they don’t have to check manually which of your DOC/UserManual/anything file contains actual documentation they can, for example, incorporate into a man. Having one good way of doing things is very efficient for collaboration and project maintaince. Python was built upon this principle, so Cheesecake is merely a continuation of this thinking, but on different level.





“The Queen of Cheeses”

29 05 2006

brie.pngReady to taste some of the famous brie? Go on and check out Cheesecake’s first milestone. Along with few bugfixes I plan to enhance the documentation index. I hope to deliver first modifications in next few hours, so feel free to check out my code and test it, before our buildbot does! ;-)