Apache Cocoon: a case study in sustainability

by Andrew Savory, Managing Director, Luminas Ltd on 17 September 2007 , last updated

Archived This page has been archived. Its content will not be updated. Further details of our archive policy.

Introduction

In the latter half of 2006, the Joint Information Systems Committee (JISC) commissioned a study via its Teaching and Learning committee to examine the issues surrounding sustainability of open source software. The resulting report drew together seven case studies of successful but very different open source projects and examined each project’s sustainability model. Each of these case studies has been told from the point of view of the lead developer or one of the key personnel and gives a fascinating insight into the factors that have determined the success of each project. These case studies are now presented by OSS Watch as stand alone documents in a series.

This case study, examining the Apache Cocoon project, has been written by Andrew Savory, Managing Director, Luminas Ltd.

Brief description

Apache Cocoon is a Web development framework—a package of interlinked Java software components that Web developers can use to help them process XML-based content for websites. The software is published under the Apache License (version 2.0), which means that the source code can be used and distributed in both open and closed source variants.

Cocoon is currently at version 2.1.11 and as it evolves, more complex solutions are being developed including a number of high profile and mission critical installations in corporate information environments. It has also been used in a number of JISC-funded projects including an open source software repository system for images for BioMed1, and in the Virtual Norfolk2 project.

Introduction

Cocoon is a Java-based web framework built around the concepts of separation of concerns and component-based web development. Cocoon facilitates a Lego™-like approach to building web solutions by allowing the developer to assemble the various components with minimal programming. It also allows parallel development of all aspects of a web application, improving development pace and reducing the chance of conflicts.

Cocoon is a project of the Apache Software Foundation, a non-profit organisation incorporated in the USA, set up to support open collaborative software development by supplying the hardware, communication and business infrastructure. The ASF is an independent legal entity to which companies and individuals can donate resources and be assured that those resources will be used for the benefit of the public. It is the ASF that provides the structure and governance of the Cocoon project and the foundation’s role provides an interesting case study for considering perspectives on the governance of open source projects.

The Cocoon project closely follows what has become known as ‘the Apache way’ in that it follows the guiding principles behind the Apache Foundation and its community of developers and users:

  • collaborative software development
  • commercially-friendly standard licence
  • consistently high quality software
  • respectful, honest, technical-based interaction
  • faithful implementation of standards
  • security as a mandatory feature

Project history

Cocoon began life in 1998 as a solution to managing the java.apache.org website (now jakarta.apache.org). Originally written by the Italian student Stefano Mazzocchi, it was designed to take advantage of the very latest standards that were available at the time. It used the newly released Extensible Stylesheet Language family (XSL)3 and XML as a way of separating presentation from content, thus allowing anyone to edit the content without worrying about how it should be displayed. The initial solution was a simple (100 lines of code) Java servlet running under the Apache webserver that transformed a given XML file into HTML using an XSL stylesheet.

Following significant interest from another Apache project’s mailing list, a formal vote was taken to start the Cocoon project and Stefano contributed the initial code. This somewhat informal process was very different to today’s Apache Incubator4, which seeks to define a formal process and metrics for what makes a suitable Apache project.

At the time of writing there have been three releases of Cocoon. Illustration 1 shows the evolution of the project over the last eight years.

Illustration 1: Cocoon versions

Around November 1999, work began on Cocoon 2.0—a complete redesign that learned from the design mistakes and architectural issues of Cocoon 1.0. It was designed for performance and scalability, and raised the use of XML and XSL technologies to a new level. With centralised configuration and sophisticated caching, it became possible to create, deploy and maintain robust XML server applications.

In 2003, work on Cocoon 2.1 began, again learning from previous mistakes. Cocoon 2.1 represented a more incremental change rather than the revolutionary change between Cocoon 1 and Cocoon 2.

Growth and development

This method of separating the content from the presentation became increasingly popular as a way of working, and the project started to attract many more developers. Cocoon has grown into a full XML-based publishing system, which is now used in many production sites around the world. As Cocoon continues to evolve it is being used to solve increasingly complex problems. The system has a fairly high profile in the world of web development and has been implemented in a number of corporate enterprise ICT scenarios, for example:

  • handling information transfer within Swiss banks and German mobile phone operators
  • providing information portals for large organisations
  • publishing websites for the media and public sectors
  • as the application framework for several popular content management systems (e.g. Hippo CMS, Daisy, Lenya).

Project structure: process and governance

The governance model for the Cocoon project is a series of guidelines based on an understanding of the successes and failures of previous Apache projects in managing their communities. As one of the larger, longer-standing ASF projects, the best practices and social patterns observed from within the Cocoon community have also informed current ASF government structure. So, in order to understand the way the Cocoon project works, it is necessary to look at the wider organisation within which it operates.

Illustration 2 shows the layout of the Apache Software Foundation and the projects within it. At the top level is the Board of Directors, responsible for maintaining the business affairs and legal oversight of all parts of the Foundation subject to a set of bylaws5. Reporting to the Board is a Project Management Committee (PMC) for each of the projects within the foundation. The PMCs are established by the board to be responsible for the management of one or more communities. They provide oversight, ensure legal issues are addressed and procedures are followed, and are also responsible for the long-term health and development of the community as a whole. The PMC’s responsibilities do not include code or coding. The PMC has an appointed chair, elected from within the community on a yearly rotating basis, but this is for legal and bureaucratic reasons only—the emphasis is always on a community of peers.

Illustration 2: Apache Software Foundation structure

Within the ASF the emphasis is very much on individual contribution rather than the contribution of companies (or people working on behalf of companies). No developer is directly paid by the ASF for the work they do on projects although developers may, of course, be paid by their own employers. The roles that an individual may play within a project and within the ASF as a whole are very clearly defined, and this helps individuals to understand how, when and where they can make a contribution.

Each community revolves around a codebase, and operates as a meritocracy—literally, governed by merit. When an individual working within the community is felt to have earned their place in the community, they are granted direct access to the code repository, thus increasing the group and increasing the ability of the group to develop the code. The roles defined within the meritocracy are shown in Illustration 3 and are as follows:

  • User: someone that makes use of ASF software. They may be a passive user (simply downloading and working with the software) or an active user (providing feedback via suggestions or bug reports), and they can participate in the community by helping other users via the mailing lists.
  • Developer: a user who contributes to the project in the form of code or documentation. They actively contribute on the developer mailing list, provide fixes, suggestions and criticism. Developers are also known as contributors.
  • Committer: a committer is a developer who has been recognised by the community and given write access (the ability to make updates to a project’s code) to the code repository, and who is also able to make short-term decisions for the project.
  • PMC member: a developer or committer elected due to their contribution to the evolution of the project and their demonstrated commitment. They have the right to vote on community-related issues and to propose other developers for committership.
  • ASF member: similar to a PMC member, the ASF member is nominated for commitment and contributions to the ASF as a whole and membership is by invitation only. They participate across several projects, and have the right to elect the ASF board.

Illustration 3: Roles within the meritocracy

There are two ways for decisions to be made within the community. Because the committers have already been recognised as responsible developers by their peers, mutual trust and peer pressure is in place to ensure committers act responsibly and know when it is appropriate to make decisions themselves and when to call a vote.

Minor decisions are usually made on the assumption that ‘code talks’, i.e. developing and committing the code to prove a concept or provide a solution is often more effective than endless discussions.

For more important decisions, such as architectural changes or major new features, votes are required. Voting is carried out by lazy consensus, that is, a discussion is started on the project mailing list titled [VOTE], and is left for at least 48 hours before the initiator sums up—a lack of response is assumed to be tacit agreement. The 48-hour period is designed to take into account the distributed nature of development, and to allow developers in different time zones to play equal roles.

Votes are cast by either providing a +1 (in favour of the proposal), a 0 (no opinion) or a -1 (against the proposal). Any negative votes must be accompanied with full explanations, to reduce friction within the community and to stimulate discussion of better solutions.

Anyone can vote, though only committer votes count. Non-committer votes are considered ‘non-binding’, i.e. recommendations or expressions of preference that help the committers decide what is best for the community.

There is no single individual identified as the ‘leader’ of the project. Early in the project’s life, Stefano Mazzocchi was recognised as a ‘benevolent dictator’ with a similar role to that of Linus Torvalds, in that he had responsibility for steering the project. This role rapidly evolved into one of ‘community catalyst’, providing oversight, advice and guidance. Over time, as the community has strengthened, the need for this role has reduced, and today Stefano is considered an emeritus member of the community6.

Today, the Apache Cocoon community is made up of roughly 15 to 20 regular committers to the code base, with over 40 other committers showing continued interest and participating in discussions. Committers make up roughly 10% of those active in discussions on the mailing lists. There are over a thousand subscribers to the Cocoon mailing lists, which receive on average more than 50 messages a day.

The GetTogether

Cocoon is well known amongst the Apache community as one of the largest and most vibrant groups of developers. It is one of very few Apache projects to have its own annual conference, the Cocoon GetTogether. This informal gathering of users, developers and business people is typical of the way the Cocoon community works: the event is split into two days of ‘hacking’ and one day of more formal talks, presentations and exhibits.

The two days of hacking typically allow anyone interested in Cocoon to informally talk about Cocoon’s architecture, to fix bugs, to discuss new features, and to meet the other people involved in the project face to face. The day of formal talks allows for the dissemination of big ideas, best practices, and examples of Cocoon usage in various business contexts.

Any events or meetings that take place are entirely voluntary and funded through ‘at-cost’ entry fees.

Project structure: sustainability model

The key to the Apache Software Foundation’s sustainability model, and to the projects underneath the ASF umbrella, is the emphasis on reducing ‘conservative’ resources (e.g. money, energy, time) whilst emphasising non-conservative resources (e.g. fun, respect, friendship, visibility)7.

Sustainability is achieved by removing some of the traditional constraints on the lifetime of a project (the conservative resources), and focusing on attributes that have proved to be good for long-term project viability. The good health of the community is guaranteed by providing a neutral environment for developers to interact as peers. This has been seen to work not just with the Cocoon community but also with many other ASF projects.

As the ASF is an independent, non-profit organisation, companies and individuals can donate resources and be assured that those resources will be used for the benefit of the public. This set-up allows the ASF to support open, collaborative software development by supplying the hardware, communication and business infrastructure that such development requires, thereby removing some of the conservative resource constraints that typically hinder open source development. This means that no particular developer or company has to pay to participate in or to support the Cocoon development community, which means that no one company has reason to express ‘ownership’ of the project. This, in turn, ensures the longevity of the project and the stability of the community.

Like all Apache projects, the Cocoon community consists of:

  • mailing lists for asynchronous communication, which is essential with users and developers spread out across the globe
  • a publicly-accessible code repository that anyone can read, and that committed developers can write to
  • continually evolving sources of documentation on the website, the wiki and now a Cocoon-based CMS.

The original documentation was developed as XML files stored beside the code within the project’s source code repository. This proved to be a barrier to contribution for many users, and so the project wiki was established for the development of documentation prior to moving it into the core documentation. The latest evolution of the documentation is being developed within the Daisy CMS, which supports publishing both to the website and also in book form.

There are several books available about Apache Cocoon, the most prominent being Cocoon: Building XML Applications and Cocoon Development Handbook.

Reflections and future

Like many open source projects, Cocoon is driven less by a specific roadmap and more by changes that occur as a result of individual contributions. Over the coming years, it is expected that Cocoon will continue to evolve to cater for specific Web application needs by learning from competitors’ frameworks, inheriting their best features where appropriate, and adapting to new approaches in application design.

Because contributions are voluntary, it is impossible to set firm dates for when particular releases will be made, but a loosely-defined plan for future development exists based on discussions within the community, and this encompasses the next two major releases:

Cocoon 2.x: in this series of releases the code base will be re-organised to allow for a more modular system of building, making it easier for users to contribute at a component level rather than working with one large monolithic structure. This release is seen as a stepping-stone to the next major release of Cocoon.

Cocoon 3: in this release, it is expected that Cocoon will complete the move to a fully modular Java component system, based around OSGi8. This version will allow for ‘hotplugging’ of components—being able to select and change functionality whilst Cocoon is running.

Work will also continue on maintaining Cocoon 2.1 to ensure that critical bugs are fixed, but with no additional major functionality.

Project details

The primary source for Cocoon information is the website: http://cocoon.apache.org/

The latest stable release: Apache Cocoon 2.2

Downloads: http://cocoon.apache.org/mirror.cgi

The mailing lists: http://cocoon.apache.org/mail-lists.html

Current status

Since it was originally authored Apache Cocoon has continued to be managed and developed using the structure described above. The predictions for the future of Apache Cocoon V3 appear to be holding with work proceeding towards a fully modular system. In the meantime the 2.x development proceeds with restructuring the architecture to allow an easier entry point for new users by adopting a more modular system.

Activity on the project has slowed considerably since its heyday, with many elements of the Cocoon web presence updated last in 2008. However, development continues despite the departure of a significant number of community leaders, and its mailing lists are still lively. It can therefore be argued that Cocoon validates the community model of software development as described in this document.

Further reading

Links:

Related information from OSS Watch:

Acknowledgements

The sustainability study from which this case study is taken was commissioned by the JISC Learning and Teaching committee and funded from HEFCE’s IT Infrastructure funds. The Learning and Teaching committe is responsible for supporting the learning and teaching community by helping institutions to promote innovation in the use of ICT to benefit learning and teaching, research and the management of institutions.

The sustainability study was edited by Gaynor Backhouse of IntelligentContent and her editorial guidance has contributed in large part to the excellent result.


  1. http://www.jisc.ac.uk/whatwedo/programmes/programme_fair/fair_synthesisintro/fairsynthesis_biomed.aspx#software

  2. http://www.intute.ac.uk/artsandhumanities/cgi-bin/fullrecord.pl?handle=humbul3341

  3. XSL includes XSLT, a language for transforming XML documents into other XML documents.

  4. The Incubator is a formal process of helping the development of new software ideas and projects and ensuring their early development and eventual suitability for inclusion as an official Apache project: http://incubator.apache.org/.

  5. http://www.apache.org/foundation/bylaws.html

  6. Inevitably for an old project there are those who no longer have the time or energy to contribute, or who have simply moved on to other things. These people, although no longer active, have made important contributions to the community and are considered ‘emeritus’ in acknowledgement of this. Fifteen of the Cocoon committers are currently considered emeritus.

  7. http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101112466811874&w=2 [last accessed 01/02/07].

  8. http://www.osgi.org/