Sustainability lessons for research infrastructure

by Gabriel Hanganu on 9 August 2010 , last updated

Introduction

Despite the availability of an impressive range of online systems and resources for UK researchers, the JISC-funded Community Engagement Report in 2009 identified a number of barriers to their adoption. These include limited opportunities for collaboration between competing partners during the bid process, lack of recognition for the open sharing of research outputs, and inadequate long-term provision of software development support. Let’s take a closer look at some of these barriers, and explore how experience from open development practice could be used to alleviate or remove them.

Introduction

The technical infrastructure built to support UK researchers (also known as e-Infrastructure) consists of a combination of systems and services designed to improve cross-subject collaboration and foster new ways of conducting research (also referred to as e-Research). However, to date, UK researchers have shown some reserve in using these services and facilities in a meaningful way. The main barriers to a wider adoption are social and organisational rather than technological. We suggest that learning and applying some of the key lessons from open development practice could improve the current situation.

We approach this topic in three related documents: a general discussion and two more detailed insights, Community Lessons for Research Infrastructure and Sustainability Lessons for Research Infrastructure (this document), which both include quotes from UK researchers and IT service providers interviewed for the Community Engagement Report. In this document, we highlight a number of sustainability lessons from open development practice that could help to remove barriers to engagement with UK e-Infrastructure systems and services.

Understand software sustainability in broader terms

The sustainability of academic projects (and of their outputs, including software) is understood mainly in terms of securing further funding. However, in open development communities sustainability is understood in broader terms. Expanding beyond the initial team and harnessing contributions from all potentially interested parties are key features of the open development practice. Resources required for ongoing development and support of research outputs can be supported in a variety of ways . By being equally open to contributions from other parties, academic projects can maximise their chances of becoming sustainable in the long term.

Acknowledge the prototype status of academic project software

The sustainability of research infrastructure tools and services is often mentioned as a potential source of concern for researchers. One aspect of this concern is the reliability of the technical solutions offered by academic projects with time-limited funding. This is critical to the researchers’ decision to adopt them:

‘So you’re working with products that come out of research rather than out of a software factory, and often these will have problems or they’ll be half finished or they don’t really fit together yet […] it’s difficult to decipher what the risk is before you start.’

To address this concern, it should be made clear by the projects that most of this software is not yet ready for general use, and requires significant time and skill from users to actually deploy:

‘Most academic software seems to have the lifespan of about a year before it disappears, or doesn’t get updated, or something, and we just can’t work like that.’

The initial development costs associated with building research infrastructure tools are usually covered by time-limited research grants. Therefore, their completeness and usability are limited. Additionally, their adoption by early users often generates further requirements, and the need to provide ongoing maintenance. The project simply cannot offer this:

‘It’s difficult because a research group […] has a number of post docs and PhD students and it’s not a software engineering company. So when you finally deliver something it’s going to be a prototype, it’s never going to be a finished product. But they then start depending on it, and then the problem arises, because you don’t have the funding or even the capacity to maintain [it …] whenever they want to change something, and they don’t have the money to do the same thing, because they can’t just say, oh here’s an extra amount of money just to add this feature that doesn’t exist, so it’s going to be very limited.’

Early and frequent releases are crucial for building a sustainable open source project: releases attract users, some of these users become contributors, and more contributors make the project stronger. The downside of releasing early and often is that one needs to manage user expectations. Projects need to be clear about the status of their releases and draw attention to any known bugs in the documentation.

Avoid relying fully on central development support

Teams tasked with building and maintaining research infrastructure tools traditionally work in closed environment projects. The technology or software developed is kept safely within the confines of the team, and no collaboration is envisaged or sought from outside. One may argue that there is nothing special about this, since many of the tools and resources researchers use are produced in this way. However, e-Research places a significant emphasis on sharing and collaboration. One would therefore expect it to be possible for these teams to have in place some mechanisms by which external contribution could be harnessed to the benefit of the entire community. Software produced in this way could be released under an open source licence and appropriate governance documents could be written to encourage third party contributors to further develop the product by adding features important to their needs.

Take, for instance, a typical situation described in the Community Engagement report concerning the use of GridSAM. GridSAM is a job submission interface for distributed resource management systems, which was initially developed at the London e-Science Centre and further expanded by developers at the Open Middleware Infrastructure Institute (OMII-UK):

‘Unfortunately we found that GridSAM has some problems, and unfortunately GridSAM is no longer actively developed […]. Depending on what happens over the next 3 or 4 months we’ll probably either start coming up with […] solutions that work better for our purposes or try and fix GridSAM ourselves, I would guess. I mean [OMII has] been trying very hard, but obviously it’s very difficult because the developers are employed on something completely different, so it’s quite hard to get the support for a lot of the OMII software, although they are trying hard to improve that. Unfortunately most of the software that’s in the OMII stack isn’t actually finished, that’s the main sort of problem.’

The OMII software development model this researcher refers to is one in which early-stage software from selected research projects is ‘hardened’ and supported up to the point that research outputs are delivered. The code is released under an open source licence, so that anybody can provide contributions. In practice, however, without community mechanisms to encourage and reward external contribution, this is rarely the case.

One more ‘radical’ model that can be found in open source environments outside the academic sector is the Apache Incubator. Here projects are not considered sustainable official projects of the Apache Software Foundation until there are at least three active committers from different employers.

The fact that OMII provides a certain level of development and support for a limited period of time seems good enough for some researchers. However, as other respondents point out, it may be that these issues need to be addressed in a more fundamental way:

‘The way in which funding is provided for projects developing software may need to be more radically changed, so that they start producing self-sustainable products rather than just prototypes.’

Build and publish project sustainability and management plans

One issue that OSS Watch often hears about when consulting with research projects is that of continuation funding. Securing follow-up funding at the end of the project is perceived as a recognition of its success. Failing to do so is seen as an expression of the project’s inability to persuade funders that what they have done is valuable.

However, this is clearly not the only way in which open development teams operate. Support beyond initial funding may come in various other forms. These include user or developer time spent on the project, sponsorship from interested corporate members, or paid support for the maintenance of the product being developed. All of these ways in which the project sees itself being maintained and supported are put together in a sustainability plan. If this plan is made available online, the vision for the project becomes clearer. Its transparency and accountability also increase, and potential new members can better plan their engagement with the community.

Expanding beyond the initial team and harnessing contributions from interested participants are key features of the open development practice. If the tools and processes used by project members on a daily basis are made transparent and appealing, and the adoption path into the community is clear, welcoming and informative, new contributors will join in. They will be attracted by the prospect of being part of a potentially successful self-sustainable project. Some of these contributors may be paid by various employers to work on particular aspects of the product. Others may be attracted by the technical challenge, the prospect of acquiring kudos, or simply by the opportunity to improve or showcase their skills. All these and other accepted forms of engaging with external collaborators can be considered and specified in the project’s governance document.

Foster research collaboration from an early stage

Working in an open fashion is something that not only developers but also researchers themselves can be encouraged to learn. An important condition for success in open development projects is having something that can be run and tested as early as possible. This can mean sharing a product that is imperfect or incomplete. Learning how to work together from an early stage on imperfect products may also help researchers to become more effective members of the emergent research infrastructure communities.

Understand incentives for research collaboration

Collaboration readiness is a useful indicator of the level of trust between project participants, their motivation to work together and share a sense of collective efficacy, beyond merely pushing together to fulfill the mandate of the funders. All these are crucial for the success of online scientific collaboration. It is therefore important to understand the researchers’ incentives for collaboration. These include the pooling of skills, effort and resources likely to improve the quality of the research process. It is equally important to understand the barriers to collaboration, most of which are directly or indirectly related to funding.

‘To a certain degree research centres are in competition. So there is a certain reluctance I think in individual research centres within the same discipline sharing what they’re doing. They may well be bidding for a research project in competition with each other.’

Research is indeed competitive, and the pressure exercised by the ‘publish or perish’ demands of the Research Assessment Exercise (RAE) further affects researchers’ willingness to work collaboratively:

‘[RAE] encourages individuals to publish independently, to keep things secret while there can be many advantages to their career, no matter if they have been funded publicly or not, because by doing that they appear to be better by the criteria used for measurement of the research assessment exercise. That’s a major cultural problem, because it makes it too difficult to persuade scientists to be open with their data, they fear losing it, and therefore their current position.’

One way to counter artificial funding-based competition leading to secrecy and closed-doors authorship is to share from the early stage of a project. At this point, a researcher’s sense of ownership is still relatively low. Inviting competitors to collaborate in the open would pave the way to turning mutually exclusive competition into mutually enriching collaboration. Rival individuals and research centres could therefore be encouraged to set up strategic alliances between groups who agree to write grant proposals, share project work and publish research outputs together.

Significant work in this respect has been carried out by NESTA through its Crucible programme, in which selected early career academics were encouraged to innovate by collaborating across research disciplines. As the keynote speaker at the 2009 edition of the Crucible programme put it in an earlier post:

‘What we need more than ever now, is a diverse and vibrant research community working on a wide range of problems, and to find better communication tools so as to efficiently connect unexpected solutions to problems in different areas.’

Another lesson that can be learned from open development practice is that of enhancing the usability of project outputs. Succesful open source communities know that attracting external contribution depends on the ease by which software code, documentation, project memory and other outputs can be accessed and improved by others. Optimising this process of using and re-using outputs is key to the growth and sustainability of open development projects. In a similar way, researchers could benefit by publishing their research in an appropriate online format and optimising it for increased discoverability and re-use. This way, the apparently missed opportunity of not publishing in an established printed journal could be balanced by the enhanced discoverability and higher potential for external contribution to the shared outputs.

Consider sharing research infrastructure

Open source environments appreciate professional diversity for its potential to generate innovative solutions to problems. Research teams can also learn from this practice. Sharing certain types of research data is not always possible, but research collaboration need not necessarily be understood only in terms of sharing data. In fact clarifying the distinction between sharing research data and sharing research infrastructure is a good way of assessing the levels on which various research communities are prepared to collaborate. By encounterig common problems in the early stages of using shared resources, disparate research teams have the opportunity to talk to each other. This can pave the way to previously inconceivable forms of research collaboration.

Many university departments are faced with the prospect of pooling together resources in order to reduce overlapping costs:

‘I think we may look to share facilities between departments here as a way of increasing power and keeping things locally managed. That would be my guess for the next five years at least.’

For instance, some departments find it useful to have all their high-power computing needs arranged locally in order to avoid overheads associated with using the National Grid Service (NGS). However, barriers to sharing computing power are likely to appear at both technical and administrative levels. In this case, policies and support for resource-sharing need to be provided, otherwise the initiatives of individual researchers are unlikely to succeed:

‘I looked […] to see if you could offer [a local compute resource] as part of an NGS resource, and it was too difficult for me to get that put on and go via that way to submitting jobs on to it, because the way I understood it is that we all put our machines on it and then the NGS would be a much bigger computer, so I looked to try to see whether the resources that we had we could stick on it, and actually there were various political barriers to doing that […] [Parts of the University] harbor resources basically and stick up barriers towards their open use, and I guess that’s probably true about any university, I don’t know.’

The experience of research units that share infrastructure resources, if properly documented and disseminated, can be extremely beneficial to other institutions considering similar arrangements. Research teams who openly share the problems encountered and solutions found in the early stages of using infrastucture tools can substantially increase their image against other more secretive teams. The national service providers themselves - the NGS in this case - will also find their feedback extremely useful, as this could help them to provide a better service.

Instead of dismissing all forms of collaboration for fear that their key research data may be used by rival teams, researchers should see the sharing of infrastructure as ways of making themselves known as helpful and caring members of their community. This could eventually pay off with extra opportunities for research partnership.

Conclusion

A number of open development sustainability lessons can help to remove barriers to engaging with research infrastructure tools and services. These include building project sustainability and management plans, avoiding fully relying on centrally provided development support, and encouraging community collaboration from an early stage. The adoption of such lessons by researchers and service providers is likely to foster a deeper sense of community across the UK research infrastructure and ultimately enhance its effectiveness.

Some of these issues have also been highlighted in an OSS Watch-commissioned study that investigated the level of awareness, attitudes to and understanding of open development within JISC Innovation communities.

Further reading

Links

Related information from OSS Watch