Is open source software insecure? An introduction to the issues

by Rowan Wilson and Scott Wilson on 25 June 2008 , last updated 13 September 2013

Introduction

This briefing note is intended to answer questions that those new to open source software may have about its security. We first identify the chief ways in which software can be insecure, then we discuss general approaches to mitigating software insecurity, and the final section compares closed and open source development methodologies in the light of the information from the preceding sections.

Source code and object code

By definition, open source software is software for which the source code is available to anyone. Source code can be thought of as a kind of blueprint for the software, a form that is ideal for gaining understanding of how a program works or modifying its design. A program’s source code is in many cases processed by another program called a ‘compiler’, which creates the actual file that runs on an end-user’s computer. This file is called the object code (or executable), and it is this that an end-user receives when buying traditional proprietary, closed-source software like Microsoft Word. In comparison to its source code, the object code of a program is very difficult for a human being to comprehend or modify. Thus, open source software can be said to invite and facilitate modification, while closed source software tends not to. These technical characteristics are also generally carried through into the accompanying licences: open source licences permit modification and redistribution by the user, while closed source end-user licence agreements tend to contractually bind the user to refrain from modifying or redistributing the software that they cover.

In what ways can software be insecure?

A piece of software will generally be written to fulfil a specific task. In order to perform this task, the software will be permitted to perform certain actions by the computer’s core software (known as the kernel), such as writing files to the hard drive or starting other programs. In most current models of computer, the program itself is responsible for ensuring that these permitted actions only take place in the intended manner, for the purposes envisaged by its design and function. However, it is notoriously difficult to write software that will always operate in the intended manner. Most software has to process data from external sources that are outside its control, such as user input at the keyboard or data received over a network. It is in these external interactions that the vast majority of security flaws are exposed.

Errors in program design

Let’s look at an example of such a security flaw: when accepting input from an external source it is customary to define how many characters will be processed, and to prepare a space or ‘buffer’ in computer memory to receive these prior to processing them. It is generally the responsibility of the programmer to check that the buffer is not over-filled (or ‘overrun’) or loaded with inappropriate or corrupt data. Where the buffer is over-filled, data can end up being written into a piece of memory that was never intended to receive it, possibly a piece of memory containing the program itself. Where such a buffer overrun does take place, the result can be complete failure of the program. When the program itself is partially overwritten by the over-large input, this can be exploited by a malicious person to replace the program’s code, and thus gain control of a running process on the computer. Similarly, data that is corrupt or malformed can compromise a program’s operation if it is not rejected, and some deliberately malformed input may, if undetected, result in a software failure that compromises the computer’s security.

Security configuration

Even if a piece of software is not insecure as a result of an error by the initial programmers, it can often be rendered insecure by being set up incorrectly. The tailoring of a piece of software to a system where it will run, often referred to as ‘configuration’, allows great scope for error. The process of configuration will often involve the setting of a security policy for the software’s operation, defining for example which users can and cannot use it, and which other computers on a network may connect to it. Sometimes a piece of software will be distributed with a default configuration that is ‘wide open’ - in the sense that it allows anyone to access it - and inexperienced system administrators may fail to detect and refine this open-door policy. To compound the problem it can be tempting to leave a security configuration more open than strictly necessary, as this generally results in fewer complaints of rejected connections or failed authorisations from users.

‘Trojan’ software

While many security problems result from unintended flaws or misconfigurations, some problems are fully intended. Software can be designed to hand over control to unauthorised third parties, or to perform unadvertised and possibly damaging functions. These so-called ‘Trojan’ or ‘Backdoor’ software programs form another component of computer insecurity. Such programs can be inserted into a system in a number of ways. An authorised user may unintentionally trigger their installation via an infected email or altered install package. A malicious user might install them on purpose. Security flaws like the ones described above might permit an unauthorised user to replace legitimate software with a ‘trojanned’ replacement, thus perpetuating their access to the system.

Perfect security?

Complete security is, therefore, extremely difficult to achieve, and it is widely accepted that software security flaws are an uncomfortable fact of life, at least in the world of general-use computing. Modern software - both open and closed source - is constructed by teams of programmers, sometimes incorporating pieces of software from external sources, and much of its eventual functionality and security will be determined by its configuration by administrators on-site. Ensuring the complete competence and good faith of this chain of people presents a difficult challenge. Software security has spawned a large industry of specialists in identifying security flaws. The legitimate portion of this industry is the most visible, and comprises software security consultants who will identify flaws and work with software developers to repair them. Perhaps more worrying is the invisible portion, who identify flaws and sell information about them to anyone with money and an interest in gaining unauthorised access to the information and computing resources of others.

What approaches exist to help ensure software security?

For errors in program design, there are tools to help identify and rectify the problem. Failures to properly check input can - to a certain extent - be identified by analysing the program’s source code using software designed to flag oversights. Programmers themselves can develop suites of tools that will test the desired functionality of their software in an automated way, and build these tools into a quality assurance package. Other tools exist to expose the resulting software’s executable code to random input while measuring the effect on the software. This process, known as ‘fuzzing’, is very effective at identifying cases in which software fails. Of course, as a tool for identifying potentially exploitable software failures, it is also very useful to people seeking unauthorised access. Once installed, the configuration of software can also be checked by tools such as port-scanners, which test the availability of the software’s funtionality to external entities.

In addition to automated oversight, human oversight is a crucial element in detecting security flaws. Inspection of source code by competent programmers can be the only way to show up some problems, such as deliberately introduced insecure functionality like backdoors. Good commenting (incorporation of topical human-language documentation into the source code itself) can greatly aid the process of human source-code checking, as it gives greater insight into what the code is intended to do. External documentation that specifies the software’s overall design and purpose is also extremely helpful.

Is open source less secure than closed source?

When we talk about ‘open vs closed’ in terms of source code, the essential difference we are identifying is one of licensing, as mentioned in the first section. However, the implications of the licensed availability (or not) of source code often extend into the type of community that surrounds a piece of software, and the kind of activities this community engages in.

While there are differences between open source and closed source software that can affect security, overall these factors tend to balance each other out. According to Ian Levy of CESG, the information assurance arm of GCHQ, “There is no significant statistical difference in vulnerability rates between free open source and proprietary software” although the motivation to invest in security is different in each model, as are the vulnerabilities.

This is echoed in the guidance from UK government which states that “Open source, as a category, is no more or less secure than closed proprietary software”.

Availability of source code

The most direct consequence of general source code availability is precisely that anyone can read your code. Errors in your code are thus going to be more easily discoverable, both by those who wish to help your software become more secure and those who wish to exploit its weaknesses. This mass-examination is often referred to as the principle of ‘many eyes’, implying that general availability of source code will lead to greater oversight and quality control.

However, in practice, for many projects there are no more examiners of the source code than in a typical closed-source project, and hackers usually do not bother inspecting source code for vulnerabilities but instead rely on tools for examining object code, so on the whole these advantages and disadvantages tend to balance each other out.

Modifying source code

Another direct consequence of open source licensing - the granting of the right to modify the code and distribute modifications - means that anyone with sufficient skill can also fix problems with your code, and distribute either a patch (a small piece of code that rectifies the problem) or the entire modified and repaired program. Conversely, malicious coders could introduce security flaws into your software via patches or the distribution of new, compromised versions.

In the closed source model, vulnerabilities cannot be discovered by examining source code, as it is not available. This does not, of course, mean that the software has no vulnerabilities, merely that where they exist external parties must discover them through more indirect means such as fuzzing (see above). Once discovered, these flaws can only be repaired by the software’s originators, as no-one else has either the source code or the right to distribute the software or modifications to it. This latter model is often referred to as ‘security through obscurity’.

Compromised distribution

The online distribution methods used by many open source projects are another possible avenue for attack, with software downloads replaced with fakes containing malicious code.

To address the issue of malicious modifications, many open source projects, foundations and communities provide digital signatures and/or hashes that can be used to verify the integrity of downloaded software. (However, if the download packages, checksums and signatures all reside on the same server this doesn’t make it much more difficult to fake as they could all be compromised at the same time.)

Processes and governance

One of the most important factors related to security for organisations selecting software is not the security of the software itself, but the processes that the organisation or community has for managing security incidents and risks. For example, whether it has a policy of not disclosing vulnerabilities until a patch is available for users to apply to mitigate it, and whether it has an effective process and good track record in responding to security incidents.

Many open source software foundations and communities do take security seriously and have processes in place to meet this requirement; for example, the Apache Software Foundation has a dedicated Security Team and set of published policies and processes for handling vulnerabilities.

Market size

These distinctions aside, the problems of security are generally alike for closed and open source software development. The extent to which a given piece of software is targeted by potential exploiters is generally determined not by its licensing policy but by its market penetration. Microsoft, a closed source vendor, had for many years, a reputation for insecurity . While there have been many high-profile vulnerabilities discovered and exploited in Microsoft products, it is extremely difficult to determine whether this is because its software was peculiarly flawed or so widely deployed. A flaw in a popular piece of software will clearly affect more users and thus attract more attention. In addition, those who wish to exploit software to unlawfully gain information or access will target their efforts at the software that - when compromised - will give them the greatest number of potentially exploitable computers. In comparison to Microsoft, Apple, another largely closed source vendor, has a good reputation for software security, but also a far smaller proportion of the market. Is their software superior or are they simply a smaller and therefore less worthwhile target for attack? Unfortunately there is no reliable way of knowing for sure.

In this particular context, open source software is no different from closed source software. Successful projects such as the Apache httpd server and the Linux kernel engage in rigorous security testing before each release because they are widely deployed. Large hardware vendors sell computers with these pieces of software pre-installed, and it is in their interests to contribute to and support the effort to make such software more secure. Smaller open source projects with fewer users are both less likely to have the resources to do rigorous testing, and less likely to be attacked.

Building from source code - trusting trust

While some projects also distribute compiled versions of open source software, its quite commmon for users to need to compile and build software themselves in order to deploy it. This brings into play the potential vulnerability of compromised compilation and build tools (as described in a classic paper by Ken Thompson, “Reflections on Trusting Trust”). Techniques for countering this vulnerability include diverse double compilation although for most practical purposes the solution is to take reasonable precautions to ensure that computers used to compile and deploy software are not compromised.

Conclusion

In the final analysis, the security of software depends upon the maintainers of that software during development and during installation. When adopting closed source, an organisation chooses to trust a single supplier and is less able to participate in the maintenance of its own security, for better or worse. However, when adopting open source, one is able to select the most appropriate supplier and, where resources allow, to engage with one’s own security processes. There are many examples of highly secure uses of both open and closed software and we believe there is no single answer to the question of which is more secure.