1 Introduction

1 Introduction

Setuid Demystified Hao Chen† David Wagner† Drew Dean‡ †{hchen|daw}@cs.berkeley.edu. Computer Science Division, UC Berkeley ‡[email protected] SRI Int...

158KB Sizes 0 Downloads 7 Views

Recommend Documents

1. Introduction
entdeckt Blupi einen gefährlichen Virus, der ihn mit einer seltsamen Krankheit ..... Die Planierraupe ist ein gefährlich

1 Introduction
Robert ČEP*, Michal HATALA**. SELECTION OF CUTTING TOOL MATERIALS. VÝBĚR VHODNÉHO ŘEZNÉHO MATERIÁLU. Abstract. Th

1 Introduction
Select the E Score, Iterations, and Max Results. 5. ..... When Q has a low score (0.1), it means the ... Q per residue i

1. INTRODUCTION
A tooling system, developed for in-situ retrieval of sliver scrape samples from pressure tubes, is also described. These

1 INTRODUCTION
city districts, village and settlement radas (councils) is an electronic ... Zhytomyrska oblast, raions, cities and town

1 Introduction
This document is intended as a source of technical reference for the infra-red network developed at Olivetti Research, a

1. Introduction
Mar 1, 2016 - Bidone, E.D., Lacerda, L.D., 2004. Environmental changes in. Sepetiba Bay, SE Brazil. Regional Environment

1 INTRODUCTION
3.10 History of tea planting. 17 ... Development, history and recent situation . ..... the southern tip of the Indian Pe

1. Introduction
then G packs into the complete graph Kn. This strengthens recent results of. Böttcher Hladký Piguet Taraz, Messuti Rö

1. INTRODUCTION
Madhanraj et al., (2010) studied Meghamalai forest soils are found to be rich in cellulolytic organisms. The physico che

Setuid Demystified Hao Chen† David Wagner† Drew Dean‡ †{hchen|daw}@cs.berkeley.edu. Computer Science Division, UC Berkeley ‡[email protected] SRI International Abstract Access control in Unix systems is based on the user ID model, yet the system calls that modify user IDs (uidsetting system calls), such as setuid, are poorly designed, insufficiently documented, and widely misunderstood and misused. This has caused many security vulnerabilities in application programs. We propose to make progress on the setuid mystery through two approaches. First, we study kernel sources and compare the semantics of the uid-setting system calls in three major Unix systems: Solaris, FreeBSD, and Linux. Second, we formalize the user ID model as a Finite State Automaton (FSA) and develop new techniques for automatic construction of such models. We use the resulting FSA to uncover pitfalls in the Unix API, identify differences in the semantics of uid-setting system calls among various Unix systems, and check their proper usage in programs automatically. Finally, we provide general guidelines on the proper usage of the uid-setting system calls, and we propose a high-level API that is more comprehensible, usable, and portable than the usual Unix API.

1

Introduction

In Unix systems, access control is based on the user ID model. In this model, each process has a set of user IDs and group IDs which determine which system resources, such as files and network ports, the process can access1 . Certain privileged user IDs and groups IDs allow a process to access restricted system resources. In particular, user ID zero, reserved for the superuser root, allows a process to access all system resources. In some applications, a user process needs extra privileges, such as permission to read the password file. By 1 In many Unix systems, a process has also a set of supplementary group IDs, which are not closely related to the topic of this paper and which will not be discussed.

the principle of least privilege, the process should drop its privileges as soon as possible to minimize risk to the system should it be compromised and execute malicious code. Unix systems offer a set of system calls, called the uid-setting system calls, for a process to raise and drop privileges. Such a process is called a setuid process. Unfortunately, for historical reasons, the uidsetting system calls are poorly designed, insufficiently documented, and widely misunderstood. “Many years after the inception of setuid programs, how to write them is still not well understood by the majority of people who write them” [1]. In short, the Unix setuid model is mysterious, and the resulting confusion has caused many security vulnerabilities. We approach the setuid mystery as follows. First, we study the semantics of the uid-setting system calls by reading kernel sources. We document the semantics and compare and contrast them among different Unix systems. This is useful in writing setuid programs. In doing so, we found that manual inspection is tedious and error-prone. This motivates our second contribution: we construct a formal model to capture the behavior of the operating system and use it to guide our analysis. We will describe a new technique for building this formal model in an automated way. We have used the resulting formal model to more accurately document the semantics of the setuid system calls, to uncover pitfalls in the Unix API, to identify differences in the semantics of uidsetting system calls among various Unix systems, and to check their proper usage in programs automatically. Formal methods have gained a reputation as being impractical, so it may be surprising that we found formal methods so useful in our effort, but we will show how our formal model enables many tasks that would otherwise be too error-prone or laborious to undertake. This is one of the first successful applications of formal methods that we know of to a real-world, mature, legacy system of considerable interest, and we feel this case study demonstrates that formal methods can be quite practical when they are used appropriately.

This paper is organized as the follows. Section 2 discusses related work. Section 3 provides background on the user ID model. Section 4 reviews the evolution of the uid-setting system calls. Section 5 compares and contrasts the semantics of the uid-setting system calls in three major Unix systems. Section 6 describes the formal user ID model and its applications. Section 7 analyzes two security vulnerabilities caused by misuse of the uid-setting system calls. Section 8 provides guidelines on the proper usage of the uid-setting system calls and proposes a high-level API to the user ID model. The appendix contains documentation of the uid-setting system calls in Solaris, FreeBSD, and Linux.

2

Related Work

ID, the effective group ID, and the saved group ID. In most cases, the properties of the group IDs parallel the properties of their user ID counterparts. For simplicity, we will focus on the user IDs and will mention the group IDs only when there is confusion or pitfalls. When a process is created by fork, it inherits the three user IDs from its parent process. When a process executes a new file by exec. . . , it keeps its three user IDs unless the set-user-ID bit of the new file is set, in which case the effective uid and saved uid are assigned the user ID of the owner of the new file. Since access control is based on the effective user ID, a process gains privilege by assigning a privileged user ID to its effective uid, and drops privilege by removing the privileged user ID from its effective uid. Privilege may be dropped either temporarily or permanently.

Manual pages in Unix systems are the primary source of information on the user ID model for most programmers. But unfortunately, they are often incomplete or even wrong (Section 6.3.1). Many books on Unix programming also describe the user ID model, such as [2], but often they are specific to one Unix system or release, are outdated, or lack important details. Bishop discussed security vulnerabilities in setuid programs in [3]. His focus is on potential vulnerabilities that a process has once it gains privilege, while our focus is on how to gain and drop privilege confidently and securely. Unix systems have evolved and diversified a great deal since Bishop’s work in 1987, and a big problem today is how to port setuid programs securely to various Unix systems.

3

User ID Model

This section provides background on the user ID model. Each user in a Unix system has a unique user ID. The user ID determines which system resources the user can access. In particular, user ID zero is reserved for the superuser root who can access all resources. A process has three user IDs: the real user ID (real uid, or ruid), the effective user ID (effective uid, or euid), and the saved user ID (saved uid, or suid). The real uid identifies the owner of the process, the effective uid is used in most access control decisions, and the saved uid stores a previous user ID so that it can be restored later. Similarly, a process has three group IDs: the real group

• To drop privilege temporarily, a process removes the privileged user ID from its effective uid but stores it in its saved uid. Later, the process may restore privilege by restoring the privileged user ID in its effective uid. • To drop privilege permanently, a process removes the privileged user ID from all three user IDs. Thereafter, the process can never restore privilege.

4

History

4.1 Early Unix In early Unix systems, a process had two user IDs: the real uid and the effective uid. Only one system call, setuid, modified them according to the following rule: if the effective uid was zero, setuid set both the real uid and effective uid; otherwise, setuid could only set the effective uid to the real uid [1]. This model had the problem that a process could not temporarily drop the root privilege in its effective uid and restore it later. As Unix diverged into System V and BSD, each system solved the problem in a different way.

4.2 BSD and System V 4.2 BSD kept the real uid and effective uid but changed the system call from setuid to setreuid. Processes could

then directly control both their user IDs, under the following rules: • If the effective uid was zero, then the real uid and effective uid could be set to any user ID. • Otherwise, either the real uid or the effective uid could be set to value of the other one. Therefore, the setreuid system call enabled a process to swap the real uid and effective uid. In contrast, System V added a new user ID called the saved uid to each process. Also added was a new system call, seteuid, whose rules were: • If the effective uid was zero, seteuid could set the effective uid to any user ID. • Otherwise, seteuid could set the effective uid to only the real uid or saved uid. seteuid did not change the real uid or saved uid. Furthermore, System V modified setuid so that if the effective uid was not zero, setuid functioned as seteuid (changing only the effective uid); otherwise, setuid set all three user IDs. Later, the POSIX standard codified a new specification for the setuid call. In an attempt to be POSIX compliant, 4.4 BSD replaced BSD 4.2’s old setreuid model with the POSIX/System V style saved uid model. It modified setuid so that setuid set all three user IDs regardless of whether the effective uid of a process was zero, therefore allowing any process to permanently drop privileges. The seteuid system call was also added. As System V and BSD influenced each other, both systems implemented setuid, seteuid, and setreuid, although with different semantics. None of these system calls, however, allowed the direct manipulation of the saved uid (although it could be modified indirectly through setuid and setreuid). Therefore, some modern Unix systems introduced a new call, setresuid, to allow the modification of each of the three user IDs directly.

5

Complexity of Uid-setting System Calls

A process modifies its user IDs by the uid-setting system calls: setuid, seteuid, setreuid, and in some systems,

setresuid. Each of the system calls involves two steps. First, it checks if the process has permission for the system call. If so, it then modifies user IDs according to its rules. In this section, we compare and contrast the semantics of uid-setting system calls among Solaris 8 [4], FreeBSD 4.4 [5], and Linux 2.4.16 [6]. For the complete documentation, see the appendix. The behavior of the uidsetting system calls was discovered by a combination of manual inspection of kernel source code and formal methods. We will defer discussion of the latter until Section 6.

The POSIX Specification To understand the semantics of the uid-setting system calls, we begin with the POSIX standard which has influenced the design of the system calls in many systems. In particular, the behavior of setuid(newuid) is defined by the POSIX specification. See Figure 1 for the relevant text. The POSIX standard refers repeatedly to the term “appropriate privileges”, which is defined in Section 2.3 of POSIX 1003.1-1988 as:

“An implementation-defined means of associating privileges with a process with regard to the function calls and function call options defined in this standard that need special privileges. There may be zero or more such means.”

Essentially, the term “appropriate privilege” serves as a wildcard that allows compliant operating systems to use any policy whatsoever for deeming when a call to setuid should be allowed. The conditional flag { POSIX SAVED IDS} parametrizes the specification, allowing POSIX-compatible operating systems to use either of two schemes (as described in Figure 1). We will see how different interpretations of the term “appropriate privilege” have led to considerable differences in the behavior of the uid-setting system calls between operating systems.

5.1 Operating System-Specific Differences

Much of the confusion is caused by different interpretations of “appropriate privileges” among Unix systems.

“If { POSIX SAVED IDS} is defined: 1. If the process has appropriate privileges, the setuid() function sets the real user ID, effective user ID, and the saved set-userID to newuid. 2. If the process does not have appropriate privileges, but newuid is equal to the real user ID or the [saved user ID], the setuid() function sets the effective user ID to newuid; the real user ID and [saved user ID] remain unchanged by this function call. Otherwise: 1. If the process has appropriate privileges, the setuid() function sets the real user ID and effective user ID to newuid. 2. If the process does not have appropriate privileges, but newuid is equal to the real user ID, the setuid() function sets the effective user ID to newuid; the real user ID remains unchanged by this function call.” (POSIX 1003.1-1988, Section 4.2.2.2) Figure 1: An excerpt from the POSIX specification [7] covering the behavior of the setuid call. Solaris In Solaris 8, a System V based system, a process is considered to have “appropriate privileges” if its effective uid is zero (root). Also, Solaris defines { POSIX SAVED IDS}. Consequently, calling setuid(newuid) sets all three user IDs to newuid if the effective uid is zero, but otherwise sets only the effective uid to newuid (if the setuid call is permitted).

FreeBSD FreeBSD 4.4 interprets “appropriate privileges” differently, as noted in Appendix B4.2.2 of POSIX: The behavior of 4.2BSD and 4.3BSD that allows setting the real ID to the effective ID is viewed as a value-dependent special case of appropriate privilege. This means that a process is deemed to have “appropriate privileges” when it calls setuid(newuid) with newuid=geteuid(), in addition to when its effective uid is zero. Also in contrast to Solaris, FreeBSD does not de-

fine { POSIX SAVED IDS}, although every FreeBSD process does have a saved uid. Therefore, by calling setuid(newuid), a process sets both its real uid and effective uid to newuid if the system call is permitted, in agreement with POSIX. FreeBSD also sets the saved uid in all permitted setuid calls.

Linux Linux introduces a capability2 model for finergrained control of privileges. Instead of a single level of privilege determined by the effective uid (i.e., root or non-root), there are a number of capability bits each of which is used to determine access control to certain resources3 . One of them, the SETUID capability, carries the POSIX “appropriate privileges”. To make the new capability model compatible with the traditional user ID model where “appropriate privileges” are carried by a zero effective uid, Linux keeps the SETUID capability consistant with the sign of the effective uid during all uid-setting system calls. Whenever the effective uid becomes zero, the SETUID capability is set; whenever the effective uid becomes non-zero, the SETUID capability is cleared. However, the SETUID capability can be modified outside the uid-setting system calls. A process can clear its SETUID capability, and a process with the SETPCAP capability can remove the SETUID capability of other processes (but note that in Linux 2.4.16, no process has or can acquire the SETPCAP capability, a change that was made to close a security hole; see Section 7.1 for details). Therefore, explicitly setting or clearing the SETUID capability changes the properties of uid-setting systems calls.

5.2 Comparison among Uid-setting System Calls

Next we compare and contrast the uid-setting system calls and point out several unexpected properties.

setresuid() setresuid has the clearest semantics among the four uid-setting system calls. As each of the real 2 Beware: the word “capability” is a bit of a misnomer. In this context, it refers to special privileges that a process can possess, and not to the usual meaning from the security literature of an unforgeable reference. Regrettably, the former usage is standard in the Linux kernel, and so we follow their convention in this paper. 3 More accurately, a Linux process has three sets of capabilities, but only the set of effective capabilities determine access control. All references to capabilities in this paper refer to the effective capabilities.

uid, effective uid, and saved uid is set directly, the programmer knows clearly what to expect after the call. Moreover, the setresuid() system call is guaranteed to have an all-or-nothing effect: if it succeeds, all uids are changed, and if it fails, none are; it will not fail after having changed some but not all of the uids. Note that while FreeBSD and Linux offer setresuid, Solaris does not. Therefore, a Solaris process cannot set the three user IDs to arbitrary values since no system call can set the saved uid directly. For example, we used the formal model developed in Section 6 to verify that no process can ever set both the real uid and saved uid to non-zero and the effective uid to zero.

seteuid() seteuid has also a clear semantics. It sets the effective uid while leaving the real uid and saved uid unchanged. However, there is a slight difference in the permission required by seteuid among Unix systems. While Solaris and Linux allow the parameter neweuid to be equal to any of the three user IDs, FreeBSD only allows neweuid to be equal to either the real uid or saved uid; in FreeBSD, the effective uid is not used in the decision. As a surprising result, seteuid(geteuid()), which a programmer might intuitively expect to be always permitted, can fail in FreeBSD, e.g., when ruid=100, euid=200, and suid=100.

setreuid() The semantics of setreuid are confusing. It modifies the real uid and effective uid, and in some cases, the saved uid. The rule by which the saved uid is modified is complicated. (See the appendix for details.) Furthermore, the permission required for setreuid differs among the three operating systems. In Solaris and Linux, a process can always swap the real uid and effective uid by calling setreuid(geteuid(), getuid()). In FreeBSD, however, setreuid(geteuid(), getuid()) sometimes fails, e.g., when ruid=100, euid=200, and suid=100.

setuid() Although setuid is the only uid-setting system call standardized in POSIX 1003.1-1988, it is also the most confusing one. First, the required permission differs among Unix systems. Both Linux and Solaris require the parameter newuid to be equal to either the real uid or saved uid if the effective uid is not zero. As a surprising result, setuid(geteuid()), which a programmer might reasonably expect to be always permitted, can fail in some cases, e.g., when ruid=100, euid=200, and suid=100. On the contrary, setuid(geteuid()) always suc-

ceeds in FreeBSD. Second, the action of setuid differs not only among different operating systems but also between privileged and unprivileged processes. In Solaris and Linux, if the effective uid is zero, setuid(newuid) sets all three user IDs to newuid; otherwise, it sets only the effective user ID to newuid. On the other hand, in FreeBSD setuid(newuid) sets all three user IDs to newuid regardless of the effective uid, if the setuid call is permitted.

setgid() and relatives There are also a set of calls for manipulating group IDs, namely, setgid, setegid, setregid, and setresgid. They behave much like their setuid counterpart, with only one minor exception (the permission check in setregid differs slightly from setreuid in Solaris). However, the “appropriate privileges” are always carried by the euid in both setuid-like and setgidlike calls. Thus, an effective group ID of zero does not accord any special privileges to change groups. This seems to be a common misconception: it is tempting to assume incorrectly that since “appropriate privileges” are carried by the euid in the setuid-like calls, they will be carried by the egid in the setgid-like calls, but this is not how it actually works. This misconception caused a mistake in the manual page of setgid in Linux 2.4.16 (Section 6.3.1).

6

Formal Models

We initially began developing the summary in the previous section by manually reading operating system source code. Although reading kernel sources is a natural method to study the semantics of the uid-setting system calls, it has many serious limitations. First, it is a laborious task, especially when various Unix systems implement the system calls differently. Second, since our findings are based on current kernel sources, they may become invalid should the implementation change in the future. Third, we cannot prove that our findings are correct and that we have not misunderstood kernel sources. Finally, informal specifications are not wellsuited to programmatic use, such as automated verification of properties of the operating system or use in static analysis of application programs to check proper usage of uid-setting system calls. These problems with manual source code analysis motivate the need for more principled methods for building a formal model of the uid-setting system calls.

6.1 Building a Formal Model

Our model of the uid-setting system calls is based on finite state automata. The operating system maintains perprocess state (e.g., the real, effective, and saved uids) to track privilege levels, and thus it is natural to view the operating system as implementing a finite state automaton (FSA). A state of the FSA contains all relevant information about the process, e.g., the three uids. Each uidsetting system call leads to a number of possible transitions; we label each transition with the system call that it comes from. We construct the FSA in two steps: (1) determine its states by reading kernel sources; (2) determine its transitions by simulation. In the first step, we determine the states in the FSA by identifying kernel variables that affect the behavior of the uid-setting system calls. For example, if only the real uid, effective uid, and saved uid can affect the uid-setting system calls, then each state of the FSA is of the form (r, e, s), representing the values of the real, effective, and saved user IDs, respectively. This is a natural approach. However, the problem one immediately faces is that the resulting FSA is much too large: in Linux, uids are 32-bit values, and so there are (232 )3 = 296 possible states. Obviously, manipulating an FSA of such size is infeasible. Therefore, we need to somehow abstract away inessential details and reduce the size of the FSA dramatically. Fortunately, we can note that there is a lot of structure present. If we have a non-root user ID, the behavior of the operating system is essentially independent of the actual value of this user ID, and depends only on the fact that it is non-zero. For example, the states (ruid, euid, suid) = (100, 100, 100) and (200, 200, 200) are isomorphic up to a substitution of the value 100 by the value 200, since the OS will behave similarly in both cases (e.g., setuid(0) will fail in both cases). In general, we consider two states equivalent when each can be mutated into the other by a consistent substitution on nonroot user IDs. By identifying equivalent states, we can shrink the size of the FSA dramatically. Now that we know that there must exist some reasonable FSA model, the next problem is how to compute it. Here we suggest using simulation: if we simulate the presence of a pseudo-application that tries every possible system call and we observe the state transitions performed by the operating system in response to these system calls, we can infer how the operating system will behave when invoked by real applications. Once we identify equiva-

lent states, the statespace will be small enough that we can exhaustively explore the entire statespace of the operating system. This idea is made concrete in Figure 2, where we give an algorithm to construct an FSA model using these techniques. G ET S TATE(): 1. Call getresuid(&r,&e,&s). 2. Return (r, e, s). S ET S TATE(r, e, s): 1. Call setresuid(r, e, s). G ETA LL S TATES(): 1. Pick six arbitrary non-zero uids u1 , . . . , u6 . 2. Let U := {0, u1 , . . . , u6 }. 3. Let S := {(r, e, s) : r, e, s ∈ U }. 4. Let C := {setuid(x), setreuid(x, y), setresuid(x, y, z), · · · : x, y, z ∈ U ∪ {−1}}. 5. Return (S, C). B UILD M ODEL(): 1. Let (S, C) := G ETA LL S TATES(). 2. Create an empty FSA with statespace S. 3. For each s ∈ S, do: 4. For each c ∈ C, do: 5. Fork a child process, and within the child, do: 6. Call S ET S TATE(s), and then invoke c. 7. Finally, let s0 := G ET S TATE(), pass s0 to the parent process, and exit. c 8. Add the transition s → s0 to the FSA. 9. Return the newly-constructed FSA as the model. Figure 2: The model-extraction algorithm.

Implementation Our implementation follows Figure 2 closely. (Note that the simulator must run as root.) In practice, we extend this basic algorithm with several optimizations and extensions. One simple optimization is to use a depth-first search to explore only the reachable states. In our case, the statespace is small enough that the improvement is probably unimportant. A more dangerous optimization might be to emulate the behavior of the operating system from user-level by cutting-and-pasting the source code of the setuid system calls from the kernel into our simulation engine. This would speed up model construction, but the performance improvement comes at a severe price: it is hard to be sure

R=0,E=1,S=1

R=1,E=0,S=0 setuid(0)

R=1,E=1,S=0

setuid(0)

setuid(1)

setuid(1)

R=1,E=0,S=1 setuid(0) setuid(1)

setuid(1)

R=0,E=0,S=1

setuid(0)

setuid(0)

R=0,E=0,S=0

setuid(0)

R=0,E=1,S=0 setuid(0) setuid(1)

setuid(1)

setuid(1) R=1,E=1,S=1

setuid(0)

setuid(1)

(a) An FSA describing setuid in FreeBSD 4.4

R=1,E=1,S=0

setuid(1)

R=0,E=1,S=1

setuid(0) R=1,E=0,S=0

R=1,E=0,S=1

setuid(0) R=0,E=1,S=0 setuid(0)

setuid(1)

setuid(1)

setuid(0)

R=0,E=0,S=0

setuid(1)

R=0,E=0,S=1

setuid(1)

R=0,E=1,S=1

R=1,E=0,S=0

setuid(1)

setuid(0) R=0,E=1,S=0

setuid(1)

setuid(1)

setuid(0)

R=0,E=0,S=0

R=1,E=0,S=1 setuid(0)

setuid(0)

R=0,E=0,S=1 setuid(0)

setuid(1)

setuid(1)

setuid(1)

setuid(1) R=1,E=1,S=1

setuid(1)

setuid(0)

setuid(0)

setuid(0) setuid(0)

R=1,E=1,S=0

setuid(0)

setuid(1)

(b) An FSA describing setuid in Solaris 8

R=1,E=1,S=1

setuid(0)

setuid(1)

(c) An FSA describing setuid in Linux 2.4.16

Figure 3: Three finite state automata describing the setuid system call in FreeBSD, Solaris, and Linux, respectively. Ellipses represent states of the FSA, where a notation like “R=1,E=0,S=1” indicates that euid = 0 and ruid = suid 6= 0. Each transition is labelled with the system call it corresponds to. To avoid cluttering the diagram, we show only the case of the setuid call, and we omit the error states, group ID fields, and (in Linux) the capability bits that otherwise would appear in our deduced model. that our emulation of the OS is completely faithful. In any case, our unoptimized implementation already takes only a few seconds to generate the model. For these reasons, we do not apply this optimization in our implementation. On Solaris, there is no getresuid system call; we emulate it by reading from the /proc filesystem. More problematic is that there is also no setresuid system call, so we must emulate it using the other system calls. This takes some work, for not all states can be reached in a single system call, but we use various ad-hoc heuristics to accomplish the desired sequence of transitions. Fortunately, it is easy to be sure that our emulation of setresuid is correct: we simply check afterwards that all three uids have been set to the desired values correctly by calling getuid, geteuid, and so on. Only one issue remains: some states (e.g., ruid = suid 6= 0, euid = 0) cannot be reached by any sequence of system calls, and hence we cannot emulate setresuid in our simulator for these cases. Fortunately, since these states will be unreach-

able by normal applications as well, so we can simply remove these unreachable states from the FSA and the basic algorithm will work correctly. On Linux, we also model the SETUID capability bit by adding a fourth dimension to the state tuple. Thus, states are of the form (r, e, s, b) where the bit b is true whenever the SETUID capability is enabled. This allows us to accurately model the case where an application explicitly clears or sets its SETUID capability bit; though we are not aware of any real application that does this, if we ever do encounter such an application our model will still remain valid. On all operating systems, we extend our model further to deal with system calls that fail. It is sometimes useful to be able to reason about whether a system call has succeeded or failed, and one way is to add a bit to the state denoting whether the previous system call returned successfully or not.

Also, on all operating systems we extend our model to include group IDs. This adds three additional dimensions to the state: real gid, effective gid, and saved gid. In this way, we can model the semantics of the gidsetting system calls. On Linux, we also add a bit to indicate whether the SETGID capability is enabled or not. Another observation is that most applications manipulate at most one non-root user ID at a time. For instance, a state like (100, 200, 100) will never appear in such an application. In this case, we can simplify the model further by recording for each component of the state only whether the corresponding user ID is zero or non-zero. Thus, each state in the simplified FSA has three bits, each representing whether the real uid, effective uid, or saved uid is root or not. All together there are eight states in the FSA, and the G ETA LL S TATES() subroutine in Figure 2 is modified accordingly. In Figure 3 we show graphically the models one obtains in this way for the setuid call on FreeBSD, Solaris, and Linux.

6.2 Correctness Our model-extraction algorithm (Figure 2) is an instance of a more general schema for inferring finite-state models, specialized by including application-dependent implementations of the G ET S TATE(), S ET S TATE(), and G ETA LL S TATES() subroutines. We argue that our algorithm is correct by arguing that the general version is correct. We frame our theoretical discussion in terms of equivalence relations. Let S denote the set of concrete states (e.g., triples of 32-bit uids) and C the set of concrete c system calls. Write s t if the operating system will always transition from state s to t upon invocation of c. We will need equivalence relations ≡S on S and ≡OS on S ×C that are respected by the operating system: in other c words, if s t and s ≡S s0 , then there is some state t0 and some call c0 so that (s, c) ≡OS (s0 , c0 ), t ≡S t0 , and c0

s0 t0 . The intuition is that calling c from s is somehow isomorphic to calling c0 from s0 . Also, we assume that whenever (s, c) ≡OS (s0 , c0 ) holds, then s ≡S s0 does, too. A critical requirement is that the operating system must behave deterministically given the equivalence class of c

0 c

0

the current state. More precisely, if s t and s u where (s, c) ≡OS (s0 , c0 ), then we require t ≡S u. The intuition is that the behavior of the operating system will depend only on which equivalence class we are in, and not on any other information about the state. For in-

stance, the behavior of the operating system cannot depend on any global variables that don’t appear in the state s; if it does, these global variables must be included into the statespace S. As another example, a system call implementation that attempts to allocate memory and returned an error code if this allocation fails will violate our requirement, because the success or failure of the memory allocation introduces non-determinism, which is prohibited. We can see that this requirement is nontrivial, and it must be verified by manual inspection of the source code before our algorithm in Figure 2 can be safely applied; we will return to this issue later. Next, there are three requirements on the instantiation of the G ET S TATE(), S ET S TATE(), and G ETA LL S TATES() subroutines. First, the G ET S TATE() routine must return (a representative for) the equivalence class of the current state of the operating system. Note that it is natural to represent equivalence classes internally by singling out a unique representative for each equivalence class and using this value. Second, the S ET S TATE() procedure with parameter s must somehow cause the operating system to enter a state s0 in the same equivalence class as s (the implementation may choose one in any way conveniently). Finally, the G ETA LL S TATES() function must return a pair (S, C) so that S contains at least one representative from equivalence class of ≡S and so that every equivalence class of ≡OS contains some element (s, c) with c ∈ C. When these general requirements are satisfied, the B UILD M ODEL() algorithm from Figure 2 will correctly4 infer a valid finite-state model for the underlying operating system. Consequently, all that remains is to check that these requirements are satisfied by our instantiation of the schema. We argue this next for the implementation shown in Figure 2. Let U denote the set of concrete uids (e.g., all 32-bit values), so that S = U × U × U. Say that a map f : U → U is a valid substitution if it is bijective and fixes 0, i.e., f (0) = 0. Each such 4 The

proof is easy. We will write [x] for the equivalence class c containing x, e.g., [s] = {t ∈ S : s ≡S t}. If [s] → [t] appears in the final FSA output by B UILD M ODEL(), then there must have been a step at which, for some s0 ∈ [s], t0 ∈ [t], and c0 with (s, c) ≡OS (s0 , c0 ), we executed c0 in state s0 at line 6 and transitioned to state t0 . (This follows from the correctness of S ET S TATE() and G ET S TATE().) The latter means that s0 c0 s0

c0

t0 , since the OS re-

spects ≡OS . Conversely, if t0 for some s0 , c0 , t0 , then by the correctness of G ETA LL S TATES(), there will be some s and c satisfying (s, c) ≡OS (s0 , c0 ) so that we enter line 6 with s, c, and thanks to the deterministic nature of the operating system we will discover c the transition s → t for some t ≡S t0 . Thus, the FSA output by B UILD M ODEL() is exactly what it should be.

substitution can be extended to one on S by working component-wise, i.e., f (r, e, s) = (f (r), f (e), f (s)), and we can extend it to work on system calls by applying the substitution to the arguments of the system call, e.g., f (setreuid(r, e)) = setreuid(f (r), f (e)). We define our equivalence relation ≡S on S as follows: two states s, s0 ∈ S are equivalent if there is a valid substitution f such that f (s) = s0 . Similarly, (s, c) ≡OS (s0 , c0 ) holds if there is some valid substitution f so that f (s) = s0 and f (c) = c0 . The correctness of G ET S TATE() and S ET S TATE() is immediate. Also, G ETA LL S TATES() is correct since the choice of uids u1 , . . . , u6 is immaterial: every pair (s, c) ∈ S ×C is equivalent to some pair (s0 , c0 ) ∈ S ×C, since we can simply map the first six non-zero uids in (s, c) to u1 , . . . , u6 respectively, and there can be at most six non-zero uids in (s, c). Actually, we can see that the algorithm in Figure 2 comes from a finer partition than that given by ≡OS : for example, (u1 , u1 , u1 ) and (u2 , u2 , u2 ) are distinguished. This causes no harm to the correctness of the result, and only unnecessarily increases the size of the resulting FSA. We gave the variant shown in Figure 2 because it is simpler to present, but in practice our implementation does use the coarser relation ≡S . All that remains to check is that the operating system respects and behaves deterministically with respect to this equivalence class. We verify this by manual inspection of the kernel sources, which shows that in Linux, FreeBSD, and Solaris the only operations that the uidsetting system calls perform on user IDs are equality testing of two user IDs, comparison to zero, copying one user ID to another, and setting a user ID to zero. Moreover, the operating system behavior does not depend on anything else, with one exception: Linux depends on whether the SETUID capability is enabled for the process, so on Linux we add an extra bit to each state indicating whether this capability is enabled. Thus, our verification task amounts to checking that user IDs are treated as an abstract data type with only four operations (equality testing, comparison to zero, and so on) and that the side effects and results of the system call do not depend on anything outside the state S. In our experience, verifying that the operating system satisfies these conditions is much easier than fully understanding its behavior, as the former is an almost purely mechanical process. This completes our justification for the correctness of our method for extracting a formal model to capture the behavior of the operating system.

6.3 Applications The resulting formal model has many applications. We have already discussed in Section 5 the semantics of the setuid system calls and pointed out pitfalls; this relied heavily on the FSA formal model. Next, we will discuss several additional applications: verifying documentation and checking conformance with informal specifications; identifying cross-platform semantic differences that might indicate potential portability issues; and automatically checking programs for proper usage of the uid-setting system calls.

6.3.1

Verifying Accuracy of Manual Pages

Manual pages are the primary source of information for programmers, but unfortunately they are often incomplete or wrong. FSAs are useful in verifying the accuracy of manual pages of uid-setting system calls. For each call, if its FSA is small and its description in manual pages is simple, we check if each transition in the FSA agrees with the description by hand. Otherwise, we build another FSA based on the description and compare this FSA to the original FSA built by simulation. Differences between the two FSAs indicate discrepancies between the behavior of the system call and its description in manual pages. The followings are a few examples of problematic documentation that we have found using our formal model: • The man page of setuid in Linux 2.4.16 fails to mention the SETUID capability which affects the behavior of setuid. • The man page of setreuid in FreeBSD 4.4 says: Unprivileged users may change the real user ID to the effective user ID and viceversa; only the super-user may make other changes.” However, this is incorrect. Swapping the real uid and effective uid does not always succeed, such as when ruid=100, euid=200, suid=100, contrary to what the man page suggests. The correct description is “Unprivileged users may change the real user ID to the real or saved user ID, and change the effective user ID to the real, effective, or the saved user ID.” • The man page of setgid in Linux 2.4.16 says

The setgid function checks the effective gid of the caller and if it is the superuser, all process related group ID’s are set to gid.

point. Figure 4(b) shows the program FSA of the program in Figure 4(a). Figure 4(c) shows the composite FSA obtained by composing the model FSA in 3(c) with the program FSA in Figure 4(b).

In reality, the effective uid is checked instead of the effective gid.

6.3.2

Identifying Operating System-Specific Differences

1: 2: 3:

// ruid=1, euid=0, suid=0 printf(“drop priv”); setuid(1); execl(“/bin/sh”, “sh”,NULL); (a) A program segment

Since various Unix systems implement the uid-setting system calls differently, it is difficult to identify their semantic differences via reading kernel sources. We can solve this problem by creating an FSA of the user ID model in each Unix system and contrasting the FSAs. For example, the semantic differences of setuid among Solaris, FreeBSD, and Linux are visually clear from the FSAs in Figure 3. The approach can be further formalized by taking the symmetric difference of FSAs. In particular, if M, M 0 are two FSAs for two Unix platforms, we can compute their parallel composition M × M 0 , whose states are pairs (s, s0 ) with s a state from M and s0 a state from M 0 , and then we can mark as an accepting state of M × M 0 any pair (s, s0 ) where s 6= s0 . Now any execution trace that starts at a non-accepting state and eventually reaches an accepting state indicates a sequence of system calls whose semantics is not the same on both operating systems. This indicates a potential portability issue, and all such differences can be computed via a simple reachability computation (e.g., depth-first search).

6.4 Checking Proper Usage of Uid-setting System Calls

The formal model is also useful in checking proper usage of uid-setting system calls in programs. We model a program as an FSA, called the program FSA, which represents each program point as a state and each statement as a transition. We call the FSA describing the user ID model a model FSA. By composing the program FSA with the model FSA, we get a composite FSA. Since each state in the composite FSA contains one state from the model FSA (representing a unique combination of values in the real uid, effective uid, and saved uid) and one state from the program FSA (representing a program point), a reachable state in the composite FSA shows that the state in the model FSA is reachable at the program

printf() Line 1

setuid(1) Line 2

Line 3

(b) Program FSA of the program in Figure 4(a)

Line 1 R=1,E=0,S=0

printf()

setuid(1) Line 2 Line 3 R=1,E=0,S=0 R=1,E=1,S=1

(c) Composite FSA of Model FSA in Figure 3(c) and Program FSA in Figure 4(a)

Figure 4: Composing a model FSA with a program FSA This method is useful for checking proper usage of uidsetting system calls in programs, such as: • Can a uid-setting system call fail? If any error state in the model FSA is reachable at some program point, it shows that a uid-setting system call may fail there. • Can a program fail to drop privilege? If any state that contains a privileged user ID in the model FSA is reachable at a program point where the program should be unprivileged, it shows that the program may have failed to drop privilege at an earlier program point. • Which part of the program may run with privilege? First, we identify all states that contain a privileged user ID in the model FSA. Then, we identify all program points where any of those states are reachable. The program may run with privilege at these program points. A full discussion is out of the scope of this paper, and we refer the interested reader to a companion paper for details [8].

6.5 Advantages

calling setuid(getuid()) which set all three user IDs to the non-root user.

The formal modeling holds several advantages. First, it makes it easier to describe the properties of the uidsetting system calls. While we still need to read kernel code to determine the kernel variables that affect the uid-setting system calls, the majority of the workload, determining their actions, is done automatically by simulation. Second, the formal model is reliable because it is created from the same environment where application programs run. The formal model has corrected several mistakes in the user ID model that we created manually. Third, the formal model is useful in identifying semantic differences of uid-setting system calls among Unix systems. Finally, the formal model is useful in checking proper usage of uid-setting system calls in programs automatically.

POSIX specifies that if a process has appropriate privileges, setuid(newuid) sets all three user IDs to newuid; otherwise, setuid(newuid) only sets the effective uid to newuid (if newuid is equal to the real uid or saved uid). In Linux, appropriate privileges are carried by the SETUID capability. Furthermore, in all uid-setting system calls, the SETUID capability is kept consistent with the sign of the effective uid, i.e. the SETUID capability is set if and only if the effective uid is zero.

7

Case Studies of Security Vulnerability

Misuses of uid-setting system calls have caused many security vulnerabilities, which are good lessons in learning the proper usage of the system calls. We will analyze two such incidents in older versions of sendmail. Sendmail [9] is a commonly used Mail Transmission Agent(MTA). It runs in two modes: (1) as a daemon that listens on port 25 (SMTP), and (2) via a Mail User Agent to submit mail to the mail queue. In the first case, all three user IDs of the sendmail process are typically zero, as it is run by the superuser root in the boot process. In the second case, however, sendmail is run by an ordinary user. As the mail queue is not world writable, sendmail requires privilege to access the mail queue.

However, prior to version 2.2.16 of Linux, there was a bug in the kernel which made it possible to clear the SETUID capability of a process even when its effective uid was zero. In this case, calling setuid(getuid()) only modified the effective uid. So the sendmail only dropped root privilege from its effective uid but kept it in its saved uid. Consequently, once taking over the sendmail, a malicious user could restore root privilege in the effective uid by calling setreuid(-1, 0). Figure 5 illustrates the vulnerability.

A normal non-root user executes sendmail ruid!=0, euid=suid=0 SETUID-capability=1

ruid!=0, euid=suid=0 SETUID-capability=0

sendmail calls

sendmail calls

setuid(getuid())

setuid(getuid())

ruid=euid=suid!=0 SETUID-capability=0 sendmail executes the rest of code

ruid=euid!=0, suid=0 SETUID-capability=0 The malicious user takes over sendmail and executes setreuid(-1,0) ruid!=0, euid=suid=0 The malicious user executes code with root privilege

7.1 Misuse of Setuid Next we describe a vulnerability that was caused by a misuse of setuid [10]. Sendmail 8.10.1 installed the sendmail as a setuid-root executable. When it was executed by a non-root user, the real uid of the process was the non-root user while both the effective uid and saved uid were zero. This gave the sendmail permission to write to the mail queue since its effective uid was zero. To minimize risks in the event that an attacker takes over the sendmail and executes malicious code with root privilege, the sendmail permanently dropped root privilege as soon as it was no longer needed. This was done by

A malicious non-root user executes sendmail

(a) A normal execution of sendmail by a non-root user

(b) An execution of sendmail by an attacker

Figure 5: A vulnerability in sendmail due to a misuse of setuid. Note the failure: the programmer assumed that setuid(getuid()) would always succeed in dropping all privilege, but by disabling the SETUID capability, the attacker is able to violate that expectation. The vulnerability was caused by the overloaded seman-

tics of setuid. Depending on whether a process has the SETUID capability, setuid sets one user ID or all three user IDs, but it returns a success code in both cases. The vulnerability can be avoided by replacing setuid(newuid) with setresuid(newuid, newuid, newuid) if available, or with setreuid(newuid, newuid) otherwise.

A user executes sendmail ruid=euid=suid!=0 rgid!=smmsp egid=sgid=smmsp

ruid=euid=suid!=0 rgid!=smmsp egid=sgid=smmsp

sendmail calls

sendmail calls

setgid(getgid())

setgid(getgid())

ruid=euid=suid!=0 rgid=egid=sgid!=smmsp (wrong assumption)

7.2 Interaction of User IDs and Group IDs

A malicious user executes sendmail

sendmail executes the rest of code

ruid=euid=suid!=0 rgid=egid!=smmsp sgid=smmsp The malicious user takes over sendmail and executes setregid(-1, smmsp)

Another vulnerability in Sendmail was caused by an interaction between the user IDs and the group IDs [11]. To further reduce the risk from a malicious user taking over the sendmail, as of version 8.12.0 Sendmail no longer installed the sendmail as a setuid-root program. To give the sendmail permission to write to the mail queue, the mail queue was configured to be writable by group smmsp and the sendmail was installed as setgidsmmsp. Therefore, when the sendmail was executed by a non-root user, the real gid of the process was the primary group of the user, but the effective gid and saved gid were smmsp. For the same reason that it permanently dropped root privilege as soon as possible in previous versions, now the sendmail permanently dropped smmsp group privilege as soon as it was no longer needed. Similar to the use of setuid(getuid()) to permanently drop root privilege, the sendmail called setgid(getgid()) to permanently drop smmsp group privilege. However, since the sendmail no longer had appropriate privileges because its effective uid was not zero anymore, setgid(getgid()) only dropped the privileged group ID smmsp from the effective gid but left it in the saved gid. Consequently, any malicious user who found some way to take over sendmail (e.g., by a buffer overrun) could restore the smmsp group privilege in the effective gid by calling setgid(-1, smmsp). This is illustrated in Figure 6. The vulnerability was caused by an interaction between the user IDs and group IDs since changing user IDs may affect the property of setgid. To avoid the vulnerability, we can replace setgid(newgid) with setresgid(newgid, newgid, newgid) if available, or setregid(newgid, newgid) otherwise. The vulnerability also shows that if both user IDs and group IDs are to be modified, the modification should follow a specific order (Section 8.1.2).

ruid=euid=suid!=0 rgid!=smmsp egid=sgid=smmsp The malicious user executes code with smmsp group privilege

(a) The programmer’s mental model of an expected execution trace

(b) Real execution of sendmail by a malicious user

Figure 6: A vulnerability in sendmail due to interaction between user IDs and group IDs. The failure occurs because the programmer has overlooked that she has already dropped root privilege and hence no longer has the “appropriate privileges” to drop all group privileges in setgid call.

8

Guidelines

In this section, we provide guidelines on the proper usage of the uid-setting system calls. First, we discuss general guidelines which apply to all setuid programs. Then, we focus on applications that use the uid-setting system calls in a specific way. We propose a high-level API for these applications to manage their privileges. The API is easy to understand and to use.

8.1 General Guidelines

8.1.1

Selecting an Appropriate System Call

Since setresuid has a clear semantics and is able to set each user ID individually, it should always be used if

available. Otherwise, to set only the effective uid, seteuid(new euid) should be used; to set all three user IDs, setreuid(new uid, new uid) should be used.

8.1.2

Obeying the Proper Order of System Calls

The POSIX-defined “appropriate privileges” affect the actions of both system calls that set user IDs and that set group IDs. Since often “appropriate privileges” are carried by the effective uid, a program should drop group privileges before dropping user privileges permanently. Otherwise, after permanently dropping user privileges, the program may be unable to permanently drop group privileges. For example, the program in Figure 7(a) is able to permanently drop both user and group privileges because it calls setgid before setuid. In contrast, since the program in Figure 7(b) calls setuid before setgid, it fails to drop group privileges permanently.

ruid=100, euid=suid=0 rgid=200, egid=sgid=0 setgid(getgid())

ruid=100, euid=suid=0 rgid=egid=sgid=200 setuid(getuid())

ruid=euid=suid=100 rgid=egid=sgid=200

(a) A program drops both user and group privileges permanently by calling setgid(getgid()) before setuid(getuid)

Verifying User IDs However, checking return codes may be insufficient for uid-setting system calls. For example, in Linux and Solaris, depending on the effective uid, setuid(newuid) may (1) set all three user IDs (if the effective uid is zero), or (2) set only the effective uid (if it is non-zero), but the system call returns the same success code in both cases. The return code does not indicate to the process which case has happened. Therefore, a process should verify its user IDs after each uid-setting system call. A process may call getresuid to check all three user IDs if it is available, as in Linux and FreeBSD. Otherwise, the process may call getuid and geteuid to check the real uid and effective uid, as in Solaris.

ruid=100, euid=suid=0 rgid=200, egid=sgid=0 setuid(getuid())

ruid=euid=suid=100 rgid=200, egid=sgid=0 setgid(getgid())

ruid=euid=suid=100 rgid=egid=200, sgid=0

(b) A program fails to drop group privileges permanently because it calls setuid(getuid()) before setgid(getgid())

Figure 7: Proper order of dropping user and group privileges. Figure (a), on the left, shows proper usage; figure (b) shows what can go wrong if one gets the order backwards.

8.1.3

Checking Return Codes A process should check return codes of uid-setting system calls to see if they succeed. This is especially important when a process permanently drops privilege, since such an action usually precedes operations that, if executed with privilege, may compromise the system.

Checking the Proper Execution of System Calls

Since the semantics of the uid-setting system calls may change, e.g. when the kernel changes or when an application is ported to a different Unix system, it is imperative to check proper execution of these system calls.

Verifying Failures Once an attacker takes control of a process, the attacker may insert arbitrary code into the process. Therefore, for further assurance on security, the process should ensure that all unpermitted uid-setting system calls will fail. For example, after dropping privilege permanently, the process should verify that attempts to restore privilege will fail. This is shown in Figure 8 // drop privilege setuid(getuid());

// verify the process cannot restore privilege if (setreuid(-1, 0) == 0) return ERROR;

Figure 8: An example of a program that verifies that we have properly dropped root privileges by checking that unpermitted uid-setting system calls will fail.

8.2 High-Level API Although the general guidelines in Section 8.1 help programmers to use the uid-setting system calls more securely, programmers still have to grapple with the complex semantics of the uid-setting system calls and their differences among Unix systems. The complexity is

partly due to a mismatch between the low-level semantics of the system calls, which describes how to modify the user IDs, and the high-level semantics of privilege management, which describes how to raise and drop privileges.

8.2.1

API

In many applications, privilege management involves the following tasks:

• Drop privilege temporarily, which can be restored later. • Drop privilege permanently, which can never be restored. • Restore privilege.

We propose a new API which offers a function to perform each of these tasks. The API contains: • drop priv temp(new uid): Drop privilege temporarily. Move the privileged user ID from the effective uid to the saved uid. Assign new uid to the effective uid.

8.2.2

Implementation

We implement the new API as wrapper functions to the uid-setting system calls. The implementation uses setresuid if available since it has the clearest semantics and it is able to set each of the user IDs independently, as shown in Figure 10. If setresuid is not available, such as in Solaris, the implementation uses seteuid and setreuid, as shown in Figure 11. int drop_priv_temp(uid_t new_uid) { if (setresuid(-1, new_uid, geteuid()) < 0) return ERROR_SYSCALL; return 0; } int drop_priv_perm(uid_t new_uid) { if (setresuid(new_uid, new_uid, new_uid) < 0) return ERROR_SYSCALL; return 0; } int restore_priv() { int ruid, euid, suid; if (getresuid(&ruid, &euid, &suid) < 0) return ERROR_SYSCALL; if (setresuid(-1, suid, -1) < 0) return ERROR_SYSCALL; return 0; }

• drop priv perm(new uid): Drop privilege permanently. Assign new uid to all the real uid, effective uid, and saved uid.

Figure 10: A possible implementation of the high-level API for systems with setresuid.

• restore priv: Restore privilege. Copy the privileged user ID from the saved uid to the effective uid.

To use this implementation, an application must meet the following requirements:

Figure 9 describes the state change of a process by calling these functions.

priv drop_priv_temp() restore_priv() drop_priv_perm() unpriv_temp

unpriv_perm

Figure 9: An FSA showing the statespace of a process when calling the functions of the new API.

• When the process starts, its effective uid contains the privileged user ID. This is true in most circumstances. When a process is run by a privileged user, all three user IDs contain the privileged user ID. If the process is run as a privileged user, i.e. its executable is setuid’ed to the privileged user and is run by an unprivileged user, both the effective uid and saved uid of the process contain the privilege user ID. • If the privileged user ID is not zero, then the unprivileged user ID must be stored in the saved uid when the process starts. This requirement enables the process to replace the privileged user ID in the effective uid with the unprivileged user ID in drop priv temp and drop priv perm. This is the case when an executable is setuid’ed to a non-root

uid_t priv_uid; int drop_priv_temp(uid_t new_uid) { int old_euid = geteuid(); // copy euid to suid if (setreuid(getuid(), old_euid) < 0) return ERROR_SYSCALL; // set euid as new_uid if (seteuid(new_uid) < 0) return ERROR_SYSCALL; priv_uid = old_euid; return 0; } int drop_priv_perm(uid_t new_uid) { if (setreuid(new_uid, new_uid) < 0) return ERROR_SYSCALL; return 0; } int restore_priv() { if (seteuid(priv_uid) < 0) return ERROR_SYSCALL; return 0; }

Figure 11: A possible implementation of the high-level API for systems without setresuid. user. On the other hand, if the privileged user ID is zero, then there is no such requirement, since the process can set its user IDs to arbitrary values. • The process does not make any uid-setting system calls that change the effective uid or saved uid. Such a call may cause the process to enter a state not covered by the FSA in Figure 9, on which the high-level API and the implementation are based. The implementation has the following properties: • It does not affect or rely on the real uid. The process is free to modify the real uid in any way. • It guarantees that all transitions in Figure 9 succeed.

and two are to restore privilege. We are able to implement all these tasks with the new API.

9

Conclusion

We have studied the proper usage of uid-setting system calls by two approaches. First, we documented the semantics of the uid-setting system calls in three major Unix systems (Solaris, FreeBSD, and Linux) and identified their differences. We then showed how to formalize this problem using formal methods, and we proposed a new algorithm for constructing a formal model of the semantics of uid-setting system calls. The resulting formal model is useful in identifying semantic differences of uid-setting system calls among Unix systems and in checking their proper usage in programs automatically. Finally, we provide guidelines for proper usage of the uid-setting system calls and propose a high-level API for managing user IDs that is more comprehensible, usable, and portable than the usual Unix API.

Acknowledgment We thank Monica Chew, Robert Johnson, Ben Liblit, and Zhendong Su for their valuable comments.

References [1] Chris Torek and Casper H.S. Dik. Setuid mess. http: //yarchive.net/comp/setuid_mess.html. [2] Richard Stevens. Advanced Programming in the UNIX Environment. Addison-Wesley Publishing Company, 1992. [3] Matt Bishop. How to write a setuid program. ;login:, 12(1):5–11, 1987. [4] http://www.sun.com/software/solaris/. [5] http://www.freebsd.org. [6] http://www.kernel.org.

8.2.3

Evaluation

To evaluate the high-level API, we replace the uidsetting system call with functions from the new API in Openssh 2.5.2. There are fifteen uid-setting system calls in eight tasks. Of the eight tasks, four are to drop privilege permanently, two are to drop privilege temporarily,

[7] IEEE Standard 1003.1-1998: IEEE standard portable operating system interface for computer environments. Institute of Electrical and Electronics Engineers, 1988. [8] Hao Chen, David Wagner, and Drew Dean. An infrastructure for examining security properties of software. manuscript in preparation. [9] http://www.sendmail.org/.

[10] Sendmail Inc. Sendmail workaround for linux capabilities bug. http://www.sendmail.org/ sendmail.8.10.1.LINUX-SECURITY.txt.

• FreeBSD ruid = euid = suid = new uid

[11] Michal Zalewski. Multiple local sendmail vulnerabilities. http://razor.bindview.com/publish/ advisories/adv_sm812.html.

• Solaris Same as Linux

A.2 seteuid(new euid)

A Documentation of Uid-Setting System Calls This section documents the semantics of setuid, seteuid, setreuid, and setresuid in Linux 2.4.16, FreeBSD 4.4, and Solaris 8. Each of them follows two steps: 1. Check if the calling process has the required permission to make the system call. 2. If so, modify user IDs and return a success code; otherwise, return an error code. We will describe the system calls in pseudocode. The pseudocode uses variables ruid, euid, and suid to represent the real uid, effective uid, and saved uid of the process respectively. Note that we assume that the SETUID capability is not modified directly by a process in Linux.

A.1 setuid(new uid) Permission Required: • Linux new uid == ruid || new uid == suid || euid == 0 • FreeBSD new uid == ruid || new uid == euid || euid == 0 • Solaris Same as Linux Action: • Linux if (euid == 0) ruid = euid = suid = new uid else euid = new uid

Permission Required: • Linux new euid == ruid || new euid == suid || new euid == euid || euid == 0 || new euid == -1 • FreeBSD new euid == ruid || new euid == suid || euid == 0 • Solaris new euid == ruid || new euid == suid || new euid == euid || euid == 0 Action: • Linux euid = new euid • FreeBSD Same as Linux • Solaris Same as Linux

A.3 setreuid(new ruid, new euid) Permission Required: • Linux euid == 0 || ((new ruid == -1 || new ruid == ruid || new ruid == euid) && (new euid == -1 || new euid == ruid || new euid == euid || new euid == suid)) • FreeBSD euid == 0 || ((new ruid == -1 || new ruid == ruid || new ruid == suid) && (new euid == -1 || new euid == ruid || new euid == euid || new euid == suid))

• Solaris Same as Linux Action: • Linux if (new ruid != -1) ruid = new ruid; if (new euid != -1) euid = new euid; if (new ruid != -1 || (new euid != -1 && euid != ruid)) suid = euid; • FreeBSD Same as Linux • Solaris Same at Linux

A.4 setresuid(new ruid, new euid, new suid) Permission Required: • Linux euid == 0 || ((new ruid == -1 || new ruid == ruid || new ruid == euid || new ruid == suid) && ((new euid == -1 || new euid == ruid || new euid == euid || new euid == suid) && ((new suid == -1 || new suid == ruid || new suid == euid || new suid == suid)) • FreeBSD Same as Linux • Solaris Not available Action: • Linux ruid = new ruid; euid = new euid; suid = new suid; • FreeBSD Same as Linux • Solaris Not available