Class 20: Society and Security and Programming Languages

Lecture Notes

Oh no! No slideshow! Try to cope.

Security and Society: Mobile code

A fact of life in our connected world is that code gets transported from machine to machine just as data. Javascript, Java applets, and Flash are all examples of executable code embedded in webpages or e-mails. In fact, most of us also download and install all sorts of different pieces of software. Sometimes your instructor demands it: DrScheme, JFLAP, etc. This "mobile code" is part of the way we all interact with one another over the web, but it it represents a serious security problem. Buried in some of this code may be instructions that, intentionally or not, comprimise our data and our systems.

What are some of the problems?

Programming Languages, Mobile code security

Programming languages actually have a big role in dealing with the security threat represented by mobile code. We're going to look a bit at what Java in particular does to try and provide a secure environment for mobile code.

The fundamental problem is to limit what an unstrusted piece of code can do --- in a more sophisticated world we might try to distinguish many levels of trust, but for the moment I think the trusted/untrusted code distinction is enought. When you run a regular binary program, it gets run as "you", meaning that whatever files you can access/create/delete so can it; whatever network connections you can make or peripherals you can accesss, so can it. So the goal is simply to run a program without allowing it access to all of those things. Unfortunately, that's not so easy to do. As we know from Theory of Computing, we can't look at a program an know what it's going to do --- after all we can't even determine whether or not it'll do into an infinite loop! Perhaps we could monitor the program as it runs? That's a daunting task on many fronts, but one aspect that's a killer is this: if the monitor runs in the same address space then a malicious program could corrupt the monitor itself; if the monitor runs in a differnt address space then every time it tries to do its monitoring the system needs to make a context switch, which is prohibitively expenses computationally. So ... what are we going to do?

As we've discussed, Java code is compiled to bytecode for a virtual machine -- the JVM. So with untrusted code, like applets, the virtual machine can be in charge of making sure nothing untoward happens. In particular, untrusted code in Java is run in a sandbox, the name is supposed to conjure up a picture of a kid that plays in a sandbox and can't touch anything outisde it --- or of several kids, each trapped in their own sandbox. At any rate, the JVM is responsible for making sure each kid - I mean program - stays in its own sandbox. A program may cause as much trouble as it likes in its own sandbox, but the JVM should keep the other sandboxes and, more importantly, the system the JVM is running on, safe.

By default, applets are run in a sandbox, while local code is not. In particular, if you launch from the command-line like "java Foo", your code executes as you: no sandbox. However, if you run like this

java -Djava.security.manager Foo

your code will be run in a sandbox.

Simplest example

One of the most fundamental system resources to protect from untrusted code is the filesystem. And it gives us a simple example of the sandbox in action.

Our first example is called Foo:

class Foo
{
  public static void main(String args[])
  {
    System.out.println("Die, Bar, die!");
  }
}

Since Foo uses no system resources, running it with or without sandbox gives the same result:

> java Foo
Die, Bar, die!
> java -Djava.security.manager Foo
Die, Bar, die!
>

Now consider a second example, Foo2:

import java.io.*;

class Foo2
{
  public static void main(String args[])
  {
     FileOutputStream outs;
     PrintStream ps;

     try
     {
       outs = new FileOutputStream("tmp.txt");
       ps = new PrintStream(outs);
       ps.println("Die Bart, die!");
       ps.close();
     }
     catch (Exception e)
     {
       System.err.println("Darn! I can't write my file!");
     }
  }
}

Now look what happens when we run this with and without the sandbox:

> java Foo2
> cat tmp.txt
Die Bart, die!
> rm tmp.txt
rm: remove tmp.txt (yes/no)? yes
> java -Djava.security.manager Foo2
Darn! I can't write my file!
> cat tmp.txt
cat: cannot open tmp.txt
>

Basic resources to protect

We need to protect: files, sockets, the JVM itself, threads, printers, clipboard, event queue, etc.

Class loader

fetch code
create & enforce namespaces ... including no replacing system-level stuff!
disallow mucking with the class loader itself.

Understand namespaces!

Byte-code verifier

Builtin theorem prover verifies that the code doesn't forge pointers, circumvent access restrictions, do illegal casts.

compiled code formatted correctly
not stack overflows or underflows (do you know what that means?) They're not actually talking about the call stack, but rather the "operand stack". The operand stack is like the value stack from the SVM that we manipulated in the last two labs.
No illegal data conversions - at least those that can be detected statically.
byte code instruction follow type rules and access privelege rules.

Run-time checks and the Security Manager / Access Controller

Run-time verification. There are two Java language checks that have to be done dynamically: array bounds checking and casts down the inheritance hierarchy (just like the C++ "dynamic_cast" we talked about last class). These are both crucial to security. Certainly many exploits is various languages have centered around overwrites.

In addition to the above language checks, Java programs that are run in a sandbox (e.g. with the -Djava.security.manager switch) are subject to the scrutiny of the Security Manager. When an operation on a protected resource is requested, the Security Manager (via the Access Controller) determines whether the operation will be allowed. (The Java API implementation contains the code that actually accesses these resources, so that's where the code sits that actually asks permission of the Security Manager.) The interesting thing is that to make this decision, the security manager has to check the whole call stack, because the code that actually tries to access the resource is the code that queries the Security Manager, and that's code within the Java API implementation. So it, of course, is trusted code. The Security Manager needs to look down the call stack to see who ultimately called that code, because the API code is being called on behalf of some other class (say to open a file) and the question is wehther that other class has the authority to ask for the operation. Only if all the classes on the stack have the requisite permissions will the operation be allowed.

Homework

Reading (for next class): PLP, Sections 3.7 and 6.6.2

Exercises

No new homework for this lecture. But there will DEFINITELY be at least one exam question on it, so be prepared! The homework from Class 19 will be due on Monday.