There's no shortage of pundits bemoaning the poor security in current operation systems. At the start of the 1970s, Multics had security features that modern systems still don't match (Paul A. Karger, Roger R. Schell, Thirty Years Later: Lessons from the Multics Security Evaluation (IBM, 2002)). The computer systems the commercial world is built upon rarely exceed class "C" in the criteria used to evaluate military computer systems (the "Orange Book"), due to inherent limitations in the security architecture of commodity operating systems.

This saddens me. So for ARGON I've designed what I hope is a good security model, that should allow installations to achieve classes "B" or "A".

ARGON doesn't have an "element named" component for security, as it's not the responsibility of an isolated software component; different aspects of security are handled throughout the kernel components, and this page documents how it all fits together. However, there will clearly be a need for a core library of encryption algorithms and tools like operations on classifications and clearances (eg, finding the sum of two, seeing if one is a superset of another, finding out if a given clearance is sufficient for a given classification, and so on) that are needed by the various components, which will go into the ARGON kernel glue code.

Security within the node

User code is run by LITHIUM, by asking an appropriate handler for the type of code. The only handler I'm specifying for now is CHROME, a high-level programming language that compiles to the HYDROGEN code generation interface. HYDROGEN code is able to "do anything"; HYDROGEN provides a single address space, and no way of preventing access to low-level device interfaces. As such, CHROME has to ensure sandbox safety of the compiled code it generates, like a Java virtual machine.

Within CHROME, access to privileged functions is provided in controlled ways by wrapping up priveleged capabilities into objects that are injected from outside the sandbox, and can be used by user code within it. There is no way for user code to generate objects with capabilities the code does not already have access to, nor to introspect into objects it is given to interfere with their operation.

Other language handlers might take different approaches, including using hardware memory management to run untrusted code with the ability to manipulate pointers directly.

HELIUM provides resource usage limits for user code, too; handlers are given CPU time and memory limits, and a priority for scheduling access to the CPU and I/O resources, in order to mitigate denial of service attacks through resource starvation.

Security within the cluster

As recommended by the Orange Book, we provide both mandatory and discretionary control for access to information.

Information is assigned a classification, a set of one or more security labels defined cluster-wide by a security administrator (managed by the cluster security entity and stored in the cluster entity). One site might just use labels "Private" and "Public", another might use a more complex hierarchy.

Security labels are connected to each other with a relationship indicating that a label "covers" another; "Private" might cover "Public" because anything trusted to handle "Private" information can also handle "Public". This means that the labels are joined into a directed acyclic graph, but the graph doesn't need to be fully connected. This graph of classification labels is part of the cluster's shared configuration.

Places where information may be stored, processed, or carried from place to place are awarded sets of labels called "clearances", which reflect the maximum secrecy level of information they are trusted to carry and a list of what codewords they are trusted with.

The set of labels in a classification of some data means that anything processing that data needs to be cleared for all the labels in the classification. In other words, adding more labels might reduce the set of things cleared to access it.

The set of labels in a clearance means that the thing with that clearance is cleared to handle any of the labels in the clearance. In other words, adding more labels might increase the set of things it's cleared to access.

To be more precise, something with a given clearance is allowed access to data with a given classification if every label in the classification is either present in the clearance, or the clearance contains a label which "covers" the classification, directly or through a path of multiple "cover" links between labels.

For instance, if the cluster's security label graph is:

Public
Company Sensitive (covers Public)
Customer Private (covers Public)
Customer Payment Details (covers Customer Private, and therefore Public indirectly)

...and a node is cleared for Customer Payment Details, it will be able to process volumes classified as Public, Customer Private, and Customer Payment Details (or any combination of the above), but not volumes classified as Company Sensitive, even when in combination with other labels such as Public.

As well as the storage of volumes, mandatory access control is used to drive the encryption of data in transit between nodes, so that communications links are not given access to data they are not cleared to view.

The cluster has a set of available encryption algorithms, each with a clearance assigned; it is assumed that classified information can be carried over untrusted links if it is encrypted with an algorithm whose clearance rates it for that classification. The cluster configuration also stores a list of "communication groups" of nodes that are connected by particularly featureful network links. The security configuration of the cluster can assign clearances to these groups, allowing encryption to be foregone for classified information that is communicated purely within those groups. This is intended to allow dropping the encryption overhead between machines located entirely within a secure facility, where network traffic between them cannot leave that facility.

The classification level of a message is specified by its sender. WOLFRAM messages replicating TUNGSTEN data will classify the messages based on the classification of the volume the entity is from.

Where these checks happen

Every volume has a set of security labels, which is the classification of the information in the volume, and also the clearance the volume has to store information.

The cluster security volume's classification is the set of security labels available in the cluster - no other volume can be more highly classified.

Every node has a clearance, which is also the classification of the node's volume.

Every entity has a classification, which is also its clearance. The volume storing the entity must be cleared for the classification of the entity.

OPEN QUESTION: How is the classification of a newly created entity assigned?

WOLFRAM will not permit sending a message to a node that is not cleared to receive it.

WOLFRAM will permit sending messages over communications links not cleared to read them, but will use a level of encryption that is trusted to that clearance level to protect the messages in transit.

Nodes may have TUNGSTEN persistent storage devices attached. Those devices are configured with a set of security labels (defaulting to the clearance of the node) that function as the clearance the device has to store information, and which must be less than the clearance of the node. For any storage volumes with less clearance than the node itself, TUNGSTEN will use a level of encryption that is trusted to the clearance level of the node to store information. For storage volumes with the same clearance as the node, information will be stored unencrypted.

Intra-cluster MERCURY will not permit sending a message to an entity that is not cleared to receive it. Intra-cluster MERCURY will send messages classified as per the classification of the volume containing the recipient entity, which is known in the cluster configuration.

The sender of a MERCURY request may request any classification if it desires, and the recipient may demand any classification for a particular endpoint (rejecting incoming requests without sufficient classification), which may cause the message security level to be raised further than the default, but it can never be lessened.

Security between clusters

Security between clusters is more interesting. Any cluster may contact any other over the public Internet via MERCURY, packet filters permitting.

Every volume has a public keypair, the private key of which is known to every node trusted to store data or perform computation for that volume, and the public key of which is part of the public entity ID of every entity within the volume.

Discretionary access control (and other trust decisions) between clusters are based upon the entities receiving the messages. However, we only actually authenticate the originating volume, and when ensuring that our messages cannot be snooped, we merely ensure that they reach the correct volume. This is because we cannot trust an entity any more than we can trust the nodes in the volume that hosts it. If we trust an entity with some information, we cannot really tell if the node hosting that entity is really sending the information to that entity or some other, so there is no point in authenticating at a finer granularity than the volume.

Every cluster's configuration maintains its own mapping of classifications to inter-cluster communications protection algorithms. The function of such an algorithm is, given the source cluster's public key pair and just the target cluster's public key, to convey sequences of bytes (raw MERCURY/CARBON messages) over a lower-level IRIDIUM transport, while ensuring that only the target volume's nodes can recover the message, that no men-in-the-middle can recover or alter the message without detection, and that the target node can check that the source node sent the message.

Of course, how well an algorithm does this job varies wildly. There's a set of algorithms which only offer signing of the data in transit, that just sends the bytes as-is with no encryption whatsoever. This is fast, so might be used for 'public' classified communications.

On the other hand, a better algorithm might open an IRIDIUM virtual circuit and negotiate a session key, signing requests so that each end can check the identify of the other, then proceed to exchange messages using a modern block cipher with the session key, and frequent re-negotation of said session key.

MERCURY, assuming that algorithms attempt to be fast in the general case by doing session key negotation at VC setup and shutdown, will attempt to cache already-negotiated algorithm channels between nodes and reuse them for more than one communication operation.

Now, if a node is attempting to send a message to a node in another cluster, it will consider the classification of the message, and will attempt to use the algorithm that cluster is configured to use for that classification. Classification hierarchies are unique to each cluster, but when the message (or initial request to set up shared session keys etc) is received by the destination, it applies its mapping from classifications to algorithms to decide which classification this algorithm represents (from its perspective), to tag the incoming message with.

If the destination cluster does not recognise the algorithm or finds it insufficient for the endpoint, it replies with a rejection, specifying the list of algorithms it would consider sufficiently trusted for the transit classification of the target MERCURY/CARBON endpoint. The sending node then finds an algorithm it will trust with the message (eg, an algorithm associated with the message's classification or higher) that is in the list, and uses that. If there is none, then it must sadly fail!

The minimum required clearance for any given endpoint is listed in the MERCURY metadata in the $MERCURY slice inside an entity, and is also subject to a volume-wide configured minimum set of security labels which are added to every endpoint's clearance. When that metadata is exposed via CARBON, or embedded in an entity ID, it is converted to sets of encryption algorithms which would be considered sufficient for those classifications by the receiving cluster so that other clusters can attempt to find a suitable algorithm for their request and avoid having to retry with a better encryption algorithm.

TODO: Clarify where all these classifications and clearances are stored

Classifications and clearances are cluster-specific as each cluster may have its own set of security labels.

Generally, they should only be exposed at all to members of some kind of "security administrator" group. They need to be stored in cluster/volume configuration somewhere, and interfaces to access them provided via an endpoint that can be ACLed for security administrators to get at.

What interface should be provided to entity code to understand whether some action is permissable in advance of trying it? How should failure to meet mandatory access control restrictions be communicated back in error messages? Do we reveal the security labels involved and explain the mismatch, or just give the contact details of the security administrator(s) responsible?

OPEN QUESTION: Write up / Read down

Under the current proposal, there is no minimum classification level for the MERCURY messages an entity sends, so an entity full of classified information is free to leak it with impunity.

We should define that all message sent by an entity must have be classified to at least that entity's clearance, so that entities can "write up" (send messages to higher-cleared entities) and "read down" (receive messages from lower-cleared entitites).

But we also need to define a framework for security administrators to allow entities limited means to send lower-classified messages (be they outgoing messages/requests, or replies to incoming requests) where the entity is trusted to do so.

Perhaps give entities an optionally overridable "send classification"?

OPEN QUESTION: Mandatory access control on top of MERCURY/CARBON ACLs

The mandatory access control mechanisms described above only protect the state within an entity, and the content of network messages involved in the MERCURY/CARBON protocol between entities.

However, there is also a potential need for mandatory access control alongside the discretionary access control provided by ACLs (as discussed in the Administrator view page).

This could be handled for intra-cluster communications by providing some configuration (controlled at a cluster-wide or volume-wide level, not as part of the entity's state) to to put "access classification" levels on MERCURY interfaces and CARBON slices.

Access to the MERCURY interfaces or CARBON slices would then only be granted if the calling entity's clearance was high enough for the classification of the endpoint.

How to do that for inter-cluster communications is harder to define; perhaps the cluster configuration may contain a set of ACLs that provide access to given clearances for external callers? Or can a trust bridge between clusters or volumes be defined, containing a bilateral agreement to use an agreed mapping between the clearance labels on each side, which are then transmitted along with requests between them?

TODO: Clarify access / storage / transit classifications

I've vaguely wandered between access classifications ("What clearance do I need to access this MERCURY interface/endpoint?"), storage classifications ("What nodes may store, or process, state for this entity?") and transit classifications ("What level of encryption is needed to carry this information on this link?") in the discussions above. Break it up better: explain the meanings of security labels, classifications, and clearances. Discuss the security configuration. Discuss how that security configuration is applied for access / storage / transit.

Certificates

It is possible for an entity to act on behalf of another entity for a while. For example, when somebody logs onto a desktop computer, they (in effect) tell that computer what their user agent entity is, then enter a password so that the computer can demonstrate to their user agent that it's really them. The agent entity then sends the computer your favourite user-interface software and settings, but as you browse CARBON from the computer, the entities you interact with must see the actions as coming from your agent, not from the host entity of the computer you're on, or else they will be unable to make useful access-control decisions.

This could be done by proxying all your activities through your user agent, but that would not be very efficient. Instead, the user interface software keeps a connection to your agent open for the duration of your session, and every minute, your agent sends it a certificate (signed by the volume as coming from that entity), stating that the entity of the user interface is allowed to act as it for the next two minutes.

This certificate is then sent along with every MERCURY or CARBON message issued on your behalf by the user interface (or early on in every virtual circuit, and then left out thereafter). The messages are still signed by the algorithm chosen for communications between the user interface node and the target node, but with the certificate wrapped within. The recipient, upon seeing the certificate, checks that the entity authorised to act on behalf within the certificate is the same entity that's originating the request, checks the certificate is not out of date, and then subsequently considers the request to have come from the entity that issued the certificate for access control purposes (with the identify of the intermediate entity still kept for auditing purposes).