How we store and access information is as important as the kind of information we store. Take the case of the Deep Blue vs Kasparov. Much of the "intelligence" that allowed the IBM computer to beat the world's premier chess grandmaster came from Deep Blue's ability to store and quickly access vast quantities of chess moves, and compare their relative merit at extremely high speeds. A more everyday example of this principle is your annual pile of dead tree matter called the phone book. It contains a ton of information, but it's structured in a relatively simple manner, searchable by business, product, or name. A clear and logical organization makes information more accessible, whether you're dialing for pizza or evaluating chess strategies with a supercomputer.
Unfortunately for those of us without a Deep Blue at our disposal, most operating systems vendors traditionally have been lukewarm on the idea of a directory service. It took the hard, groundbreaking work of vendors and organizations like ICL Ltd., Banyan Systems, and the University of Michigan to prove that directory services can play an important role in collating system information across a network.
Better late than never
Microsoft has finally joined the fray, introducing a rival to Novell's NDS. In reviewing a future direction for its existing NetBIOS-based domain system, Microsoft decided to create its own network directory service called Active Directory (AD). This new system will play a central role in the next-generation Windows 2000 platform, but the APIs for it have already been released (in part) for existing Windows NT systems. This availability has allowed software developers to start working on software that utilized AD and ran on existing platforms, though it will still require an AD server running on Windows 2000.
Active Directory borrows very heavily from the model created in the Lightweight Directory Access Protocol (LDAP) system. It then goes beyond that protocol by creating a superlayer that also works with other non-AD directory services, such as NetWare and Windows NT 4.0 domains. However, the core structure of the system mirrors LDAP, and all accesses to AD use the LDAP protocol. So, to understand more about AD, you should first take a look at how LDAP works.
AD and LDAP
The Active Directory API, known as the Active Directory Services Interface (ADSI), allows applications to access the AD system using COM+ objects, direct language-dependent function calls, or scripting interfaces. System administrators do not need an exhaustive knowledge of how ADSI works. For a quick illustration of how AD is integrated with multiple languages, object models, and directory services, take a look at Figure 1.
Figure 1. The Active Directory Services Interface
Within the operating system is an ADSI router that handles the function call. The call is then forwarded to an appropriate directory service provider. The AD service falls under the LDAP servers section in this figure, while those for other directory services are routed through their appropriate service provider.
Once in contact with the LDAP services, the ADSI router sends the LDAP messages to the appropriate server for processing. LDAP has nine basic messages or function calls, most of which are self-explanatory: add, delete, modify, rename, search, compare, bind, unbind, and abandon. These messages comprise all that is needed to change the state of the LDAP directory.
An LDAP communication begins with a bind message from a client to a server. This both authenticates the client and instructs the server that a new LDAP session has begun. The client can then send an add, delete, modify, rename, search, or compare message to retrieve existing information or change its state. Once the client is done with the session, it sends an unbind message to close the connection. If the client needs to stop the session or interrupt the server after a message has been sent, it can send the abandon message, whereupon the server drops what it is currently doing and ends the session.
The latest version of LDAP, version 3, also allows the introduction of extended messages or new message types. Microsoft takes advantage of this addition by including better authentication and security that is tied to the Windows 2000 security system. Traditionally, LDAP messages and data were sent in the clear over networks, increasing the possibility that they might be intercepted and modified by a hostile intruder. With AD, these messages can be encrypted using the built-in cryptography services in Windows 2000. This does mean, however, that older Windows 9x and NT systems are still vulnerable.
Seeing the forest through the trees
Active Directory entries are called objects and can be of any number of different types: a user account, a group of users, files, printers, a group of computers, or an entire division network. Each object has a set of information that describes its state, known as its properties. Objects of the same type have similar properties. A network printer object, for example, has information on the jobs in the print queue, the type of printer languages it supports, the network and printing protocols it supports, etc. A container object represents a collection of other objects of the same type, such as a group of computers or a printer pool.
In order to define their role on the network, objects are hierarchically organized in a single tree format known as the directory information tree (DIT) (see Figure 2). All objects are unique in this structure; even a copy of an object is considered a separate, unique object with all the same values for its properties as the original. To maintain object uniqueness, every object has a name. This name differentiates it from all other objects and is defined by its position in the tree. The distinguished name (DN) is the full identity of a unique object, as defined in the overall hierarchy.
Figure 2. An example directory information tree
Some DNs in Figure 2 include:
The forward slash (/) separators indicate nodes within the tree hierarchy that might branch out to other groups of objects. To avoid the repetition of these fairly long names, you can also use a relative distinguished name (RDN), which refers to other objects relative to a node in the tree. For example, from the node /O=US/DC=COM/DC=WPI, there could be RDNs to individual editors such as /OU=WindowsTechEdge/OU=Editors/CN=Tom Young, or /OU=SunWorld/OU=Editors/CN=Carolyn Wong. RDNs become useful when you are only looking at a subsection of the overall hierarchy.
Although there is a defined tree structure for global systems like the Internet and its domains, there isn't a single global definition of what a tree should look like for every organization. The AD system is flexible enough that you can define your own vision of the objects on your network. This definition, called a schema, shows how objects are grouped together within the tree. The schema also defines the structure of individual objects in the directory.
The namespace is the set of names within a portion of the hierarchy, or the entire hierarchy itself. The scope of a namespace defines the limits of that namespace and all the names that can be seen under it. For example, within the scope of /OU=Contributors there are various individuals including /CN=Rawn Shah and /CN=Brooks Talley. Within the scope of /O=Internet/DC=Com are all the Internet domains that end in .com, along with all the objects that belong to these domains.
The DN of any object can change if the object is moved to another location. To maintain an identity for its entire lifetime, irrespective of its location in the AD tree, each object has a globally unique identifier (GUID). This string array is assigned when the object is created, and can be used to look up a name the same way as a DN.
Because the hierarchy can get pretty large on a single server, AD allows you to break it up into smaller subsets known as naming contexts. Each of these contexts can be located on a separate server and can refer to an entire domain or a subset of it. Each naming context has a root directory service entry (rootDSE) that contains a description of the structure of its tree. The rootDSE is used by the directory service system to define what is contained in its subset, how the information is replicated, and which directory service protocols are supported by that server.
An AD domain is a single area of security containing any number of AD objects. Windows NT 4.0 domains can be represented within an overall AD tree as organizational units (OUs) and maintain user accounts, policies, and security rights within these units. Each domain has a security policy defining how objects both within and without the domain can interact. An AD domain can be the equivalent of multiple NT domains and most closely resembles the master domain model. The hierarchical structure of a single AD domain is known as a domain tree.
Just as there are currently multiple master NT 4.0 domains, it is possible to build separate trees for different divisions or locations of your company. Multiple domain trees can be connected together into a forest, allowing administrators of large organizations to have separate systems for each location and still be able to create a single view of all objects belonging to the company.
AD servers can hold entire domain trees or subsets of them. A single AD site contains a number of AD servers as part of the same domain. Defining sites for AD makes it easier for administrators to set up replication between servers and to configure systems by physical location. AD replaces the Windows NT 4.0 domain primary and backup domain controller model with replication and caching servers. All noncaching controllers are now of equal status. Client machines look for the local site first or the first server that responds to their query, rather than locating and waiting for the primary domain controller.
Faster searches, better security
The AD system has a component known as the directory service agent (DSA), which is responsible for managing the storage and access of local AD services, as well as communicating with AD servers on other machines. This component is directly built into the local security authority subsystem of Windows 2000, the system responsible for directing all user application requests for security checks and clearance. When the ADSI router needs to contact AD, it contacts the DSA through its representative service provider. AD servers pass messages through their DSAs either to propagate changes, as one would during a replication or update, or to pass a query to other subsets of the directory tree.
Searching across a large DIT can be quite time consuming, especially if the information is spread across multiple servers. To enhance performance, each AD server contains a global catalog, an index of all names in the DIT. The names within this catalog do not contain the entire object itself, but may contain some of the key properties associated with the object. The AD server system automatically builds this catalog on each server. In essence, the global catalog is a sort of cache of the tree so that simple accesses, such as locating the name of a set of LDAP entries, can be quickly verified. The catalog also defines which servers contain the real information for the object. A query for the full details of the object, on the other hand, will still go through the process of searching the complete tree and retrieving the data from the appropriate server.
AD is strongly tied to NT's security model. Although LDAP itself does not specify how security should work on an individual object basis, AD implements very strong security on every object, down to having access control lists for individual properties of an object. Any query is identified by both its originating user object and its access token. This token contains details on each user's access rights and group memberships. These user rights are compared against those allowed by the AD server in the security manager component of the operating system kernel.
AD supports replication across multiple servers and sites. A domain tree might have multiple naming contexts, some of which may be stored on different servers, or even different sites, for the sake of performance. It, therefore, makes sense to build a replication service directly into the directory service. AD offers multimaster replication. That means each naming context can be modified by client applications within a server, regardless of the contents of other servers and naming contexts. The replication system keeps track of these changes and propagates them across the various copies of the naming contexts. This replication process is automatic and is coordinated with the help of update sequence numbers (USNs).