CREATE VIEW MyTable
AS
SELECT ID,
DecryptByKey(SensitiveData),
DecryptByKey(OtherSensitiveData),
NonSensitiveData,
FROM BaseTable
GO
Given this approach, let’s look at what happens when a user selects against the view. All the keys which map to labels that the user can access will have been opened. Therefore, all the cells with labels the user can access will be decrypted when the SELECT statement is executed. Conversely, all the keys that map to labels the user cannot access will not be opened. When the SELECT statement is executed, cells with these labels come back NULL, providing the user with no information on whether there is even data present in the cell.
This approach accomplishes the granular, dynamic control over data in a relational table that we are seeking.
Key access control
Of course the suitability of the design hinges on control of the keys. SQL Server 2005 defines permissions on keys in terms of a single SQL Server principal. The permission to open a key, for example, can be granted to a user named Bob, or to a group named AppUsers. In our scenario, however, we want to control rights to the key based on arbitrary combinations of principals – to be specific, based on arbitrary combinations of role memberships which define the user’s label. We want to avoid defining a principal for every possible label combination. The semantics for deciding which keys a user can access are identical to those for controlling row access.
This type of ACL’ing requires a more subtle approach. Instead of granting key permissions to users, or to roles, we grant permissions to a system-defined broker user account. We’ll call it the KeyBroker account. KeyBroker can open keys. Users and user roles are denied all permissions on keys. We get a list of the labels to which the user should have access, and ask KeyBroker to open the corresponding keys.
Using new impersonation features of SQL Server 2005, we can define a stored procedure which can be called by users, but which will ‘execute as’ KeyBroker. This procedure is shown below. A cursor is defined on the vwVisibleLabel view. Note two columns in that view which we have not mentioned yet: KeyName and CertName. The cursor selects these two columns from the view. Using another new impersonation feature, we temporarily revert to the identity of the calling user and open the cursor. This ensures that the rows returned from vwVisibleLabels are based on the calling user’s label (that is, their combination of role memberships). We then immediately revert back to the KeyBroker identity. The rest of the procedure loops through the rows in the cursor, opening each key by using the dedicated certificate used to encrypt it.
Note: Some parts of the code snippet presented in the following table have been displayed in multiple lines only for better readability. These should be entered in a single line.
CREATE PROCEDURE
usp_EnableCellVisibility
WITH EXECUTE AS 'KeyBroker'
AS
DECLARE @KeyName nvarchar(256)
DECLARE @CertName nvarchar(256)
DECLARE Key_Cursor CURSOR LOCAL
FORWARD_ONLY STATIC FOR
SELECT KeyName, CertName
FROM vwVisibleLabels
EXECUTE AS CALLER
--Since the cursor is STATIC, it is fully
--populated here based on the caller’s
identity OPEN Key_Cursor
REVERT
FETCH NEXT FROM Key_Cursor INTO @KeyName,
@CertName
WHILE @@FETCH_STATUS = 0
BEGIN
open symmetric key @KeyName using
certificate @CertName
FETCH NEXT FROM Key_Cursor INTO @KeyName,
@CertName
END
CLOSE Key_Cursor
DEALLOCATE Key_Cursor
GO
Exit from the stored procedure automatically reverts the user context back to the calling user. This approach opens exactly those keys associated with labels that are dominated by the user’s label. The user never has any access to keys, and so cannot open any others. And, there is no input to this procedure which can be spoofed. Evaluation of rights is based only on membership in SQL Server database roles.
With the correct symmetric keys open, selecting from the view causes labeled cells to be visible if the user’s label dominates. All other labeled cells appear as NULL.
The expectation is that this stored procedure (usp_EnableCellVisibility) is called once by an application or end user immediately after establishing the database connection. The keys opened would remain open for the life of the connection. A corresponding procedure (usp_DisableCellVisibility) is provided to close the keys if needed. This is not strictly necessary, as closing the connection does the necessary cleanup.
Changes to the base table
Changes to the base table to support cell-level security are fairly minor. Most important, the data type of the column to be protected must be compatible with the encrypted data values. The intrinsic function EncryptByKey returns varbinary. This can be stored in a character or binary field (for example, varchar, nvarchar, or varbinary). If the original data type is not compatible with varbinary content—a numeric type such as int, for example—the column data type must be changed. In the next section, we’ll see how to make the data type appear unchanged to the user.
Ensuring the correct data type is the only required change. Another change that may be desirable in some scenarios is the addition of a column to hold the label that applies to the cell. The main purpose of this would be to comply with policies requiring that the label metadata live with the cell data. This could be stored as the ID from the tblUniqueLabel table, an instance of the SecurityLabel user-defined type (UDT), or the raw label string from source data.
Defining the view
Finally, we need to redefine the user-accessible view to include logic for decrypting data in protected cells. The next example code shows what the view definition should look like. It is almost identical to the view definition we showed previously when discussing row-level security. The only difference is that some columns are wrapped in a SQL Server 2005 intrinsic function called DecryptByKey. To keep the example simple, we assume just a few columns in the base table.
Note: Some parts of the code snippet presented in the following table have been displayed in multiple lines only for better readability. These should be entered in a single line.
CREATE VIEW UserTable
AS
SELECT ID,
DecryptByKey(SensitiveData),
CONVERT(money, CONVERT(varchar(50),
DecryptByKey(SensitiveMoneyData))),
NonSensitiveData,
FROM tblBaseTable (READCOMMITTED),
vwVisibleLabel
WHERE tblBaseTable.RLSMappingID =
vwVisibleLabel.ID
GO
GRANT SELECT ON UserTable TO <app_users>
DENY ALL ON tblBaseTable TO <app_users>
GO
As mentioned in the previous section, the encryption functions use character or binary data as input and output. If the original data type of a protected column is numeric, for example, the view definition should include a conversion of the varbinary decryption output to the original data type. An example of this is shown in the third column of the view above.
Encrypting cell data on insert/update
The encryption of cell data can be handled in much the same way we handled write-up/write-down checks for row-level security. INSTEAD OF triggers defined on the view handle the write-up/write-down checks, and also encrypt the cell(s) based on their label. The code in Appendix A includes an example of what this logic might look like.
Physical Partitioning
When designing a system that might combine data from multiple levels of classification, the issue of commingling data in physical storage must be addressed. Part of our design must allow the distribution of data with different classifications across distinct physical storage units. As with other aspects presented here, we would like to encapsulate the solution to the requirement in a unified, prescriptive reference design which lives as architecturally close to the data as possible. Applications built atop the data management solution should not have to deal with these issues.
This requirement can be addressed by using partitioned tables, another feature of SQL Server 2005. It could also be addressed by using partitioned views in the previous version of SQL Server, but with more difficulty.
Table partitioning allows the definition of a partition function, which uses a column in the table to divide the data into disjoint logical partitions. The partition functions are mapped to physical storage units (that is, files). These files can be placed on distinct physical devices. For an additional measure of control, each physical volume can use Windows Encrypting File System (EFS3) support to apply high-strength encryption at the physical media level. (This use of EFS is only prudent if significant I/O hardware resources are available to offset the performance impact of encryption at the file system level.)
Pulling it All Together (Part 2)
Combining the cell-level security and physical partitioning with the previously presented row-level security model, we have the design as shown in Figure 5.
As discussed earlier, the techniques laid out to this point could be implemented in code in a variety of ways. However, the layered design pattern has several advantages:
| • |
It breaks the major design points into simple, easily maintained pieces.
|
| • |
The process of applying the design to a given table is formulaic and mechanical. It might be desirable to auto-script the implementation based on metadata.
|
| • |
It encapsulates security-label logic for the write cases in the core constructs, rather than application code or stored procedures.
|
Performance
A key question in assessing this design is its impact on performance. An acceptable solution must implement the required security, yet have an acceptable performance impact with tables containing millions of rows.
A reference implementation of this design has been tested on a dataset containing one million rows. The test server specs were:
|
CPU
|
2 x 550 MHz
|
|
RAM
|
512 MB
|
|
Disk
|
4 x 18 GB @ 10,000 RPM
1 x 20 GB on Fiber channel SAN
(No EFS)
|
|
Operating System
|
Windows 2003 Standard
|
|
Database
|
SQL Server 2005 Beta 2
|
|
.NET Framework
|
1.1 for application code
2.0 for SQL Server code
|
A detailed description of the test is beyond the scope of this document. In summary, the following results were obtained.
Impact on insert performance: ~40 percent degradation
Impact on selective queries: < 10 percent degradation
Impact on aggregate queries: ~10-50 percent degradation
The performance profile of every application is unique, and these results should not be taken as a fixed guideline. However, the testing does indicate that this design impacts performance to an acceptable degree. Note that the measured impact is on a low end (550 MHz) server with limited RAM. Assuming a similar performance impact for production applications, additional hardware resources could make up for the degradation.
Row-Level Disclosure
The view-based row-level security design described previously has some specific, narrow vulnerabilities. The design prevents the return of rows to which the user should not have access, whether the data is queried from an application, a reporting tool, or a direct connection with a SQL query tool. However, given the right conditions, data may be exposed by inference if an error message is thrown. This section describes the conditions, the potential disclosure, and presents two approaches to mitigation.
Predicate evaluation order
If a specifically formed query is issued against the user-accessible view that implements row-level security, an error may be returned which reveals the existence of information in the table to which the user is not supposed to have access.
The query might look like the following:
Note: Some parts of the code snippet presented in the following table have been displayed in multiple lines only for better readability. These should be entered in a single line.
SELECT * FROM UserTable --actually a view
WHERE LastName = ‘Smith’ AND
LEN(LastName)/0 > 10
Suppose that there are no rows where LastName equals Smith to which the current user has access, but that there is at least one to which the user does not have access. Internally, the database server’s query optimizer will decompose any view in the statement into its parts. It will then build a query plan that is based on all of the WHERE clause conditions in the view(s) and in the actual statement’s WHERE clause. If the statement executes successfully, no information will leak. However, if the WHERE clause includes invalid predicates (such as division by zero), there is a risk of leakage. Depending on the query plan generated by the query optimizer, the invalid predicate may be evaluated before the predicate which restricts row visibility. If this happens, an error will be thrown that implies the existence of a row that should not be ‘seen’.
Here is how it might work with the example statement above. Suppose UserTable is actually a view defined per the design presented earlier. The actual query as seen by the optimizer will be something like:
Note: Some parts of the code snippet presented in the following table have been displayed in multiple lines only for better readability. These should be entered in a single line.
SELECT *
FROM BaseTable (READCOMMITTED)
WHERE (LastName = ‘Smith’ AND
LEN(LastName)/0 > 10)
AND BaseTable.RLSMappingID IN
(SELECT ID FROM
tblUniqueLabel
WHERE ID IN .. /* category
predicates */)
Suppose the optimizer chooses to first use an index on LastName to narrow the result set before the other conditions are evaluated. It will find the row(s) where LastName =’Smith’, then evaluateLEN(LastName)/0 > 10. This will immediately throw an exception, prior to the evaluation of the row security predicates. The result may be that the presence of at least one row with LastName = ‘Smith’ is revealed, even if the user should not see any of those rows.
This issue is not unique to view-based row-level security with SQL Server. Other vendors’ row-level security solutions which inject predicates behind the scenes are subject to the same risk, since they inject the predicate before the query plan is generated.
It is important to observe the conditions which must be met for this vulnerability to be an issue:
|
1.
|
A user must be able to directly submit a maliciously formed SQL query, either by using a SQL query tool, or SQL injection against a poorly written front-end application.
|
|
2.
|
The user must know enough about the table definition to write the query.
|
|
3.
|
The query optimizer must choose a plan which allows the invalid predicate to evaluate before the row access predicate(s). This is a function of several variables, including what indexes exist, the distribution of data in the table, and the nature of the other predicates in the statement.
|
|
4.
|
The user must be able to directly see the error that is generated. This is a given with a SQL query tool, but a well-written application will not return raw error information.
|