In-Depth
SQL Server 2000: All Charged Up!
Microsoft's made its relational database management system fully Web-enabled, scalable, reliable, and faster for development. What justifies putting it into a qualifying heat in your organization?
If you’re interested in databases at all, you
already know that Microsoft is about to ship a
new version of SQL Server. In line with this year’s
naming model, it’s called SQL Server 2000, although
internally it’s also called SQL Server 8.0. Microsoft’s
design goals for this edition were to make the
application fully Web-enabled, scalable, reliable,
and fast to develop with. But how did those goals
translate to product features?
In this article, I’ll introduce you to some of
the more significant new features in SQL Server
2000, so you can start the process of evaluating
it for your own organization.
Super Speedway Support
Let’s start with a look at some of the numbers.
Microsoft is getting more serious all the time
about making SQL Server an enterprise-level database.
For starters, if you could afford it, you could
run SQL Server on a 32-processor server with 64GB
of RAM. That server could host a database containing
more than a million terabytes of data. But of
course, numbers never tell the whole story. Microsoft
has implemented a number of improvements in support
for very large, reliable databases.
For example, failover support has been substantially
improved, making it easier to run SQL Server in
a 24x7 environment without outages due to hardware
failures. You can create clusters of up to four
nodes, and set up failover or failback to or from
any node in the cluster in case of problems. Individual
nodes can be reinstalled or rebuilt without affecting
other nodes in the cluster.
If you’re maintaining multiple copies of a distributed
database, you’ll want to look into the new log
shipping feature. With log shipping, copies of
a transaction log are fed from a source database
to a destination database on a constant basis.
This lets you keep the destination database as
a warm standby, ready to replace the source database
at any time. It also provides an easy way to separate
OLAP analysis applications from OLTP data entry
applications; let the data entry proceed at your
primary server, then use log shipping to update
a secondary server where you can run time-consuming
analyses. The Database Maintenance Plan Wizard
has been enhanced to include log shipping setup.
A separate server within your organization can
be designated as the Monitor Server to keep track
of all log shipping operations.
For those occasions when scaling up your server
to the largest possible monster server doesn’t
handle the traffic, you can now look at “scaling
out.” Microsoft’s tool for doing this is the federated
server. With federated servers, you split a table
horizontally across as many database servers as
necessary. That is, one server contains the first
hundred thousand rows of the table, the next contains
the next hundred thousand, and so on. SQL Server
2000 sports updateable distributed partitioned
views, which is a fancy way to say that a single
view can treat this collection of federated tables
as a single table. With federated servers, there’s
no practical limit to the amount of data you can
store and update in a single table. Scaling out
has the added bonus of being less expensive than
scaling up for equivalent power.
For those maintaining data in heterogeneous sources,
the new OPENROWSET function in T-SQL will come
as a distinct improvement over the old OPENQUERY.
OPENROWSET allows you to specify OLE DB connection
information directly in an ad-hoc query. On a
practical level, this means that you can use a
rowset from any OLE DB provider in place of a
native SQL Server base table.
Under the Hood
You may have noticed recently that Microsoft
is integrating XML into all of its products. SQL
Server 2000 is no exception. For starters, SQL
Server 2000 is tightly integrated with Microsoft
Internet Information Server. You can define a
virtual root in an IIS Web site that refers to
a database on SQL Server. Properly constructed
URLs directed at that virtual root can contain
T-SQL queries, including the new FOR XML clause,
to generate XML documents dynamically for display
in a Web browser. Figure 1 shows this capability
in action.
|
Figure 1. Retrieving
database results as XML via IIS. Note the
use of special characters in the URL. |
SQL Server 2000 can also work directly with XML
documents. The new system stored procedure sp_xml_preparedocument
can read any well-formed XML document into memory
and parse its contents. From there, the OPENXML
keyword can be used in a SELECT statement to make
any set of nodes from the document available as
an updateable recordset.
It’s become clear that the SQL Server team couldn’t
quite get all the XML debugged in time to make
it into the box. Fortunately for developers, it
appears that new XML technology will continue
to roll out on a regular basis. The first offering
is the updategram, which (as of this writing)
is available as a download for SQL Server 2000
beta 2. With an updategram, you can construct
an XML message that shows the state of data before
and after an update, send it to the server, and
have SQL Server translate the updategram into
the appropriate UPDATE, INSERT, and DELETE statements.
This facility brings us one step closer the realization
of XML as a universal language that can be used
by different applications to exchange data.
One more new feature of possible interest to
those running databases for the Internet is support
for multiple server instances on the same computer.
With SQL Server 2000, you can install more than
one server on a single computer as completely
isolated applications (although they share much
of the same disk footprint). This is ideal for
service providers that want to let multiple customers
share a single computer without sharing a single
copy of SQL Server.
Souped-up Tranny
Don’t worry—the database administrator hasn’t
been overlooked in the upgrades here. One of the
new features of SQL Server 2000 is the use of
Kerberos to support authentication across multiple
computers. By using Kerberos in conjunction with
Windows 2000 Active Directory, servers can pass
user credentials among themselves. This is especially
useful for distributed querying scenarios. If
a user authenticates to one server that then uses
OPENROWSET to retrieve data from a second server,
Kerberos provides a secure way for the second
server to check the user’s credentials.
The backup and restore architecture in SQL Server
2000 has been reworked as well. You can now save
a log mark at the start of any transaction, which
allows restoring the database to just before or
just after that transaction. In addition, SQL
Server will use multiple threads to perform backup
and restore operations in parallel whenever possible,
making it easier to manage very large databases.
For example, if a backup set is being striped
across multiple devices, SQL Server will devote
a thread to each device.
The venerable Database Consistence Checker (DBCC)
utility, too, has been improved by the addition
of parallelism. As you know, DBCC executes through
Transact SQL to perform maintenance tasks, check
the status of the server, and validate the data
stored on the server. If you have multiple processors
on your server, the new version of DBCC will run
on all of them, allowing much faster consistency
checking in a database. DBCC can also check tables
without taking locks, allowing it to run at the
same time that tables are being updated.
Moving Around the Track
Anyone who’s maintained a large database installation
knows that just getting the data into the database
isn’t enough. You frequently need to manipulate
and transport the data to multiple locations.
SQL Server 2000 sports improvements in this area
as well.
Replication, which allows maintaining copies
of a database in diverse locations, was an immense
step forward when Microsoft first introduced it
in version 6.0. In SQL Server 2000, replication
has been enhanced in numerous ways:
- Replication of schema changes on published
databases.
- Replication of scripts to subscribers.
- Pre- and post-snapshot scripts.
- Remote agent activation, to better distribute
the processing load of replication.
- Greater parallelism in the replication agents.
- Vertical filters for merge publications.
- Alternate synchronization partners for merge
publications.
- New merge replication conflict resolvers.
- Better support for heterogeneous data sources
in replication.
- Queued updating for transactional replication.
- Data transformation during the publishing
process.
- Integration of replication publishing with
Active Directory.
The result is a robust and full-featured replication
service that goes far beyond the original goal
of maintaining two identical copies of a database
from a single point of data entry.
SQL Server’s other data movement tool, Data Transformation
Services (DTS), has been similarly enhanced. The
variety of data sources and tasks built into DTS
has been expanded, and the documentation and examples
for creating your own custom task is improved.
Figure 2 shows an example of a DTS package that
includes multiple data sources and conditional
workflow in the new version of the DTS designer,
with the icons for the new tasks and data sources
visible.
|
Figure 2. DTS Package
in the Package Designer. This package includes
SQL and Send Mail tasks, conditional workflow,
and both SQL Server and text data sources. |
DTS can also now export entire packages to Visual
Basic projects, making it very easy to use DTS
from standalone applications.
Analyzing the Results
Microsoft OLAP Server has been renamed Microsoft
Analysis Services in this version of SQL Server,
and it’s been substantially improved from the
version included with SQL Server 7.0. Perhaps
the biggest news is the addition of data mining
to Analysis Services. With data mining, Analysis
Services will help you explore your data for correlations
and even predict values for new data. Figure 3
shows a data mining model in action.
|
Figure 3. The new Data
Mining Model Browser. Darker color corresponds
to more records; the most important factors
are closest to the root of the tree. The Content
Navigator window provides a schematic overview
of the entire model. |
In this case, the data in a set of tables is
being analyzed to determine which factors are
the most important in predicting the Yearly Income
column of a table. The core OLAP portion of Analysis
Services has been enhanced as well. A few of the
key new features here include:
- Parent-Child dimensions to handle tables
with self joins.
- Write-enabled dimensions that let you update
data through Analysis Manager.
- Ragged dimensions for better handling of
data with an incomplete or inconsistent hierarchy.
- Dimension filtering to limit the rows included
in a dimension.
- Kerberos authentication and finer-grained
security.
- Actions linked to cube data. For example,
you can jump straight from a customer to their
Web page.
- Linked cubes for use by distributed Analysis
Servers.
- A drag-and-drop builder for MDX Expressions.
- Client-side PivotTable Service connections
via HTTP and IIS.
If you’ve been struggling with some of the limitations
in SQL Server 7.0’s OLAP Server, you should be
pleasantly surprised by all of the work that went
into making Analysis Services a much more serious
tool for working with large amounts of data.
Improvements in the Pit
Developers haven’t been left behind, either.
The core of SQL Server, after all, is the Transact-SQL
language. Although there aren’t a lot of new extensions
to the language this time around, the ones that
were incorporated are significant. SQL Server
now supports user-defined functions, or UDFs.
By using the CREATE FUNCTION keyword, you can
wrap up your own custom processing in a package
that’s indistinguishable from a built-in SQL Server
function:
CREATE FUNCTION twoTimes
( @input int=0 )
RETURNS int
AS
BEGIN
RETURN 2 * @input
END
After you create a function, it’s easy to use
it in a SQL statement:
SELECT OrderID, dbo.TwoTimes(Quantity)
AS Extra
FROM [Order Details]
Another nice extension to the language is the
addition of indexed views. You can use the CREATE
INDEX statement on the results of a view to improve
the response time of future statements that retrieve
data from that view. New data types in T-SQL include
bigint for eight-byte integers, sql_variant for
storage of variant data (and tighter integration
with, for example, Visual Basic applications),
and table for temporary storage of result sets
for later use. Many developers will be pleased
to know that cascading declarative referential
integrity is now fully supported. For those using
triggers for custom processing, INSTEAD OF and
AFTER triggers are welcome improvements as well.
All of this T-SQL is wrapped up in the latest
iteration of SQL Server Query Analyzer. Figure
4 shows this tool in its SQL Server 2000 version.
|
Figure 4. SQL Server
Query Analyzer. Note the separate server and
client impact analyses. |
The Object Browser window offers drag-and-drop
access to both the objects in your databases and
templates for common bits of SQL (and, yes, you
can add your own templates to the list). You can
also trace both the client and server impact of
queries directly from query analyzer, as well
as estimate the execution plan without actually
executing the query.
Should You Sign Up for a Test
Drive?
Feature lists are a thing of joy to marketers
and developers, but for the business side, there’s
really only one question: Is it worth spending
the money to upgrade to a new version of a complex
and core program such as SQL Server? After working
with beta versions for most of a year, I can see
several key situations where the upgrade is immediately
justified:
- If your organization has made a commitment
to new technologies such as XML, Active Directory,
or Kerberos, you’ll appreciate the integration
with these features.
- If you’re running a large Web site or internal
application that’s pushing the limits of SQL
Server 7.0, the improved scalability in SQL
Server 2000 will remove those limits. If you
have a large amount of data to analyze from
a business perspective, Analysis Services can
pay for the cost of the upgrade all by itself.
- If you have many servers to administer and
develop with, the incremental improvements to
both the administrative and development tools
can make a substantial difference in your productivity.
Of course, as with any other business-critical
software, you should approach a database server
upgrade with caution. If at all possible, install
SQL Server 2000 on a test server and run your
own application in that environment before rolling
it out on your production servers. Microsoft makes
this process easy with evaluation copies and a
wealth of information on the SQL Server Web site
at www.microsoft.com/sql/default.htm.
Speaking personally, I’ve found SQL Server 2000
to be a polished and stable release, and I look
forward to helping my clients implement it.