In-Depth
DFS Best Practices: How To Ditch Windows File Replication Service
It's time to ditch File Replication Service and move completely to Distributed File System.
Distributed File System (DFS) has been around since Windows NT and comes in a variety of configurations and options. While DFS is available in standalone and domain configurations, this article will specifically discuss the domain option. DFS is a popular and effective technology that provides redundant replication of files and folders between remote servers. It can be organized under a common namespace to allow users to connect without needing the name of the server that the DFS share is hosted on.
Unfortunately, at least to my knowledge, Microsoft has never developed a comprehensive DFS best practices document. This is my attempt to summarize all the best practices I've used, learned and recommended over the years. Note that new information is always posted to the Microsoft Web site, so you should periodically check microsoft.com/DFS for new articles.
The term DFS is used to refer to the legacy namespace product available in Windows 2000, Windows 2003 and Windows 2003 R2, and available as a legacy product in Windows 2008. DFS used the problematic File Replication Service (FRS) for the replication engine. But, in Windows 2003 R2, Microsoft introduced a new DFS namespace product along with a much-improved replication engine. For clarity, I'll use "legacy DFS" to refer to the legacy DFS product available in Windows Server 2000, Windows Server 2003 and Windows Server 2008. I'll refer to the new DFS namespace product as DFSN and the new replication engine as DFSR.
[Click on image for larger view.] |
Figure 1. A typical configuration for the legacy DFS. |
Legacy Windows Server 2003 DFS/FRS
The legacy DFS in Windows 2000 and Windows 2003 used a cumbersome and confusing administrator console and terminology, as well as the problematic FRS. With Windows 2003 now out of mainstream support by Microsoft, it's time to migrate to the new DFS/DFSR available in Windows 2003 R2 and Windows 2008. This section will identify problems and best practices associated with the legacy DFS and FRS.
Known Problems with FRS
FRS has historically been fraught with problems. Windows 2003 attempted to mitigate some of the issues but was unable to actually fix them. Microsoft delivered a completely new replication engine (DFSR) for Windows 2003 R2, and Windows 2008. DFSR is described later in this article. First, let's touch upon the problems of FRS.
FRS detects changes via the New Technology File System (NTFS) journal, which is modified when a change is made to a file or folder in the file system. Unfortunately, FRS can't detect whether the change actually requires replication.
Applications that scan files -- including antivirus, disk defragmentation and other apps -- typically modify the security descriptor of the files, which triggers a change in NTFS journal, which in turn triggers FRS to replicate the files even though there are no changes in the files. Changes were made to FRS in Windows 2003 that minimized the problems but did not fix them. They included:
- Suppressing excessive replication. When FRS determines that certain files are frequently being replicated, an event is logged and replication is suppressed for those files. This prevents the staging areas from filling up and stopping FRS, but you could unwittingly delete valid files.
- Not stopping when the staging area is full. When the staging area gets to 90 percent full, old files are deleted until the directory is only 60 percent full, thus preventing FRS shutdown. But this might delete updates you need.
- Making it impossible to proactively seed data on multiple servers to avoid replicating large amounts of data over the WAN. Workarounds are to copy small amounts of data at a time until you have it all copied.
[Click on image for larger view.] |
Figure 2. The DFSR console. |
Best Practices
Best practices for the legacy DFS/FRS revolve around the central concept that keeping dynamically changing data on DFS shares is inherently a bad idea. The previously noted problems with FRS cause it to get overwhelmed easily with large numbers of files and have a hard time replicating data that changes frequently. It's not recommended to use FRS to house My Documents for users' profiles, for example.
Other best practices include:
- In initiating data on DFS shares for a series of target servers, seed the data on a single share and let it replicate. Do this in smaller quantities. Adding large numbers of files in multiple shares at the same time will make it difficult for FRS to catch up. If the data exists on multiple DFS servers, add and replicate data from one server a time. That way, after the initial seeding, FRS only has to replicate changes.
- Ensure that your antivirus, defragmentation and similar programs that scan files and folders are "FRS aware." Most well-known programs that have been around for a while will have this feature, which prevents the needless replication of files due to scanning.
- Create multiple root targets on multiple machines for redundancy of data. Root targets contain configuration data.
- Provide redundancy for data on shares by creating multiple targets for DFS links. This ensures the same data is continuously replicated to multiple targets, and if one target server is down, the users will be directed to another server with that data. DFS uses the "client awareness" feature of Active Directory to locate DFS servers closest to the user.
- Replication of DFS data is not required but is recommended for data redundancy. Without replication, DFS provides only a common namespace for the shares.
- Do not host DFS shares on domain controllers (DCs). Because SYSVOL uses DFS on DCs, it's easier to isolate replication issues if the SYSVOL and DFS shares are not on the same server. Note that SYSVOL uses the DFS service and can't be disabled on DCs. The point here is not to host DFS links or root targets on DCs.
- Configure one-way FRS replication between link targets in a hub-and-spoke configuration for best practices in controlling and managing data. Data created on spoke targets won't replicate to the hub.
Limitations of FRS and Legacy DFS
FRS replicates the entire file even if only a few bytes have changed. There's an approximate limit of 65GB in a share that can effectively be replicated by DFS/FRS. Exceeding this limit results in inconsistency and poor performance. Other limitations include:
- Only one DFS root per Windows 2003 Server Standard edition (though there's no limit with the Enterprise version). DFS service start-up time increases with the number of DFS roots.
- Limit of 5,000 links per domain-based DFS namespace. More links will cause performance degradation when changes are made to DFS configuration.
- Limit of 260 characters in the DFS path. Exceeding this will cause applications to fail to access the DFS data. Data can be accessed by mapping explicitly to a drive letter.
- Domain-based DFSes can't be configured on clustered nodes; use standalone DFSes only.
- For multiple-domain DFS configurations:
- Root targets for a domain-based DFS root must be in the same domain. However, link targets can exist in domains other than the root.
- Clients can access DFS servers in trusted domains.
- When accessing link targets in other domains from the client, use Fully Qualified Domain Names (FQDNs) for link targets. See Microsoft Knowledge Base 244380 for more information.
- FRS can be used to replicate on a DFS link whose targets are in different (trusted) domains (this requires enterprise admin rights).
For further reference, see the Distributed File System: Frequently Asked Questions page.
Significant Improvements
The new DFSN and DFSR available in Windows Server 2003 R2, Windows Server 2008 and Windows Server 2008 R2 have made significant improvements over the legacy DFS and FRS products. DFSR replicates on a block-level basis, only replicating changes in a file rather than replicating the whole file. For instance, changing a title on a slide in a PowerPoint file that's 3MB in size would cause the entire 3MB file to be replicated by FRS for the old legacy DFS, but DFSR would only replicate a few bytes. This can make a huge difference not only in the network load but in disk performance, as well as in user-perceived performance of getting the change replicated. DFSR thus handles large amounts of data and dynamically changing data efficiently. DFSR is available only in Windows Server 2003 R2 and Windows Server 2008, and can only be used to replicate DFS data in Windows Server 2003 R2. But it can replicate DFS and SYSVOL data in Windows Server 2008 and Windows Server 2008 R2. In order to use DFSR for replication, only the DFS servers must be Windows Server 2003 R2, Windows Server 2008 or Windows Server 2008 R2. It's not necessary to upgrade DCs.
Best Practices
Note that installing the new DFS/DFSR in a Windows 2003 domain will require a schema change. You can review the DFS Replication Frequently Asked Questions (FAQ) page here.
- Installing the new DFS/DFSR in a Windows 2003 domain will require a schema change. This will likely require approval from your change-control process, so plan in advance.
- Replication groups are effectively used to replicate data from branch sites to file servers in the hub site, where data can easily be stored on large SAN disks. In this scenario, make sure new data is only added at the remote site. If an existing file is modified at the core (hub) site, it will replicate back to the remote sites and overwrite the file there.
- Take advantage of DFSR for SYSVOL replication in Windows Server 2008 and Windows Server 2008 R2, especially in large domains with large numbers of Group Policies deployed. This requires a migration, as FRS is the default replication engine for Windows Server 2008 domains.
- Refer to the TechNet blog by the Microsoft Directory Services team, "DFSR SYSVOL Migration FAQ: Useful trivia that may save your follicles".
- Apply these hotfixes prior to SYSVOL migration to DFSR:
- 972105
- 969688
- 978326
- 959114
- 978994
- Migrate legacy DFS shares to DFSN and DFSR technologies as Windows Server 2008 R2 begins to phase out legacy DFS and FRS. Both will eventually go away.
- Design the replication topology for replication groups prior to deployment. There are a lot of options for topology in DFSR that weren't available in DFS/FRS. Be sure that the replication method suits your file-deployment design.
- Monitor the state of DFSR replication. System Center Operations Manager contains a management pack for DFSR monitoring. There may be third-party tools as well. Note that the old Ultrasound and Sonar tools don't work with DFSR.
[Click on image for larger view.] |
Figure 3. The legacy DFS component, Distributed File System. |
Limitations
While DFSR provides more robust and efficient replication and handles dynamic data quite well, it's important to understand the scalability limitations for DFSR when planning a DFS infrastructure. Replication groups can be defined independently of DFS namespace configuration. One is not dependent on the other. Note the following limitations:
- Each server can be a member of up to 256 replication groups.
- Each replication group can have up to 256 replicated folders.
- Each server can have up to 256 connections (for example, 128 incoming connections and 128 outgoing connections).
- On each server, the number of replication groups multiplied by the number of replicated folders multiplied by the number of simultaneously active connections must be kept to 1,024 or fewer.
- A replication group can contain up to 256 members.
- A volume can contain up to 8 million replicated files, and a server can contain up to 1TB of replicated files.
- The maximum tested file size is 64GB.
- DFSR can't communicate with FRS.
For more details, see the Microsoft TechNet article on this issue. There's also an excellent FAQ here.
[Click on image for larger view.] |
Figure 4. The interface for migrating existing DFS namespaces. |
Recommendations
Overall, my recommendations are simple: Get off of FRS. Seriously. It's old, junky technology that Microsoft threw in the dumpster years ago. It's perhaps some of the worst code to come out of Redmond. Bite the bullet and migrate all DFS shares (Windows Server 2003 R2 and newer) and SYSVOL replicas (Windows Server 2008 and newer) to DFSR. Take advantage of the robustness and vast performance improvements, and spend your time doing more productive things.
With the depreciation of legacy DFS and FRS in Windows Server 2008 R2, Microsoft is sending a message that it's time to move to better technology. There are no downsides, in my mind. I've recommended migration to DFSN and DFSR to many customers who've asked for help fixing DFS configurations. We should all quit trying to make the old stuff work.