Overview
- Think scalability and performance first!
- Invest time performing an in-depth current content inventory of Records (are we looking at terabytes?)
- What are your growth estimates for the size of your electronic record inventories?
- Large collections of Records require careful planning on numbers and locations of content databases, site collections, sites and document libraries in relation to the file plan
- Bottom Line: Invest time in planning the SharePoint Logical Architecture for your Records Center
- Some more thoughts:
- Try and limit the size of your content databases to 50 GB to 100 GB
- For very large archives of records, organize your Records Center repositories as independent site collections rather sub-sites and document libraries (consider making a site collection per category in your file plan)
- Consider RBS
- Clearly think of your backup, restore and disaster recovery strategy
How SharePoint 2010 can help with scaling [1]
SharePoint 2010 has many features to make it easier to scale to massive archives such as:
- Database query performance optimizations [2]
- SQL 2008’s Remote Blob Storage (RBS) decreased size of content DB [3]
- Basically takes binary data out of your content databases resulting in the binary data on the file systems themselves and the metadata in the databases reducing the database size and improving scalability and performance
- Internal timer job processing improvements
- Highly scalable search along with new database indexing strategies [4]
- Compound indexing, index management, and content-by-query optimizations
- SharePoint now supports multiple index servers
- Content index can now be divided into multiple index partitions
- Each index server can be configured to run multiple crawlers
- Multiple crawlers can crawl content in parallel
- Index servers are now stateless. The crawlers build the content index and propagate directly to the query servers.
- multiple query servers benefits of redundancy and parallel performance can be made available
- crawl management and property store data tables have been split into separate databases and multiple tables of this kind can be configured.
- List optimizations
- Tens of millions of docs in a single list
- Service Applications Architecture
- New Send to connections allow moving of records instead of just copying
- Multiple Records Center Site Collections
- Internal database improvements (e.g. lock ordering, throttling, IOPS efficiency)
- Background per-item processing throughput maximization
- Content Organizer is able to organize your Records Repositories
- Content Type Syndication allows central location to inherit and publish from
This allows:
- Millions of records in a single Records Center
- Multiple Records Centers! (new in SharePoint 2010, in 2007 you were only allowed 1)
- A distributed archive allowing many Record Centers to bind together to act as one logical repository
- Fast searching through your archives of records
- An easy mechanism to move records to the archive of your choice and leave a reference to where it now exists
But this does not excuse you from planning your architecture for scalability and performance!!
Visit Microsoft’s Technet article called “SharePoint Server 2010 capacity management: Software boundaries and limits” [5] to see more of SharePoint 2010’s new boundaries, recommendations and thresholds that can help with scaling, capacity and performance for your Records Management Solution. I have listed some here:
Limit | Threshhold or Maximum |
Zone | 5 per Web application |
Managed path | 20 per Web application |
Solution cache size | 300 MB per Web application |
Site collection | 250,000 per Web application
|
Application pools | 10 per Web server
|
Content database size (general usage scenarios) | 200 GB per content database |
Content database size (all usage scenarios) | 4 TB per content database |
Content database size (document archive scenario) | No explicit content database limit |
Content database items | 60 million items including documents and list items |
Site collections per content database | 2,000 recommended, 5,000 maximum |
Remote BLOB Storage (RBS) storage subsystem on Network Attached Storage (NAS) | Time to first byte of any response from the NAS cannot exceed 20 milliseconds
|
Web site | 250,000 per site collection |
Site collection size | Maximum size of the content database
|
List row size
|
8,000 bytes per row |
File size | 2 GB |
Documents | 30,000,000 per library |
Major versions | 400000 maximum |
Items | 30,000,000 per list |
Rows size limit | 6 table rows internal to the database used for a list or library item |
Bulk operations | 100 items per bulk operation |
List view lookup threshold | 8 join operations per query |
List view threshold | 5000 maximum |
List view threshold for auditors and administrators | 20000 maximum |
Subsite | 2,000 per site view |
Coauthoring in Microsoft Word and Microsoft PowerPoint for .docx, .pptx and .ppsx files | 10 concurrent editors per document |
Security scope | 1,000 per list
|
Web parts
|
25 per wiki or Web part page
|
Number of SharePoint groups a user can belong to
|
5000 |
Users in a site collection | 2 million per site collection |
Active Directory Principles/Users in a SharePoint group | 5,000 per SharePoint group |
SharePoint groups | 10,000 per site collection |
Security principal: size of the Security Scope | 5,000 per Access Control List (ACL)
|
The complete list of this series can be seen by the following links:
1. Introduction
2. Document IDs
3. Managed Metadata Service (Term Store)
4. In-Place Records Declarations
5. Site Collection Auditing
6. Content Organizer
7. Compliance Details
8. Hold and eDiscovery
9. Content Type Publishing Hubs
10. Multi-Level Retention
11. Virtual folders and metadata based navigation
12. Scaling
13. Send To…
14. Document Sets
[1]
[2]
[3]
http://technet.microsoft.com/en-us/library/ee748607.aspx
[4]
http://www.houberg.com/2009/10/sp2010_scalability_2_of_4_sharepoint_search/
[5]