Microsoft® Office SharePoint® Server 2007
|
|||
Fill in this worksheet using the following topics: Plan to crawl content |
After filling in this worksheet, use it with the following topics: Deployment for Office SharePoint Server 2007 |
||
Prepared by: |
Date: | ||
Specify the SSP name to which this worksheet pertains.
Note Most organizations use only one SSP. If you are planning to use multiple SSPs, use a separate worksheet for each SSP.
SSP name |
Specify the content access account the crawler will use, by default, when crawling content.
Default content access account |
Use the following section of the worksheet to record your decisions about content sources. The section contains five tables - one for each content source type. If you need more than one content source for a particular type of content, copy the appropriate table, as needed.
SharePoint sites
Use the following table to specify a content source for crawling SharePoint sites.
Content source name | |
Content source type |
SharePoint sites |
Start addresses | |
Crawl settings |
Select one: Crawl everything under the host name for each start address. Crawl only the SharePoint site of each start address. |
Crawl schedule |
Type of schedule: Daily Weekly Monthly Daily Run every ________ days. Starting time: ________ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Weekly Run every _______ weeks. On (select all that apply): Monday Tuesday Wednesday Thursday Friday Saturday Sunday Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Monthly On the ______ day of the month. Select each month for which you want this schedule to apply: January May September February June October March July November April August December Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. |
Crawl schedule |
Type of schedule: Daily Weekly Monthly Daily Run every ________ days. Starting time: ________ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Weekly Run every _______ weeks. On (select all that apply): Monday Tuesday Wednesday Thursday Friday Saturday Sunday Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Monthly On the ______ day of the month. Select each month for which you want this schedule to apply: January May September February June October March July November April August December Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. |
Web sites
Use the following table to specify a content source for crawling Web sites.
Content source name | |
Content source type |
Web sites |
Start addresses | |
Crawl settings |
Select one: Crawl only within the server of each start address. Crawl only the first page of each start address. Custom - specify page depth and server hops: Limit page depth to _______ pages (default is unlimited). Limit server hops to _______ hops (default is unlimited). |
Crawl schedule |
Type of schedule: Daily Weekly Monthly Daily Run every ________ days. Starting time: ________ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Weekly Run every _______ weeks. On (choose all that apply): Monday Tuesday Wednesday Thursday Friday Saturday Sunday Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Monthly On the ______ day of the month. Select each month for which you want this schedule to apply: January May September February June October March July November April August December Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. |
Crawl schedule |
Type of schedule: Daily Weekly Monthly Daily Run every ________ days. Starting time: ________ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Weekly Run every _______ weeks. On (select all that apply): Monday Tuesday Wednesday Thursday Friday Saturday Sunday Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Monthly On the ______ day of the month. Select each month for which you want this schedule to apply: January May September February June October March July November April August December Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. |
File shares
Use the following table to specify a content source for crawling file shares.
Content source name | |
Content source type |
File shares |
Start addresses | |
Crawl settings |
Select one: The folder and all subfolders of each start address. Only the folder of each start address. |
Crawl schedule |
Type of schedule: Daily Weekly Monthly Daily Run every ________ days. Starting time: ________ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Weekly Run every _______ weeks. On (select all that apply): Monday Tuesday Wednesday Thursday Friday Saturday Sunday Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Monthly On the ______ day of the month: Select each month for which you want this schedule to apply. January May September February June October March July November April August December Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. |
Crawl schedule |
Type of schedule: Daily Weekly Monthly Daily Run every ________ days. Starting time: ________ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Weekly Run every _______ weeks. On (select all that apply): Monday Tuesday Wednesday Thursday Friday Saturday Sunday Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Monthly On the ______ day of the month. Select each month for which you want this schedule to apply: January May September February June October March July November April August December Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. |
Exchange public folders
Use the following table to specify a content source for crawling Exchange public folders.
Content source name | |
Content source type |
Exchange public folders |
Start addresses | |
Crawl settings |
Select one: The folder and all subfolders of each start address. Only the folder of each start address. |
Crawl schedule |
Type of schedule: Daily Weekly Monthly Daily Run every ________ days. Starting time: ________ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Weekly Run every _______ weeks. On (select all that apply): Monday Tuesday Wednesday Thursday Friday Saturday Sunday Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Monthly On the ______ day of the month. Select each month for which you want this schedule to apply: January May September February June October March July November April August December Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. |
Crawl schedule |
Type of schedule: Daily Weekly Monthly Daily Run every ________ days. Starting time: ________ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Weekly Run every _______ weeks. On (select all that apply): Monday Tuesday Wednesday Thursday Friday Saturday Sunday Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Monthly On the ______ day of the month. Select each month for which you want this schedule to apply: January May September February June October March July November April August December Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. |
Business data
Use the following table to specify a content source for crawling business data, sometimes called line-of-business data.
Content source name | |
Content source type |
Business data |
Start addresses | |
Crawl settings |
Select one: Crawl entire Business Data Catalog. Crawl selected applications. |
Crawl schedule |
Type of schedule: Daily Weekly Monthly Daily Run every ________ days. Starting time: ________ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Weekly Run every _______ weeks. On (select all that apply): Monday Tuesday Wednesday Thursday Friday Saturday Sunday Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Monthly On the ______ day of the month. Select each month for which you want this schedule to apply: January May September February June October March July November April August December Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. |
Crawl schedule |
Type of schedule: Daily Weekly Monthly Daily Run every ________ days. Starting time: ________ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Weekly Run every _______ weeks. On (select all that apply): Monday Tuesday Wednesday Thursday Friday Saturday Sunday Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. Monthly On the ______ day of the month. Select each month for which you want this schedule to apply: January May September February June October March July November April August December Starting time: _______ Repeat within the day (optional) Every _____ minutes. For ______ minutes. |
Use this section of the worksheet to record your decisions regarding crawler impact rules. Make a copy of this table for each crawler impact rule you want to define.
Site (URL) | |
Request frequency |
Choose one of the following: Request up to the specified number of documents at a time. (Select one.) - or - Request one
document at a time and wait _______ seconds |
Use the following table to record any third-party or custom protocol handlers you will need during deployment.
Tip Review the start addresses listed in the Content sources section of this worksheet to determine what protocol handlers are required to access the data that you want to crawl, and then list in the following table the protocol handlers that are not provided, by default.
Protocol handlers |
Use this section of the worksheet to record your decisions regarding crawl rules. Make a copy of this table for each crawl rule you want to define.
Path | |
Crawl configuration |
Choose one of the following options: Exclude all items in this path. - or - Include all items in this path and optionally select any of the following: Follow links on the URL without crawling the URL itself. Crawl complex URLs. Crawl content in SharePoint sites as HTTP pages. |
Specify content access account |
Choose one of the following options: Use the default content access account when crawling this path. - or - Use this
content access account _____ _______ ______ ___________ Allow basic authentication? Yes No - or - Use the client certificate _____ _______ ______ __________ |
Use the following table to record your decisions about the file types that you want to include in the file-type inclusions list.
File types to add |
Require additional IFilter? (Yes/No) |
Use the following table to list the file types that you do not want to crawl and that you want to remove from the file-type inclusion list.
File types to remove |
Use the following table to record the languages for which you need to install word breakers and stemmers.
Languages of word breakers and stemmers |
Use the tables in this section to record the decisions you make about farm-level search settings.
Contact e-mail address
Record the e-mail address of the person in your organization whom external site administrators can contact if problems arise when their site is being crawled.
Contact e-mail address |
Proxy server settings
Will you configure proxy server settings to use when crawling other servers?
Yes No
If yes, use the following table to record the proxy server settings to use.
Address (required) | |
Port (optional) | |
Bypass proxy server for local (intranet) addresses? |
Yes No |
Do not use proxy server for addresses beginning with: |
The Address can be the either the NetBIOS name or the IP address of the proxy server.
Time-out settings
Use the following table to record the amount of time that the search server will wait while connecting to other services.
Connection time | |
Request acknowledgement time (in seconds) |
SSL certificate warning configuration
Do you want to ignore Secure Sockets Layer (SSL) certificate name warnings and trust that sites are legitimate even if their certificate names are not exact matches?
Yes No
|