|
Getting Started
with Indexing
Section
One: Basic Properties About Your Site
Section
Two: Contact Information About Your Site
Section
Three: Information About Your Organization
Section
Four: Do you want to generate HTML code for a search box that will search only your site?
Section
Five: Subject Analysis
Section 6:
Calculate
My Metadata
Illinois
GILS Spidering
How
to Keep Specific Documents Out of Find-It! Illinois
Additional Services and
Reports Available to Illinois GILS Agencies
Find-It!
Illinois Resources for Illinois State Agency Webmasters Find-It!
Illinois Contact Information
|
AGENCY GUIDE TO INDEXING WEBSITES
Getting
Started with Indexing
This chart
presents the Find-It! Illinois metadata elements to be used in indexing your
information for IGI. For your convenience, we have provided you with our
Metadata Generator that converts all of your information into metadata. The
following chart describes the content of each element and gives examples of
what it will look like after the metadata is calculated/formatted.
Section 1:
Basic Properties About Your
Site
|
Element |
Description
and Example |
| A.
Site title |
This is the
officially assigned title of your document/page or the name by which
the resource is formally known.
Example:
<meta name="siteTitle" content="Illinois State
Library Homepage"> |
| B.
Keywords |
This section will include any terms,
acronyms and/or phrases that users may use to find your information;
even incorrect terms that are commonly used should be added here.
Examples:
Junior League of Springfield is a correct phrase;
however, people also refer to it as The Springfield Junior League and
The Junior Womens League.
<meta name="keywords" content="Junior
League of Springfield">
<meta name="keywords" content="The
Springfield Junior League">
<meta
name="keywords" content="The Junior
Womens
League">
|
| C. Description |
This is a narrative that
summarizes the content and purpose of your document/page information.
The length should be less than 500 words and can be copied and pasted
from any existing Web source.
Example
<META
NAME="description" CONTENT="The Department of Business
Services is a centralized repository for business filings in six
areas: Corporations (For profit and Not for profit), Limited Liability
Companies, Registered Limited Liability Partnerships, Limited
Partnerships, Trademarks, and Uniform Commercial Code. Through fees
and taxes, this Department generates approximately 69 percent of the
money deposited into the General Revenue Fund by the Secretary of
State's office. All documents are filed in the Springfield office, but
pamphlets and forms, as well as Certificates of Good Standing and No
Record are available through the Chicago office.">
|
| D. Date created |
The date that indicates the day your document was
created.
Example:
<meta name="createDate"
content="07181996">
|
| E. Date last
modified |
This is to be
used each time you change any content of the document. This is
especially important for archival purposes.
Example:
<meta name="dateofLastModification"
content="03011998">
|
| F. Time period |
The timeframe
covered by the content of your document.
Examples:
<meta name="timePeriodTextual"
content="Statistics collected for FY 1993-1995">
<meta name="timePeriodTextual"
content="Covers period of 1922-1945">
<meta name="timePeriodTextual"
content="historical records from 1889-1939">
|
| G. Medium |
Choose from Website or
CD-ROM or Database or Microfilm or Paper or Photograph or Sound
Recording or Image.
Example:
<meta
name="medium" content="website">
|
| H. Site to be
retained for permanent public access? |
"Yes"
is the automatic default because most documents you will be applying
metadata to will be public documents.
Example:
<meta name="PermanentPublicAcess"
content="Yes">
|
| I. Unique
control ID or number? |
If your
organization has an official control ID or number assigned to your
electronic document, complete this section.
Examples:
<meta name="originalControlIdentifier"
content="1040A">
<meta name="originalControlIdentifier"
content="Executive Order 97-02">
|
| J. Agency
program |
This section identifies the
official name for the agency program for your document. Usually it
corresponds to the program budget name.
Example:
<meta name="agencyProgram"
content="Crime Victims Compensation">
|
| K. Related
sources |
This is a will allow a
direct link to a point of reference to other information on the Web
that could be helpful to your audience.
Example:
If someone were looking for information on public
improvements, the Capital Development Board may want to reference a
link to the specific Illinois Department of Revenue Web page that
concerns the tax exemption for building materials for public
improvements.
<meta name="linkage"
content="http://www.revenue.state.il.us"> |
| L. Language |
English will be the default
unless you make a change.
Examples:
<meta name="language"
content="EN">
<meta name="language" content="SP"> |
| M. Government
type |
Choose whether your document
is a state or local publication.
Example:
<meta
name="language" content="state">
|
Section
2: Contact Information About Your Site
This section is to help resource
librarians in their search for patron information. It is especially important
that this is a contact person that can address questions or direct the
librarian to the right person. It is essential that the phone number is one
that is answered by a person and not a machine.
|
Element |
Example |
| A. Contact Name |
<meta name="contactName"
content="Anne Wendler"> |
| B. Organization |
<meta name="contactOrganization"
content="Illinois State Library"> |
| C. Contact
Address Line 1 & Line 2 |
meta name="contactStreetAddress1"
content="300 South Second Street">
<meta name="contactStreetAddress2"
content="Third Floor LAT">
|
| D. City |
meta name="contactCity"
content="Springfield"> |
| E. Zip |
<meta name="contactZipcode"
content="62701-1796"> |
| F. Email |
<meta name="contactNetworkAddress"
content="bmatheis@ilsos.net"> |
| G. Phone |
<meta name="contactPhoneNumber"
content="(217) 558-2065"> |
| H. Fax |
<meta name="contactFaxNumber"
content="(217) 557-6737"> |
Section
3: Information About Your Organization
This section will help searchers to consistently retrieve
potentially relevant items by searching under the name of the organization.
The elements start at the largest designation and go to the smallest. Even
though your organization may not require all five divisions, they are there
for those who may need them. Please, do not use abbreviations.
|
Element |
Example |
| Jurisdiction |
State of Illinois |
| Section/Unit |
Secretary of State |
| Department |
Illinois State Library |
| Office |
Library Automation and
Technology |
| Division |
Find-It! Illinois |
Section
4: Do you want to generate HTML code for a search box that will
search only your site?
If you choose Yes, the
Metadata Generator will generate the HTML code necessary for you to place a
search box on your web site that allows users to search only your site.
Section
5: Subject Analysis
Select all the
subject categories from the Illinois
Subject Tree that describe your site. If you see that there are some
subjects that you need tailored for your organization, contact Find-It!
Illinois Outreach Coordinator, Connie Frankenfeld
.
Section
6: Calculate
My Metadata
At the very end of the
Meta-Data Generator, there is a clickable box for calculating your metadata.
After clicking on this box, your calculated/formatted metadata will appear on
your screen. From here, you can view your metadata. If everything meets your
approval, you can then just copy and paste this calculated/formatted metadata
into the HEAD portion of your electronic document. It is important that you
paste this calculated/formatted metadata in between the <HEAD> and the
</HEAD>. From there, you then post your new indexed document to your
server; replacing the old one.
Top
Illinois GILS Spidering
- After your agency has posted your new
indexed document, the Find-It! Illinois spider will visit your site and
add your new document information to the IGI database.
- Your site will be visited regularly.
Presently, we are on a weekly spidering schedule. Therefore, your
documents should be captured and assimilated in a matter of days.
- If you are using a proxy server anything
else that might cause your public documents to elude the spider, you will
want to manually add your site by completing the Suggest
A Site form.
- The spider will inventory all of your
primary links and pages. This information will be compared to locator
records in the IGI database. Replaced pages will be automatically added to
our permanent public access database.
- If after two weeks, your pages do not show
up in search results, please contact Anne Wendler, Find-It! Illinois
Outreach Coordinator at 217-417-0495.
Top
How to Keep Specific
Documents Out of Find-It! Illinois
First of all, to see which pages are spidered
and which aren't, it's useful to understand how the Find-It! Illinois spider
collects pages. It begins by gathering information from a starting point page,
typically the server's root page. It collects all the links on that page, then
visits each of those pages. If the page meets our criteria, then that page is
scanned for links, and those links are visited, and so on. This means that,
for a page to be collected into our system, it must ultimately be available
from the root, starting point page. If nothing links to a page, it won't be
found by our spider -- or by any other spider.
The way to prevent an internal-use page from
being spidered is to add meta tags that will block robot access. Borrowing
from the Netscape Compass Server documentation, here's an explanation:
The META tag that controls robot behavior uses
the name ROBOTS. Its content tells a visiting robot whether it should include
the document itself in its index and whether to follow hyperlinks found in the
document to index the linked documents. The general format for the ROBOTS tag
is as follows:
<META NAME="ROBOTS"
CONTENT="terms">
The terms in the CONTENT portion can be
any of the following, separated by commas:
|
Content String |
Meaning |
| ALL |
The robot is
welcome to include this document in its index and to follow any links
found in it. This is the default value. You can get the same result by
leaving the CONTENT portion empty, by omitting the ROBOTS tag
entirely, or by using the contents "INDEX, FOLLOW". |
| NONE |
The robot should
ignore the page. This is the equivalent of "NOINDEX, NOFOLLOW". |
| INDEX |
The robot is
welcome to include the document in its index for searching. |
| NOINDEX |
The robot should
not include the document in its index. The robot can still follow
links, unless you also include the NOFOLLOW string. |
| FOLLOW |
The robot is
welcome to follow any hyperlinks in the document to locate other
documents for its index. |
| NOFOLLOW |
The robot should
not follow any hyperlinks in the document to locate other documents.
This enables you to index just the entry point of a complex document,
for example, or to index the open access point to an otherwise
restricted site. |
Top
Additional Services
and Reports Available to Illinois GILS Agencies
All government agencies that
participate in Illinois GILS are entitled to additional helpful services and
reports. The following benefits are available:
Top

Find-It! Illinois is a
service of the Illinois State Library, Jesse White, Secretary of State and
State Librarian, and is supported by a grant from the federal Institute of
Museum and Library Services.
Illinois State Library
300 South Second
Springfield, Illinois 62701
http://finditillinois.org
|