Mission
The mission is to create a Digital Library which
will foster creativity and free access to all
human knowledge. As a first step in realizing
this mission, it is proposed to create the Digital
Library with a free-to-read, searchable collection
of one million books, available to everyone
over the Internet. Within 5 years, it is our
expectation that the collection will grow to
10 Million books. The result will be a unique
resource accessible to anyone throughout nation
or world 24x7, without regard to nationality
or socioeconomic background.
One of the goals of the Digital
Library is to provide support for full text
indexing and searching based on OCR (optical
character recognition) technologies where available.
The availability of online search allows users
to locate relevant information quickly and reliably
thus enhancing student's success in their research
endeavors. This 24x7 resource would also provide
an excellent test bed for language processing
research in areas such as machine translation,
summarization, intelligent indexing, and information
retrieval.
It is our expectation that
the Digital Library will be mirrored at several
locations worldwide so as to protect the integrity
and availability of the data. Several models
for sustainability are being explored. Usability
studies would also be conducted to ensure that
the materials are easy to locate, navigate,
and use. Appropriate metadata for navigation
and management would also be created.
Goals
The primary long-term objective is to capture
all books in digital format. It is believed
that such a task is impossible and could take
hundreds of years, and never be completed. Thus,
a first step was to demonstrate the feasibility
by undertaking to digitize 1 lakh books. This
will be achieved in the two year timeframe.
We will continue to digitize books at 5 scanning
centers all over the Nation to achieve the long
term objective. We believe such a project has
the potential to change how education is conducted
in much of the world.
Typical large high-school libraries
house fewer than 30,000 volumes. Most libraries
in South Asia have less than a million volumes.
The total number of different titles to be indexed
in DLP of Open Forum in next 5 years is about
1 lakh books. One lakh books, therefore, is
more than the holdings of most high-schools,
and is equivalent to the libraries at many universities
and represents a useful fraction of all available
books.
A secondary objective of this
project will be to provide a test bed that will
support other researchers who are working on
improved scanning techniques, improved optical
character recognition, and improved indexing.
The corpus this project creates will be one
to three orders of magnitude larger than any
existing free resource.