IMAGING FAQ’s
Q. What is a “document?”
Q. Can I edit or alter images?
Q. Do imaging systems support audit trails?
Q. What is the standard format used to store images?
Q. Which types of desktop operating systems are usually supported?
Q. How much disk space does an imaging system typically require?
Q. What if my database is too big to fit in one data volume?
Q. How much total RAM does imaging software require?
Q. Are special display cards or monitors required?
Q. Which manufacturers make document imaging scanners?
Q. What is the most common hardware and software scanner interfaces?
Q. How can I scan checks?
Q. How can I scan large format documents?
Q. What image resolution should I use?
Q. What about color files or photographs?
Q. How can I scan double-sided documents?
Q. Can I scan landscape and portrait pages together in one batch?
Q. How are “skewed” images handled?
Q. What file formats can a versatile system import?
Q. What is the difference between CD or DVD jukeboxes/changers and towers?
Q. Can I view combinations of images, text and index fields side by side?
Q. Can I open and display more than one document at a time?
Q. How can I re-sequence pages?
Q. Will I need a specialized “imaging” display?
Q. What is the advantage of a large monitor for “power users?”
Q. What is important besides monitor size?
Q. Will I need a specialized printer for images or OCR’ed text?
Q. In which formats can I export documents?
Q. What is OCR?
Q. What is the difference between OCR and indexing?
Q. How accurate is OCR?
Q. Do I have to go through and correct OCR mistakes?
Q. How fast is the OCR process?
Q. What is ICR (Intelligent Character Recognition)?
Q. What is OMR (Optical Mark Recognition)?
Q. Can OCR’ed text be exported and re-used in a word processor?
Q. Can I manually correct OCR errors and typos?
Q. What is the difference between COLD and imaging?
Q. How many index fields can the COLD server extract from each report?
Q. What is Document Imaging and Management?
Q. What is a Document?
Q. How does it store the documents?
Q. What if my application demands many on-line disks?
Q. How are documents captured?
Q. Can colored documents be captured?
Q. How do I retrieve the document I want?
Q. Can I access information over the World Wide Web?
Q. What are imaging systems best used for?
Q. What Space Saving does Imaging Offer?
Q. Can I view images on my existing hardware?
Q. What about the legality of images held?
Q. What is OCR and where does it fit in with imaging?
Q. What is an electronic document?
Q. How would I access and use this electronic document?
Q. Do you store the documents on tape, CD, or DVD?
Q. Are there any hardware and software requirements?
Q. What electronic formats do you offer?
Q. How many documents can fit on one CD?
Q. Can I add notes to the scanned documents?
Q. Do I need special training to use a document CD?
Q. What if I want to add more documents to my MSI CD that I had made last year?
Q. What if I want another copy of my document CD later?
For more information click here
Q. What is a “document?”
A. A document can be from one to several thousand pages, and can include images and/or text, plus annotations, and one template (index card).
BACK TO TOP
Q. Can I edit or alter images?
A. An imaging system should not provide any facility for editing or altering images. This is important as many users consider that images should be sacrosanct and that any changes would undermine the integrity of the system. In addition, the system should provide an audit trail function to keep track of which users have accessed which documents at what times.
BACK TO TOP
Q. Do imaging systems support
audit trails?
A. An imaging system’s audit trail product should record a user name, date, time, document name and action whenever a user accesses a database or document. Various levels of audit-trail logging detail and activity tracking should be available. The system should also support a viewer for sorting and filtering these logs.
BACK TO TOP
Q. What is the standard format
used to store images?
A. Black and white images are most commonly stored as standard TIFF files using CCITT Group 4 (two-dimensional) compression. Grayscale and color images are frequently stored as TIFF files with JPEG compression.
BACK TO TOP
Q. Which types of desktop operating systems are usually supported?
A. Most imaging systems have client applications that can run as Windows applications on Windows 95, 98 and Windows NT. Internet/intranet systems may be able to run on additional platforms, such as Macintosh and Unix, among others.
BACK TO TOP
Q. How much disk space does
an imaging system typically require?
A. With the rapid drop in prices for hard drives and optical media, it costs much less to store documents on an imaging system than with paper. A single page typically occupies around 50KB of disk space if the image is stored in TIFF Group IV. Each gigabyte (GB) of storage space (which costs only a few dollars) will hold approximately 20,000 pages.
BACK TO TOP
Q. What if my database is too
big to fit in one data volume?
A. A high-end imaging system will allow data and images to be stored across multiple volumes, with each volume residing in a different directory or on a different drive, disk array, CD or MO disk.
BACK TO TOP
Q. How much total RAM does imaging software require?
A. Client software generally requires 16 to 20 MB of RAM to run, with higher requirements for scanning and OCR. Most systems recommend having 64MB or more.
BACK TO TOP
Q. Are special display cards or monitors required?
A. Most systems work with any Windows-compatible video card and VGA (or better) monitor, and recommend that you use at least a 15" monitor with at least 800 x 600 dpi in resolution.
BACK TO TOP
Q. Which manufacturers make document imaging scanners?
A. Some of the top scanner manufacturers include Ricoh, Fujitsu, Panasonic, Bell & Howell, Canon, Hewlett Packard, Avision, Mitsubishi, Visionshape, Kodak and BancTec. Document imaging scanners typically have document feeders and fast scan rates to quickly bring in large amounts of documents.
BACK TO TOP
Q. What are the most common hardware and software scanner interfaces?
A. Kofax Image Controls (http://www.kofax.com) provide the most popular document imaging scanner interfaces.
Many scanners attach to an Adaptec SCSI card or to a Kofax Image processing board. Most scanners use either TWAIN or ISIS scanner drivers to communicate with the computer.
BACK TO TOP
Q. How can I scan checks?
A. Several manufacturers make scanners specifically designed for checks that read the magnetically encoded MICR numbers at the bottom of the check. If you do not have one of these scanners, most checks can be scanned with regular document imaging scanners and OCR’ed as usual, though the MICR numbers will not be read.
BACK TO TOP
Q. How can I scan large format documents?
A. Several manufacturers, including Contex, Vidar, Ocй and Calcomp make scanners specifically designed for large format documents up to E-size (34" x 44") and A-0 size (33" x 46.8"). If you do not have one of these, the document can be reduced in size using a photocopier and then scanned with a normal scanner, or sent to a service bureau that has large format scanners.
BACK TO TOP
Q. What image resolution should
I use?
A. Most imaging systems can support documents scanned at various resolutions, from 50 dpi to 600 dpi (or more) depending on your scanner. Depending on the purpose and the contents of the page, most documents are scanned in black and white at 300 dpi.
Q. What about color files or photographs?
A. Imaging systems should support black and white, grayscale and color images. Color files can be scanned with a color scanner or imported into an imaging system. There are a wide range of color scanners on the market. Many document imaging scanners support color and grayscale.
BACK TO TOP
Q. How can I scan double-sided documents?
A. An imaging system should provide two different ways to do this. It should support duplex scanners, which simultaneously scan both sides of a page. Also, with a simplex scanner, the user should be able to scan all the front sides, place the documents in upside down and scan all the back sides, and then the system should automatically collate the pages into the correct order.
BACK TO TOP
Q. Can I scan landscape and portrait pages together in one batch?
A. An imaging system should allow you to change the orientation of pages as you scan or after scanning. A well-designed system will also include an option to automatically check and correct the orientation of pages.
BACK
TO TOP
Q. How are “skewed” images handled?
A. Skewed (crooked or tilted) images can adversely affect the accuracy of the OCR process, so an imaging system should include software that recognizes skewed images and compensates for them. This is particularly important when scanning press cuttings on a flat bed scanner or when scanning documents through a worn-out or poorly-designed ADF (automatic document feeder).
BACK TO TOP
Q. What file formats can a
versatile system import?
A. A versatile system should be able to import the files you would encounter in your office. This includes word processing files, spreadsheets and presentations as well as common image formats such as TIFF 4, TIFF 3, TIFF Raw, TIFF LZW, PCX, BMP, CALS, JPEG, GIF, PICT, PNG and EPS Preview images. An imaging system providing long term archival of documents should allow the images of each page to be stored in a non-proprietary format. For example, electronic document pages would be “printed” to the imaging system, black and white graphical files would be converted to TIFF Group 4 format and color/grayscale images would be converted to TIFF JPEG.
BACK
TO TOP
Q. What is the difference between CD or DVD jukeboxes/changers and towers?
A. In a jukebox/changer, there are more slots and disks than there are drives. Robotic mechanisms automatically place the correct disk into one of the drives when the disk is needed. In a tower, many CD or DVD drives are stacked together in a single unit, and every disk is always sitting in a drive. Towers provide faster data access but typically cost more per disk and do not hold as many disks. Jukeboxes/changers cost less per disk and can hold up to 500 disks, but are slower because swapping disks in and out of the drives is time-consuming.
BACK TO TOP
Q. Can I view combinations of images, text and index fields side by side?
A. To allow convenient access to document information, a well-designed imaging system will allow the view screen to be configured to show the text, images, template index fields or thumbnail images.
BACK
TO TOP
Q. Can I open and display more than one document at a time?
A. Some imaging systems will allow you to display multiple documents, with the number of documents you can have opensimultaneously limited only by the amount of memory available.
BACK
TO TOP
Q. How can I re-sequence pages?
A. If pages are out of order and need to be re-sequenced;a well-designed imaging system will allow “thumbnail” views of pages to be simply dragged to the required position. In the same way, individual pages can be selected and deleted, subject to appropriate security access control and privileges.
BACK TO TOP
Q. Will I need a specialized “imaging” display?
A. No, most systems run perfectly well on standard VGA and better monitors. A 15" display using a Super VGA controller should be considered the absolute minimum practical display for an ad hoc user of the system. Frequent users should have a 17" monitor, and users who scan or review imaged documents full-time may want to consider a 19" or 21" monitor.
BACK TO TOP
Q. What is the advantage of
a large monitor for “power users?”
A. For people who use an imaging system intensively, screen size can be a critical factor. If users are to flip between pages with the ease of real paper, they must be able to view the whole page at once in a way that allows the text to be readable. If 81/2" x 11" pages are the dominant paper size, then a 21" monitor capable of displaying 1600 x 1200 is optimal. Using a standard 14" VGA monitor will require scrolling and panning if the image is viewed at normal size.
BACK TO TOP
Q. What is important besides
monitor size?
A. Screen resolution and the refresh rate of the monitor are also important. Generally, the larger a monitor is and the higher resolution it has, the harder it is to get the high refresh rate that is required for sustained viewing without screen flicker. The optimum threshold for minimum flicker is generally considered to be a horizontal refresh rate of 72 Mhz on a 21" monitor. The maximum refresh rate is a function of the monitor and the graphics controller.
BACK TO TOP
Q. Will I need a specialized
printer for images or OCR’ed text?
A. Generally no. Most imaging systems support most Windows
compatible printers, but recommend that you use a laser printer
with at least 4 MB of RAM. If you are using a networked system
and printing high volumes of pages to a network printer, you
might consider installing a separate laser printer either
locally or on its own network segment to minimize network
traffic.
BACK
TO TOP
Q. In which formats can I export
documents?
A. It depends on the imaging system. Common graphical formats
you may need include TIFF III, TIFF IV, TIFF Raw, BMP, GIF,
CALS and JPEG.
BACK
TO TOP
Q. What is OCR?
A. OCR stands for Optical Character Recognition, which is
how a computer converts words in an unsearchable scanned image
to searchable text. OCR is usually necessary in order to use
full-text indexing and searches, and it should be included
in an imaging system. OCR engines can generally only recognize
typed or laser-printed text, not handwriting.
BACK
TO TOP
Q. What is the difference between
OCR and indexing?
A. OCR is the process of converting scanned images to text
files. Full-text indexing is the process of taking a text
file and adding each word to an index file that specifies
the location of every word on every document. Well designed
imaging software can make this a fast and easy procedure,
providing rapid access to any word in any document.
BACK
TO TOP
Q. How accurate is OCR?
A. Accuracy on a freshly laser-printed page is typically better
than 99.6%. Accuracy on faxed, dirty or degraded documents
will of course be lower, but a few imaging systems have image
clean-up technology that can improve OCR accuracy.
BACK
TO TOP
Q. Do I have to go through
and correct OCR mistakes?
A. Not if the imaging system supports “fuzzy” logic, which
will find words even if the OCR engine made a few mistakes.
BACK
TO TOP
Q. How fast is the OCR process?
A. The performance of the OCR and indexing processes is entirely
dependent on factors such as the speed and configuration of
the host system as well as the contents of the image. A 133
MHz Pentium generally needs about 6 seconds per page, while
a 450 MHz Pentium II will take about 2-3 seconds per page.
BACK
TO TOP
Q. What is ICR (Intelligent
Character Recognition)?
A. ICR is pattern based character recognition and is also
known as Hand-Print Recognition. Handwritten text is more
difficult for computers to recognize and results in higher
error rates than printed text. ICR engines usually do best
at recognizing constrained printing, which means block printed
letters with one letter in each box. Accurate recognition
of unconstrained handwriting, especially cursive handwriting,
typically requires that the ICR engine be trained to recognize
each user’s style of writing.
BACK
TO TOP
Q. What is OMR (Optical Mark
Recognition)?
A. OMR, also called Mark-Sense Recognition, is the recognition
of marks commonly used on forms, such as check marks, circled
choices, and filled-in bubbles. OMR can be an important part
of an imaging system for organizations that process many standard
forms. Scantron exam forms and customer survey cards are perhaps
the best-known examples of OMR in action.
BACK
TO TOP
Q. Can OCR’ed text be exported
and re-used in a word processor?
A. Yes, you can usually cut and paste text between the imaging
system and another Windows application, or you can export
complete text files (all text pages in a document) to a directory
and open it with your favorite word processor.
BACK
TO TOP
Q. Can I manually correct OCR
errors and typos?
A. Well-designed systems allow users to correct OCR errors
from within the system. However, when hundreds or thousands
of pages are scanned every day, it is usually not practical
to have someone clean up the text. If fuzzy logic search capabilities
are available, it is not necessary to correct the text as
searches will typically still find misread words.
BACK
TO TOP
Q. What is the difference between COLD and imaging?
A. Imaging is for scanning, compressing, storing, indexing,
OCRing, searching and retrieving millions of pages of paper
documents or electronic documents archived as permanent images.
COLD is for archiving, indexing, searching and printing reports
from huge text files generated by mainframes, mini-computers
and other computer applications. COLD stores huge report files
and extracted index fields on hard disk, optical cartridge
or CD-ROM instead of printing all the information out on paper
or storing it to microfilm.
BACK
TO TOP
Q. How many index fields can the COLD server extract
from each report?
A. The number of index fields is usually unlimited. However,
the more fields extracted from each report, the slower the
extraction process will run and the larger the index files
will be.
BACK
TO TOP
Q. What is Document Imaging and Management?
Paper and electronic documents are at the root of our modern
business environment. Document Management provides information
retrieval when and where it is needed.
BACK
TO TOP
Q. What is a Document?
A document is a container for information. Documents exist
as paper files, scanned images, electronic files, and printed
statements. The objects that modern filing systems can track
include, video clips, sound files, photos, x-rays, HTML files,
presentations, images, paper and products of our favorite
office suite.
Document Management means being able to store, sort, index,
and combine these information containers for easy retrieval.
BACK
TO TOP
Q. How does it store the documents?
Documents may or may not be stored in their native format
or file type. Some systems use proprietary file structures
while others treat every thing stored as a single document
type like TIFF image. Storing documents in the same way as
they were created is useful if you are going to use the documents
again or want them to retain their special properties.
Somewhere on a local drive or network device the document
will be sent and stored. The imaging system retains a pointer
or address where this image might be. Even if the document
needs to be in several places at once, some systems can support
a hierarchical search that locates the nearest match in a
tree search. The storage medium determines the speed and reliability
of access plus the overall cost.
BACK
TO TOP
Q. What if my application demands many on-line disks?
When many optical disks are required for storage a device
called a jukebox can be used. This will automatically mount
any one of the disks held within it, whenever requested by
a user. Jukeboxes can hold anything from 5 to 200 optical
disks, providing capacity of up to a terabytes (1,000GB) and
near-line access to many millions of document images.
BACK
TO TOP
Q. How are documents captured?
The most common form of input to a document imaging system
is scanned paper. This can be done in several ways. The document
may be single page or multi pages. Scanning is a process of
converting a paper document into a series of ones and zeros
that faithfully represent the original document. Automatic
document feeders built into the scanner move pages in sequence
to the scanner anywhere from 12 pages per minute to 100's
of pages per minute. Scanners are rated by speed (pages per
minute) , resolution (lines per inch i.e.200,300,400) format
(color, grayscale, black/white) and page layout (double sided,
single sided, standard , legal, ) . The pages are converted
by light from the pages falling on a sensor area that converts
the image to electronic "ones" and "zeros"
. Several processes can then be added to scanners or their
interfaces to enhance the scanned image. These process enhancements
can include color dropout lamps, deskew, despeckle, continuous
contrast adjustments, thresholds and bar code recognition.
As each page is brought into the document management system
it is indexed so it can be found again. It's like having a
telephone book to link with your home address. We can identify
your house with your name, street etc. Alternatively we can
inventory the contents of your whole house and list the contents
in a search table with key word values. The third method of
indexing involves using a folder to group similar objects.
In this way all white houses are located in the same folder.
BACK
TO TOP
Q. Can colored documents be captured?
Yes. Colored text and drawing can usually be happily handled
by the scanners and red writing on red card is no problem
at all. As long as there is a clear definition between the
foreground and background, scanning will be successful. File
sizes will be much larger than black and white, which will
affect storage cost and transmission times. Compression technology
can be employed to reduce file sizes.
BACK
TO TOP
Q. How do I retrieve the document I want?
When the documents are captured, they must be indexed on the
system. Indexing involves entering data onto an index page
within the database, with references unique to the specific
documents, i.e. an Insurance policy holders name, their post
code, the policy number, the policy type and the issued date.
Future retrievals could then be via any or a combination of
these fields. The types of searches which may be performed
include search by form and full-text retrieval. The later
allows every word associated to the document (without quantity
or length restrictions) to be used as a search key.
BACK
TO TOP
Q. Can I access information over the World Wide Web?
The advent of the World Wide Web has had most imaging applications
bringing out web based tools. Through their continual adoption
of new technologies, Xerox has embraced the Internet as a
widely available and highly suitable distribution medium to
give companies fast access to digitized paper records.
Documents scanned are held in our secure digital repository,
giving the security of off-site storage. All the communications
are carried out using the same secure server technology that
many banks use for on-line transactions, ensuring that all
confidential information is transmitted in an encrypted form.
Each document type can be protected with further levels of
passwords.
BACK
TO TOP
Q. What are imaging systems best used for?
Any documents that have to be stored for more than say, six
months, that are referred to regularly by more than one person
and where their storage and handling requires space and human
resource. Typical organizations currently using imaging systems
include Insurance Companies, Banks, Building Societies, Airlines,
Local Government and large commercial companies.
BACK
TO TOP
Q. What Space Saving does Imaging Offer?
Storage can be on many forms of medium, from space on one
of the SAN in our digital repository to good old magnetic
tape. For example, though, the contents of four filing cabinets
can be stored on a single 5.25" optical disk.
BACK
TO TOP
Q. Surely imaging is the same as microfilm?
Not really, imaging is easier to use and does not require
the chemical processes of microfilm. In addition, imaging
can produce far better reproductions of the original documents.
Most importantly, digital document images can be transported
around an organization very quickly over a local or wide area
network. Even the fax can be integrated into the system. Added
to this is the enhanced security of limited access rights
and date recording, available with optical storage systems.
BACK
TO TOP
Q. Can I view images on my existing hardware?
Most desktop systems can view images with no upgrades required
and a common web browser is all that's required to utilize
our hosted repository service.
BACK
TO TOP
Q. What about the legality of images held?
There is no definitive answer yet to this question. Under
civil law the best available evidence is usually applied.
No firm legal precedent has yet been established for the acceptance
of digitized images. Ultimately, it is a customer's decision
as to whether or not paper can be destroyed afterwards, but
Xerox is happy to advise based on our experiences.
BACK
TO TOP
Q. What is OCR and where does it fit in with imaging?
OCR is a method which involves the computer 'reading' the
words on an image and converting the text into a form which
can then be processed by the computer. A variety of packages
are available which run under windows, some specialist packages
can even be trained to recognize handwriting and signatures.
OCR is often used to capture text from a document to form
part of the index. The text can then be exported into other
applications if desired. Relatively speaking, OCR is still
in its infancy and no package can as yet claim to be 100%
accurate. Thus OCR used for indexing purposes would require
some manual verification, particularly if it is used to key
primary fields.
BACK
TO TOP
Q. What is an electronic document?
An electronic document is a digital picture of the original
paper document. When a piece of paper is run through a scanner
it makes an electronic image of the document, similar to a
digital camera.
BACK
TO TOP
Q. How would I access and use this electronic document?
There are several possibilities for your use.
1. You can read the document on your computers monitor.
2. You can fax the documents using your computer's fax-modem.
3. With Internet access, you can email the document.
4. Using Optical Character Recognition (OCR) software, you
can convert the document images into editable text for use
with word processing and spreadsheet software. MSI can perform
the OCR service for you and provide editable documents along
with the document images.
5. You can use your printer to print the entire document,
or only the pages you need at that time.
BACK
TO TOP
Q. Do you store the documents on tape, CD, or DVD?
MSI stores electronic documents on high quality CD-R media.
The "R" stands for "recordable". CD-R
media has a longer shelf life than magnetic media such as
tape, wide support and compatibility, fast random access to
files, cannot be erased or altered, and is non-proprietary.
CD-RW (ReWritable) can be altered or erased, and for this
reason it is not a good archive medium for both security and
legal reasons. Tape has moving parts to break, dozens of different
and incompatible formats. It is quite slow to access, is erasable
and alterable, and can be damaged by common electromagnetic
fields. We do not recommend tape for long term storage of
electronic documents, nor for frequently accessed information.
Tape is often not accepted as a storage method for legal purposes
because of the ability to alter the files on the tape.
While we do offer DVD+R and DVD-R formats
for special applications, MSI does not recommend the DVD format
for most documents archiving at this time. While it does offer
promise for large volume document image storage, DVD is still
a fairly new technology and has not been standardized to provide
the nearly universal access to your documents that is possible
with CDs. Not all customized DVD disks are usable in all DVD
drives.
BACK
TO TOP
Q. Are there any hardware and software requirements?
A computer with a CD drive capable of reading CD-R disks and
imaging or document management application software. CDs can
be formatted to be read by Windows, UNIX, Linux, OS/2, and
Macintosh systems.
While many of our customers purchase
document management software from us along with scanning services,
we can provide the documents in a format that can be imported
into your existing document management system. Our document
CDs include an Emergency Data Recovery program which is suitable
for basic document searching for one CD at a time. See below
for formats we offer.
BACK
TO TOP
Q. What electronic formats
do you offer?
MSI offers a variety of formats including those listed below.
If you don't see the format you need, please inquire by phone
or email. It is important to note that each of these formats
have their own unique set of advantages and limitations.
· Adobe Acrobat
· Alchemy Data Grabber
· ASCII Text
· HTML
· JPG Images Only
· Kodak IBS
· Microsoft Office formats: MS Word, MS Excel
· Multi-Page TIFF Images Only
· OmniPage
· Optika FPmulti
· PaperFlow Data Group
· Summation Blaze
· Other formats are available. Please ask!
BACK
TO TOP
Q. How many documents can fit
on one CD?
Approximately 20,000 letter-sized, black and white pages will
fit on a single CD. Color documents and photos can also be
scanned to CD, but because they require more memory, fewer
will fit. This will let you find your document without leaving
your desk. Your files will be at your finger tips. The fact
that the documents are on a CD means that you may be able
to eliminate all those dusty files and file cabinets and vastly
reduce the cost of storage. You can transport millions of
electronic documents in your briefcase, something just not
possible with paper.
BACK
TO TOP
Q. Can I add notes to the scanned documents?
You cannot add notes directly onto the CD. By its nature it
is a read-only medium. This feature keeps others from deleting
or changing your originals and is usually required if the
images are to be used for legal purposes. You can annotate
or otherwise edit the documents once you copy them to your
hard drive or other read/write media. Additionally, many document
management systems will allow you to associate notes with
your documents.
BACK
TO TOP
Q. Do I need special training
to use a document CD?
No special training is required to view or print the documents
when using intuitive software such as PaperVision. Most computer
users are able to get results in just a few minutes.
BACK
TO TOP
Q. What if I want to add more documents to my MSI
CD that I had made last year?
If there is sufficient room, we can add to your existing MSI
CD.
BACK
TO TOP
Q. What if I want another copy of my document CD later?
We can make multiple copies anytime as long as your MSI CD
is still in readable condition.
BACK
TO TOP
Q. What if I buy a new computer? Will it be able to
read the MSI CD?
A. The CDs can be read by PCs and Macs equipped with CD drives,
and most new systems have the drives included. Computer manufacturers
have made their CD drives compatible by having industry standards,
which were established by Sony and Phillips over 10 years
ago. Some older CD readers pre-dating 1995 may need firmware
updates to properly read CD-R media, and some older drives
just won't read recordable CDs. (The good news is that a new
CD drive costs only about $20.)
What if I decide to use a DVD (digital versatile disc) drive
instead of a CD drive in my next computer - - will I still
be able to use my MSI CD?
Most DVD drives on the market are designed to read all standard
CDs including those produced by MSI. Check manufacturer’s
specifications to see if the DVD drive will read CD-Recordable
(CD-R).
BACK
TO TOP
Q. What if new technology should happen five years
from now? Will I be able to use my MSI CD?
A. If you keep your CD or DVD drive - no problem. However,
MSI will keep up with any widely-used superior technology
and will offer conversion services if and when required for
our customers. The CD format is generic enough that the electronic
industry will be able to convert or use them for the foreseeable
future. If and when the technology changes, you will have
adequate time to convert them.
BACK
TO TOP
Q. What file format are the scanned pages?
A. MSI offers many file formats, but as a default format we
recommend black and white Group 4 TIFF for most document storage
applications due to its wide support by software developers,
small file size, and ability to store multi page documents
in one file. Other image formats are available for full color
or special applications, subject to a higher cost per image,
additional processing time, and in some cases, royalty fees.
If you require a special format, just ask.
BACK
TO TOP
Q. Do you OCR documents as well? Can you convert the
scanned files in Microsoft Word® format? What is the accuracy?
How much does OCR cost?
A. MSI does offer conversion of paper documents to editable
file formats such as word processing files, commonly called
OCR or Optical Character Recognition. Accuracy is entirely
dependent on the quality of the original. Claims by OCR software
vendors for uncorrected documents are 99.7% accuracy with
a high-quality original but expect average of 95-99% based
on our industry experience. A document produced by a laser
printer or printing press with standard fonts on white paper
is considered a high-quality original. Photocopies, low-resolution
printouts, documents with non-standard fonts such as script,
and documents with poor contrast due to similar colors of
paper and ink, are all considered low-quality for the purposes
of OCR and will probably require extensive manual processing
to improve accuracy.
MSI will perform one-pass processing on most OCR jobs using
state-of-the-art software with multiple OCR engines. One-pass
means that there is no correction of the document by our company.
You would then take your file and use tools such as spell
check to find any errors it may have, and adjust formatting
for your printer or other final output.
BACK
TO TOP
Q. What type of processing do the images undergo?
A. Images go through extensive quality control for straightening,
de-speckle, and visible page edge removal before being recorded
to CD.
BACK
TO TOP
Q. Do you offer a searchable database of my documents?
A. A searchable database can be supplied. The creation of
a searchable database of document content is also called "indexing".
There are many types, and therefore costs and effectiveness
vary greatly. MSI can output your images and data in many
different formats, some of which are listed above. Another
possibility is getting MSI to do the conversion from paper
to electronic images then having your staff do the indexing
and importation to your existing system.
BACK
TO TOP
Q. How does my company allow for more than one person
at a time to access our scanned documents?
A. If your office has a LAN (local area network) or WAN (wide
area network), document CDs may be placed in CD drives that
are accessible on the network. For large archives spanning
more than one CD, some companies use CD "jukeboxes"
which hold multiple CDs ready for use. You may also get multiple
copies of your archive so each key staff member or branch
office has a complete set. You can also have your IT manager
copy the files to your server's hard drives for access across
your network.
BACK
TO TOP
Q. How much does converting paper documents to CD
cost?
A. There are many factors determining the costs involved with
document conversion. The format of the originals, their quality,
the target format, number of pages, and the condition of the
paper files are all important factors in the cost of your
project. MSI can provide written estimates for serious inquiries.
BACK
TO TOP
Q. How do I make my documents "scan ready"?
A. "Scan ready" means all the staples and other
paper fasteners have been removed from the files and the pages
are not excessively wrinkled, torn or otherwise damaged. The
pages are all oriented the same way (i.e.: all tops up) and
facing the same way (if single-sided). If the pages are of
odd or mixed size, charges may apply for extra processing.
The files should also be separated and labeled in a way that
is logical to how you would store and access them normally.
We will assign a file name based on the label on the physical
folder, unless some other naming scheme has been planned.
Each electronic file should contain no more than 100 images
because of the limits of many computers' RAM memory prevents
viewing of larger files. If a particular physical file is
more than 100 pages, it will span more than one electronic
file. Extra charges may also apply if there are more than
2 unique file names for each 100 pages if we are required
to manually assign file names, and this requirement also applies
to extra directory levels.
If you wish, MSI will prepare your documents for scanning,
although this service will add significantly to your project
cost.
BACK
TO TOP
Q. Is there a minimum number of pages that I need
to send to you to have my documents scanned?
A. There is no minimum number of pages; however MSI does have
a minimum charge. Contact our sales staff for a list of charges.
BACK
TO TOP
Q. How fast can we get our documents converted to
CD?
A. Turn around time depends on the volume of documents, the
level of service requested, advanced notice, the condition
of the paper files, and our current job queue. If getting
the best cost per page is not as important as getting it back
ASAP, rush jobs may be available, so contact a MSI representative
if you have a time constraint.
BACK
TO TOP
Q. Do I need to make a contract?
Our proposal for your job becomes your contract should you
choose to accept it. There is no "subscription"
period, nor amount of time you must commit to using MSI's
services.
BACK
TO TOP
Q. Do you return the original documents?
A. By default MSI returns your paper documents. All transportation
costs are paid for by the customer. If you need to dispose
of the originals, inquire about document destruction with
a MSI representative.
BACK
TO TOP
Q. Can you upload the image files to our server?
A. Yes, we can, but the time to do so might not be as time or
cost-effective, or secure, as waiting for the CD for on-site
copying to the server by your in-house or contracted technician
who already knows your network system. The CD is also your
backup copy in case of hard drive failure or other disaster.
We also offer private FTP downloads of your data. If you need
special handling of your image files, bring up your needs
to your MSI representative when you inquire.
BACK
TO TOP
Q. How many CD-ROMs do we need for our project?
A. As a rule-of-thumb, approximately 25,000 81/2" x 11"
black & white images scanned at 200 dpi, TIFF G4 will
fit on a single CD-ROM.
BACK
TO TOP
Q. How can I access scanned information?
A. There are many ways to set up your retrieval system, ranging
from very simple directory structures to complex enterprise-wide
management systems. We can suggest a solution to meet your
technical requirements and your budget
BACK
TO TOP
Q. How long will the conversion take?
With around-the-clock operations you can expect turnaround
in the range of 10,000 to 50,000 pages per week according
to the processing involved.
BACK
TO TOP
Q. Can you scan old or fragile documents?
Using flat-bed scanners, grayscale and color scanning, even
the most delicate or poor quality documents can translate
into good electronic images
BACK
TO TOP
Q. Can you scan documents full of staples, clips and bindings?
Yes, by removing the staples etc. as part of our preparation
process. Documents are usually "Deprepped" after
scanning to replace clips, staples etc. so they can be returned
to our client exactly as they arrived.
BACK
TO TOP
Q. We need access to our files - they cannot go off-site!
This does not necessarily mean that scanning needs to be done
onsite. Our quality control process ensures that we can retrieve
and return a document to you within minutes of the request.
Alternatively, we can scan overnight or on weekends to ensure
data is not out of your office during normal business hours.
BACK
TO TOP
Q. OCR (Optical Character Recognition)
Conversion of paper-based text into editable electronic format
such as ASCII, Word, Excel etc. This is done by scanning the
paper into a format such as TIFF, then translating the image
into character codes (ASCII); the range of output possibilities
is defined by the particular version of OCR software in use.
BACK
TO TOP
Q. RTF (Rich Text Format)
A file format that lets you exchange text files between different
word processors in different operating systems. The RTF specification
uses the ANSI, PC-8, Macintosh, and IBM PC character sets;
it defines control words and symbols that serve as "common
denominator" formatting commands. When saving a file
in the Rich Text Format, the file is processed by an RTF writer
which converts the word processor's markup to the RTF language.
When being read, the control words and symbols are processed
by an RTF reader that converts the RTF language into formatting
for the word processor that will display the document.
BACK
TO TOP
Q. ASCII (American Standard Code for Information Interchange)
A code for representing English characters as numbers, with
each letter assigned a number from 0 to 127. Most computers
use ASCII to represent text, which makes it possible to transfer
data from one computer to another.
BACK
TO TOP
Q. TIFF (Tagged Image File Format)
The TIFF format was developed in 1986 to provide a standard
format for image files. TIFF is operating system and display
device independent. TIFF files can be in any of several classes,
including black & white (bitonal), grayscale or colour,
and can include CCITT Group 4, JPEG, or LZW compression.
BACK
TO TOP
Q. JPEG (Joint Photographers Expert Group)
JPEG is a lossy compression algorithm used to compress colour
and grayscale images down to 5% of their uncompressed size.
Lossy compression means that some of the data in the original
file is lost as a result of the abbreviation of information
(compression agorithm) being used.
BACK
TO TOP
Q. GIF (Graphic Interchange Format)
A service mark used for a raster-based colour graphics file
format, often used on the World Wide Web to store graphics.
The GIF uses the 2D raster data type and is encoded in binary.
There are two versions of the format, 87a and GIF89a. Version
89a, allows for the possibility of an animated GIF, which
is a short sequence of images within a single GIF file. A
patent-free replacement for the GIF, the Portable Network
Graphics (PNG) format, has been developed by an Internet committee
and major browsers support it or soon will.
BACK
TO TOP
Q. PDF (Portable Document Format)
A file format developed by Adobe Systems which captures all
the elements of the original file; text, graphics, font, layout
etc. PDFs are in wide use on the web and offer great versatility
for distributing and sharing information. See Adobe.com for
information and/or to download a free copy of Acrobat Reader.
BACK
TO TOP
Q. HTML (Hypertext Markup Language)
This is the set of markup symbols or codes inserted in a file
intended for display on a World Wide Web browser page. The
markup tells the Web browser how to display a Web page's words
and images for the user. Each individual markup code is referred
to as an element (but many people also refer to it as a tag).
Some elements come in pairs that indicate when some display
effect is to begin and when it is to end. HTML is a formal
Recommendation by the World Wide Web Consortium (W3C) and
is generally adhered to by the major browsers, Microsoft's
Internet Explorer and Netscape's Navigator, which also provide
some additional non-standard codes. The current version of
HTML is HTML 4.0.
BACK
TO TOP
Q FTP (File Transfer Protocol)
A standard Internet protocol, FTP is the simplest way to exchange
files between computers on the Internet. FTP is an application
protocol that uses the Internet's TCP/IP protocols. FTP is
commonly used to transfer Web page files from their creator
to the computer that acts as their server for everyone on
the Internet. FTP is also commonly used to download programs
and other files to your computer from other servers.
BACK
TO TOP
Q. CCITT
(Comitй Consultatif Internationale de Telegraphique et Telephonique
now known as the ITU (parent organization) is responsible
for defining many of the standards for data communications
such as Group 3 compression (FAX standard) and Group 4 compression
(TIFF Imaging standard).
BACK
TO TOP
Q. Deskewing
Software-driven action to straighten or adjust an image that
was scanned in crooked or where the data on the page is crooked
in relation to the page's edge.
BACK
TO TOP
Q. Indexing
Creating a database of information to refer to the appropriate
images resident on the system. Index databases can consist
of single index fields or multiple index fields or full-text
repositories. (See OCR).
BACK
TO TOP
Q. Microfiche
A transparent sheet of photographic film containing images
arranged in a series of rows (grid) and having a heading that
contains identifying information in text that is large enough
to be read without magnification. The most common microfiche
are made by filming textual or graphic material at a reduction
ratio of approximately 24:1 and usually accommodate 5 rows
of images.
BACK
TO TOP
Q. Microfilm
Although the term is used for the photographic film that is
employed in producing microforms in general, it is also specifically
used to designate the long strips of photographic film that
are mounted on reels, in cartridges, and in cassettes. The
strips of film together with the containers that house them
are called, respectively, microfilm reels, microfilm cartridges
, and microfilm cassettes . They are all roll formats in which
microimages appear in linear rather than grid array.
BACK
TO TOP
Q. Postscript
This is a programming language that describes the appearance
of a printed page. It was developed by Adobe in 1985 and has
become an industry standard for printing and imaging. All
major printer manufacturers make printers that contain or
can be loaded with Postscript software, which also runs on
all major operating system platforms. Postscript describes
the text and graphic elements on a page to a black-and-white
or color printer or other output device, such as a slide recorder,
imagesetter, or screen display.
BACK
TO TOP
Q. CDIA (Certified Document Imaging Architech)
This is a technology professional who possesses the requisite
level of knowledge and expertise to successfully plan, specify
and design an imaging solution, and who demonstrates that
knowledge by passing the CDIA examination. CompTIA's CDIA
certification is an internationally recognized credential
acknowledging competency and professionalism in the document
imaging industry. CDIA candidates possess critical knowledge
of all major areas and technologies used to plan, design and
specify an imaging system.
For more information click here
|