|
What is... OCR
Picture files can be viewed and printed as a whole or in part, and comprise a matrix of
individual pixels making up the whole image. As such, they cannot be edited (other than
cutting and pasting en bloc). Typical examples are tiff, mainly for monchrome documentation,
jpeg and bmp for colour images and grey scale.
Text files are electronic documents created in a computer system, typically WORD documents or
ASCII files, which may well need to be archived alongside scanned hard copy in a mixed archive.
These differ from image-only files in that they are composed of individual alpha-numeric characters,
which allows a further level of retrieval, by word or character-string (full text retrieval).
A text file can be created from a picture file of printed text via optical character recognition
(OCR), in order to add another dimension to the search process, or to allow the original printed
document to be edited in a suitable word processing program.
|