click here to download
ABSTRACT
Text recognition in images is a research area which attempts to develop a computer system with the ability to automatically read the text from images. These days there is a huge demand in storing the information available in paper documents format in to a computer storage disk and then later reusing this information by searching process. One simple way to store information from these paper documents in to computer system is to first scan the documents and then store them as images. But to reuse this information.it is very difficult to read the individual contents and searching the contents form these documents line-by-line and word-by-word. So, with the help of Fuzzy Logic we extract Editable Text from an Input Image.
Keywords: Text Recognition, Text Extraction, Fuzzy Logic, Pre-processing,Segmentation, Feature Extraction
Introduction
Today the most information is available either on paper or in the form of photographs or videos. Large information is stored in images. The current technology is restricted to extracting text against clean backgrounds. Thus, there is a need for a system to extract text from general backgrounds. There are various applications in which text extraction is useful. These applications include digital libraries, multimedia systems, Information retrieval systems, and Geographical Information systems. The role
of text detection is to find the image regions containing only text that can be directly highlighted to the user or fed into an optical character reader module for recognition. The information from these image documents would give higher efficiency and ease of access if it is converted to text form. The process by which Image Text converted into plain text that computer can recognize its ASCII character is Text Extraction. The information from image documents should be converted into text in order to get efficient
use and access of it like archiving or reporting that are used in different image-based applications such as office works. Document papers that need to be digitized and used for archiving or indexing orinformation retrieval process are increasingly common today, for example scanned documents of office works, in magazines, advertisements and web pages. Robust and efficient extraction of text from these documents is a challenging problem due to different properties of text in image. Textual data present in the images contain useful information for indexing and automatic annotations. Extraction of this useful information involves text detection, localization of text, classification, and then recognition of text. Fuzzy logic determines the degree of truth values. This logic helps to identify and match the characters accurately with trained data