Generating annotation data while creating documents
Dr. Karim Hajjar
Ahlia University, Bahrain
Abstract
Since many years the research community, is still doing reverse engineering for extracting the physical and the logical layout. This extraction is based on image processing techniques. From a year to another the document recognition ratio is improving but the 100% is still unachievable. My research tries to demonstrate that we need to include in any document, at the creation stage data generated from the user action regarding the physical and the logical layout. For this purpose a new proposed format that stores the physical and the logical layout will be described. And a new proposed extension to Microsoft Word that allows capturing user annotations will be presented. This research is not aiming to solve the problem of document extraction of the huge number of existing documents but propose a definitive solution to document recognition.