In the uses clause add : ComObj, MODI_TLB units.ĭrop a MiDocView object from the activeX page onto your form. After doing so, the component will be in the ActiveX Select the "Microsoft Office Document Imaging 11.0 Type Library" and click on Go in the Component menu of Delphi and select Import An activeX control. You can find some more informations on this control in : Looking after many good but not freeVCls I tried to use the Active X given with Microsoft Office named Microsoft Office Document Imaging. I was looking for nights how to do some character recognition within delphi. The simpliest wayīefore reading, please excuse me for my poor english. I haven't been able to find a better compression system than that but I will try MDI.Question: How to do some character recognition in delphi. The quality is pretty good and fine for OCR. This gives me A4 black and white (1 bit) images with each page around 17k. I typically use "CITT Group 4 (2d) Fax" compression at 200dpi. If you are not then I will immediately switch to MDI files as that would give me amazing compression on the documents! I'm sure you are probably already aware of this, but TIFF is just a container format that can house various compressed image types. Regarding TIFF versus MDI: I have not tried MDI but I think you are probably referring to uncompressed TIFF files. It is a shame that Google seem to be doing better at interfacing with Microsoft technologies than Microsoft themselves. I did get results back on a search from a TIFF file which is must have OCR'd. Although GDS only got around 2% of the way through indexing the initial results are promising. I registered the "TIF TIFF MDI" extensions with the IFilterShop plugin so that Office XP MODI will filter these extensions and OCR them. I installed GDS last night along with the free IFilterShop plugin. Shame as I quite liked WDS when I tried it. So, sorry Microsoft, you've lost out to Google here as their product works much better. I'd be interested to know if this works for you. So if you really need your MDI documents indexed then you can try the IFilterShop and GDS solution, but it may not be ideal as you may have to restart GDS / your computer quite a few times before it has finished indexing everything to free up the memory it will probably leak. I don't remember it being one of the extensions that it mentioned when it installed though. However, I don't think it will index MDI documents - but I have not tried as I don't have any. Personally I have installed GDS with the ScanSoft plugin (as suggested above) and it seems to work great so far. So, basically, yes it does work and I suspect (though I did not try) that it would index MDI documents if you register that extension with IShopFilter, but I also suspect that it would leak memory too. I suspect MODI because Windows Explorer's Virtual Memory also grows large when using the Search Companion to look through TIFFs, which I believe also uses MODI. I'm pretty sure this is MODI itself and if you have a newer version than I have (I have 2002/XP) then maybe Microsoft have fixed this, but don't hold your breath. To cut a long story short: it leaks memory and eventually Windows will run out. GDS + IShopFilter did indeed OCR the TIFF documents using MODI as I hoped.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |