Could not find file ‘C:\ Windows \ TEMP \ AquaForestOCR \ nnnn_nnn \ n_n.hocr’

When using the Aquaforest OCR SDK, intermittently you may receive the following message in your application:
				
					System.IO.FileNotFoundException was caught
FileName=C:\WINDOWS\TEMP\AquaforestOcr\xxxx_xx\x_x.hocr
Message=Could not find file 'C:\WINDOWS\TEMP\AquaforestOcr\xxxx_xx\x_x.hocr'
				
			

This message is generated as a direct result of the source file not being OCR’d, however the particular message is not appropriate in this case.  In order to resolve this issue you need to subscribe to the StatusUpdate which will allow you to use StatusUpdateEventArgs.  This class is available for each page processed when subscribing to the StatusUpdate event and provides information relating to the processing outcome for the page.

Properties

Below are the properties of this class.

  • int PageNumber This property returns page for which the object relates to.
  • int Rotation A value from 0 to 3 which indicates the rotation used for the output in terms of the number of 90° steps away from the orientation in which the input page was provided. If AutoRotation is set to false this will always be 0.
  • double ConfidenceScore Generally a value of 1 or greater would indicate that reasonable OCR of a page, but this should be confirmed using “typical” source files.
  • bool TextAvailable This property indicates whether text was extracted for the page.
  • bool ImageAvailable This property indicates whether an image (after all appropriate pre-processing) was successfully extracted.
  • bool BlankPage This property indicates whether the page was detected as blank.

Example

Below is an example in C# where the above class has been used (higlighted in red) to overcome this issue:

				
					class Program

{

static bool textAvailable = false;

static void Main(string[] args)

{
try
{
Ocr _ocr = new
Ocr();
_ocr.License = "";
PreProcessor _preProcessor = new PreProcessor();
_ocr.EnableConsoleOutput = true;
string OCRFiles = System.IO.Path.GetFullPath(@"..\..\..\..\..\..\bin");
System.Environment.SetEnvironmentVariable("PATH", System.Environment.GetEnvironmentVariable("PATH") + ";"
+ OCRFiles);
_ocr.ResourceFolder = OCRFiles;
_preProcessor.Deskew = true;
_preProcessor.Autorotate = false;
_ocr.Language = SupportedLanguages.English;
_ocr.EnablePdfOutput = true;
_ocr.StatusUpdate += OcrStatusUpdate;
_ocr.ReadTIFFSource(System.IO.Path.GetFullPath(@"..\..\..\..\..\..\docs\tiffs\sample.tif"));
if (_ocr.Recognize(_preProcessor))
{
string words = null;
for (int j = 1;
j = _ocr.NumberPages; j++)
{
try
{
if (textAvailable)
words += _ocr.ReadPageString(j);

}
catch (Exception
ex)
{
Console.WriteLine("ERROR");
}
}
_ocr.SavePDFOutput(System.IO.Path.GetFullPath(@"..\..\..\..\..\..\docs\tiffs\sample.pdf"),
true);
}
_ocr.DeleteTemporaryFiles();
}
catch (Exception
e)
{
Console.WriteLine("Error
in OCR Processing :" + e.Message);
}

}

private static void OcrStatusUpdate(object sender,
StatusUpdateEventArgs statusUpdateEventArgs)


{
textAvailable = statusUpdateEventArgs.TextAvailable;
}
}
				
			

Author

Neil Pitman

Head of IT Business Solutions

Neil established Aquaforest in 2001 to provide high-performance PDF, OCR, and SharePoint products to a worldwide market.

Categories

Archive

Share Post

Related Posts

Explore the capabilities of Searchlight OCR through three videos that demonstrate its seamless integration and transformative capabilities for document management and OCR in SharePoint. …
In this blog, we will be going over the most common problems caused when trying to interface with Office365, either for emails or SharePoint….
What’s new and how to upgrade? Autobahn DX 5.5 was released in April 2021. One of Aquaforest’s flagship products which has been a cornerstone…