Recognize Text From PDF Step
- 17 Mar 2023
- 1 Minute to read
- Print
- DarkLight
Recognize Text From PDF Step
- Updated on 17 Mar 2023
- 1 Minute to read
- Print
- DarkLight
Article summary
Did you find this summary helpful?
Thank you for your feedback
Overview
Step Details | |
Introduced in Version | 4.0.9 |
Last Modified in Version | 4.0.9 |
Location | Data > PDF |
The Recognize Text From PDF step scans the Pdf file for string text and outputs all recognized text. Unlike the Get Text From PDF step, this step does not support options for Page selection or whitespace preservation: it only reads the text. Text outputs from this step may deviate slightly from the original content, so other Flow logic steps, like Replace Text step, may correct these errors assuming the text is all in a similar style. This step may take fifteen to thirty seconds to execute.
Properties
Inputs
Property | Description | Data Type |
---|---|---|
PDF Document | File to scan for recognizable text. | FileData |
Outputs
Property | Description | Data Type |
---|---|---|
Output | Text that was recognized in the PDF file. If nothing was recognized, the step will output an empty string. | String |
Common Errors
Incorrect file header at
A file type other than .pdf has been used as the PDF input. To correct this, change the PDF input to a PDF file.
Exception Message:
Exception Stack Trace: DecisionsFramework.Design.Flow.ErrorRunningFlowStep: Error running step Recognize Text From PDF 1[RecognizeTextFromPdf] in flow [Display Steps]: Exception invoking method RecognizeTextFromPdf on class PdfManagementSteps
---> DecisionsFramework.LoggedException: Exception invoking method RecognizeTextFromPdf on class PdfManagementSteps
---> Aspose.Pdf.InvalidPdfFileFormatException: Incorrect file header at #=zKZIslgV_VTgI_laDaHeWplnCCTSzL8YTTg==.#=zHht8sihEEDo7(
at #=zKZIslgV_VTgI_laDaHeWplnCCTSzL8YTTg==..ctor(Stream #=ziA0t9_I=, String #=zuiJIvEo=, Boolean #=zBvJEbRdfIP3N
at #=zKZIslgV_VTgI_laDaHeWplnCCTSzL8YTTg==..ctor(Stream #=ziA0t9_I=
at #=zjroLc8wES_h1ilPWuP2WJb4VKmlo2M1kB_9Ae9Q=.#=zCVbFrz0=(Stream #=ziA0t9_I=
at #=zzVjo1F9wNoksFXH0KKuyBSP8d$xY$dIsPQ==..ctor(Stream #=ziA0t9_I=
at #=zjroLc8wES_h1ilPWuP2WJb4VKmlo2M1kB_9Ae9Q=.#=zG1sr3o8SaFaT(Stream #=ziA0t9_I=
at #=zGaHf0$lTElKXJzbip2dw4yLBu4qT.#=zXO2JtCM=(Stream #=ziA0t9_I=
at #=zGaHf0$lTElKXJzbip2dw4yLBu4qT..ctor(Stream #=ziA0t9_I=
at Aspose.Pdf.Document.#=zQxeSgpE=(Stream #=z3Wdp9mg=, String #=zuiJIvEo=
at Aspose.Pdf.Document..ctor(Stream input
at DecisionsFramework.Design.Flow.CoreSteps.StandardSteps.DocumentManagementMethods.GetPdfDocFromFileData(FileData fileData
at DecisionsFramework.Design.Flow.CoreSteps.StandardSteps.PdfManagementSteps.GetPdfDocFromFileData(FileData fileData
at DecisionsFramework.Design.Flow.CoreSteps.StandardSteps.PdfManagementSteps.RecognizeTextFromPdf(FileData PdfDocument
at InvokeStub_PdfManagementSteps.RecognizeTextFromPdf(Object, Object, IntPtr*
at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)
--- End of inner exception stack trace --- at DecisionsFramework.Design.Flow.StepImplementations.InvokeMethodStep.Run(StepStartData data
at DecisionsFramework.Design.Flow.FlowStep.RunStepInternal(String flowTrackingID, String stepTrackingID, KeyValuePairDataStructure[] stepRunDataValues, AbstractFlowTrackingData trackingData
at DecisionsFramework.Design.Flow.FlowStep.Start(String flowTrackingID, String stepTrackingID, FlowStateData data, AbstractFlowTrackingData trackingData, RunningStepData currentStepData)
--- End of inner exception stack trace ---
Was this article helpful?