Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rotateRadians and osd output are incorrect #859

Open
wvanrensburg opened this issue Dec 13, 2023 · 4 comments
Open

rotateRadians and osd output are incorrect #859

wvanrensburg opened this issue Dec 13, 2023 · 4 comments

Comments

@wvanrensburg
Copy link

wvanrensburg commented Dec 13, 2023

Version 5.03

When asking for rotateRadians from the recognize api, the radians returned are incorrect. See attached image for example. The correct text is pulled, and the image is clearly 90 degrees (or 270, or -90 however you put it), but the radians value returned is -0.010256410576403141, which converting back to degrees, comes out to roughly -0.58 degrees.

Tested the below with both PSM modes of PSM.SPARSE_TEXT_OSD and PSM.AUTO_OSD

Example to reproduce

const { data: { tsv, imageBinary, rotateRadians } } = await scheduler.addJob(
           'recognize',
          `data:image/png;base64,${imagebase64}`,
          {rotateAuto: true}, {imageBinary: true, rotateRadians: true, tsv: true}
);

When calling it directly on tesseract, Tesseract reports correct orientation

tesseract /path/to/image.png /path/to/reportout --psm 0

Page number: 0
Orientation in degrees: 90
Rotate: 270
Orientation confidence: 10.40
Script: Latin
Script confidence: 5.40

rotatedimage

@wvanrensburg
Copy link
Author

wvanrensburg commented Dec 13, 2023

Adding to this.... when adding the osd: true to the options payload, the OSD result comes out incorrect as well running with modes PSM.SPARSE_TEXT_OSD and PSM.AUTO_OSD

Example

const { data: { tsv, imageBinary, rotateRadians } } = await scheduler.addJob(
           'recognize',
          `data:image/png;base64,${imagebase64}`,
          {rotateAuto: true}, {imageBinary: true, rotateRadians: true, tsv: true, osd: true}
);

Results in..

Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 0.00
Script: Latin
Script confidence: 2.00

When running in PSM.OSD_ONLY mode, the results work...

Page number: 0
Orientation in degrees: 90
Rotate: 270
Orientation confidence: 10.66
Script: Latin
Script confidence: 4.33

UPDATE
When running in PSM.OSD_ONLY mode, through the scheduler and jobs, Im getting 0 degree orientation. Does not work through the scheduler and jobs, works through direct call

@wvanrensburg wvanrensburg changed the title rotateRadians is incorrect rotateRadians and osd output are incorrect Dec 13, 2023
@Balearica
Copy link
Collaborator

Thanks for reporting. It sounds like we need to improve documentation and/or recognition output for auto-rotate and orientation detection, and perhaps adding an output field for total page angle.

The reason why rotateRadians is reported as -0.58 degrees is that rotateRadians is only reporting the angle used by the auto-rotate feature (enabled by rotateAuto: true). The auto-rotate code is distinct from the orientation detection code, with both adjustments being calculated independently at different points in the recognition process.

Orientation detection detects how the page is oriented, with 4 discrete options: 0/90/180/270. The angle used by auto-rotate is calculated using the slope of the lines of text Tesseract identifies just prior to recognition (after orientation has been corrected). Auto-rotate is intended to adjust the image +/- 10 degrees and improves results for pages that have been photographed/scanned at an angle. To calculate the total angle of the page, as it stands, both the angle reported by orientation detection and auto-rotate would need to be combined.

@Balearica
Copy link
Collaborator

Adding to this.... when adding the osd: true to the options payload, the OSD result comes out incorrect as well running with modes PSM.SPARSE_TEXT_OSD and PSM.AUTO_OSD

I was able to replicate the incorrect osd results you are describing when using the Tesseract.js default oem value of 1 (LSTM_ONLY). However, changing this value to 2 (TESSERACT_LSTM_COMBINED) produce the correct result. oem is the 2nd argument in createWorker, so that looks like the following:

const worker = await Tesseract.createWorker("eng", 2);

I believe that the tesseract CLI application uses a default oem value of 2. Therefore, this would explain why you got different results using Tesseract.js compared to the tesseract CLI program.

@Balearica
Copy link
Collaborator

The following snippet contains a minimal test site that reports the total angle of the page, including both orientation and page rotation.

</head> <body> <input type="file" id="uploader" multiple> <script type="module"> const worker = await Tesseract.createWorker("eng", 2); worker.setParameters({tessedit_pageseg_mode: '3'}) const recognize = async function(evt){ const files = evt.target.files; for (let i=0; i<files.length; i++) { const ret = await worker.recognize(files[i], {rotateAuto: true}, {osd: true}); const osdAngle = parseFloat(ret.data.osd.match(/Orientation in degrees: (\d+)/)?.[1]) || 0; const autoRotateAngle = ret.data.rotateRadians * (180 / Math.PI) * -1; const totalAngle = osdAngle + autoRotateAngle; console.log("osdAngle: " + osdAngle + " (degrees)"); console.log("autoRotateAngle: " + autoRotateAngle + " (degrees)"); console.log("totalAngle: " + totalAngle + " (degrees)"); console.log(ret.data.text); } } const elm = document.getElementById('uploader'); elm.addEventListener('change', recognize); </script> </body> </html> ">
<!DOCTYPE HTML>
<html>
  <head>
    <script src="https://cdn.jsdelivr.net/npm/tesseract.js@5/dist/tesseract.min.js"></script>
  </head>
  <body>
    <input type="file" id="uploader" multiple>
    <script type="module">

      const worker = await Tesseract.createWorker("eng", 2);
      worker.setParameters({tessedit_pageseg_mode: '3'})

      const recognize = async function(evt){
        const files = evt.target.files;
        
        for (let i=0; i<files.length; i++) {
          const ret = await worker.recognize(files[i], {rotateAuto: true}, {osd: true});

          const osdAngle = parseFloat(ret.data.osd.match(/Orientation in degrees: (\d+)/)?.[1]) || 0;
          const autoRotateAngle = ret.data.rotateRadians * (180 / Math.PI) * -1;
          const totalAngle = osdAngle + autoRotateAngle;
          console.log("osdAngle: " + osdAngle + " (degrees)");
          console.log("autoRotateAngle: " + autoRotateAngle + " (degrees)");
          console.log("totalAngle: " + totalAngle + " (degrees)");

          console.log(ret.data.text);
        }
      }
      const elm = document.getElementById('uploader');
      elm.addEventListener('change', recognize);
    </script>
  </body>
</html>

This test image is rotated exactly 95 degrees clockwise.

rotate_95_clock

Results:

osdAngle: 90 (degrees)
autoRotateAngle: 5.017756638841202 (degrees)
totalAngle: 95.0177566388412 (degrees)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants