Releases: naptha/tesseract.js
Releases 路 naptha/tesseract.js
v4.0.5
What's Changed
- No changes to code
- Removed unnecessary files to reduce the size of the npm package
Full Changelog: v4.0.4...v4.0.5
v4.0.4
What's Changed
- Added SIMD-detection when
corePath
is manually specified (#735)- Important note for users who set
corePath
: for significantly faster performance, setcorePath
to a directory that includes bothtesseract-core.wasm.js
andtesseract-core-simd.wasm.js
- See this comment for explanation
- Important note for users who set
- Improved auto-rotate feature (
rotateAuto: true
) (#747) - Switched default CDN from unpkg to jsdelivr (#743)
- Updated various dependencies (#729, #736, #737, #739, #741)
- Reduced size of npm package (#731, #734, #740)
New Contributors
Full Changelog: v4.0.3...v4.0.4
v4.0.3
What's Changed
- Updated Tesseract to v5.3.0
- This resolves bug with inverted (white on black) text recognition (#717)
- Minor documentation fixes (#612, #614, #682, #673)
- Better types for
addJob
by @nathanbabcock in #719
New Contributors
- @Sacramentix made their first contribution in #612
- @Porush made their first contribution in #682
- @eltociear made their first contribution in #673
- @Woutervdvelde made their first contribution in #614
- @nathanbabcock made their first contribution in #719
Full Changelog: v4.0.2...v4.0.3
v4.0.2
v4.0.1
What's Changed
- Running
recognize
ordetect
with invalidimage
argument now throws error message (#699) - Fixed bug with custom
langdata
paths (#697)
New Contributors
- @fmonpelat made their first contribution in #697
Full Changelog: v4.0.0...v4.0.1
v4.0.0
Breaking Changes
createWorker
is now async- In most code this means
worker = Tesseract.createWorker()
should be replaced withworker = await Tesseract.createWorker()
- Calling with invalid
workerPath
orcorePath
now produces error/rejected promise (#654)
- In most code this means
worker.load
is no longer needed (createWorker
now returns worker pre-loaded)getPDF
function replaced bypdf
recognize option (#488)
Major New Features
- Processed images created by Tesseract can be retrieved using
imageColor
,imageGrey
, andimageBinary
options (#588)- See image-processing.html example for usage
- Image rotation options
rotateAuto
androtateRadians
have been added, which significantly improve accuracy on certain documents- See Issue #648 example of how auto-rotation improves accuracy
- See image-processing.html example for usage of
rotateAuto
option
- Tesseract parameters (usually set using
worker.setParameters
) can now be set for single jobs usingworker.recognize
options (#665)- For example, a single job can be set to recognize only numbers using
worker.recognize(image, {tessedit_char_whitelist: "0123456789"})
- As these settings are reverted after the job, this allows for using different parameters for specific jobs when working with schedulers
- For example, a single job can be set to recognize only numbers using
- Initialization parameters (e.g.
load_system_dawg
,load_number_dawg
, andload_punc_dawg
) can now be set (#613)- The third argument to
worker.initialize
now accepts either (1) an object with key/value pairs or (2) a string containing contents to write to a config file - For example, both of these lines set
load_number_dawg
to 0:worker.initialize('eng', "0", {load_number_dawg: "0"});
worker.initialize('eng', "0", "load_number_dawg 0");
- The third argument to
Other Changes
loadLanguage
now resolves without error when language is loaded but writing to cache fails- This allows for running in Firefox incognito mode using default settings (#609)
detect
returnsnull
values when OS detection fails rather than throwing error (#526)- Memory leak causing crashes fixed (#678)
- Cache corruption should now be much less common (#666)
New Contributors
- @reda-alaoui made their first contribution in #570
Full Changelog: v3.0.3...v4.0.0
v3.0.3
v3.0.2
What's Changed
- Updated to Tesseract.js-core v.3.0.1 (uses Tesseract v5.1.0)
- Added SIMD-enabled build, automatic detection of supported devices
- Fix caching of bad langData responses by @andreialecu in #585
- Added benchmark code and assets per #628 by @Balearica in #629
- Replaced child_process with worker_threads per #630 by @Balearica in #631
- Updated to webpack 5 for compatibility with Node.js 18 by @Balearica in #640
New Contributors
- @andreialecu made their first contribution in #585
- @SusanDoggie made their first contribution in #621
Full Changelog: v2.1.5...v3.0.2
Tesseract.js v2.1.5
- Add language constants (thanks to @stonefruit )
- Add user job id to logger (thanks to @miguelm3)
- Fix env selection bug in electron (thanks to @LoginovIlya)
Tesseract.js v2.1.4
- Fix Electron WebView (thanks to @CedricCouton )
- Fix security vulnerabilities by upgrading packages
- Migrate from Travis CI to Github Actions
- Add CodeQL scanning