01
Nov
11

How To: Compile and Use Tesseract (3.01) on iOS (SDK 5)

Update

  • I don’t have access to a Mac computer now (actually it has been 3 months) and I couldn’t update the guide to Xcode 4.5 (iOS 6), but the fine gentleman bengl3rt has done so and the updated script is available at: http://goo.gl/wQea5 (I haven’t been able to test it, so some feedback on whether it works or not would be appreciated!)

I never thought that my last post would have so much audience. Among other things, it earned me 3 direct job interview offers (1 of  ‘em from Google itself, maintainer of tesseract), an invite to write articles to a TI digital e-magazine and a few digital friends, but that’s something to discuss at other posts. Thank you!

Getting back to what really matters: last post was focused on cross compiling (potentially) any library for iOS (armv6/armv7/i386) and to use as an example I chose Tesseract, which was the library I was using on a work project. But the repercussion was so great and both Tesseract and iOS got newer versions that I’ve decided to write this post specifically about getting Tesseract compiled and using it on your iOS project.

As stated earlier, Tesseract has been officially launched at version 3.01 (that now uses an autogen.sh setup script and an improved configure script ) and iOS has received a major upgrade, version 5.0. As you may guess, these changes broke my script!


So let’s restart this party! (or: Compiling Tesseract 3.01 for iOS SDK 5.0)

The basics about the script were explained at last post and I’ll be just covering the changes and how to use it.

As noted by Rafael, the default C/C++/Objective-C compilers for iOS 5 (bundled with Xcode 4.2) have changed, actually, now you just need Clang, so the CPP, CXX, CXXPP, and CC definitions (inside setenv_all()) have changed to:

export CXX="$DEVROOT/usr/bin/llvm-g++"
export CC="$DEVROOT/usr/bin/llvm-gcc"

Additionally, as Tesseract now has on autogen.sh script to run before configuring, we run it before each configure call:

bash autogen.sh

And because Tesseract’s configure script now accepts a path to Leptonica to be specified, no hacks with it are needed, just calling it with another parameter is just enough:

./configure --enable-shared=no LIBLEPT_HEADERSDIR=$GLOBAL_OUTDIR/include/

To build your desired library, create a directory, I’ll refer to it as “./build/”. Inside it, create the following structure:

  • ./build/
    • dependencies/ – which will receive the .h and compiled lib*.a files
    • leptonica-1.68/ – directory with the Leptonica 1.68 source files
    • tesseract-3.01/ – directory with the Tesseract 3.01 source files
    • build_dependencies.sh – our build script (link at the end of the post)

Open Terminal, enter our “./build/” directory, cross your fingers (one very important step pointed out by Venusbai) and run:

bash build_dependencies.sh

Well, if you’re lucky enough and deserve the holy right to use Tesseract on mobile Apps, check the dependencies folder content and there you’ll have all the needed header and library files to play with OCR on your iPhone (I don’t have one, personally prefer Android, but you got it….).

Great!!! Now what?! (or: Using Tesseract on your iOS project)

  1. Create one new iOS project at Xcode (or just open your existing one)
  2. Add the generated ./build/dependencies/ folder to your project. It contains the needed .h Header and lib*.a Library files
  3. Add the tessdata folder, containing, well, erhm, hum, the tessdata files you need at your project. If you don’t know what the “tessdata” folder is: it contains preprocessed data for a certain language so Tesseract can recognize that language, download language data from: http://code.google.com/p/tesseract-ocr/downloads/list. Check the sub-instructions below to add it the right way (not the default Xcode way…)
    1. Right-click your project/group at Xcode
    2. Choose “Add files to your project”
    3. Select the “tessdata” folder
    4. At the same window, check the “Create folder references for any added folders”. This is the most important step, as it instructs Xcode to add your “tessdata” folder as a regular folder (a resource, as well), not as a Xcode project group.
  4. Create your TessBaseAPI object with the code below to start playing with it!
  5. Make sure that every source file that includes/imports or sees (includes/imports one file that may include/import) Tesseract Header files has the .mm extension instead of the regular .m. This allows the compiler to interpret Tesseract Headers as C/C++ headers.
// Set up the tessdata path. This is included in the application bundle
// but is copied to the Documents directory on the first run.
NSArray *documentPaths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentPath = ([documentPaths count] > 0) ? [documentPaths objectAtIndex:0] : nil;

NSString *dataPath = [documentPath stringByAppendingPathComponent:@"tessdata"];
NSFileManager *fileManager = [NSFileManager defaultManager];
// If the expected store doesn't exist, copy the default store.
if (![fileManager fileExistsAtPath:dataPath]) {
    // get the path to the app bundle (with the tessdata dir)
    NSString *bundlePath = [[NSBundle mainBundle] bundlePath];
    NSString *tessdataPath = [bundlePath stringByAppendingPathComponent:@"tessdata"];
    if (tessdataPath) {
        [fileManager copyItemAtPath:tessdataPath toPath:dataPath error:NULL];
    }
}

setenv("TESSDATA_PREFIX", [[documentPath stringByAppendingString:@"/"] UTF8String], 1);

// init the tesseract engine.
tesseract = new tesseract::TessBaseAPI();
tesseract->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], "eng");

Well, that’s it! Hope you can reproduce this and I also provide to download one Xcode 4.2 iOS SDK 5 project with Tesseract configured and already recognizing one sample image, check it out if having any troubles following this howto.

Files for Download

Final Considerations

I really hope you guys have enjoyed it and if you have any opinion, compliment, suggestion or just wanna state something, feel free to comment, I’ll try to approve it ASAP.

acknowledgements

About these ads

233 Responses to “How To: Compile and Use Tesseract (3.01) on iOS (SDK 5)”


  1. November 1, 2011 at 7:06 pm

    Great stuff Suzuki .. I’ll try it later today :-)

    Muito obrigado pela resposta e um grande abraco !

  2. November 4, 2011 at 2:21 pm

    Hey thanks a bunch for this, the same thing just happened to me on tesseract3.0 + iOS 5 upgrade :(

    I was also wondering if you had any luck Training tesseract for some reason I am having a really tough time to train for a select few fonts.

    Thanks for this, you seem to be interested in the same type of projects as I am possibly? :)

    • November 4, 2011 at 3:14 pm

      I haven’t tried to train tesseract since I found it to be really painful and got regular to very good results with the supplied language files…

      If you’re interested in mobile applications that try to empower the user (in the “be able to do more” sense), then yes, we may have the same tastes. :D

    • April 1, 2012 at 12:55 pm

      Please, i ‘m integrating tesseract with OpenCv , i wonder how to make an object from tesseract in order to use it’s methods in another class, so the question is how to make an object from tesseract ?

      • April 8, 2012 at 9:06 am

        It’s as simple as encapsulating it inside a custom class and creating functions that calls the tesseract C-functions to do the job. Did I understand your question?

      • 6 EslamFarag
        April 12, 2012 at 11:08 am

        when i see this link http://code.google.com/p/tesseract-ocr/wiki/ReleaseNotes for version 3.01 i found that these version have many features that i want to make a good benefit from them, the question –> is the sample Xcode project that you attached to these post already configured to allow all these features? or it only recognizes an image if yes, so how to use them in my project, if no, so how to configure these features on iOS (SDK 5)?
        thanks

      • 7 EslamFarag
        April 12, 2012 at 11:10 am

        yes you did it, thanks

  3. November 4, 2011 at 4:11 pm

    Yea the base eng.traindata is good however for say new pictures of text the image quality is dramatically changed, even with image processing and cleaning up the results are far from usable for reliable information. Especially for say targeted document types. I only want to train 2 fonts after my image processing to test accuracy on imagepicker results. My older 3.0 version without a lot of work was around 80% accurate on non-trained fonts however on times new roman was 100% from picture. Its a shame the training process is so painful especially when in theory it should not be that difficult lol. If your interested in training at all keep me in the loop or need some pre-processing tips, ill let you know how it all works out shortly. Hoping to finish this and release in 2 weeks. But thanks again for this, I was dreading doing a new compile for iOS!

    • November 4, 2011 at 4:34 pm

      Nice!

      Let me know if you get to train tesseract and if that actually helps with accuracy, I haven’t seem someone stating that got success with that. Guess that could be used as material for a new post.

      Good luck!

    • July 4, 2013 at 7:16 pm

      Can you share your experience how did you trained your fonts and how was the success rate ?

  4. November 4, 2011 at 5:01 pm

    Sorry to blow up your thread on possible off topics however.. I have tried converting this to run off the camera roll or image picker and noticed that leptonica pix.h methods are completely new and restricting my ability to run any size image I want. Are there any restrictions to sizing or ways to overcome this instead of using the older image sizing methods aside from leptonica?

  5. November 9, 2011 at 7:23 am

    Just a small thanks, you saved me a lot of time.

  6. 14 Matt H
    November 9, 2011 at 11:20 pm

    Excellent tutorial, worked perfectly!

  7. 20 Lucio
    November 11, 2011 at 2:07 pm

    Great tutorial…. thanks

  8. 21 Abdulla
    November 19, 2011 at 5:09 am

    Great tutorial.. Was fully lost with several kinds of errors trying to build tesseract :)
    It would be good if you let us know how you used OpenCV with tesseract

    Thanks a lot!!!!!

  9. 22 abdulla
    November 19, 2011 at 10:01 am

    Hi,
    I tried your steps to build 4.3, but not able to build it properly I think. So many errors popup when trying to link to an Xcode project. More over, the header files in the dependencies/include/tesseract are lesser than that you have. I ve posted the logs of the build script. Can you help pls. Trying to build this library and use for the last 8 hours straight – badly need your help. No idea where I am going wrong.

    The following i output to a txt file. I ve also pasted the error messages coming from terminal.
    *****************************************
    Edit: WAY TOO BIG!!!

    configure: error: C preprocessor “/Developer/Platforms/iPhoneSimulator.platform/Developer/usr/bin/cpp-4.2″ fails sanity check
    See `config.log’ for more details.
    make: *** No targets specified and no makefile found. Stop.
    ar: creating archive libtesseract_all.a
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(bmpiostub.o) has no symbols
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(gifio.o) has no symbols
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(jpegio.o) has no symbols
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(leptwin.o) has no symbols
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(pdfiostub.o) has no symbols
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(pngio.o) has no symbols
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(pnmiostub.o) has no symbols
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(psio1stub.o) has no symbols
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(psio2stub.o) has no symbols
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(tiffio.o) has no symbols
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(webpio.o) has no symbols
    /Developer/usr/bin/ranlib: file: libtesseract_all.a(zlibmemstub.o) has no symbols
    lipo: specifed architecture type (armv6) for file (./outdir/arm6/liblept.a) does not match its cputype (16777223) and cpusubtype (3) (should be cputype (12) and cpusubtype (6))
    lipo: specifed architecture type (armv6) for file (./outdir/arm6/libtesseract_all.a) does not match its cputype (16777223) and cpusubtype (3) (should be cputype (12) and cpusubtype (6))
    cp: ./outdir/lib*.a: No such file or directory

    • 23 Vedika
      February 17, 2012 at 9:32 pm

      Hi abdullah
      Can you tell me how you solved this problem? I am getting the same error..
      /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/ranlib: file: .libs/liblept.a(leptwin.o) has no symbols

      Thanks,

      • April 27, 2013 at 9:21 am

        I don’t know if you solved it, but in case somebody else have this problem, the build_dependencies.sh has to be changed to point the latest base sdk (right now IOS_BASE_SDK=”6.1″). BTW, I changed the deploy target too, so it can be used on iOS5 too.
        IOS_DEPLOY_TGT=”5.0″

  10. 25 abdulla
    November 19, 2011 at 11:33 am

    Hi,
    Pls disregard my last message. I ve build it fine. Too many hours of work made me a little impatient with tesseract :(

    Now trying to integrate with Xcode project – View controller based. Getting this error :

    Command /Developer/usr/bin/lex failed with exit code 1

    Trying to fix it . Thanks

    • November 21, 2011 at 12:48 pm

      Do you have any other output about this error on Xcode? Even the command that is generating this would help.

      Regards

      • 27 Hitesh Soni
        February 18, 2013 at 9:12 am

        Hi ,
        I am getting the following error message :

        /Users/administrator/Desktop/Intern/myOCRApp/myOCRApp/tesseract-ocr/tessdata/eng.cube.lm:8: premature EOF
        Command /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/lex failed with exit code 1

        Trying to fix it,
        please help.
        Thanks

      • February 18, 2013 at 11:36 am

        There’s no reason for Xcode to do anything with tessdata files, it should just be packaging the folder along with your App. It seems that Xcode is trying to use it as some kind of source file, try to make it stop doing this.

    • 29 Karwan
      September 30, 2012 at 11:02 am

      How did you solve the problem with the error

      cp: ./outdir/lib*.a: No such file or directory

      thanks,

  11. November 20, 2011 at 6:29 am

    If you build the library this way using the iOS 5 SDK, can an app using the library run on iOS 4.3 devices?

  12. 32 Andrea
    November 23, 2011 at 6:57 pm

    Hi, thank you for the tutorial!
    I Have a problem, i have copied folder “dependencies” and “tessdata” in my Xcode project from your Xcode example (that work for me), but I have this error in this part

    namespace tesseract {
    class TessBaseAPI;
    };

    -> expected ; after top level declarator

    Can you help me?

    • November 23, 2011 at 9:21 pm

      You can try changing your .m files that are including Tesseract stuff to .mm extension. Having .m files including C/C++ files causes problem because .m files are compiled as strict C and Objective-C source files.

      If that won’t work, you can also try to change the project I’ve provided and turn it into yours…

      Regards,
      Suzuki

      • 34 Andrea
        November 24, 2011 at 5:10 am

        I have already changed .m file in .mm and not work (compile but when I #import the .h file in another class give me the previous error)
        I have solved in this way:

        Change “namespace tesseract { …. };” in:

        #ifdef __cplusplus
        #include “baseapi.h”
        using namespace tesseract;
        #else
        @class TessBaseAPI;
        #endif

        and change tesseract::TessBaseAPI *tesseract;
        in : TessBaseAPI *tesseract;

        In this way my project works. Is correct?

      • November 24, 2011 at 5:33 am

        I guess it is correct and a very nice way to solve the problem. I’ll use it myself.

        Thanks and I’m glad you worked it out!

      • 36 Andrea
        November 24, 2011 at 5:37 am

        Thank you for this tutorial! =)

  13. 37 Fred
    November 25, 2011 at 6:19 pm

    I’ve downloaded the Sample Xcode project, but didn’t work to me. When i try to compile project more then 50 erros was showed, specific on those codes: @class MBProgressHUD;

    namespace tesseract {
    class TessBaseAPI;
    };

    Can u help me to just compile this project ??

    tks.

  14. December 6, 2011 at 9:11 am

    Do you have a PayPal account? I want to send you some money!
    You’re a life savior!!!

  15. 41 Pich
    December 12, 2011 at 5:03 am

    hi Suzuki
    firstly i’d like to apologize for my poor english if you can’t read my question smoothly.

    i’ve read your instruction to cross-complie tesseract but it didn’t work well
    (of course that i wasn’t good enough at this)
    but finally i decided to download your sample project and a bit fixing
    so now i can go through with tesseract on iOS
    but the problem is i want to enable tesseract to recognize Handwritting
    so i made my own hand.traineddata
    but i don’t know how to make Xcode use my hand.traineddata – -”

    i changed this once
    “tesseract->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], “hand”);”
    and already added hand.traineddata in to “tessdata” folder

    and Xcode still show me “error”
    i know that i have to set PATH for Xcode to read my hand.trainedata but i didn’t know how to set it – -”

    very thanks

    • December 19, 2011 at 3:01 pm

      OK, so the provided sample works fine for you and you’re trying to add your own traineddata to it, right?

      As far as I know, the described procedure should have worked fine and your traineddata file should have been loaded. You can try to see if some of the traineddata supplied by tesseract other than the english one works (http://code.google.com/p/tesseract-ocr/downloads/list), but I can’t help you with that since I haven’t tried to load different traineddata files.

      Besides that I’d take Ray Smith‘s (tesseract developer) statement as a strong advise in the “limited” sense:
      “Tesseract was never designed for handwriting, but people have been successful to a limited extent in retraining it for handwriting.”

      Regards

      • 43 Pich
        December 20, 2011 at 5:04 am

        finally i can solve this problem
        i printed the error and see the path that your proveded sample call
        and just copy .traineddata to that folder
        so the proplem is successful fix
        and can also work when i deployed it to my ipad

        i know this maybe a very simple solution
        but just post it for the others
        who hit the same problem with me :D

        Regards
        pich

  16. 44 swathi
    December 13, 2011 at 3:08 am

    Hi if possible can u provide me a video regarding this tutorial

  17. 46 sudheer
    December 13, 2011 at 8:18 am

    hi

    I’ve downloaded the Sample Xcode project, but didn’t work to me. When i try to compile project more then 50 erros was showed, specific on those codes: @class MBProgressHUD;

    namespace tesseract {
    class TessBaseAPI;
    };

    Can u help me to just compile this project ??

    tks.

  18. December 13, 2011 at 6:52 pm

    Great tutorial, very helpful, thanks a lot!

  19. December 19, 2011 at 5:01 am

    Huge help, you’re the man Suzuki!

  20. January 4, 2012 at 11:55 am

    Thanks a lot & happy new year !!
    I had to do a really quick proof-of-concept and was given a link to RCarlsen’s Pocket OCR sample and after reading through several blogs (including yours) I was really happy to see that your sample project includes pre-compiled libraries – WHAT A TIME SAVER !!

    I’ll probably be back one day and will have to compile some library on my own, but importing your dependencies folder into Pocket OCR will enable me to use the picture roll directly for making some custom tests without having to bother to compile libraries on my own, so your sample project makes my day !!

    Keep up the good work !!

  21. January 12, 2012 at 6:42 am

    Amazing tutorial complete with explanations and a sample project. Thank you for everything and keep up the great work!

  22. 52 Crapulax
    January 20, 2012 at 7:10 am

    Very interesting post !

    I am trying to build vlc for iOS (vlc is an open source multimedia player) http://wiki.videolan.org/MobileVLC

    The provided build script are outdated so I am trying to upgrade them to match new ios version

    During the configure step , I got the following error :
    “configure: error: C preprocessor “/Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/llvm-cpp-4.2 ” fails sanity check”

    export CPP=”${DEVROOT}/usr/bin/llvm-cpp-4.2

    would you have any guess on this pb ?

    • January 20, 2012 at 8:48 am

      Try changing from llvm-cpp-4.2 to llvm-gcc-4.2, if that works the explanation would be that llvm-cpp isn’t a C compiler…

      • May 9, 2012 at 10:41 am

        I haven’t the same issue here (I am using XCode 4.2/iOS 5) – also trying to compile Tesseract.. having problem liptolib:
        onfigure: error: C preprocessor “/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer//usr/bin/llvm-gcc” fails sanity check
        See `config.log’ for more details.

        Checking config.log this is waht I find:

        configure:5841: /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer//usr/bin/llvm-gcc -arch armv7 -pipe -no-cpp-precomp -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer//SDKs/iPhoneOS5.1.sdk -miphoneos-version-min=3.2 -I/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer//SDKs/iPhoneOS5.1.sdk/usr/include/ -I/Users/miguel/Uni-Local/Praktikum/build/dependencies/include -L/Users/miguel/Uni-Local/Praktikum/build/dependencies/lib conftest.c
        conftest.c:14: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘error’
        configure:5841: $? = 1
        configure: failed program was:
        | /* confdefs.h */
        | #define PACKAGE_NAME “leptonica”
        | #define PACKAGE_TARNAME “leptonica”
        | #define PACKAGE_VERSION “1.68″
        | #define PACKAGE_STRING “leptonica 1.68″
        | #define PACKAGE_BUGREPORT “dan.bloomberg@gmail.com”
        | #define PACKAGE_URL “”
        | /* end confdefs.h. */
        | #ifdef __STDC__
        | # include
        | #else
        | # include
        | #endif
        | Syntax error
        configure:5871: error: in `/Users/miguel/Uni-Local/Praktikum/build/leptonlib-1.67′:
        configure:5874: error: C preprocessor “/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer//usr/bin/llvm-gcc” fails sanity check
        See `config.log’ for more details.

        Enviroment:

        setenv_all()
        {

        # Add internal libs
        export CFLAGS=”$CFLAGS -I$GLOBAL_OUTDIR/include -L$GLOBAL_OUTDIR/lib”

        export CPP=”$DEVROOT/usr/bin/llvm-gcc”
        #export CXX=”$DEVROOT/usr/bin/g++-4.2″
        export CXX=”$DEVROOT/usr/bin/llvm-g++”
        export CC=”$DEVROOT/usr/bin/llvm-gcc”

        export CXXCPP=”$DEVROOT/usr/bin/llvm-g++”
        #export CC=”$DEVROOT/usr/bin/gcc-4.2″
        export LD=$DEVROOT/usr/bin/ld
        export AR=$DEVROOT/usr/bin/ar
        export AS=$DEVROOT/usr/bin/as
        export NM=$DEVROOT/usr/bin/nm
        export RANLIB=$DEVROOT/usr/bin/ranlib
        export LDFLAGS=”-L$SDKROOT/usr/lib/”

        export CPPFLAGS=$CFLAGS
        export CXXFLAGS=$CFLAGS
        }

        setenv_arm7()
        {
        unset DEVROOT SDKROOT CFLAGS CC LD CPP CXX AR AS NM CXXCPP RANLIB LDFLAGS CPPFLAGS CXXFLAGS
        export DEVROOT=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer
        # export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer
        export SDKROOT=$DEVROOT/SDKs/iPhoneOS$IOS_BASE_SDK.sdk

        export CFLAGS=”-arch armv7 -pipe -no-cpp-precomp -isysroot $SDKROOT -miphoneos-version-min=$IOS_DEPLOY_TGT -I$SDKROOT/usr/include/”

        setenv_all
        }

        Any help would be appreciated.

      • May 9, 2012 at 10:58 am

        I am having the same error with XCode 4.2/ iOS 5.1 – Trying to compile Tesseract as well (but fails building liption):

        Error:
        configure: error: C preprocessor “/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer//usr/bin/llvm-gcc” fails sanity check
        See `config.log’ for more details.

        In config.log I see:

        configure:5841: /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer//usr/bin/llvm-gcc -arch armv7 -pipe -no-cpp-precomp -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer//SDKs/iPhoneOS5.1.sdk -miphoneos-version-min=3.2 -I/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer//SDKs/iPhoneOS5.1.sdk/usr/include/ -I/Users/miguel/Uni-Local/Praktikum/build/dependencies/include -L/Users/miguel/Uni-Local/Praktikum/build/dependencies/lib conftest.c
        conftest.c:14: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘error’
        configure:5841: $? = 1
        configure: failed program was:
        | /* confdefs.h */
        | #define PACKAGE_NAME “leptonica”
        | #define PACKAGE_TARNAME “leptonica”
        | #define PACKAGE_VERSION “1.68″
        | #define PACKAGE_STRING “leptonica 1.68″
        | #define PACKAGE_BUGREPORT “dan.bloomberg@gmail.com”
        | #define PACKAGE_URL “”
        | /* end confdefs.h. */
        | #ifdef __STDC__
        | # include
        | #else
        | # include
        | #endif
        | Syntax error
        configure:5871: error: in `/Users/miguel/Uni-Local/Praktikum/build/leptonlib-1.67′:
        configure:5874: error: C preprocessor “/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer//usr/bin/llvm-gcc” fails sanity check
        See `config.log’ for more details.

        My Environment:

        setenv_all()
        {

        # Add internal libs
        export CFLAGS=”$CFLAGS -I$GLOBAL_OUTDIR/include -L$GLOBAL_OUTDIR/lib”

        export CPP=”$DEVROOT/usr/bin/llvm-gcc”
        #export CXX=”$DEVROOT/usr/bin/g++-4.2″
        export CXX=”$DEVROOT/usr/bin/llvm-g++”
        export CC=”$DEVROOT/usr/bin/llvm-gcc”

        export CXXCPP=”$DEVROOT/usr/bin/llvm-g++”
        #export CC=”$DEVROOT/usr/bin/gcc-4.2″
        export LD=$DEVROOT/usr/bin/ld
        export AR=$DEVROOT/usr/bin/ar
        export AS=$DEVROOT/usr/bin/as
        export NM=$DEVROOT/usr/bin/nm
        export RANLIB=$DEVROOT/usr/bin/ranlib
        export LDFLAGS=”-L$SDKROOT/usr/lib/”

        export CPPFLAGS=$CFLAGS
        export CXXFLAGS=$CFLAGS
        }

        setenv_arm7()
        {
        unset DEVROOT SDKROOT CFLAGS CC LD CPP CXX AR AS NM CXXCPP RANLIB LDFLAGS CPPFLAGS CXXFLAGS
        export DEVROOT=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer
        # export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer
        export SDKROOT=$DEVROOT/SDKs/iPhoneOS$IOS_BASE_SDK.sdk

        export CFLAGS=”-arch armv7 -pipe -no-cpp-precomp -isysroot $SDKROOT -miphoneos-version-min=$IOS_DEPLOY_TGT -I$SDKROOT/usr/include/”

        setenv_all
        }

        I have changed many times CXX/CPP but no of them works… same problem :S!

        Any help appreciated.

  23. 56 Tom
    January 26, 2012 at 6:00 am

    Yeaahh.. amazing post! Thanks!
    Is it possible to compile Tesseract with your script without Leptonica? I am using OpenCv in my project, so there is no need of Leptonica…

    • January 26, 2012 at 9:25 am

      I haven’t searched for how to compile Tesseract w/o Leptonica, but if it is possible, there must be some parameter to be passed to Tesseract’s configure call. Take a look at it an modify the way it gets called by the build script.

      Good Luck!

    • 58 sts2k
      January 26, 2012 at 11:37 am

      The readme states : “Leptonica is required and provides image I/O and processing”… so doubt it.

      btw thanks Suzuki! Spent 5 hrs trying to fix the build script myself and then I came across this page :)
      Are you with Google now, and if so you think they’ll incl. support for osx/ios in the future releases?

      • January 26, 2012 at 12:34 pm

        Well, I’m not with Google, though I’d be more than happy to contribute with OSX/iOS support or so. And, for future releases, note that tesseract project only provides pre-compiled binaries for Windows, so I bet it will stay like that: pre-compiled/easy installer for Windows, compatible code for compiling for other platforms.

  24. 60 Mark
    February 9, 2012 at 5:37 pm

    Using Thai (tha) and I notice simplified Chinese (chi_sim) the tesseract code seg faults while processing the image. Someone created an issue: http://code.google.com/p/tesseract-ocr/issues/detail?id=502 though in my testing it seg faults in a different function. Anyone have any pointers for debugging? I’d love to be able to step through the tesseract library code in Xcode.

    • 61 Mark
      February 10, 2012 at 2:06 pm

      Actually english doesn’t work 100% for me either. With certain images I get the same crash which occurs in tesseract::Classify::ComputeIntCharNormArray. Is anyone actually using this with iOS 5 successfully beyond a few test images?

  25. 62 Ruben
    February 11, 2012 at 11:03 pm

    Thanks for this info. One question: in your sample project you included the dependencies group with the header files and static libs of leptonica and tesseract. Are the static libs in there fat libs? In other words can I use those for both the iPhone simulator and a real iPhone? Or do I still have to build my own following your tutorial?

    P.S. sorry I first posted this in the wrong topic. :)

  26. 65 Kin
    March 4, 2012 at 10:40 pm

    Awesome tutorial! I downloaded your sample and it works for me, but when i trying to include in my own project some problem occurs. There are no error while compiling but it crash my app. The image below is the screenshot of the error.

    Any idea with this?

    • March 5, 2012 at 10:42 am

      Are you sure that the required image (Lorem_Ipsum_Univers.png) is also in your project’s resources? Or, if you’re using your own images, place a breakpoint at the setTesseractImage: function and check if your (UIImage *)image is valid. As this code is only for demonstration purposes, I haven’t placed error checks.

      • 67 Kin
        March 5, 2012 at 2:08 pm

        Yes it is UIImage. The image is displayed in the Center image view. But when OCR take place the error just pops up. I’ll try debug it later if it still can’t maybe I juz use ur sample file and move all my project into it. I would like to ask what process do I need to include to OCR normal capture image. I added camera and camera roll to your sample and tried a  image from camera the OCR output is quite messy.  What I know is I have to do some preprocessing to the image. I would like to ask is there anyway to decrease its sensitivity? 

      • March 5, 2012 at 2:13 pm

        That (image preprocessing) is, actually, the real challenge behind getting Tesseract to play well with mobile camera pictures. I have no great advices to give, but I know this is totally possible because Apps like ScanBizCards (http://www.scanbizcards.com/) can do it pretty well. If you find out any hint on how to do this, please share with us.

        Good Luck!

    • 69 EslamFarag
      April 12, 2012 at 1:01 pm

      make sure that you checked the “Create folder references for any added folders” checkbox when you add the tessdata folder to your project,
      GoodLuck.

  27. March 7, 2012 at 10:13 pm

    awesome Suzuki. Adigato! You saved my lazy ass a lot of work :-P

  28. 71 hungtrv
    March 22, 2012 at 1:55 pm

    Hi, I have a weird problem and I’m looking for help. I used AV Foundation to capture a image of text, just few words, to UIImage (JPG). If I save that photo to iPhoto Library and load it back then use Tesseract, it works no problem. However, if I pass the UIImage directly from AV Foundation output, it didn’t work, the resulting text from image is just random characters.

    Thank you very much!

    • March 22, 2012 at 4:09 pm

      I’d bet that the problem is in your “passing UIImage to Tesseract” code, because Tesseract understands binary (or RAW) image representation in a variety of formats, but you gotta correctly specify it. This AV Foundation image of yours may not be on the format that your specific UIImage->Tesseract code is expecting, resulting in erroneous image interpretation.

      On the other hand, when you save your AV Foundation image to JPG and load back to UIImage, it may have been converted on the save or the load step to the format you’re expecting on your UIImage->Tesseract code.

      Pay attention to your images pixel data format.

      Best regards!

      • 73 hungtrv
        March 30, 2012 at 2:44 am

        Thanks for your response. I finally figured it out, the image rotation is the cause.

        Have a nice day!

  29. March 25, 2012 at 8:22 am

    THANK YOU!!
    Finally a script that works. I have it up and running now. For anyone having problems with the /Developer build path, Apple moved the entire folder into /Applications/Xcode. Here is an updated script

    http://pastebin.com/xyUt3c84

  30. 75 mike
    March 29, 2012 at 4:36 pm

    HI I am trying to build your sample project by setting the target compiler to LLVM GCC 4.2 and run into a lot of compile time errors. Is this possible? Thanks

  31. 76 maX
    April 8, 2012 at 8:40 am

    Can you please help me out with exact description about building simple OCR android application. its for my project submission.

    Thanks in advance !!

    • April 8, 2012 at 9:04 am

      Sorry pal, can’t do that. Just can give you the overral idea of what I’d try to do: compile Tesseract for the ARM architecture of your target device and try to use it from Java using JNI.

  32. 78 hungtrv
    April 8, 2012 at 2:00 pm

    Hi Suzuki,

    I would like to get the debug information of the OCR process, for example, the bounding box around the recognized words. Could you please show me how to get that info?

    Thank you

    • April 8, 2012 at 3:04 pm

      hungtrv, here’s what I do:

      - (UIImage*)image:(UIImage*)image withBoxa:(Boxa*)boxa
      {
      UIGraphicsBeginImageContext(image.size);

      [image drawAtPoint:CGPointZero];

      CGContextRef ctx = UIGraphicsGetCurrentContext();

      [[UIColor blueColor] setStroke];

      for (int i = 0; i n; i++)
      {
      Box *b = boxa->box[i];
      CGRect asRect = CGRectMake(b->x, b->y, b->w, b->h);
      CGContextStrokeRectWithWidth(ctx, asRect, 2.0);
      }

      UIImage *newImg = UIGraphicsGetImageFromCurrentImageContext();

      UIGraphicsEndImageContext();

      return newImg;
      }

      then, with the original image passed to tesseract, do something like:

      Pixa *textImages = 0;
      Boxa *textLines = tesseractPtr->GetTextlines(&textImages, NULL);
      originalWithBoxes = [self image:originalUIImage withBoxa:textLines];
      pixaDestroy(&textImages);
      boxaDestroy(&textLines);

      Hope that helps..

      • 80 hungtrv
        April 9, 2012 at 12:15 am

        It works perfectly, thank you very much matt h.

      • 81 EslamFarag
        May 11, 2012 at 7:28 pm

        what’s the variable n defined in the for loop ? is this method returns to me only the text lines in the image, (i.e. neglecting non any graphics content in the image)?
        thanks,

      • 82 EslamFarag
        May 11, 2012 at 7:30 pm

        what’s the variable n defined in the for loop ? is this method returns to me only the text lines in the image, (i.e. neglecting any graphics content in the image)?
        (sorry there’s a mistake in the previous post)
        thanks,

  33. 83 Jaideep Joshi
    April 9, 2012 at 8:35 pm

    It is a great tutorial but while running build_dependencies.sh, I got few errors:
    configure: WARNING: If you wanted to set the --build type, don't use --host.
    If a cross compiler is detected then cross compile mode will be used.
    checking build system type... i386-apple-darwin11.3.0
    checking host system type... arm-apple-darwin6
    checking for arm-apple-darwin6-gcc... /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/llvm-gcc
    checking whether the C compiler works... no
    configure: error: in `/Users/jjaideep2000/project/tesseract/build/leptonica-1.68':
    configure: error: C compiler cannot create executables
    See `config.log' for more details.
    test.sh: line 103: make: command not found
    cp: src/.libs/lib*.a: No such file or directory
    ...

    • April 9, 2012 at 8:57 pm

      Thats a generic error caused by the script when using the wrong c compiler:
      checking whether the C compiler works… no

      I’m gonna need more info like the Xcode and iOS SDK versions you’re using to reproduce that.

      • 85 Autumn
        July 11, 2012 at 5:58 am

        I have the exact same error.
        I am using Xcode 4.3 and iOS 5.1.
        Can you advise?

      • July 11, 2012 at 11:17 am

        Hey Autumn,

        The correction on build_dependencies.sh is pretty straightforward, actually…

        Well, as current Xcode version (4.3.2) installs on /Application/Xcode subfolder rather than splitted dirs across all your HD, I had to change line #35 to:

        export DEVROOT=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer

        #47 to:

        export DEVROOT=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer

        and #59 to:

        export DEVROOT=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer

        I’ll try to post a more bullet-proof script ASAP,

        Good luck,
        Suzuki.

  34. 87 EslamFarag
    April 15, 2012 at 8:27 am

    How to use DetectOS (Automatic page orientation) and page segmentation in tesseract 3.01 under iOS 5

  35. 88 ahmad alattal
    April 15, 2012 at 9:04 pm

    this is amazing ……. thanks a lot man, you was a big help for me, thanks for the effort

  36. April 16, 2012 at 8:55 pm

    @Maciej Swic
    comment:
    (http://tinsuke.wordpress.com/2011/11/01/how-to-compile-and-use-tesseract-3-01-on-ios-sdk-5/#comment-227)

    Thank You! I knew someone would have already gone to the effort of adjusting the script for the XCode 4.3 changeover.

    I think the lines for setting SDKROOT in setenv_arm6(), setenv_arm7(), setenv_i386() should remain:

    export SDKROOT=$DEVROOT/SDKs/iPhoneOS$IOS_BASE_SDK.sdk

    and then just update IOS_BASE_SDK to:

    ln9: IOS_BASE_SDK=”5.1″

    @Suzuki
    Epic praise for your work! Going to making bringing just about any OpenSource code I find into my iOS project 1000x easier with your cross-compile and FAT linking tips.

  37. 91 Kevin
    April 19, 2012 at 12:54 am

    Thanks so much for the awsome instructions, AND the binaries! One snag that I ran into when trying to make use of the library was that I got a “Member access into incomplete type…” error when I tried to use BaseAPI::ResultIterator. This is a really useful tool to be able to access standard OCR metadata like character-level confidences. I figured out that just adding the header files ResultIterator.h and PageIterator.h from the Tesseract source made it work, so those might be handy to include too…

    • April 19, 2012 at 10:58 am

      Note taken! I just didn’t want to include every Tesseract Header file in order to avoid confusing people, and, as I ain’t a Tesseract expert, I just added the ones needed to make plain OCR work.

  38. 93 wingnet
    April 19, 2012 at 9:19 am

    thanks a lot. you just save my life.

  39. 94 EslamFarag
    April 20, 2012 at 6:54 am

    How to remove Non text area from a scanned image using tesseract 3.01 i see the two methods (segmentPage) in tesseractClass.h and (remove_non_text_area) in osdetect.h but i don’t know how to use them, please help me on that

    • April 20, 2012 at 12:06 pm

      Hi Eslam,

      Tesseract provides ways to get the OCR result as a list of bounding boxes with the recognized strings for each box. You could create a new image with just the region determined by those boxes.

      Take a look at TessBaseAPI::GetRegions(Pixa** pixa).

      And as a general tip, read the library API a little bit more before asking for directions.

      Good luck,
      Suzuki

      • 96 EslamFarag
        April 20, 2012 at 12:25 pm

        you mean that tesseract already neglects non text area ? and i have to take the bounding boxes(contains the resulted text ) and put them in another image then apply tesseract again on the new image?

      • 97 EslamFarag
        April 20, 2012 at 2:59 pm

        the method GetRegions(Pixa** pixa) returns a struct of type Boxa , the question is how to put the value returned from these method in a UIImage?

  40. 98 Catherine
    April 20, 2012 at 9:38 am

    If you encounter problems using this script and you’re using the latest Xcode from the app store:
    The developer folder is now in : /Applications/Xcode.app/Contents/Developer/Platforms, adjust the script accordingly.
    Do not forget to update the SDK version to 5.1 or whatever new version you have when you read this.

  41. 99 Catherine
    April 20, 2012 at 10:03 am

    Another problem I encounter whas that automake was missing (aclocal to be precise), if you’re in that situation :

    curl -O http://mirrors.kernel.org/gnu/automake/automake-1.11.tar.gz
    tar xzvf automake-1.11.tar.gz
    cd automake-1.11
    ./configure –prefix=/usr/local
    make
    sudo make install

  42. 100 Catherine
    April 20, 2012 at 10:49 am

    And libtool also :)

  43. 101 Frank
    April 20, 2012 at 11:50 am

    Hi Suzuki,

    i´m trying to get the apache portable runtime library on iOS, have you any experience with that or do you think it will work with your script?

    • April 20, 2012 at 11:52 am

      Hi Frank,

      Well, if the library contains a makefile compatible with iOS (or at least not platform-tied), the script (with some adjustments) should work fine.

  44. 103 bhushan
    April 27, 2012 at 5:28 am

    Please let us know if works with camera,if yes how and what are the limits

  45. 104 jzsues
    May 2, 2012 at 3:00 am

    thanks for your great work.

  46. 105 bhushan
    May 4, 2012 at 4:27 am

    Sir please let us know why it works like charm on iphone simulator but not on Iphone ,do we need to do anything else to make it run on Iphone.

  47. 107 thewoz
    May 11, 2012 at 8:37 am

    Hi, I done these changes:

    Change “namespace tesseract { …. };” in:

    #ifdef __cplusplus
    #include “baseapi.h”
    using namespace tesseract;
    #else
    @class TessBaseAPI;
    #endif

    and change tesseract::TessBaseAPI *tesseract;
    in : TessBaseAPI *tesseract;

    But now when I try to compile lines like:

    tesseract->SetImage(…);

    or

    tesseract->Recognize();

    the compiler give me back as errors:

    ‘TessBaseAPI’ does not have a member named ‘SetImage’
    ‘TessBaseAPI’ does not have a member named ‘Recognize’

    What I made wrong?

    thx a lot lenny

    • May 11, 2012 at 11:00 am

      Have you tried changing “@class TessBaseAPI;” to “class TessBaseAPI”? Because the one with a @ is Objective-C specific and TessBaseAPI is a C/C++ class…

      Good luck,
      Suzuki

      • 109 thewoz
        May 11, 2012 at 2:23 pm

        Ok, the problem is that I have to rename the .m file that include the tesseract header file in .mm

        my fault!

        tnx :)

  48. 110 bhushan
    May 16, 2012 at 6:50 am

    @Suzuki,it shows 90% accuracy for my data when i pick image from my phone’s Library,but same image if i click from camera and use ,it is showing jumbled letters.Kindly guide if something can be done to make sure it works from camera as it works for gallery images,thanks

  49. 111 ducky
    May 18, 2012 at 12:07 am

    Hello

    I followed all the steps and the script ran successfully , however thr lib folder under dependencies is empty.

  50. 113 scorpiozj
    May 24, 2012 at 2:43 am

    Hi, Suzuki

    the url for the script and the sample is invalid.
    could you reshare it?
    thanks.

    • May 24, 2012 at 11:27 am

      Well man, I have just tested it and I was able to download the files through the post links…
      Give it another try and let me know if it doesn’t work for you.

      Regards,
      Suzuki

  51. 116 bhushan
    May 25, 2012 at 2:20 am

    @Suzuki,it shows 90% accuracy for my data when i pick image from my phone’s Library,but same image if i click from camera and use ,it is showing jumbled letters.Kindly guide if something can be done to make sure it works from camera as it works for gallery images,thanks

  52. 117 scorpiozj
    May 25, 2012 at 5:35 am

    Hi, Suzuki

    I download the project and it runs well.
    I also try to create a new project using the sources build by you, and it runs well, too.

    However, when I add the source build myself, it runs error:
    ‘pageiterator.h’ file not found
    in dependencies/include/tesseract/baseapi.h.

    I compare the baseapi.h with yours, and find mine has more codes than yours.
    For example, the include files are:
    #include “platform.h”
    #include “apitypes.h”
    #include “thresholder.h”
    #include “unichar.h”
    #include “tesscallback.h”
    #include “publictypes.h”
    #include “pageiterator.h”
    #include “resultiterator.h”

    Moreover, the files in libs are also more than yours: it has 18 files.

    I do’t know where to focus on this problem.
    Could you help me and give some suggestion?
    BTW,My Xcode is 4.3.2.

    thanks.

  53. 119 kato
    May 31, 2012 at 12:48 am

    Hello man Thank you for your tutorial

    I want to know how to add other language .like Japanese
    I down jpn.traineddata,and change “tesseract->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding],”jpn”)”
    but error ,so i need some help!

  54. 120 Biagio
    June 16, 2012 at 5:50 pm

    Hi all! I currently have a lotto app on the app store called A+ Lotto. I am looking to integrate a ticket scanner, which would scan lotto tix and save the numbers on each playline. I think Tesseract would be a perfect solution, as some commercial OCR engines are very expensive. Suzuki, if you or anyone else on this post would be interested in working on this (for a fee of course) please email me at biellimo83@yahoo.com and we can go from there. Hope to hear from some of you!!

    Biagio

  55. 121 gofa
    June 19, 2012 at 12:36 pm

    Hi

    Can you give me an example of how to use the data returned from

    Boxa* box = tess->GetWords(&pixa);

    thanks,

  56. June 24, 2012 at 11:58 pm

    Hi!

    When I was trying to initialize the TessBaseApi, I encountered Apple LLVM compiler 3.1 Error:
    /usr/bin/clang failed with exit code 254

    Any help will be greatly appreciated!

    Thanks

  57. July 3, 2012 at 7:32 am

    Hi,
    I tried to use pixRead(const char *filename) to convert the Jpeg image.. But it always shows the following error…

    Error in pixReadStreamJpeg: function not present
    Error in pixReadStream: jpeg: no pix returned
    Error in pixRead: pix not read
    Error in pixGetDimensions: pix not defined
    Error in pixGetColormap: pix not defined
    Error in pixClone: pixs not defined
    Error in pixGetDepth: pix not defined
    Error in pixGetWpl: pix not defined
    Error in pixGetYRes: pix not defined

    Thanks

    • July 3, 2012 at 10:27 am

      Just check if you’re compiling the libs with JPEG support (I don’t add it to the script by default). Or… just use PNG, or even load the image through iOS APIs.

      • July 4, 2012 at 2:32 am

        Hi suzuki,

        Thanks for the reply…. I tried it with a PNG image too … but it shows a similar kind of error..( Error in pixReadStreamPng: function not present).. Tesseract processes the image slowly if i directly convert the image to PIX structure and pass it to the setImage(…..) function. Android apps which use the same tesseract, processes the image faster than in iphone. They(android) have called the pixRead(….) function in java native interface to convert image to a PIX structure and pass it to the setImage(…) function… i have two questions suzuki…

        1) Is there any other way to make tesseract process the image faster…..

        2) how to compile the libs with JPEG support by adding it to the script.. can u give the lines for adding it..

        Thanks..

      • July 4, 2012 at 11:21 am

        Just take a look at leptonica’s README (available at sources or: http://code.google.com/p/leptonica/downloads/detail?name=README-1.68.html).

        There’s a section called “I/O libraries leptonica is dependent on”, it shows how to add support from libjpeg, libtiff, libpng, libz, libgif and libwebp.

        Well, as you can see this reply was a polite version of RTFM… I just wanted to show that the README files can actually be very useful, not just yell at someone to read ‘em w/o even checking it myself…

        Best Regards,
        Suzuki

      • July 4, 2012 at 11:26 am

        Oh, I’ve forgot about (1). Tesseract on Android or iOS are running natively, so there’s should be no difference other than the devices computational power. If you’re OCR’ing pictures from the devices, try to check the performance while processing the same image file, not pictures directly from the cameras…

        If you wanna improve OCR timings, try to scale down your picture to the smallest size (less data, quicker processing) that would still give nice OCR results, if possible.

  58. 128 Tina
    July 12, 2012 at 3:39 am

    Hi,

    Thanks for your tutorial.
    I’m interested in adding many languages in tesseract. Can you pls help me out in doing this. Where should I exactly edit the code for it. Please help.
    Thanks in advance.

  59. 129 Tina
    July 12, 2012 at 4:35 am

    Hi,

    I’m new to tesseract and I’m currently working in a project which uses tesseract.
    In tessdata, unix executable file named ‘mic.traineddata’ is there. What language does this correspond to?

    // init the tesseract engine.
    objForTesseract = new tesseract::TessBaseAPI();
    objForTesseract->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], “mic”);

    What does “mic” mean? Is it any langauge?

  60. 131 Tina
    July 12, 2012 at 4:53 am

    Hi,

    Sorry for the series of questions.
    I downloaded the sample project of yours and its working perfectly.
    You have used ‘English language data for Tesseract’ ie ‘eng.traineddata’
    tesseract->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], “eng”);
    But its extracting latin text (Lorem_Ipsum_Univers.png). You didn’t add the latin files then how come it is extracting latin text.
    Please explain.
    Thank you.

    • July 13, 2012 at 1:39 pm

      Hi Tina,

      One thing trained data provides is information for tesseract to know WHAT to recognize, the actual characters. As all alpha-numeric languages use the same characters, using plain english MAY be enough for recognizing their words, but you would have no luck trying to OCR arabic, ie.

      You gotta note that the picture the sample app is OCR’ing is a very clear one, with high-contrast between the text and background colors, this also increases the accuracy.

      Using a trained data for your specific language would also help Tesseract’s accuracy by providing contextual information (the whole word context of a character) so it can better differ characters that look alike by evaluating if valid words exists for any of them.

      In the end, trained data identifies the characters to scan and provides useful contextual information (words, sentences) for better accuracy.

      Regards,
      Suzuki.

      • 133 Tina
        July 16, 2012 at 3:29 am

        Thank you so much for all your efforts. Thanks for the reply too.

        I have got a clear explanation about the trained data.

  61. 134 shaun
    July 12, 2012 at 9:06 pm

    Thank you very much for this – your work and explanations are very much appreciated.

  62. July 18, 2012 at 4:07 am

    Hi suzuki,

    I followed the instructions given in “I/O libraries leptonica is dependent on” section in the README of leptonica…. But still i am getting the same error as mentioned earlier.. Is there anything i need to change in the build script.. like adding or removing options in the configure…

    Thanks in advance,
    Hari

  63. July 23, 2012 at 5:48 am

    thanks so much for this!!! Works great, saved me many many hours, well done :)))

  64. 137 Idrees
    July 30, 2012 at 1:14 am

    I followed your sample project. But I am facing this error: “Error opening data file /var/mobile/Applications/026CB5A9-DC32-4B26-AA26-AD8AADF72EF1/Documents/tessdata/eng.traineddata”

    Can you please tell me the solution?
    Thanks

  65. 140 jaime
    July 31, 2012 at 8:03 am

    great post!
    i followed the instructions and i´ ve been able to compile the leptonica and tesseract libraries but it doesn´t make the libraries *.a and it shows the message: ” cp: ./outdir/lib*.a: No such file or directory
    Finished! ”

    could you help me, please?
    Thank you again for this post!

  66. August 7, 2012 at 6:16 am

    Hi suzuki,

    I have compiled jpeg.. I need to specify the path of jpeg to be specified in leptonica’s configure script… like
    (./configure –enable-shared=no LIBLEPT_HEADERSDIR=$GLOBAL_OUTDIR/include/) for tesseract… I have also replaced the –without-jpeg option in the configure script of leptonica with the –with-jpeg=yes option… in order to include jpeg support in leptonica…, If i directly compile leptonica with the above option alone it shows the following error…

    jpegio.c:112:21: error: jpeglib.h: No such file or directory
    jpegio.c:114: error: expected ‘)’ before ‘cinfo’
    jpegio.c:115: error: expected ‘)’ before ‘cinfo’
    jpegio.c:122: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘jpeg_comment_callback’
    jpegio.c: In function ‘pixReadStreamJpeg’:
    jpegio.c:228: error: ‘JSAMPROW’ undeclared (first use in this function)
    jpegio.c:228: error: (Each undeclared identifier is reported only once
    jpegio.c:228: error: for each function it appears in.)
    jpegio.c:228: error: expected ‘;’ before ‘rowbuffer’
    jpegio.c:231: error: storage size of ‘cinfo’ isn’t known
    jpegio.c:232: error: storage size of ‘jerr’ isn’t known
    jpegio.c:246: error: ‘BITS_IN_JSAMPLE’ undeclared (first use in this function)
    jpegio.c:254: error: ‘rowbuffer’ undeclared (first use in this function)
    jpegio.c:260: error: ‘jpeg_error_do_not_exit’ undeclared (first use in this function)
    jpegio.c:265: error: ‘JPEG_COM’ undeclared (first use in this function)
    jpegio.c:265: error: ‘jpeg_comment_callback’ undeclared (first use in this function)
    jpegio.c:271: error: ‘JCS_GRAYSCALE’ undeclared (first use in this function)
    jpegio.c:278: error: ‘JCS_YCCK’ undeclared (first use in this function)
    jpegio.c:279: error: ‘JCS_CMYK’ undeclared (first use in this function)
    jpegio.c:286: error: expected ‘;’ before ‘calloc’
    jpegio.c:290: error: expected ‘;’ before ‘calloc’
    jpegio.c:336: error: ‘JDIMENSION’ undeclared (first use in this function)
    jpegio.c:336: error: expected ‘)’ before numeric constant
    mv -f .deps/grayquant.Tpo .deps/grayquant.Plo
    jpegio.c:374: error: expected ‘)’ before numeric constant
    /bin/sh ../libtool –tag=CC –mode=compile /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/llvm-gcc -DHAVE_CONFIG_H -I. -I.. -arch armv6 -pipe -no-cpp-precomp -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS5.1.sdk -miphoneos-version-min=3.2 -I/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS5.1.sdk/usr/include/ -I/Users/administrator/Desktop/newtess/dependencies/include -L/Users/administrator/Desktop/newtess/dependencies/lib -arch armv6 -pipe -no-cpp-precomp -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS5.1.sdk -miphoneos-version-min=3.2 -I/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS5.1.sdk/usr/include/ -I/Users/administrator/Desktop/newtess/dependencies/include -L/Users/administrator/Desktop/newtess/dependencies/lib -MT kernel.lo -MD -MP -MF .deps/kernel.Tpo -c -o kernel.lo kernel.c
    jpegio.c: In function ‘freadHeaderJpeg’:
    jpegio.c:472: error: storage size of ‘cinfo’ isn’t known
    jpegio.c:473: error: storage size of ‘jerr’ isn’t known
    jpegio.c:492: error: ‘jpeg_error_do_not_exit’ undeclared (first use in this function)
    jpegio.c:503: error: ‘JCS_YCCK’ undeclared (first use in this function)
    jpegio.c:505: error: ‘JCS_CMYK’ undeclared (first use in this function)
    jpegio.c: In function ‘fgetJpegResolution’:
    jpegio.c:530: error: storage size of ‘cinfo’ isn’t known
    jpegio.c:531: error: storage size of ‘jerr’ isn’t known
    jpegio.c:546: error: ‘jpeg_error_do_not_exit’ undeclared (first use in this function)
    jpegio.c: In function ‘pixWriteStreamJpeg’:
    jpegio.c:655: error: ‘JSAMPROW’ undeclared (first use in this function)
    jpegio.c:655: error: expected ‘;’ before ‘rowbuffer’
    jpegio.c:657: error: storage size of ‘cinfo’ isn’t known
    jpegio.c:658: error: storage size of ‘jerr’ isn’t known
    jpegio.c:670: error: ‘rowbuffer’ undeclared (first use in this function)
    jpegio.c:700: error: ‘jpeg_error_do_not_exit’ undeclared (first use in this function)
    jpegio.c:710: error: ‘JCS_GRAYSCALE’ undeclared (first use in this function)
    jpegio.c:714: error: ‘JCS_RGB’ undeclared (first use in this function)
    jpegio.c:757: error: ‘JPEG_COM’ undeclared (first use in this function)
    jpegio.c:757: error: expected ‘)’ before ‘JOCTET’
    jpegio.c:763: error: expected ‘)’ before ‘calloc’
    jpegio.c:784: error: expected expression before ‘)’ token
    jpegio.c: At top level:
    jpegio.c:1115: error: expected ‘)’ before ‘cinfo’
    jpegio.c:1125: error: expected ‘)’ before ‘cinfo’
    jpegio.c:1144: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘jpeg_comment_callback’
    make[2]: *** [jpegio.lo] Error 1

    Can u help me solving it….

  67. August 24, 2012 at 3:38 pm

    It does not build with Xcode 4.5 Preview 4 w/ iOS 6.0. I haven’t looked into why but it is clear that no object files are built. I suspect its a ranlib/libtool problem. It does build with XCode 4.4 w/ iOS 5.1 following the discussion in this thread.

  68. August 25, 2012 at 6:09 am

    Nice article. Many thanks.

    For those who ran into “configure: error: C compiler cannot create executables” issue, despite setting correct paths, check your IOS_BASE_SDK at the top of build_dependencies.sh. It defaults to 5.0, and current version is 5.1. It cost me an hour of headbashing to figure this out. Hope my comment will help somebody :)

  69. 145 MP
    August 29, 2012 at 12:47 pm

    Hi Suzuki,

    Thanks for excellent article. I spent couple of days with no luck but you saved me. Thanks again. After running your sample project; it seems that tesseract is not able to recognize text correctly. In my case output is around 10% or less. I have added latest traindata to resource folder available on http://code.google.com/p/tesseract-ocr/downloads/list. I was trying to process fuel receipts.

    i guess i must be doing something wrong as people are using it and it is working. So i wanted to know. Please help.

    Thanks.

  70. August 31, 2012 at 8:19 am

    Installation process uses build tools which could be missing on OS X, so if someone have failed to build do next:

    1) check IOS_BASE_SDK and IOS_DEPLOY_TGT in build_dependencies.sh if you have different
    2) check all DEVROOT variables (in latest Xcode it should be like /Applications/Xcode.app/Contents/Developer/Platforms/.platform/Developer)
    3) install Homebrew, than do next in terminal :
    brew install autoconf
    brew install automake
    brew install libtool

    It helps me, hope helps someone.

    Thank you, Suzuki.

  71. 147 Sara
    September 2, 2012 at 1:06 am

    Hi,

    Your tutorial is good. But, I am getting this error:

    checking for leptonica… configure: error: leptonica not found

    Can you please help me?

  72. 148 Deepak
    September 10, 2012 at 9:37 am

    Hi Suzuki and all other Tesseract people who has done remarkable job in this field.

    I really need your help I’ve been searching on this particular document from last two days but sadly I didn’t make the most of it plus I am failed to understand that from where do I download Tesseract and how will I configure all the libraries and frameworks to use them in Xcode later on.

    All your replies will be appreciable and Thanks a ton in advance because I am sure that I will definitely get help over here.

    Once again Thanks!!!! :)

  73. 149 mahipal
    September 13, 2012 at 8:45 am

    Great stuff!!. I have downloaded the sample xcode project and it is working fine. Now I would like to add tesseract file from this sample to my project. I was able to add all folders. When I run the project, the error is thrown at

    tesseract->SetImage();
    EXC_BAD_ACCESS

    PLEASE HELP ME.

    • 150 Deepak
      September 14, 2012 at 3:06 am

      Hello Sir,

      I really need your help please provide me the steps to configure Tesseract and how to use it along with Xcode and please let me know where I can download Tesseract.

      Thanks in advance :)

  74. 151 jimmy
    September 18, 2012 at 6:50 am

    Hi Suzuki,

    this is a rather trivial question, but somehow after compiling, I’m left with the 15 libtesseract_*.a and one libtesseract.a libs in the lib folder. They didn’t merge somehow into libtesseract_all.a , so eh..possible to give me some helpful hint in solving this small issue?

    Thanks very much !!

  75. 152 tgriebe
    September 20, 2012 at 7:06 pm

    hi, with the iphone 5 coming up, you are like to experience problems in Xcode. reason is that xcode 4.5 expects libs to support armv7s architecture. i solved the problem by adding armv7s to the build script.
    however, make sure you use the correct lipo version. older versions that come with older xcode versions or reside in /usr/bin may not support armv7s

    this may eventually leave you with the final lib not being created. in that case change DEVROOT to /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer

    there under $DEVROOT/usr/bin you’ll find a working version of lipo

  76. 153 blubbaren
    September 22, 2012 at 12:22 pm

    i tried to compile this for SDK6 (Xcode 4.5) with bad results.
    Anyone have or know of a guide for sdk6?

  77. September 24, 2012 at 6:03 am

    Hi,
    I have the latest iOS 6 sdk (Xcode 4.5) , when I run build_dependencies in terminal it works fine but when I check the dependencies>lib there are nothing in that.

  78. September 24, 2012 at 6:27 am

    Thanks to your blog post and its comments, I’ve been able to make two repo for Tesseract and SDK 6.0:

    – Libraries compiled for armv7-i386: https://github.com/ldiqual/tesseract-ios-lib
    – Objective-c wrapper: https://github.com/ldiqual/tesseract-ios

    Compilation instructions for iOS SDK 6.0 are available on my blog post: http://lois.di-qual.net/blog/compile-tesseract-for-ios-sdk-6-0/

  79. September 25, 2012 at 7:17 am

    I am building the project in Xcode 4.5 and I have changed the compiler to LLVM GCC 4.2 but i am getting some errors like autorelease in the main method. it says it needs some expression.

  80. October 3, 2012 at 4:40 am

    Hi guys,

    Thanks Suzuki for this very useful tutorial.

    After much screwing around I got it to compile on iOS 6.0 with Xcode 4.5.
    Using Tesseract 3.01 and Leptonica 1.69.

    I had to;

    1) set IOS_BASE_SDK to 6.0
    IOS_BASE_SDK=”6.0″

    2) change leptonica dir
    LEPTON_LIB=”`pwd`/leptonica-1.69″

    3) check all DEVROOT variables
    export DEVROOT=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer
    export DEVROOT=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer

    4) install Homebrew, then do in terminal :
    brew install autoconf
    brew install automake
    brew install libtool

    5) change compilers to 4.2 version
    export CXX=”$DEVROOT/usr/bin/llvm-g++-4.2″
    export CC=”$DEVROOT/usr/bin/llvm-gcc-4.2″

    6) ios 6 doesn’t support armv6 so i had to remove the armv6 sections and the references to armv6 where it merges the libs. If you don’t remove the armv6 references you will get no libs.

  81. 159 Idrees
    October 4, 2012 at 9:14 am

    Hi,
    I want to put progress bar while OCR’ing. I know my expected number of characters in output. But Is there any way in which through I can get the recognized text while tesseract is in process.
    Thanks

  82. 161 prasad
    October 9, 2012 at 8:16 am

    How to get font sizes of recognized text to Recognize Name using font size on the Business Card form the Recognized text using OCR: iPhone.

    http://stackoverflow.com/questions/12796500/how-to-get-font-sizes-of-recognized-text-to-recognize-name-using-font-size-on-th

  83. 162 Lutful_Kabir
    October 17, 2012 at 5:22 am

    Why show this error message “Command /Developer/Platforms/iPhoneSimulator.platform/Developer/usr/bin/clang failed with exit code 1”?

    I am trying to integrate tesseract into my application.. When I runs My application it showing the Above Error.. If I remove tesseract file the application is fine without errors… What is the problem….?

    Error Displayed above that is:

    {

    ld: duplicate symbol _main in /Lutful Kabir/Project/Business Card Reader/DBZiCardReader/DBZiCardReader/build_dependencies/dependencies/lib/libtesseract_all.a(svpaint.o) and /Users/foyzulkarim/Library/Developer/Xcode/DerivedData/DBZiCardReader-awawphfeoprlwqgcmoohhievukcx/Build/Intermediates/DBZiCardReader.build/Debug-iphonesimulator/DBZiCardReader.build/Objects-normal/i386/main.o for architecture i386 Command /Developer/Platforms/iPhoneSimulator.platform/Developer/usr/bin/clang++ failed with exit code 1

    }

    How can i remove this error message? please help to get out from this error. Thanks at advance….

  84. 163 dan
    October 18, 2012 at 5:49 am

    This doesnt work at all for me, at first I get “command not found” for every empty line, after removing the empty lines I get this

    make: *** No rule to make target `distclean’. Stop.
    ./build_fat.sh: line 15: unset: `LD ‘: not a valid identifier
    configure: WARNING: If you wanted to set the –build type, don’t use –host.
    If a cross compiler is detected then cross compile mode will be used.
    checking build system type… i386-apple-darwin11.4.0
    checking host system type… arm-apple-darwin
    checking –enable-graphics argument… yes
    checking –enable-multiple-libraries argument… no
    checking whether the C++ compiler works… no
    configure: error: in `/Users/driegler/Documents/tesseract-3.01′:
    configure: error: C++ compiler cannot create executables

  85. 164 Autumn
    October 24, 2012 at 3:03 am

    Thanks for your post. I have successfully compiled and run on simulator from xcode 4.5.1
    However, when I try to run it on my iphone5, I got the following error

    ld: file is universal (3 slices) but does not contain a(n) armv7s slice: /Users/tom/cocr/build/dependencies/lib/liblept.a for architecture armv7s
    clang: error: linker command failed with exit code 1 (use -v to see invocation)

    Can you give me some pointers?

  86. October 28, 2012 at 8:28 pm

    in case anyone else runs into this problem as well iOS6 and iphone5 captured image, i thought i’d post.
    I couldn’t get tesseract to decode a UIImage right from the UIImagePicker (not would it work on a JPG that I emailed from my phone).
    I used this function from here

    http://artgillespie.tumblr.com/post/232498238/getting-a-uiimages-raw-pixel-data

    which converts the UIImage from the picker to raw pixels and then back again. This seems to work.

    • 166 Farooq
      November 12, 2012 at 7:34 am

      Are you using Mountain Lion and XCode 4.5.1? If yes, can you share the changes made in the script file to compile it against ios6?

      One change that I identified in the script is following:-

      export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer

      needs to be replaced with the following due to changes in XCode and Mountain Lion.

      export DEVROOT=/Applications/XCode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer

      But still, its not working. If anyone can post their experience on Mountain Lion / XCode 4.5.1 OR 4.5.2, it would really help others a lot.

  87. October 30, 2012 at 10:02 pm

    Followed all the steps and getting this error when I include “environ.h” file reference not found. I’ve scanned the sample for includes paths to the lib directories added included paths, removed includes paths pulled my hair out checked every single setting between the sample and my project and for the life of me cannot figure out why the .h file which is in the path I’ve provided is throwing a reference not found. My files are all .mm. The environ.h file is part of the leptonica portion of the includes. So building the .a files with the build_dependencies.sh file completes no errors the files are created. I followed the steps to create the new project based on the sample and compiled it and the compiler threw this environ.h file reference not found error. Then the fun began painstakingly checking every compiler setting and have not been able to find why the environ.h file just will not be found by the ^$#%$#% compiler. Any help would be appreciated.

    The tesseract sample has these includes:
    #include “baseapi.h”

    #include “environ.h” —- this include throws a lexical or preprocessor issue reference not found
    #import “pix.h”

    This code is in the ViewController.mm file.

  88. November 8, 2012 at 2:11 am

    Here is an updated version of the script for Leptonica 1.69, Tesseract 3.02, Xcode 4.5 and armv7s CPU type:

    #!/bin/sh
    # build.sh

    GLOBAL_OUTDIR=”`pwd`/dependencies”
    LOCAL_OUTDIR=”./outdir”
    LEPTON_LIB=”`pwd`/leptonica-1.69″
    TESSERACT_LIB=”`pwd`/tesseract-ocr”

    IOS_BASE_SDK=”6.0″
    IOS_DEPLOY_TGT=”6.0″

    [I cropped the script, the full version is available at: http://goo.gl/wQea5%5D

  89. 169 Pete
    November 28, 2012 at 11:15 am

    Thanks for posting the compile script!… I made some additions when working with it to improve performance/memory footprint. You can add -O3 to the script to bring the library size from 25MB to 17MB, helps on low mem devices like the first iPad.

  90. 170 paul
    December 2, 2012 at 7:06 am

    Great work suzuki

  91. 171 Mike
    December 7, 2012 at 11:56 am

    Hi, thanks for all the great information. I have just one question. I notice the tessdata folder has files named “eng.cube…..” When I run through the instructions I get my eng.traineddata file, however I don’t know how to generate the “eng.cube…” files? Can someone point me in the right direction. Thanks!!!

  92. December 13, 2012 at 3:21 am

    hey ! i am working on terresact 3.02, checking if it works for multiple pages (using the method ProcessPages in TessBaseAPI) like a document and recognizes it. can you help me how this can work? and tesseract compatible to read multiple pages ?

  93. 177 Donald Chun
    December 16, 2012 at 12:42 am

    Run “build_dependencies.sh” failed, because there are some HTML tags in file “build_dependencies.sh” like > &amp.

  94. 179 Patrick
    December 17, 2012 at 5:52 pm

    I checked and rechecked everything and I still get this error:

    checking for leptonica… yes
    checking for pixCreate in -llept… no
    configure: error: leptonica library missing
    make: *** No targets specified and no makefile found. Stop.

    What am I missing?

    Thanks!

    • 180 Patrick
      December 17, 2012 at 8:30 pm

      Looks like there was a spurious ‘/’ in a sed command:

      create_outdir_lipo()
      {
      for lib_i386 in `find $LOCAL_OUTDIR/i386 -name “lib*.a”`; do
      lib_arm7=`echo $lib_i386 | sed “s/i386/arm7/g”`
      lib_arm7s=`echo $lib_i386 | sed “s/i386/arm7s/g”`
      lib=`echo $lib_i386 | sed “s/i386//g”` // was: sed “s/i386///g”
      xcrun -sdk iphoneos lipo -arch armv7s $lib_arm7s -arch armv7 $lib_arm7 -arch i386 $lib_i386 -create -output $lib
      done
      }

  95. 182 rajesh Pola
    January 11, 2013 at 6:33 pm

    Nice Work. I was struck and your tip on recursively adding the folder helped me . You derserve a drink . Let me buy u a drink or send u some thing.
    thanks a Bunch…

    • January 11, 2013 at 8:24 pm

      I’m a beer guy!

      Don’t know where you’re from, but you can always send me beer cash through Paypal, I promise I’ll solely use it for beer!

      Glad to have helped.

  96. 184 Joch
    January 19, 2013 at 3:32 am

    HI, your website is very useful. Thank you so much for that.

    I have a problem when I try to copy your sample project. I have copied the “dependencies” and all other files that needs in this project and also your code. But I failed to compile it in Xcode 4.5. Could you explain why?
    Thanks!

  97. January 20, 2013 at 12:55 am

    I successfully compile Tesseract as your suggestion. Anyway Tesseract requires setting of c++ language dialect and c++ standard library to be Compiler Default. But OpenCV requires setting of c++ language dialect and c++ standard library to be GNU++11 and libc++ with c++11 support respectively.
    So the question is how to set C++ language dialect & C++ Standard Library to integrate Tesseract & OpenCV in the same project?

    • January 21, 2013 at 9:32 am

      I’ve already used Tesseract and OpenCV in a Xcode/iOS project without any problem with c++ dialects or standard library versions, although I didn’t compile OpenCV, but used a pre-compiled lib (http://goo.gl/Z4Grl, from (I believe) http://niw.at/articles/2009/03/14/using-opencv-on-iphone/en).

      • February 20, 2013 at 3:40 am

        Suzuki, the current 2.4.3 and 2.4.4 branches are compiled with libc++. Making the combination of openCV and tesseract not possible. However 2.4.2 of open cv is with libstdc++ and so should work. However the newer versions of opencv have some great improvements to how they handle doing work on ios. Also the newer versions include cap_ios.h which has some great video functions on iOS. Do you have a build_dependencies.sh that switches Tesseract to libc++ instead of libstdc++. That would be super helpful in combining the latest versions of OpenCV and Tesseract together.

  98. 190 Nelson
    February 18, 2013 at 8:53 am

    Hi Suzuki, thanks for the great post. It made it all possible. I have a couple of suggestions to make, because the whole building/compiling process bruised me for a while during the weekend, before I finally made it work:

    1) I’m from the Unix “old-days”, before autoconf, automake, libtools, brew etc. So it might be a good idea to remind people of these dependencies and have them install them before they attempt to build.
    2) Emphasize that the $DEVROOT must point to your current SDK. I got the build_dependencies file for 6.0, and got all these cryptic error messages about how the “C and C++ compilers failed”. Lost a few hours figuring out that, since I’d recently updated Xcode and got the 6.1 SDK, that’s what the $DEVROOT should point to.

    After I installed/fixed my configuration, everything worked out smoothly.

    Again, thanks for the very helpful post.

  99. 191 Hitesh Soni
    February 19, 2013 at 2:23 am

    Hi there,
    I was able to fix the earlier problem that i was facing..
    Although , NOW, when i apply the library on some article/image
    It is only producing me an output like –

    13 11 21 1 31 1 5 563 111 1 111 41
    3 734111 1 3 1 01 1 1123 51011 113 1 1 2 7
    2921131 1 1 3 310013 13 3 11 111 0112 1111
    1310 13 1 113 01 51 59 011 1 374223 111 21
    1 0 2 7 13 21 31 1311 15121 1 1 31 13 111511
    21 9 11 1 1 11911 3793 3112137 21111 5
    6 1 7937 6 0 5 60119 19 31 1 1 1101 15
    341 011911 2111 1 131111 3 1 1 4013611231
    67 7911 11 0 1 3011633 311 0655 111131
    511 6111119 1 312 10 10 1 5111 511 01101
    21 1 191 15 11 5110 11 1 116 1 511319 1 31
    2 111 2 1 9154 143 43 0 13

    for the article/image –

    Do reply,
    waiting :)

    • February 19, 2013 at 1:18 pm

      Hi Hitesh,

      Glad to hear you got it running. As of your current problem, even though it is out of the post scope, have you tried to rotate the image to see if it get better results? Try it manually and, if it works, programmaticaly later.

      Best regards,
      Suzuki.

  100. 193 Karpagavinayagam
    February 27, 2013 at 3:27 am

    Am getting this error , can any one find me the solution

    /dependencies/include/tesseract/baseapi.h:33:10: fatal error: ‘pageiterator.h’ file not found
    #include “pageiterator.h”
    ^
    1 error generated.

  101. March 11, 2013 at 10:17 am

    Hi again :)
    suzuki, i have launched my OCR app using tesseract 3.02 in a device (iPhone 4) the application closes and i need to reopen it for running again after 7th or 8th time i pick an image from image gallery for OCRing.. i didn’t get any memory warning or crash log in the Debug area in Xcode. can you please help! the crash looks inconsistent.

  102. 196 Andriy
    March 19, 2013 at 8:42 am

    Hello,

    I compile sources with lot of errors.
    Could You please help me?

    Regards,
    Andriy.

  103. 198 Sebastian
    April 9, 2013 at 7:56 am

    Hello, thx for the great tutorial!
    But since today I can’t compile the libs anymore, the “dependencies/lib” is empty.

    I’m using Xcode 4.6.1 and iOS SDK 6.1.

    The last error I see:

    configure: error: leptonica library missing
    make: *** No targets specified and no makefile found. Stop.
    ar: creating archive libtesseract_all.a
    ar: *.o: No such file or directory

  104. 200 Jack
    April 29, 2013 at 10:51 pm

    Xcode 4.6 iOS 6.1 encounter this error, libtesseract_all.a has not been generated

    config.status: error: cannot find input file: `Makefile.in’
    make: *** No targets specified and no makefile found. Stop.
    ar: creating archive libtesseract_all.a
    ar: *.o: No such file or directory

  105. 201 Alon
    May 1, 2013 at 11:36 am

    Hi Suzuki,

    Like others mentioned before, in order for tesseract to work with current OpenCV versions, it needs to be built using libc++.
    Using Xcode 4.6, iOS sdk 6.1 and tesseract 3.02, I can’t get it to work.
    I edited the CXX and CXXFLAGS lines in the script to:
    export CXX=”/usr/bin/clang++”
    export CXXFLAGS=”$CFLAGS -stdlib=libc++”
    This way, the script seems to work just fine, and builds both leptonica and tesseract (with some warnings, but still succeeds). Still, when I add these libraries to my xcode project, and change the “C++ Standard Library” to “libc++”, i’m getting a lot of linker errors, while the regular “Compiler Default” option works just fine.

    Do you have an idea what’s wrong?

    Thanks!

    • 202 Alon
      May 5, 2013 at 9:21 am

      OK, finally managed to get it to work.
      My problem was that after adding and removing references to libraries a few times in my project, I had quite a mess in my “Library Search Paths”. Plus, I didn’t add the new “include” folder (created when building tesseract) to the “User Header Search Paths”.

      So, just a quick recap, in order to build tesseract-ocr using libc++, so it can work along with newer OpenCV versions:

      - Download leptonica-1.69
      - Download tesseract 3.02
      - Arrange them in the folder structure explained in the original tutorial in this page.
      - Download the updated script from the top of this page to the same folder.
      - Edit the script for your relevant IOS_BASE_SDK and IOS_DEPLOY_TGT.
      - Edit CXX to use clang++: CXX=”/usr/bin/clang++”
      - Edit CXXFLAGS to use libc++ as the standard library: CXXFLAGS=”$CFLAGS -stdlib=libc++”
      - Use the script and build tesseract and leptonica.
      - Add these libraries to your xcode project, change the “C++ Standard Library” setting to libc++.
      - Make sure your “Library Search Paths” setting is not pointing to any old tesseract libs.
      - Make sure your “User Header Search Paths” setting is pointing to the new “include” folder created when you built the new libs.
      - Now, when you try building your project, you’ll have a few missing header files. Just copy them from the old “include” folder from tesseract and leptonica.

      That’s it. At this point, you’ll have a project capable of using both new OpenCV versions AND tesseract 3.02 together. Yey!
      If it’s a new project, don’t forget to edit your prefix file accordingly to include OpenCV and Tesseract in case of __cplusplus, and rename any .m file using them to .mm

      Enjoy

  106. May 2, 2013 at 1:42 am

    Hi Suzuki,

    Is there any way to build leptonica to support png or tiff format?

    for example, in your script :

    ./configure –host=arm-apple-darwin7 –enable-shared=no –disable-programs –without-zlib –without-libpng –without-jpeg –without-giflib –without-libtiff

    we always build without png or tiff format.

    I change the environ.h and change –with-libpng but it still not work.

    Thanks,
    Hieu

  107. June 25, 2013 at 12:59 pm

    FYI: I think the only tesseract libs you need to link are tesseract.a and tesseract_api.a. Everything else is unnecessary, at least for me.

  108. 205 Sudhiip
    July 8, 2013 at 4:33 am

    Hi Suzuki, thanks for the straight answer which can be used as is for development.

    But I need to build it again with jpeg and png support, however if i change anything in the line ./configure only armv7 , armv7s will build. i386 is not building with any changes i do.

    Can you please provide some script where i could get liblept.a with png and jpeg support.

  109. 206 Eric
    July 30, 2013 at 8:17 pm

    anyone getting “eng.cube.lm:8: premature EOF” should remove eng.cube.lm from Compile Sources build phase of your target.

  110. 207 Sabotage3D
    September 4, 2013 at 7:02 am

    Great post, I am going to try your approach tonight. Have you tried using cmake to generate xcode project with ios-toolchain ?

  111. September 5, 2013 at 8:45 am

    I can’t compile no matter I’m using your script or new script which claimed to support ios 6. The error is the same.

    ./configure: line 598: test: /Users/zhouhao/Library/Application: binary operator expected
    ./configure: line 598: test: /Users/zhouhao/Library/Application: binary operator expected
    configure: WARNING: If you wanted to set the –build type, don’t use –host.
    If a cross compiler is detected then cross compile mode will be used.
    checking build system type… i386-apple-darwin12.4.0
    checking host system type… arm-apple-darwin7
    checking for arm-apple-darwin7-gcc… /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/llvm-gcc
    checking whether the C compiler works… no
    configure: error: in `/Users/zhouhao/Projects/ocrtest/leptonica-1.69′:
    configure: error: C compiler cannot create executables
    See `config.log’ for more details.
    make: *** No targets specified and no makefile found. Stop.

    ……

    checking build system type… x86_64-apple-darwin12.4.0
    checking host system type… arm-apple-darwin7
    checking –enable-graphics argument… yes
    checking –enable-embedded argument… no
    checking –enable-visibility argument… no
    checking –enable-multiple-libraries argument… no
    checking whether to use tessdata-prefix… yes
    checking whether to enable debugging… no
    checking whether the C++ compiler works… no
    configure: error: in `/Users/zhouhao/Projects/ocrtest/tesseract-ocr’:
    configure: error: C++ compiler cannot create executables
    See `config.log’ for more details
    make: *** No targets specified and no makefile found. Stop.
    ar: creating archive libtesseract_all.a
    ar: *.o: No such file or directory
    /Users/zhouhao/Projects/ocrtest/tesseract-ocr
    Running aclocal
    Running libtoolize
    autogen.sh: line 55: libtoolize: command not found
    autogen.sh: line 55: glibtoolize: command not found

    My Xcode is 4.6.3. Can anybody advise? Thanks

  112. September 21, 2013 at 2:48 am

    Hi Suzuki,

    Thanks for the great tutorial. I have integrated this in my application. Its working fine in simulator. But when i install it in my real device it works in very odd manner. When i capture any text image, It translate into some symbolic text. Same image if i translate in simulator it works fine.

    Do i need any extra configuration for ios device?

    Thanks,
    Rahul.

  113. September 21, 2013 at 6:21 am

    Hi Suzuki,

    Thanks for the great tutorial. I have integrated this code in my application. Its working fine in simulator. But when i install it in my real device it works in very odd manner. When i capture any text image, It translate into some symbolic text. Same image if i translate in simulator it works fine.

    Is there any different scenerio or configuration to make this working in device? Please help me as i am stuck on this things.

    Thanks,
    Rahul.

    • September 21, 2013 at 12:41 pm

      Hi Rahul,
      No, you shouldn’t need any extra configuration on-device, but it is a real challenge to take a nice picture that would give good results for OCR. There are some ways to apply filters to the image before OCR’ing so you’d get more accurate results, but I haven’t been successful in doing so.
      Good luck!

  114. 212 SlavaVVV
    November 23, 2013 at 6:47 am

    Hello Dear Suzuki
    Did u try to make it with ios7 on one of iphone4/…/5s ? Is it going to work too? Thx

  115. February 6, 2014 at 12:59 pm

    Had trouble getting this working with Tesseract 3.03 / Leptonica 1.70 for iOS7. Incase this helps others, a script with my modifications are at https://gist.github.com/williamsodell/8846486

  116. 215 Patrick
    February 28, 2014 at 4:09 pm

    I am giving this a try for building on Mac OS for iOS 7 + XCode 5, will post my findings when I am successful. Meanwhile I am getting an error about “autom4te” invoked in the build_dependencies.sh script. Any ideas? This tool is not present on my Mac OS.

  117. 216 Deepak
    June 19, 2014 at 6:05 am

    how to change default font type to Arial_Black.ttf

  118. June 26, 2014 at 3:04 am

    Can I use the lib/dll for other Platform using ARM such as Android/Windows Phone?

  119. September 23, 2014 at 4:22 am

    Support Multi Languages? I would like to apply it to my project, like “eng+ita”. Is it possible?

  120. 221 Alex
    November 4, 2014 at 9:35 pm

    As part of a project I’m working on, I updated this script to get it to work with Tesseract 3.03, using 8.0 base SDK:

    https://github.com/twelve17/openalpr-ios/blob/master/bin/build_dependencies.sh

  121. 222 Raghav
    November 12, 2014 at 2:02 am

    Could you please give a similar tutorial for compiling tesseract for OS X?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

%d bloggers like this: