hi, I wanna write and application to read a PDF file which is not in English (imagine an Asian or Middle Eastern language), from what kind of packages can I use? how can I do this? can you offer a website with a sample code? thanks in advance
Welcome to the Java Programming Forums
The professional, friendly Java community. 21,500 members and growing!
The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.
>> REGISTER NOW TO START POSTING
Members have full access to the forums. Advertisements are removed for registered users.
hi, I wanna write and application to read a PDF file which is not in English (imagine an Asian or Middle Eastern language), from what kind of packages can I use? how can I do this? can you offer a website with a sample code? thanks in advance
Unfortunately java currently does not have a PDF parser as part of its standard library. I've used Apache's PDFBox in the past and it worked marginally well. Not sure if there are any foreign language constraints to this library however (if it can extract unicode from a PDF file it should be ok)