File file = new File("C:/PdfBox_Examples/new.pdf");
PDDocument document = PDDocument.load(file);
//Instantiate PDFTextStripper class
PDFTextStripper pdfStripper = new PDFTextStripper();
//Retrieving text from PDF document
String text = pdfStripper.getText(document);
There is a common view that extracting text from a PDF document should not be too difficult. After all, the text is right there in front of our eyes and humans consume PDF content all the time with great success. Why would it be difficult to automatically extract the text data? Turn