阅读在露天的PDF文件

我已经添加了一个用户界面的行动,以获取nodeRef作为我试图读取内容的参数

contentService.getReader(nodeRef, ContentModel.TYPE_CONTENT).getContentString() 

但它只能使用.txt文件,而不能使用.pdf,.xlsx,.docx …..例如,当我尝试阅读PDF文件它让我:

 S#?_#3C?? R/Metadata 64 0 R/OCProperties<</D<</Order[]/R ..... 

与word文档和excel其数字。

有解决scheme吗?

是。 尝试使用contentService.getReader(nodeRef, ContentModel.TYPE_CONTENT).getContentInputStream()来代替。 如果您需要查找某些文本,请将stream中的数据提供给PDF库并使用其API访问内容。

其实Alfresco有他自己的读者,转换器和所有的工作人员,以便阅读我使用此代码的任何内容,它的工作原理

 ContentReader reader = contentService.getReader(nodeRef, ContentModel.PROP_CONTENT); String content=""; if (reader != null && reader.exists()) { // get the transformer ContentTransformer transformer = contentService.getTransformer(reader.getMimetype(), MimetypeMap.MIMETYPE_TEXT_PLAIN); // is this transformer good enough? if (transformer != null) { // We have a transformer that is fast enough ContentWriter writer = contentService.getTempWriter(); writer.setMimetype(MimetypeMap.MIMETYPE_TEXT_PLAIN); try { transformer.transform(reader, writer); // point the reader to the new-written content reader = writer.getReader(); // Check that the reader is a view onto something concrete if (!reader.exists()) { logging(new ContentIOException("The transformation did not write any content, yet: \n" + " transformer: " + transformer + "\n" + " temp writer: " + writer+"")+""); throw new ContentIOException("The transformation did not write any content, yet: \n" + " transformer: " + transformer + "\n" + " temp writer: " + writer); }else { content = reader.getContentString(); logging("------------------------------------------------------------"); logging(content); } } catch (ContentIOException e) { } } } 
Interesting Posts