阅读在露天的PDF文件
我已经添加了一个用户界面的行动,以获取nodeRef作为我试图读取内容的参数
contentService.getReader(nodeRef, ContentModel.TYPE_CONTENT).getContentString()
但它只能使用.txt文件,而不能使用.pdf,.xlsx,.docx …..例如,当我尝试阅读PDF文件它让我:
S#?_#3C?? R/Metadata 64 0 R/OCProperties<</D<</Order[]/R .....
与word文档和excel其数字。
有解决scheme吗?
是。 尝试使用contentService.getReader(nodeRef, ContentModel.TYPE_CONTENT).getContentInputStream()
来代替。 如果您需要查找某些文本,请将stream中的数据提供给PDF库并使用其API访问内容。
其实Alfresco有他自己的读者,转换器和所有的工作人员,以便阅读我使用此代码的任何内容,它的工作原理
ContentReader reader = contentService.getReader(nodeRef, ContentModel.PROP_CONTENT); String content=""; if (reader != null && reader.exists()) { // get the transformer ContentTransformer transformer = contentService.getTransformer(reader.getMimetype(), MimetypeMap.MIMETYPE_TEXT_PLAIN); // is this transformer good enough? if (transformer != null) { // We have a transformer that is fast enough ContentWriter writer = contentService.getTempWriter(); writer.setMimetype(MimetypeMap.MIMETYPE_TEXT_PLAIN); try { transformer.transform(reader, writer); // point the reader to the new-written content reader = writer.getReader(); // Check that the reader is a view onto something concrete if (!reader.exists()) { logging(new ContentIOException("The transformation did not write any content, yet: \n" + " transformer: " + transformer + "\n" + " temp writer: " + writer+"")+""); throw new ContentIOException("The transformation did not write any content, yet: \n" + " transformer: " + transformer + "\n" + " temp writer: " + writer); }else { content = reader.getContentString(); logging("------------------------------------------------------------"); logging(content); } } catch (ContentIOException e) { } } }