java中lucene解析word工具类_JAVA_编程开发_程序员俱乐部

中国优秀的程序员网站程序员频道CXYCLUB技术地图
热搜:
更多>>
 
您所在的位置: 程序员俱乐部 > 编程开发 > JAVA > java中lucene解析word工具类

java中lucene解析word工具类

 2013/9/9 19:06:56  rain_2372  程序员俱乐部  我要评论(0)
  • 摘要:java中lucene解析word工具类(读取word文档并查询数据)的方法packageextract;importjava.io.*;importorg.textmining.text.extraction.WordExtractor;publicclassExtractorWord{/***@paramargs*/publicstaticStringgetText(Stringfile){Strings="";Stringwordfile=file
  • 标签:工具 Java 解析 lucene
java中lucene解析word工具类(读取word文档并查询数据)的方法
packageextract;
importjava.io.*;
importorg.textmining.text.extraction.WordExtractor;
  
publicclassExtractorWord {
/**
* @param args
*/
publicstaticString getText(String file){
String s="";
String wordfile=file;
WordExtractor extractor=null;
try{
FileInputStream in=newFileInputStream(newFile(wordfile));
extractor=newWordExtractor();
s=extractor.extractText(in);
}catch(IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}catch(Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
returns;
}
publicstaticvoidtoTextFile(String doc,String filename)throwsException{
String s="";
String wordfile=doc;
String txtfile=filename;
WordExtractor extractor=null;
try{
s=getText(wordfile);
PrintWriter pw=newPrintWriter(newFileWriter(newFile(filename)));
pw.write(s);
pw.flush();
pw.close();
System.out.print("成功写入文件!");
}catch(IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
/**
* @param args
*/
publicstaticvoidmain(String[] args) {
// TODO Auto-generated method stub
try{
String sc=getText("D:/workspace/testsearch2/htmls/ddd.doc");
System.out.print(sc);
toTextFile("D:/workspace/testsearch2/htmls/ddd.doc","D:/workspace/testsearch2/htmls/ddd.txt");
}catch(Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
  
}
}

转 http://www.ablanxue.com/prone_3331_1.html
发表评论
用户名: 匿名