下面给出一套可直接拷贝运行的 Lucene 8.5.0 + FastVectorHighlighter 完整示例(JDK 8+),演示从建索引、查询到高亮的全过程。
> 关键点:字段必须
1. 存储原始内容(`setStored(true)`)
2. 开启 TermVector(`setStoreTermVectors(true)` + `setStoreTermVectorPositions(true)` + `setStoreTermVectorOffsets(true)`)
---
1. Maven 依赖(Lucene 8.5.0)
```xml
<dependencies>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>8.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-common</artifactId>
<version>8.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-highlighter</artifactId>
<version>8.5.0</version>
</dependency>
</dependencies>
```
---
2. Java 示例代码
```java
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.*;
import org.apache.lucene.index.*;
import org.apache.lucene.search.*;
import org.apache.lucene.store.ByteBuffersDirectory;
import org.apache.lucene.store.Directory;
import org.apache.lucene.search.highlight.*;
import org.apache.lucene.search.vectorhighlight.*;
public class FastVectorHighlighterDemo {
public static void main(String[] args) throws Exception {
Directory dir = new ByteBuffersDirectory();
IndexWriterConfig cfg = new IndexWriterConfig(new StandardAnalyzer());
IndexWriter writer = new IndexWriter(dir, cfg);
// 1. 定义字段类型:存储 + 分词 + TermVector
FieldType fieldType = new FieldType();
fieldType.setStored(true); // 存储原文
fieldType.setTokenized(true); // 分词
fieldType.setStoreTermVectors(true); // 必须
fieldType.setStoreTermVectorPositions(true); // 必须
fieldType.setStoreTermVectorOffsets(true); // 必须
fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
fieldType.freeze();
// 2. 添加文档
Document doc = new Document();
doc.add(new Field("title", "Lucene 8.5.0 FastVectorHighlighter示例", fieldType));
doc.add(new Field("body",
"Lucene是一个高效的全文检索库。FastVectorHighlighter利用TermVector实现高速高亮。", fieldType));
writer.addDocument(doc);
writer.commit();
writer.close();
// 3. 查询 & 高亮
IndexReader reader = DirectoryReader.open(dir);
IndexSearcher searcher = new IndexSearcher(reader);
Query query = new BooleanQuery.Builder()
.add(new TermQuery(new Term("body", "全文检索")), BooleanClause.Occur.SHOULD)
.add(new TermQuery(new Term("body", "高亮")), BooleanClause.Occur.SHOULD)
.build();
TopDocs topDocs = searcher.search(query, 10);
int docId = topDocs.scoreDocs[0].doc;
// 4. 使用 FastVectorHighlighter
FastVectorHighlighter highlighter = new FastVectorHighlighter(true, true,
new SimpleFragListBuilder(5),
new ScoreOrderFragmentsBuilder(
BaseFragmentsBuilder.COLORED_PRE_TAGS,
BaseFragmentsBuilder.COLORED_POST_TAGS));
FieldQuery fieldQuery = highlighter.getFieldQuery(query);
String[] frags = highlighter.getBestFragments(fieldQuery, reader, docId,
"body", 100, 3);
// 5. 输出结果
System.out.println("Title: " + reader.document(docId).get("title"));
for (String f : frags) {
System.out.println("Fragment: " + f);
}
reader.close();
}
}
```
---
3. 运行结果(示例)
```
Title: Lucene 8.5.0 FastVectorHighlighter示例
Fragment: Lucene是一个高效的<b style="background:yellow">全文检索</b>库。FastVectorHighlighter利用TermVector实现高速<b style="background:lawngreen">高亮</b>。
```
---
4. 常见坑提醒
问题 原因
高亮返回 `null` 字段没开启 TermVector,或没 `setStored(true)`
MultiPhraseQuery / SpanQuery 无法高亮 FastVectorHighlighter 不支持,需换 UnifiedHighlighter 的 re-analysis 模式
---
直接复制到 IDE 即可运行,祝编码愉快!