[Yanel-dev] Enhancing Yarep Indexer/Searcher interface
Guillaume Déflache
guillaume.deflache at wyona.com
Thu Aug 6 12:53:13 CEST 2009
Hi!
Michael Wechner schrieb:
> Hi
>
> At the moment one has the following searcher interface
>
> Node[] Repository.getSearcher().search(QUERY);
Does QUERY includes pagination ATM?
> and with the index the PATH/URL is saved and the FULLTEXT is indexed.
>
> This is all nice and simple, BUT .... ;-)
>
> In most common search engines one receives the following search result
> structure:
>
> Title of Document
> Excerpt of Document
> Path/URL of Document
> Mime-Type of Document
> Last Modified of Document
What would be the types there? String for most probably, maybe a
java.lang.Long timestamp for LMD?
> which means if we also want to provide this, then we need to reparse
> each Node which has been found, which is not
> so nice (performance wise and also code wise).
>
> Hence I would suggest that we enhance the Indexer/Searcher interface by
> adding the fields above and introducing methods like
>
> Result[] Repository.getSearcher().search(QUERY)
If QUERY does not include paging there, we'd better return a
java.lang.Iterable<Result> to make the API easier to use (e.g. with a
Java 5 for loop) but mostly to be able to load the results lazily if
needed. We would also need a startIndex and maxCount, even if we do not
implement them at once.
> whereas Result has methods like
>
> Result.getTitle()
> Result.getExcerpt()
> etc.
We could also use:
String Result.getMetadata(String aDublinCoreOrWhateverRDFpropertyURI)
...or maybe both: the hard-coded ones because API users will most
probably need them, and the generic one for extensibility?
> WDYT?
>
> Thanks
>
> Michi
HTH,
Guillaume
More information about the Yanel-development
mailing list