Ticket #309 (closed Feature Request: fixed)
string contains search instead of substring
Reported by: | Malte | Owned by: | jri |
---|---|---|---|
Priority: | Critical | Milestone: | Release 4.1 |
Component: | 3rd Party Plugins | Version: | 4.0.11 |
Keywords: | Cc: | dgf | |
Complexity: | 3 | Area: | Application Framework / API |
Module: |
Description
we need to find values by a "contains" search and not by a substring, where e.g. a topic named "Algebraische Kongruente Funktionen" would be returned as a search result for the query "Funktionen".
is this configurable atm or do we even need to implement our own search?
all the resp. fields we are searching through were modelled with the following declarative settings:
"index_mode_uris": ["dm4.core.fulltext", "dm4.core.fulltext_key"],
any hint appreciated.
Change History
comment:2 Changed 12 years ago by Malte
Thanks for your quick reply. I am writing a new client where the search is currently based on dm4-webclients rest_client`s search_topics()-method which queries /core/topic?{parameter}
So, I want to do a "contains"-search (which searches for matches over topics of three different topic types), using the REST-API.
comment:3 Changed 12 years ago by jri
OK.
The REST API already supports what you call a "contains" search.
In fulltext-indexed values (dm4.core.fulltext or dm4.core.fulltext_key) you can search for every single word.
So, you *will* find "Algebraische Kongruente Funktionen" when you search for "Funktionen".
What looks your request like?
Example: create a Person with Last Name "Algebraische Kongruente Funktionen".
(I just use Person because its Last Name is exactly indexed like your fields.)
This request will find that Last Name topic:
GET /core/topic?search=Funktionen&field=dm4.contacts.last_name
BTW: the "wholeword" parameter is false by default. So this search is positive as well:
GET /core/topic?search=Funk&field=dm4.contacts.last_name
To match whole words only add wholeword=true to the request (no result here):
GET /core/topic?search=Funk&field=dm4.contacts.last_name&wholeword=true
Note: the "field" parameter is optional. When you omit it you will search in *all* fields indexed as dm4.core.fulltext
With the field parameter specified you search only in this field. This requires the index mode dm4.core.fulltext_key.
You can use both index modes at the same (as you do).
However, what you call "matches over topics of three different topic types" is not possible.
You have to perform 3 searches consecutively and then manually combine the result.
Hope this helps.
comment:4 Changed 12 years ago by jri
One more thing: what is supported is not exactly a "contains" search, but rather a "begin of each word" search.
A true "contains" search would be positive for "ktionen" as well. This is not supported by Lucene. You *can* use wildcards ("*", "?") but they are not allowed as first character of a word. (I don't think this is what you asked for either).
comment:5 Changed 12 years ago by Malte
Thanks for your insightful clarifications. I know see also the real issue underlying my assumption that this search is not supported yet, our usage of the REST-API is correct already but sometimes the search does not perform due to an internal server error. Here is what I found out. Maybe you can give me an idea of what happens here. Thanks very much in advance!
which results my dm-service to log:
18.09.2012 11:40:40 de.deepamehta.core.impl.service.EmbeddedService searchTopics WARNUNG: ROLLBACK! 18.09.2012 11:40:40 com.sun.jersey.spi.container.ContainerResponse logException SCHWERWIEGEND: Mapped exception to response: 500 (Internal Server Error) javax.ws.rs.WebApplicationException: java.lang.RuntimeException: Searching topics failed (searchTerm="Funktionen", fieldUri="tub.eduzen.excercise_name", wholeWord=false, clientState={dm4_topicmap_id=10631, dm4_workspace_id=9676, mjx.fontWarn=warned%3Atrue, dm4_username=admin}) at de.deepamehta.plugins.webservice.WebservicePlugin.searchTopics(WebservicePlugin.java:109) at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.ResourceObjectRule.accept(ResourceObjectRule.java:100) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletHandler.java:96) at org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHandler.java:79) at org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletPipeline.java:42) at org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:49) at org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33) at org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterPipeline.java:48) at org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.java:39) at org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServlet.java:67) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:926) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.lang.RuntimeException: Searching topics failed (searchTerm="Funktionen", fieldUri="tub.eduzen.excercise_name", wholeWord=false, clientState={dm4_topicmap_id=10631, dm4_workspace_id=9676, mjx.fontWarn=warned%3Atrue, dm4_username=admin}) at de.deepamehta.core.impl.service.EmbeddedService.searchTopics(EmbeddedService.java:152) at de.deepamehta.plugins.webservice.WebservicePlugin.searchTopics(WebservicePlugin.java:107) ... 41 more Caused by: java.lang.IllegalArgumentException: ID 26986 refers not to a MehtaNode but to a MehtaEdge at de.deepamehta.mehtagraph.impl.Neo4jBase.buildMehtaNode(Neo4jBase.java:71) at de.deepamehta.mehtagraph.impl.Neo4jMehtaGraph.queryMehtaNodes(Neo4jMehtaGraph.java:113) at de.deepamehta.core.impl.storage.MGStorageBridge.searchTopics(MGStorageBridge.java:146) at de.deepamehta.core.impl.service.EmbeddedService.searchTopics(EmbeddedService.java:147) ... 42 more
while searchin for another topic type, same index mode, succeeds as described by you here.
comment:6 Changed 12 years ago by jri
- Cc dgf added
Oh, yes, that's a different story ;-)
I encounter this search issue too once in a while.
It looks like DM doesn't maintain the Lucene indexes properly in every situation. I guess ID 26986 is a stale index entry. The respective topic is meanwhile deleted and Neo4j reassigned that ID to another object, now to an DM association (MehtaEdge?).
Sorry, no solution for the moment.
I have to investigate it further.
Can you confirm that searching in tub.eduzen.excercise_name with other terms works correctly?
comment:7 Changed 12 years ago by jri
- Status changed from new to closed
- Resolution set to fixed
There is now a workaround in Neo4j MehtaGraph?. See #302.
This should work for the moment.
Just to clarify: to you mean the DM Webclient, the Java API, or the REST API?