Ticket #776 (closed Enhancement: worksforme)
dm4-wdtk: for bulk import, allow periodic writing of new lucene indexes
Reported by: | Malte | Owned by: | |
---|---|---|---|
Priority: | Major | Milestone: | Release 4.5 |
Component: | DeepaMehta Standard Distribution | Version: | 4.5 |
Keywords: | Cc: | jri | |
Complexity: | 3 | Area: | Application Framework / API |
Module: | deepamehta-core |
Description
Concerning:
http://wikidata-topics.beta.wmflabs.org/de.deepamehta.webclient
Source Code:
https://github.com/mukil/dm4-wikidata-toolkit
I have written a naive import-mechanism for wikidata dumps and deepamehta4. After running over the wikidata json-dump for 10seconds, this process starts.
INFO: Processed 328500 items from the wikidata json-dump. Mar 02, 2015 5:43:34 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor printProcessingStatus INFO: Identified 91695 human beings, 2714 institutions, 3128 cities and 47 countries Mar 02, 2015 5:43:34 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor stop INFO: ### Start creating Topics ..
Then, later, the following error is thrown and no single topic was imported/written to db.
### Starting to process dump for 10secs .. on a fresh database. Mar 14, 2015 5:09:17 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor printProcessingStatus INFO: Processed 23900 items from the wikidata json-dump. Mar 14, 2015 5:09:17 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor printProcessingStatus INFO: Identified 8611 human beings, 198 institutions, 589 cities and 38 countries. Mar 14, 2015 5:09:17 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor stop INFO: ### Start creating Topics ... ... WARNING: object 369252 (typeUri="dm4.core.aggregation") is not assigned to any workspace -- READ permission is granted Mar 14, 2015 5:58:12 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor stop INFO: Skipping the import of relations, this works just with (and among) already imported items. ... ### Starting to process dump for 10secs .. for the second time on a database. Mar 14, 2015 6:09:51 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor printProcessingStatus INFO: Processed 30400 items from the wikidata json-dump. Mar 14, 2015 6:09:51 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor printProcessingStatus INFO: Identified 13160 human beings, 219 institutions, 634 cities and 38 countries. Mar 14, 2015 6:09:51 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor stop INFO: ### Start creating Topics .. ... INFO: Created new "org.deepamehta.wikidata.mentor_of" relationship for Q65466 to 366694 (Heinrich Magnus) with GUID: "Q65466$028350e2-4545-9dc2-ac05-bce12837b281" Mar 14, 2015 6:25:06 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor stop INFO: DeepaMehta now recognizes 13160 human beings, 219 institutions, 634 cities, 38 countries by name. Additionally DeepaMehta recorded 8399 associations among these items. Mar 14, 2015 6:25:06 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor stop INFO: Finished importing. ... ### Starting to process dump for 30secs .. for third time on the database. ... INFO: Processed 92300 items from the wikidata json-dump. Mar 14, 2015 6:53:04 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor printProcessingStatus INFO: Identified 30436 human beings, 720 institutions, 1228 cities and 44 countries. Mar 14, 2015 6:53:04 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor countEntity INFO: Timeout. Aborting processing. (approx. 80 minutes later between each "getAssocations()"-call there are 7 seconds pause, in which a person topic gets created) Mar 14, 2015 8:17:44 PM de.deepamehta.core.impl.EmbeddedService getAssociations INFO: topic1Id=1031830, topic2Id=3182, assocTypeUri="dm4.core.aggregation" Mar 14, 2015 8:17:44 PM de.deepamehta.core.impl.AccessControlImpl hasPermission WARNING: object 1031869 (typeUri="dm4.core.aggregation") is not assigned to any workspace -- READ permission is granted Mar 14, 2015 8:17:50 PM de.deepamehta.core.impl.EmbeddedService getAssociations INFO: topic1Id=1031871, topic2Id=3182, assocTypeUri="dm4.core.aggregation" Mar 14, 2015 8:17:50 PM de.deepamehta.core.impl.AccessControlImpl hasPermission WARNING: object 1031910 (typeUri="dm4.core.aggregation") is not assigned to any workspace -- READ permission is granted Mar 14, 2015 8:17:57 PM de.deepamehta.core.impl.EmbeddedService getAssociations INFO: topic1Id=1031912, topic2Id=3182, assocTypeUri="dm4.core.aggregation" Mar 14, 2015 8:17:57 PM de.deepamehta.core.impl.AccessControlImpl hasPermission WARNING: object 1031951 (typeUri="dm4.core.aggregation") is not assigned to any workspace -- READ permission is granted Mar 14, 2015 8:18:04 PM de.deepamehta.core.impl.EmbeddedService getAssociations INFO: topic1Id=1031953, topic2Id=3182, assocTypeUri="dm4.core.aggregation" Mar 14, 2015 8:18:04 PM de.deepamehta.core.impl.AccessControlImpl hasPermission WARNING: object 1031992 (typeUri="dm4.core.aggregation") is not assigned to any workspace -- READ permission is granted Mar 14, 2015 8:18:09 PM de.deepamehta.core.impl.EmbeddedService getAssociations INFO: topic1Id=1031994, topic2Id=3182, assocTypeUri="dm4.core.aggregation" Mar 14, 2015 8:18:09 PM de.deepamehta.core.impl.AccessControlImpl hasPermission (until after 140 minutes the following two error start to occur repeatingly... ) ... WARNING: object 1032033 (typeUri="dm4.core.aggregation") is not assigned to any workspace -- READ permission is granted Mar 14, 2015 9:00:45 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor createPersonTopic WARNING: Creating topic failed (topic (id=-1, uri="org.wikidata.entity.Q277049", typeUri="dm4.contacts.person", value="", childTopics={dm4.contacts.notes=topic (id=-1, uri="null", typeUri="dm4.contacts.notes", value="<p>Polish swimmer</p><p class="wd-item-footer">For more infos visit this items <a href="http://www.wikidata.org./entity/Q277049" target="_blank" title="Wikidata page for this item">page</a> on wikidata.org.</p>", childTopics={}), dm4.contacts.person_name=topic (id=-1, uri="null", typeUri="dm4.contacts.person_name", value="null", childTopics={dm4.contacts.first_name=topic (id=-1, uri="null", typeUri="dm4.contacts.first_name", value="Artur", childTopics={}), dm4.contacts.last_name=topic (id=-1, uri="null", typeUri="dm4.contacts.last_name", value="Wojdat", childTopics={})})})) java.lang.RuntimeException: Creating topic failed (topic (id=-1, uri="org.wikidata.entity.Q277049", typeUri="dm4.contacts.person", value="", childTopics={dm4.contacts.notes=topic (id=-1, uri="null", typeUri="dm4.contacts.notes", value="<p>Polish swimmer</p><p class="wd-item-footer">For more infos visit this items <a href="http://www.wikidata.org./entity/Q277049" target="_blank" title="Wikidata page for this item">page</a> on wikidata.org.</p>", childTopics={}), dm4.contacts.person_name=topic (id=-1, uri="null", typeUri="dm4.contacts.person_name", value="null", childTopics={dm4.contacts.first_name=topic (id=-1, uri="null", typeUri="dm4.contacts.first_name", value="Artur", childTopics={}), dm4.contacts.last_name=topic (id=-1, uri="null", typeUri="dm4.contacts.last_name", value="Wojdat", childTopics={})})})) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:718) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:155) at org.deepamehta.plugins.wdtk.WikidataEntityProcessor.createPersonTopic(WikidataEntityProcessor.java:494) at org.deepamehta.plugins.wdtk.WikidataEntityProcessor.stop(WikidataEntityProcessor.java:710) at org.deepamehta.plugins.wdtk.WikidataToolkitPlugin.startProcesssingWikidataDumpfile(WikidataToolkitPlugin.java:193) at org.deepamehta.plugins.wdtk.WikidataToolkitPlugin.importWikidataEntities(WikidataToolkitPlugin.java:149) at org.deepamehta.plugins.wdtk.WikidataToolkitPlugin.importEntitiesFromWikidataDump(WikidataToolkitPlugin.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.ResourceObjectRule.accept(ResourceObjectRule.java:100) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1480) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1411) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1360) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1350) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503) at org.ops4j.pax.web.service.jetty.internal.HttpServiceServletHandler.doHandle(HttpServiceServletHandler.java:69) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.ops4j.pax.web.service.jetty.internal.HttpServiceContext.doHandle(HttpServiceContext.java:240) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.ops4j.pax.web.service.jetty.internal.JettyServerHandlerCollection.handle(JettyServerHandlerCollection.java:77) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot flush at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3599) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3588) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:452) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:401) at org.apache.lucene.index.DirectoryReader.doOpenFromWriter(DirectoryReader.java:428) at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:448) at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:391) at org.apache.lucene.index.IndexReader.openIfChanged(IndexReader.java:497) at org.apache.lucene.index.IndexReader.reopen(IndexReader.java:681) at org.neo4j.index.impl.lucene.FullTxData.searcher(FullTxData.java:348) at org.neo4j.index.impl.lucene.FullTxData.internalQuery(FullTxData.java:199) at org.neo4j.index.impl.lucene.FullTxData.get(FullTxData.java:410) at org.neo4j.index.impl.lucene.TxDataHolder.get(TxDataHolder.java:56) at org.neo4j.index.impl.lucene.LuceneTransaction.getAddedIds(LuceneTransaction.java:214) at org.neo4j.index.impl.lucene.LuceneIndex.query(LuceneIndex.java:253) at org.neo4j.index.impl.lucene.LuceneIndex.get(LuceneIndex.java:207) at de.deepamehta.storage.neo4j.Neo4jStorage.checkUriUniqueness(Neo4jStorage.java:1113) at de.deepamehta.storage.neo4j.Neo4jStorage.storeTopic(Neo4jStorage.java:156) at de.deepamehta.core.impl.StorageDecorator.storeTopic(StorageDecorator.java:100) at de.deepamehta.core.impl.EmbeddedService.topicFactory(EmbeddedService.java:739) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:714) ... 52 more Mar 14, 2015 9:01:27 PM org.deepamehta.plugins.wdtk.WikidataEntityProcessor createPersonTopic WARNING: Creating topic failed (topic (id=1044697, uri="org.wikidata.entity.Q303259", typeUri="dm4.contacts.person", value="", childTopics={dm4.contacts.notes=topic (id=-1, uri="null", typeUri="dm4.contacts.notes", value="<p>Figure skater</p><p class="wd-item-footer">For more infos visit this items <a href="http://www.wikidata.org./entity/Q303259" target="_blank" title="Wikidata page for this item">page</a> on wikidata.org.</p>", childTopics={}), dm4.contacts.person_name=topic (id=1044698, uri="", typeUri="dm4.contacts.person_name", value="", childTopics={dm4.contacts.first_name=topic (id=1044699, uri="", typeUri="dm4.contacts.first_name", value="Sergei", childTopics={}), dm4.contacts.last_name=topic (id=-1, uri="null", typeUri="dm4.contacts.last_name", value="Sakhnovski", childTopics={})})})) java.lang.RuntimeException: Creating topic failed (topic (id=1044697, uri="org.wikidata.entity.Q303259", typeUri="dm4.contacts.person", value="", childTopics={dm4.contacts.notes=topic (id=-1, uri="null", typeUri="dm4.contacts.notes", value="<p>Figure skater</p><p class="wd-item-footer">For more infos visit this items <a href="http://www.wikidata.org./entity/Q303259" target="_blank" title="Wikidata page for this item">page</a> on wikidata.org.</p>", childTopics={}), dm4.contacts.person_name=topic (id=1044698, uri="", typeUri="dm4.contacts.person_name", value="", childTopics={dm4.contacts.first_name=topic (id=1044699, uri="", typeUri="dm4.contacts.first_name", value="Sergei", childTopics={}), dm4.contacts.last_name=topic (id=-1, uri="null", typeUri="dm4.contacts.last_name", value="Sakhnovski", childTopics={})})})) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:718) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:155) at org.deepamehta.plugins.wdtk.WikidataEntityProcessor.createPersonTopic(WikidataEntityProcessor.java:494) at org.deepamehta.plugins.wdtk.WikidataEntityProcessor.stop(WikidataEntityProcessor.java:710) at org.deepamehta.plugins.wdtk.WikidataToolkitPlugin.startProcesssingWikidataDumpfile(WikidataToolkitPlugin.java:193) at org.deepamehta.plugins.wdtk.WikidataToolkitPlugin.importWikidataEntities(WikidataToolkitPlugin.java:149) at org.deepamehta.plugins.wdtk.WikidataToolkitPlugin.importEntitiesFromWikidataDump(WikidataToolkitPlugin.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.ResourceObjectRule.accept(ResourceObjectRule.java:100) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1480) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1411) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1360) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1350) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503) at org.ops4j.pax.web.service.jetty.internal.HttpServiceServletHandler.doHandle(HttpServiceServletHandler.java:69) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.ops4j.pax.web.service.jetty.internal.HttpServiceContext.doHandle(HttpServiceContext.java:240) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.ops4j.pax.web.service.jetty.internal.JettyServerHandlerCollection.handle(JettyServerHandlerCollection.java:77) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Storing the child topics of object 1044697 failed ({dm4.contacts.notes=topic (id=-1, uri="null", typeUri="dm4.contacts.notes", value="<p>Figure skater</p><p class="wd-item-footer">For more infos visit this items <a href="http://www.wikidata.org./entity/Q303259" target="_blank" title="Wikidata page for this item">page</a> on wikidata.org.</p>", childTopics={}), dm4.contacts.person_name=topic (id=1044698, uri="", typeUri="dm4.contacts.person_name", value="", childTopics={dm4.contacts.first_name=topic (id=1044699, uri="", typeUri="dm4.contacts.first_name", value="Sergei", childTopics={}), dm4.contacts.last_name=topic (id=-1, uri="null", typeUri="dm4.contacts.last_name", value="Sakhnovski", childTopics={})})}) at de.deepamehta.core.impl.ValueStorage.storeChildTopics(ValueStorage.java:277) at de.deepamehta.core.impl.ValueStorage.storeValue(ValueStorage.java:107) at de.deepamehta.core.impl.EmbeddedService.topicFactory(EmbeddedService.java:740) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:714) ... 52 more Caused by: java.lang.RuntimeException: Creating topic failed (topic (id=1044698, uri="", typeUri="dm4.contacts.person_name", value="", childTopics={dm4.contacts.first_name=topic (id=1044699, uri="", typeUri="dm4.contacts.first_name", value="Sergei", childTopics={}), dm4.contacts.last_name=topic (id=-1, uri="null", typeUri="dm4.contacts.last_name", value="Sakhnovski", childTopics={})})) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:718) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:155) at de.deepamehta.core.impl.ValueStorage.storeCompositionOne(ValueStorage.java:310) at de.deepamehta.core.impl.ValueStorage.storeChildTopics(ValueStorage.java:290) at de.deepamehta.core.impl.ValueStorage.storeChildTopics(ValueStorage.java:274) ... 55 more Caused by: java.lang.RuntimeException: Storing the child topics of object 1044698 failed ({dm4.contacts.first_name=topic (id=1044699, uri="", typeUri="dm4.contacts.first_name", value="Sergei", childTopics={}), dm4.contacts.last_name=topic (id=-1, uri="null", typeUri="dm4.contacts.last_name", value="Sakhnovski", childTopics={})}) at de.deepamehta.core.impl.ValueStorage.storeChildTopics(ValueStorage.java:277) at de.deepamehta.core.impl.ValueStorage.storeValue(ValueStorage.java:107) at de.deepamehta.core.impl.EmbeddedService.topicFactory(EmbeddedService.java:740) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:714) ... 59 more Caused by: java.lang.RuntimeException: Creating topic failed (topic (id=1044699, uri="", typeUri="dm4.contacts.first_name", value="Sergei", childTopics={})) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:718) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:155) at de.deepamehta.core.impl.ValueStorage.storeCompositionOne(ValueStorage.java:310) at de.deepamehta.core.impl.ValueStorage.storeChildTopics(ValueStorage.java:290) at de.deepamehta.core.impl.ValueStorage.storeChildTopics(ValueStorage.java:274) ... 62 more Caused by: java.lang.RuntimeException: Associating topic 1044699 with topic type "dm4.contacts.first_name" failed at de.deepamehta.core.impl.EmbeddedService.createTopicInstantiation(EmbeddedService.java:594) at de.deepamehta.core.impl.EmbeddedService.topicFactory(EmbeddedService.java:741) at de.deepamehta.core.impl.EmbeddedService.createTopic(EmbeddedService.java:714) ... 66 more Caused by: java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot flush at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3599) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3588) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:452) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:401) at org.apache.lucene.index.DirectoryReader.doOpenFromWriter(DirectoryReader.java:428) at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:448) at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:391) at org.apache.lucene.index.IndexReader.openIfChanged(IndexReader.java:497) at org.apache.lucene.index.IndexReader.reopen(IndexReader.java:681) at org.neo4j.index.impl.lucene.FullTxData.searcher(FullTxData.java:348) at org.neo4j.index.impl.lucene.FullTxData.internalQuery(FullTxData.java:199) at org.neo4j.index.impl.lucene.FullTxData.get(FullTxData.java:410) at org.neo4j.index.impl.lucene.TxDataHolder.get(TxDataHolder.java:56) at org.neo4j.index.impl.lucene.LuceneTransaction.getAddedIds(LuceneTransaction.java:214) at org.neo4j.index.impl.lucene.LuceneIndex.query(LuceneIndex.java:253) at org.neo4j.index.impl.lucene.LuceneIndex.get(LuceneIndex.java:207) at de.deepamehta.storage.neo4j.Neo4jStorage.fetchTopicNodeByUri(Neo4jStorage.java:1024) at de.deepamehta.storage.neo4j.Neo4jStorage.fetchTopicPlayerNode(Neo4jStorage.java:937) at de.deepamehta.storage.neo4j.Neo4jStorage.fetchPlayerNode(Neo4jStorage.java:927) at de.deepamehta.storage.neo4j.Neo4jStorage.storePlayerRelationship(Neo4jStorage.java:917) at de.deepamehta.storage.neo4j.Neo4jStorage.storeAssociation(Neo4jStorage.java:275) at de.deepamehta.core.impl.StorageDecorator.storeAssociation(StorageDecorator.java:269) at de.deepamehta.core.impl.EmbeddedService.createTopicInstantiation(EmbeddedService.java:590) ... 68 more
As jri once already mentioned: A mechanism for bulk-importing topics and associations might be needed to prevent errors like this, or? Maybe we could/should need periodic write operations to prevent errors like this (unable to write more indexes).
Question: May the writing of new lucene indexes be related to unfinished/open transactions? Please note that all 100 000 topics get created within a GET request (which itself is annotated as @Transactional).
What do you think?
Thanks for your time & Cheers!
Change History
comment:2 Changed 10 years ago by Malte
Please also note that running this plugin (or at least compiling it) requires to run DM 4 with Java 7 since the WDTK in use (https://github.com/Wikidata/Wikidata-Toolkit) depends on that (see also https://www.mediawiki.org/wiki/Wikidata_Toolkit).
BTW: Compiling and running DM 4 with Java 7 doesn't seem to be a problem in itself, operation looks smooth.
comment:3 Changed 10 years ago by jri
Yes, when creating 100.000 topics in a single transaction I can imagine that you run out-of-memory as the transaction log becomes huge.
Try this: avoid DM from creating the per-request transaction by removing the @Transactional annotation from your resource method. Instead create the transactions manually (via dms.beginTx()), for example one for every single topic.
The bulk updates I've once mentioned (#325) are related to /Migrations/. There the developer has no chance to avoid DM from creating the outer transaction. In contrast, in your case -- a resource method -- you have.
comment:4 Changed 10 years ago by Malte
- Status changed from new to closed
- Resolution set to worksforme
I changed the implementation according to your hint. The endurance test is yet to be done but i am very optimistic that this will solve the issue i described. The first release of this new wikidata-plugin is therefore about to come.
Thanks!
Having filelogging only configured, i found another exception thrown to the gogo shell.
May it help us to better track down the issue.