Ticket #913 (closed Defect: worksforme)
seting up the dm4 database on a NFS looks to be problematic
Reported by: | Malte | Owned by: | |
---|---|---|---|
Priority: | Major | Milestone: | Release 4.8 |
Component: | DeepaMehta Standard Distribution | Version: | 4.7 |
Keywords: | Cc: | jri, dgf, JuergeN | |
Complexity: | 3 | Area: | |
Module: | deepamehta-storage-neo4j |
Description
In an environment where resources like e.g the hard disk is mounted via NFS, neo4j might throw some I/O Errors and thus the platform needs an immediate restart.
Here is the responsible line for the file system where the deepamehta-db folder is located, taken from t "/etc/fstab/" output.
store.svc.fue.xyznet:/project/some-project/home /home nfs rw,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,noatime,lookupcache=none,nofsc 0 0
I read a bit about each of the the mounting options present here but none of those seem related to the deployment requirements mentioned at http://neo4j.com/docs/1.9.9/deployment-requirements.html
For proper ACID behavior, the filesystem must support flush (fsync, fdatasync). Minimum ext4 (or similar) Recommended ext4, ZFS
Nov 24, 2015 9:38:34 AM de.deepamehta.core.impl.AttachedTopic _update INFO: Updating topic 32633 (new topic (id=-1, uri="null", typeUri="dm4.topicmaps.translation", value="null", childTopics={dm4.topicmaps.translation_y=topic (id=-1, uri="null", typeUri="dm4.topicmaps.translation_y", value="320", childTopics={}), relating association (id=-1, uri="null", typeUri="null", value="null", childTopics={}, null, null), dm4.topicmaps.translation_x=topic (id=-1, uri="null", typeUri="dm4.topicmaps.translation_x", value="187", childTopics={}), relating association (id=-1, uri="null", typeUri="null", value="null", childTopics={}, null, null)}), relating association (id=-1, uri="null", typeUri="null", value="null", childTopics={}, null, null)) Nov 24, 2015 9:38:34 AM de.deepamehta.core.impl.AttachedTopic _update INFO: Updating topic 32634 (new topic (id=-1, uri="null", typeUri="dm4.topicmaps.translation_x", value="187", childTopics={}), relating association (id=-1, uri="null", typeUri="null", value="null", childTopics={}, null, null)) Nov 24, 2015 9:38:34 AM de.deepamehta.core.impl.AttachedDeepaMehtaObject updateSimpleValue INFO: ### Changing simple value of topic 32634 from "-154" -> "187" Nov 24, 2015 9:38:34 AM de.deepamehta.core.impl.AttachedTopic _update INFO: Updating topic 32643 (new topic (id=-1, uri="null", typeUri="dm4.topicmaps.translation_y", value="320", childTopics={}), relating association (id=-1, uri="null", typeUri="null", value="null", childTopics={}, null, null)) Nov 24, 2015 9:38:34 AM de.deepamehta.core.impl.AttachedDeepaMehtaObject updateSimpleValue INFO: ### Changing simple value of topic 32643 from "252" -> "320" Nov 24, 2015 9:38:34 AM de.deepamehta.core.util.UniversalExceptionMapper logException SEVERE: Request "PUT /topicmap/32613/translation/187/320" failed. Responding with 500 (Internal Server Error). The original exception/error is: org.neo4j.graphdb.TransactionFailureException: Unable to commit transaction at org.neo4j.kernel.TopLevelTransaction.finish(TopLevelTransaction.java:134) at de.deepamehta.storage.neo4j.Neo4jTransactionAdapter.finish(Neo4jTransactionAdapter.java:39) at de.deepamehta.core.impl.TransactionFactory$TransactionResourceFilter$2.filter(TransactionFactory.java:91) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1427) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1360) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1350) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716) at javax.servlet.http.HttpServlet.service(HttpServlet.java:722) at org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletHandler.java:339) at org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHandler.java:300) at org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletPipeline.java:93) at org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:50) at org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:31) at org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterPipeline.java:76) at org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.java:49) at org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServlet.java:67) at javax.servlet.http.HttpServlet.service(HttpServlet.java:722) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) Caused by: org.neo4j.graphdb.TransactionFailureException: commit threw exception but status is committed? at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:476) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:400) at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:119) at org.neo4j.kernel.TopLevelTransaction.finish(TopLevelTransaction.java:118) ... 40 more Caused by: javax.transaction.xa.XAException at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:558) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:438) ... 43 more Caused by: org.neo4j.kernel.impl.nioneo.store.UnderlyingStorageException: Unable to write record[3] @[27] at org.neo4j.kernel.impl.nioneo.store.PersistenceRow.writeContents(PersistenceRow.java:182) at org.neo4j.kernel.impl.nioneo.store.PersistenceRow.force(PersistenceRow.java:198) at org.neo4j.kernel.impl.nioneo.store.LockableWindow.writeOutAndClose(LockableWindow.java:64) at org.neo4j.kernel.impl.nioneo.store.LockableWindow.writeOutAndCloseIfFree(LockableWindow.java:146) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.release(PersistenceWindowPool.java:237) at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.releaseWindow(CommonAbstractStore.java:469) at org.neo4j.kernel.impl.nioneo.store.NeoStore.setRecord(NeoStore.java:484) at org.neo4j.kernel.impl.nioneo.store.NeoStore.setLastCommittedTx(NeoStore.java:422) at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.applyCommit(WriteTransaction.java:498) at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.doCommit(WriteTransaction.java:440) at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.commit(XaTransaction.java:319) at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:482) at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64) at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:552) ... 44 more Caused by: java.io.IOException: Input/output error at sun.nio.ch.FileDispatcherImpl.pwrite0(Native Method) at sun.nio.ch.FileDispatcherImpl.pwrite(FileDispatcherImpl.java:66) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:89) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.FileChannelImpl.writeInternal(FileChannelImpl.java:739) at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:724) at org.neo4j.kernel.impl.nioneo.store.PersistenceRow.writeContents(PersistenceRow.java:176) ... 57 more Nov 24, 2015 9:38:35 AM de.deepamehta.core.util.UniversalExceptionMapper logException SEVERE: Request "PUT /topicmap/32613/translation/187/320" failed. Responding with 500 (Internal Server Error). The original exception/error is: org.neo4j.graphdb.TransactionFailureException: Unable to get transaction. at org.neo4j.kernel.InternalAbstractGraphDatabase.transactionRunning(InternalAbstractGraphDatabase.java:769) at org.neo4j.kernel.InternalAbstractGraphDatabase.beginTx(InternalAbstractGraphDatabase.java:742) at org.neo4j.kernel.TransactionBuilderImpl.begin(TransactionBuilderImpl.java:39) at org.neo4j.kernel.InternalAbstractGraphDatabase.beginTx(InternalAbstractGraphDatabase.java:737) at de.deepamehta.storage.neo4j.Neo4jTransactionAdapter.<init>(Neo4jTransactionAdapter.java:22) at de.deepamehta.storage.neo4j.Neo4jStorage.beginTx(Neo4jStorage.java:529) at de.deepamehta.core.impl.StorageDecorator.beginTx(StorageDecorator.java:633) at de.deepamehta.core.impl.EmbeddedService.beginTx(EmbeddedService.java:581) at de.deepamehta.core.impl.TransactionFactory$TransactionResourceFilter$1.filter(TransactionFactory.java:70) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:281) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.ResourceObjectRule.accept(ResourceObjectRule.java:100) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1480) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1411) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1360) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1350) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716) at javax.servlet.http.HttpServlet.service(HttpServlet.java:722) at org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletHandler.java:339) at org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHandler.java:300) at org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletPipeline.java:93) at org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:50) at org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:31) at org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterPipeline.java:76) at org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.java:49) at org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServlet.java:67) at javax.servlet.http.HttpServlet.service(HttpServlet.java:722) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) Caused by: javax.transaction.SystemException: TM has encountered some problem, please perform neccesary action (tx recovery/restart) at org.neo4j.kernel.impl.transaction.TxManager.assertTmOk(TxManager.java:349) at org.neo4j.kernel.impl.transaction.TxManager.getTransaction(TxManager.java:700) at org.neo4j.kernel.InternalAbstractGraphDatabase.transactionRunning(InternalAbstractGraphDatabase.java:765) ... 52 more
Do we have someone with expert knowledge about this here to confirm these speculations?
Thanks for your support!
Change History
comment:1 Changed 9 years ago by jri
- What makes you think NFS would not support fsync()?
- Can you confirm that you didn't delete any files from deepamehta-db? (#912)
comment:2 Changed 9 years ago by jri
• How do you relate the errors to NFS at all?
javax.transaction.xa.XAException javax.transaction.SystemException: TM has encountered some problem, please perform neccesary action (tx recovery/restart)
To me it looks rather like the transaction manager has a problem. Possibly due to missing transaction logs?
My hint would be to never delete transaction logs while Neo4j is running (#912), if deleting them is OK at all (I don't know about this at the moment).
Have you ever used DM/Neo4j on a NFS volume before?
comment:3 Changed 9 years ago by Malte
What makes you think NFS would not support fsync()?
I did not state so.
Can you confirm that you didn't delete any files from deepamehta-db? (#912)
Yes, absolutely. This log is from 24th of November and #912 i just discovered.
How do you relate the errors to NFS at all?
As i wrote, all just speculations and intuition.
It was the only change i made, moving the database onto the NFS (due to disk space running out in system space). Furthermore, the error occurred to me after a day and first everything seemed to be fine. Time might be a factor here too. I also thought of a not closed transaction but everything what happens is wrapped in a @Transactional statement and otherwise would have thrown an Exception (and thus ended that transaction).
Subsequently i was further speculating that, because no other activity happened during that timeframe of 24hous, the file-system might have been unmounted and mounted and this might produce such an error.
Its just a suspicion and to make it very clear again: i do not know what i am talking about here i am not an ops expert.
What works for me: I moved the DB back to its old place and i never had such an error again (as before).
When i am there the next time i will try to reproduce this.
comment:4 follow-up: ↓ 5 Changed 9 years ago by Malte
Regarding your question:
What makeswro you think NFS would not support fsync()?
In contrary, i wrote that i could not make up any relation between the requirements (after reading those docs at neo4j and informing myself about) and any of the mounting options (present in fstab on the NFS mount commands).
Originally:
I read a bit about each of the the mounting options present here but none of those seem related to the deployment requirements mentioned at http://neo4j.com/docs/1.9.9/deployment-requirements.html
comment:5 in reply to: ↑ 4 Changed 9 years ago by jri
Thank you for providing this background info!
To track this further I would search for the actual error message first:
org.neo4j.kernel.impl.nioneo.store.UnderlyingStorageException: Unable to write record[3] @[27]
FYI: DM uses Neo4j 1.8.1