Ticket #912 (new Enhancement)
what are these "lucene.log.v" files for and can we delete them once in a while?
Reported by: | Malte | Owned by: | |
---|---|---|---|
Priority: | Major | Milestone: | Release 4.8 |
Component: | DeepaMehta Standard Distribution | Version: | 4.7 |
Keywords: | Cc: | jri, JuergeN | |
Complexity: | 3 | Area: | |
Module: |
Description
In the "index" folder of each "deepamehta-db" there are, in my case, currently 219 files named "lucene.log.n" on disk. Are these Lucene "segments"? If so we could probably easily optimize the number and size of them but i guess these are some other files, so what are these?
In my example these binaries make up 3.6GB of the 5.7GB overall and i am just curious about that. If we can do something to reduce the overall file size of the db i think we should try it (as it is cumbers tome to work with).
I found the following post on stackoverflow and it might seem related and it points to the neo4j-lucene implementation being in charge for these files:
http://stackoverflow.com/questions/29656457/neo4j-database-exploding-due-to-lucene-logs-when-properties-are-added-to-nodes
Can we do something here? Thanks for your support!
Change History
comment:2 Changed 9 years ago by Malte
Yes!
So, as it looks, it might be possible to just remove all of those lucene.log.vxxx files (below the index folder) and also all the ones starting with nineo_logical.log.vXX in the root dir of deepamehta-db.
Though, to find out which of those are safe to delete on has to look at the "last modified" times of the resp. files and guess which were not touched lately (which is vague and related to neo4j internals).
The size of my deepamehta-db folder has now shrinked from 5.9GB to 1.5GB and i have yet run into troubles with this. I am so glad that i asked.
Maybe it would be the wisest option to pass through this neo4j configuration option to dm4-users. That would be the most flexible for all kind of dm4 operators.
Cheers!
comment:3 Changed 9 years ago by Malte
Here is the information from the source for neo4j version 1.8.3:
http://neo4j.com/docs/1.8.3/configuration-logical-logs.html
Logical logs in Neo4j are the journal of which operations happens and are the source of truth in scenarios where the database needs to be recovered after a crash or similar. Logs are rotated every now and then (defaults to when they surpass 25 Mb in size) and the amount of legacy logs to keep can be configured. Purpose of keeping a history of logical logs include being able to serve incremental backups as well as keeping an HA cluster running. Regardless of configuration at least the latest non-empty logical log be kept. For any given configuration at least the latest non-empty logical log will be kept, but configuration can be supplied to control how much more to keep. There are several different means of controlling it and the format in which configuration is supplied is: keep_logical_logs=<true/false> keep_logical_logs=<amount> <type> For example: # Will keep logical logs indefinitely keep_logical_logs=true # Will keep only the most recent non-empty log keep_logical_logs=false # Will keep logical logs which contains any transaction committed within 30 days keep_logical_logs=30 days # Will keep logical logs which contains any of the most recent 500 000 transactions keep_logical_logs=500k txs
Another post [1] on the web suggest that these files are related to a neo4j setting called "keep_logical_logs".
As it seems, this looks like "the source of truth" about these files:
http://neo4j.com/docs/1.9.9/configuration-logical-logs.html
As briefly mentioned in [1] this setting should also apply to these lucene logs.
If we doing backups of your DB it should be safe to get rid of those (from time to time).
[1] http://grokbase.com/t/gg/neo4j/131fmvgg0s/does-this-seem-normal-for-the-index-folder-size