Ticket #389 (closed Enhancement: fixed)

Opened 12 years ago

Last modified 12 years ago

Storage layer performance: associations should be indexed

Reported by: jri Owned by: jri
Priority: Major Milestone: Release 4.1
Component: DeepaMehta Standard Distribution Version: 4.0.13
Keywords: Cc: dgf, Malte, joern, tsc, JuergeN
Complexity: 8 Area: Performance
Module:

Description

With "real" application content, e.g. 1000 Person topics, retrieval has proven too slow.
This is mainly when traversing supernodes. This happens e.g. when loading the definition of a type which have e.g. 1000 instances. You have to traverse all 1005 associations then to find the 5 association definitions.

The supernode problem is described in this article (thanks to dgf for the link):
http://java.dzone.com/articles/solution-supernode-problem

The solution is to index the associations.

Change History

comment:1 Changed 12 years ago by jri

  • Status changed from new to accepted

comment:3 Changed 12 years ago by Jörg Richter

Core: refactor storage layer, pt.6 (#389, #391).

Introduce association index.
Does not yet compile.

See ticket 389.
See ticket 391.

comment:4 Changed 12 years ago by Jörg Richter

Core: association metadata index (#389, #338).

The association metadata index basically works. Both, indexing and query. Update is pending.

Core is not yet functional.

Furthermore, in order to determine an object's type the new storage layer doesn't rely on "Instantiation" associations anymore. Instead a "type_uri" property is stored for every object (= Neo4j Node). (The "Instantiation" associations remain to exist).
This provides the basis for solving #338 (An Association's "Instantiation" Association has no type assignment).

Furthermore in Parent POM: we explicitely use an older Surefire version (2.12.4) in order to avoid an exception logging problem introduced in Surefire 2.13

See ticket 389.
See ticket 338.

comment:5 Changed 12 years ago by Jörg Richter

Core: bidirectional assoc indexing (#389, #391).

The new association metadata index can be queried "from both sides".
See ticket 389.

First tests of the Neo4j storage implementation are adpapted to the new storage layer API.
See ticket 391.

comment:6 Changed 12 years ago by Jörg Richter

Core: bidirectional assoc indexing (#389, #391).

The new association metadata index can be queried "from both sides".
See ticket 389.

First tests of the Neo4j storage implementation are adpapted to the new storage layer API.
See ticket 391.

comment:7 Changed 12 years ago by Jörg Richter

Core: fix bidirectional assoc index (#389, #393).

For the first time the server can start with the new storage layer.
Use the "assoc-index" branch.
(Try multiple if "bundle not found" is thrown, see 2) below.)

Pending:

1) Adapt the Webclient to the changed assod def format (#393).
2) Provide storage implementation as OSGi service.
3) Update assoc metadata index on delete and retype.

BREAKING CHANGES

The semantics of the Association Definition URI has changed. Now this is straight the URI as derived from the underling Association. So, it does *not* reflect the assoc def's part type URI (= "child type") anymore, but is usually empty (see #393).

When operating on an assoc def you must replace

    assocDef.getUri()

by

    assocDef.getPartTypeUri()

This is in particular important for genric operations on composite values, e.g.:

    composite.put(assocDef.getPartTypeUri(), childTopic.getModel());

See ticket 389.
See ticket 393.

comment:8 Changed 12 years ago by Jörg Richter

Core: fix bidirectional assoc index (#389, #393).

For the first time the server can start with the new storage layer.
Use the "assoc-index" branch.
(Try multiple if "bundle not found" is thrown, see 2) below.)

Pending:

1) Adapt the Webclient to the changed assod def format (#393).
2) Provide storage implementation as OSGi service.
3) Update assoc metadata index on delete and retype.

BREAKING CHANGES

The semantics of the Association Definition URI has changed. Now this is straight the URI as derived from the underling Association. So, it does *not* reflect the assoc def's part type URI (= "child type") anymore, but is usually empty (see #393).

When operating on an assoc def you must replace

    assocDef.getUri()

by

    assocDef.getPartTypeUri()

This is in particular important for genric operations on composite values, e.g.:

    composite.put(assocDef.getPartTypeUri(), childTopic.getModel());

See ticket 389.
See ticket 393.

comment:9 Changed 12 years ago by Jörg Richter

Storage fix: update assoc index on delete (#389).

BREAKING CHANGES

Core API: rename methods

    RelatedTopic.getAssociation()                         ->  getRelatingAssociation()
    RelatedTopicModel.getAssociationModel()               ->  getRelatingAssociation()
    RelatedAssociationModel.getRelatingAssociationModel() ->  getRelatingAssociation()

See ticket 389.

comment:10 Changed 12 years ago by Jörg Richter

Core: refactor some storage code (#389).

In preparation of "update association metadata index on retype".

New in DeepaMehtaStorage? interface:

  • void storeTopicTypeUri(long topicId, String topicTypeUri)
  • void storeAssociationTypeUri(long assocId, String assocTypeUri)

Not yet functional.

See ticket 389.

comment:11 Changed 12 years ago by Jörg Richter

Core: update assoc index on assoc retype (#389).

Done:

  • update the association metadata index when an *association* is retyped.

Pending:

  • ... when an *association role* is retyped.
  • ... when a *topic* is retyped.

See ticket 389.

comment:12 Changed 12 years ago by jri

  • Cc JuergeN added

comment:13 Changed 12 years ago by Jörg Richter

Neo4j storage: update index on role retype (#389).

The association metadata index is updated when an *association role* is retyped.

Pending:

  • ... when a *topic* is retyped.

=> Building composite types works for the first with the new storage layer.

BREAKING CHANGES

The association metadata index format has slightly changed.
Existing DBs *can* be used without modification. However, roles of existing associations can *not* be retyped.
Only roles of new associations can.

Resetting the DB is recommended.

See ticket 389.

comment:14 Changed 12 years ago by Jörg Richter

Storage: store type URI of retyped topic (#389).

The association metadata index is not yet updated when a topic is retyped.

=> retyping a Topic does not work for the moment.

Further refactoring is required: The retype code must be moved from storage decorator to application layer.

See ticket 389.

comment:15 Changed 12 years ago by Jörg Richter

Neo4j storage: update index on topic retype (#389)

The association metadata index is updated when a topic is retyped.

Still pending:

  • when an association is retyped all associations this association is a player in, must be re-indexed.

See ticket 389.

comment:16 Changed 12 years ago by Jörg Richter

Neo4j storage: update assoc metadata index (#389).

When an association is retyped all associations the association is a player in, are re-indexed.

=> For the first time the new storage layer should be complete now.
Minor issues may remain and will be fixed very soon.

See ticket 389.

comment:17 Changed 12 years ago by Jörg Richter

Storage fix: content index update on delete (#389)

When a topic or association is deleted the corresponding entries are removed from the *content indexes*.

=> ready to be merged into "master".
=> when you have an installation build from "master" its DB must be deleted.
=> to build the latest stable release (4.0.13) use branch "4.0.13-release".

See ticket 389.

comment:19 Changed 12 years ago by Jörg Richter

Core: refactor storage layer, pt.6 (#389, #391).

Introduce association index.
Does not yet compile.

See ticket 389.
See ticket 391.

comment:20 Changed 12 years ago by Jörg Richter

Core: association metadata index (#389, #338).

The association metadata index basically works. Both, indexing and query. Update is pending.

Core is not yet functional.

Furthermore, in order to determine an object's type the new storage layer doesn't rely on "Instantiation" associations anymore. Instead a "type_uri" property is stored for every object (= Neo4j Node). (The "Instantiation" associations remain to exist).
This provides the basis for solving #338 (An Association's "Instantiation" Association has no type assignment).

Furthermore in Parent POM: we explicitely use an older Surefire version (2.12.4) in order to avoid an exception logging problem introduced in Surefire 2.13

See ticket 389.
See ticket 338.

comment:21 Changed 12 years ago by Jörg Richter

Core: bidirectional assoc indexing (#389, #391).

The new association metadata index can be queried "from both sides".
See ticket 389.

First tests of the Neo4j storage implementation are adpapted to the new storage layer API.
See ticket 391.

comment:22 Changed 12 years ago by Jörg Richter

Core: fix bidirectional assoc index (#389, #393).

For the first time the server can start with the new storage layer.
Use the "assoc-index" branch.
(Try multiple if "bundle not found" is thrown, see 2) below.)

Pending:

1) Adapt the Webclient to the changed assod def format (#393).
2) Provide storage implementation as OSGi service.
3) Update assoc metadata index on delete and retype.

BREAKING CHANGES

The semantics of the Association Definition URI has changed. Now this is straight the URI as derived from the underling Association. So, it does *not* reflect the assoc def's part type URI (= "child type") anymore, but is usually empty (see #393).

When operating on an assoc def you must replace

    assocDef.getUri()

by

    assocDef.getPartTypeUri()

This is in particular important for genric operations on composite values, e.g.:

    composite.put(assocDef.getPartTypeUri(), childTopic.getModel());

See ticket 389.
See ticket 393.

comment:23 Changed 12 years ago by Jörg Richter

Storage fix: update assoc index on delete (#389).

BREAKING CHANGES

Core API: rename methods

    RelatedTopic.getAssociation()                         ->  getRelatingAssociation()
    RelatedTopicModel.getAssociationModel()               ->  getRelatingAssociation()
    RelatedAssociationModel.getRelatingAssociationModel() ->  getRelatingAssociation()

See ticket 389.

comment:24 Changed 12 years ago by Jörg Richter

Core: refactor some storage code (#389).

In preparation of "update association metadata index on retype".

New in DeepaMehtaStorage? interface:

  • void storeTopicTypeUri(long topicId, String topicTypeUri)
  • void storeAssociationTypeUri(long assocId, String assocTypeUri)

Not yet functional.

See ticket 389.

comment:25 Changed 12 years ago by Jörg Richter

Core: update assoc index on assoc retype (#389).

Done:

  • update the association metadata index when an *association* is retyped.

Pending:

  • ... when an *association role* is retyped.
  • ... when a *topic* is retyped.

See ticket 389.

comment:26 Changed 12 years ago by Jörg Richter

Neo4j storage: update index on role retype (#389).

The association metadata index is updated when an *association role* is retyped.

Pending:

  • ... when a *topic* is retyped.

=> Building composite types works for the first with the new storage layer.

BREAKING CHANGES

The association metadata index format has slightly changed.
Existing DBs *can* be used without modification. However, roles of existing associations can *not* be retyped.
Only roles of new associations can.

Resetting the DB is recommended.

See ticket 389.

comment:27 Changed 12 years ago by Jörg Richter

Storage: store type URI of retyped topic (#389).

The association metadata index is not yet updated when a topic is retyped.

=> retyping a Topic does not work for the moment.

Further refactoring is required: The retype code must be moved from storage decorator to application layer.

See ticket 389.

comment:28 Changed 12 years ago by Jörg Richter

Neo4j storage: update index on topic retype (#389)

The association metadata index is updated when a topic is retyped.

Still pending:

  • when an association is retyped all associations this association is a player in, must be re-indexed.

See ticket 389.

comment:29 Changed 12 years ago by Jörg Richter

Neo4j storage: update assoc metadata index (#389).

When an association is retyped all associations the association is a player in, are re-indexed.

=> For the first time the new storage layer should be complete now.
Minor issues may remain and will be fixed very soon.

See ticket 389.

comment:30 Changed 12 years ago by Jörg Richter

Storage fix: content index update on delete (#389)

When a topic or association is deleted the corresponding entries are removed from the *content indexes*.

=> ready to be merged into "master".
=> when you have an installation build from "master" its DB must be deleted.
=> to build the latest stable release (4.0.13) use branch "4.0.13-release".

See ticket 389.

comment:31 Changed 12 years ago by jri

  • Status changed from accepted to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.