Search for "asset" content

At the moment we store assets (pdf’s, word) directly on the document type. The main reason for is findability. When searched on terms that are in the asset the document the asset was attached to is found.

However as a (very undesirable) side effect this has led to a huge increase in database site. The cms keeps a version history of all changes by means of storing a copy of the document node. And since our assets are also on this node these will be copied too. Over the course of a year our db grew a factor 10, and most of this can be related to the assets stored directly on the node.

Is there a way to either:

  • prevent the asset nodes from being copied when the document node is stored for version history?
  • detach the assets from the document node but still be able to search on content in the assets and link that to the document?
  • any other solution that I have not come up with?

kind regards,

Lucas