Project

General

Profile

feature request #9134

Rule based creation of additional media representations "on the fly"

Added by Andreas Kohlbecker 28 days ago. Updated 16 days ago.

Status:
Resolved
Priority:
Priority14
Category:
cdmlib-remote
Target version:
Start date:
07/08/2020
Due date:
% Done:

50%

Severity:
normal

Description

Often only the full sized image URIs are recorded in a db but thumbnail sized images are often required.

Based on a set of rules the missing media representation can be created at request time.

As the MediaRepresentations are created as purely volatile objects they must not be persised. To prevent from erroneously storing them as part of an cdm entity graph the classes that produce these new MediaRepresentations should only reside in cdmlib-remote.

All method in cdmlib-remote which return Media or MediaRepresentation should be equipped with this functionality.

Other methods which also make use of the according MediaUtil methods can not be adapted:

  • MediaUtils.findBestMatchingRepresentation()
    • eu.etaxonomy.cdm.api.service.ClassificationServiceImpl.getAllMediaForChildNodes(TaxonNode taxonNode, List propertyPaths, int size, int height, int widthOrDuration, String[] mimeTypes)

Related issues

Related to Edit - feature request #9135: all media related web service end points with MediaUriTransformation and/or bestMatchingRepresentation filter New 07/09/2020
Related to Edit - task #9132: Update cyprus images to Scaler default API and add thumbnails Resolved 07/07/2020

Associated revisions

Revision 2a4ebda1 (diff)
Added by Andreas Kohlbecker 27 days ago

fix #9134 MediaUriTransformation for rule based creation of additional media representations 'on the fly'

Revision 2f0cabf1 (diff)
Added by Andreas Kohlbecker 27 days ago

fix #9134 solving merge conflict and avoiding field solving serialization in SearchReplace

Revision 551c4f43 (diff)
Added by Andreas Kohlbecker 23 days ago

ref #9134 adding ducumentation for MediaUriTransformationProcessor and Mediatoolbox

Revision 193bfc9a (diff)
Added by Andreas Müller 22 days ago

ref #9134 fix transformation for non-square maxextend transformations

Revision 8165e2b9 (diff)
Added by Andreas Kohlbecker 20 days ago

ref #9134 no longer storing DefaultMediaTransformations in the db, better exception handling and more documentation

Revision 5ccd238c (diff)
Added by Andreas Kohlbecker 20 days ago

ref #9134 fixing bug with adding entiies to a collection

Revision 896e7ef8 (diff)
Added by Andreas Kohlbecker 20 days ago

ref #9134 fixing calculation of croped thumbnail images and adding tests

Revision bf5abe7d (diff)
Added by Andreas Kohlbecker 20 days ago

ref #9134 fixing bug in default transformation creation

Revision 8e996490 (diff)
Added by Andreas Kohlbecker 16 days ago

ref #9134 preventing persistance of volatile modified media

History

#1 Updated by Andreas Kohlbecker 28 days ago

  • Description updated (diff)
  • Category changed from cdmlib to cdmlib-remote

#2 Updated by Andreas Kohlbecker 28 days ago

  • Description updated (diff)

#3 Updated by Andreas Kohlbecker 28 days ago

  • Description updated (diff)

#4 Updated by Andreas Kohlbecker 28 days ago

  • Description updated (diff)

#5 Updated by Andreas Kohlbecker 27 days ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 50

#6 Updated by Andreas Kohlbecker 27 days ago

A set of generic transformation rules for the BGBM digilib server rules will be stored in each db. This has been done for two reasons:

  1. this allows for zero configuration usage of the digilib server
  2. serves as a example or template on which other rules can be based on.
[
  {
    "height": 200,
    "width": 200,
    "mimeType": "image/jpeg",
    "pathQueryFragment": {
      "search": "digilib/Scaler/IIIF/([^\\!]+)\\!([^\\/]+)(.*)",
      "replace": "digilib/Scaler/IIIF/$1!$2/full/!200,200/0/default.jpg"
    },
    "host": {
      "search": "pictures.bgbm.org",
      "replace": "pictures.bgbm.org"
    },
    "scheme": null
  },
  {
    "height": 400,
    "width": 400,
    "mimeType": "image/jpeg",
    "pathQueryFragment": {
      "search": "digilib/Scaler/IIIF/([^\\!]+)\\!([^\\/]+)(.*)",
      "replace": "digilib/Scaler/IIIF/$1!$2/full/!400,400/0/default.jpg"
    },
    "host": {
      "search": "pictures.bgbm.org",
      "replace": "pictures.bgbm.org"
    },
    "scheme": null
  },
  {
    "height": 400,
    "width": 400,
    "mimeType": "image/jpeg",
    "pathQueryFragment": {
      "search": "digilib/Scaler/\\?fn=([^\\\\/]+)/(\\w+)(.*)",
      "replace": "digilib/Scaler/IIIF/$1!$2/full/!400,400/0/default.jpg"
    },
    "host": {
      "search": "pictures.bgbm.org",
      "replace": "pictures.bgbm.org"
    },
    "scheme": null
  },
  {
    "height": 200,
    "width": 200,
    "mimeType": "image/jpeg",
    "pathQueryFragment": {
      "search": "digilib/Scaler/\\?fn=([^\\\\/]+)/(\\w+)(.*)",
      "replace": "digilib/Scaler/IIIF/$1!$2/full/!200,200/0/default.jpg"
    },
    "host": {
      "search": "pictures.bgbm.org",
      "replace": "pictures.bgbm.org"
    },
    "scheme": null
  }
]

#7 Updated by Andreas Kohlbecker 27 days ago

  • Assignee changed from Andreas Kohlbecker to Andreas Müller

please review

#8 Updated by Andreas Kohlbecker 27 days ago

  • Related to feature request #9135: all media related web service end points with MediaUriTransformation and/or bestMatchingRepresentation filter added

#9 Updated by Andreas Kohlbecker 27 days ago

TODO: Skip the creation of a MediaRepresentation if a part with exactly the same URL already exists.

#10 Updated by Andreas Müller 22 days ago

  • Status changed from Resolved to Feedback
  • Assignee changed from Andreas Müller to Andreas Kohlbecker

As far as I can see this works well.

Some minor issues:

  • The MediaToolbox methods change the media itself by adding or removing representations. This might be dangerous in case the ToolBox is used whithin a not read-only transaction. Currently the toolbox belongs and is used only in remote to it is probably not used in such a way. However, to be on the save side it might be a good idea to use Media clones instead.

  • MediaToolbox: shouldn't we add a @Component annotation to show that it is meant to be a bean (even if not searched by component scan)?

  • documentation for MediaToolbox is still missing

  • MediaToolbox is only logging JsonExceptions but not throwing them. Is this wanted (I am not familiar with the exception handling in remote therefore I ask)

  • MediaUriTransformationProcessor.calculateTargetSize() does not handle maxextend transformations with trans.getWidth() != trans.getHeight() really correct by only returning "return new Point(trans.getWidth(), trans.getHeight());" . The algorithm should not depend on the fact that the required box is a square. I adapted the code accordingly (it even became simpler this way) and added tests. (193bfc9a54d2)

#11 Updated by Andreas Müller 22 days ago

There is one issue that needs to be solved:

  • MediaToolbox.readTransformations is creating default preference values and stores them in the database. This is not the correct way to handle preferences. Default values should be handled in the PreferencePredicate itself. This way the database is not littered by default value records. Also this way it is easier to change the default behavior without having to update database records. Simply by changing the code. So preferences should only be stored in the database if they deviate from the default value.

#12 Updated by Andreas Müller 22 days ago

  • Subject changed from rule based creation of additional media representations "on the fly" to Rule based creation of additional media representations "on the fly"

#13 Updated by Andreas Kohlbecker 21 days ago

  • Related to task #9132: Update cyprus images to Scaler default API and add thumbnails added

#15 Updated by Andreas Kohlbecker 20 days ago

new default transformation for digilib:

[ {
  "scheme" : null,
  "host" : {
    "replace" : "pictures.bgbm.org",
    "search" : "pictures.bgbm.org"
  },
  "pathQueryFragment" : {
    "replace" : "digilib/Scaler/?fn=$1/$2&mo=crop&dw=200&dh=147&uvfix=1",
    "search" : "digilib/Scaler/IIIF/([^\\!]+)\\!([^\\/]+)(.*)"
  },
  "mimeType" : "image/jpeg",
  "width" : 200,
  "height" : 200
}, {
  "scheme" : null,
  "host" : {
    "replace" : "pictures.bgbm.org",
    "search" : "pictures.bgbm.org"
  },
  "pathQueryFragment" : {
    "replace" : "digilib/Scaler/IIIF/$1!$2/full/!400,400/0/default.jpg",
    "search" : "digilib/Scaler/IIIF/([^\\!]+)\\!([^\\/]+)(.*)"
  },
  "mimeType" : "image/jpeg",
  "width" : 400,
  "height" : 400
}, {
  "scheme" : null,
  "host" : {
    "replace" : "pictures.bgbm.org",
    "search" : "pictures.bgbm.org"
  },
  "pathQueryFragment" : {
    "replace" : "digilib/Scaler/IIIF/$1!$2/full/!400,400/0/default.jpg",
    "search" : "digilib/Scaler/\\?fn=([^\\\\/]+)/(\\w+)(.*)"
  },
  "mimeType" : "image/jpeg",
  "width" : 400,
  "height" : 400
}, {
  "scheme" : null,
  "host" : {
    "replace" : "pictures.bgbm.org",
    "search" : "pictures.bgbm.org"
  },
  "pathQueryFragment" : {
    "replace" : "digilib/Scaler/?fn=$1/$2&mo=crop&dw=200&dh=147&uvfix=1",
    "search" : "digilib/Scaler/\\?fn=([^\\\\/]+)/(\\w+)(.*)"
  },
  "mimeType" : "image/jpeg",
  "width" : 200,
  "height" : 200
} ]

#16 Updated by Andreas Müller 20 days ago

Does it make sense to use

"width" : 200,
"height" : 200

if we use &mo=crop&dw=200&dh=147

Do we need width and heigth at all if we use crop?

#17 Updated by Andreas Kohlbecker 20 days ago

Andreas Müller wrote:

...
* The MediaToolbox methods change the media itself by adding or removing representations. This might be dangerous in case the ToolBox is used whithin a not read-only transaction. Currently the toolbox belongs and is used only in remote to it is probably not used in such a way. However, to be on the save side it might be a good idea to use Media clones instead.

Using media clones can also cause problems: Rest service clients like the data portals may rely on the uuid of the media object as being a stable identifier. Cloned objects would have a different UUIDs than the original. Cloned objects with the same UUID - also not a good idea. May be we could set Media.id = 0, this would most probably cause an error when hibernate attempts to persist such an entity (not tested).

Another option is to return MediaDTOs instead of Media entities. It consider this as the best solution since it is more transparent than all others.

I implemented all of your suggestions and discovered a bug in my code:

media.getRepresentations().addAll(newRepr);

can you see the problem?

getRepresentations() must return an unmodifiableSet to avoid errors as above. I think we should protect all collections of entities where the add() and remove() methods do more than collection.add(), collection.addAll(), collection.remove(), etc.

#18 Updated by Andreas Kohlbecker 20 days ago

Andreas Müller wrote:

Does it make sense to use

"width" : 200,
"height" : 200

if we use &mo=crop&dw=200&dh=147

Do we need width and heigth at all if we use crop?

I guess you mean the ./!400,400/0/default.jpg URLs? Theses are for the preview images in the taxon general page.

#19 Updated by Andreas Müller 20 days ago

Andreas Kohlbecker wrote:

Andreas Müller wrote:

Does it make sense to use

"width" : 200,
"height" : 200

if we use &mo=crop&dw=200&dh=147

Do we need width and heigth at all if we use crop?

I guess you mean the ./!400,400/0/default.jpg URLs? Theses are for the preview images in the taxon general page.

Why 400? I know they are for the profile image. But above I mentioned

"width" : 200,
"height" : 200
if we use &mo=crop&dw=200&dh=147

so we talk about the thumbnails here. To me it sounds strange we have height=200 but dh=147 . But I did not go into detail. They might be a reason for this.

And the second issue was that I was wondering if we do need the width and heigth parameter at all for the algorithm if we use crop because crop will cut the image anyway to the size that is needed. But also here I did not really check against the code. It just came into my mind that this could be something that is not needed anymore with crop.

#20 Updated by Andreas Müller 20 days ago

Andreas Kohlbecker wrote:

Andreas Müller wrote:

Using media clones can also cause problems: Rest service clients like the data portals may rely on the uuid of the media object as being a stable identifier. Cloned objects would have a different UUIDs than the original. Cloned objects with the same UUID - also not a good idea. May be we could set Media.id = 0, this would most probably cause an error when hibernate attempts to persist such an entity (not tested).

Hmm, true, using a clone does not fully solve the problem, as long as we do not make sure that we are out of an transaction. However, CdmBase.clone() sets the id of the new object to 0 anyway, so it is correct that setting the uuid to the old value will throw an exception if anyone will try to use this object within a transaction. So I think it is not dangerous to do it this way. However, a DTO might also be a good (and maybe more performant) solution but will take more time to implement.

#21 Updated by Andreas Müller 20 days ago

Andreas Kohlbecker wrote:

Andreas Müller wrote:

I implemented all of your suggestions and discovered a bug in my code:

media.getRepresentations().addAll(newRepr);

can you see the problem?

getRepresentations() must return an unmodifiableSet to avoid errors as above. I think we should protect all collections of entities where the add() and remove() methods do more than collection.add(), collection.addAll(), collection.remove(), etc.

Yes, it is a known issue (discussed at the very early days) that in this context it is a bit dangerous to return the collections themself. I am not sure anymore why we decided to not do it the immutable way. One reason could be that in some cases you want to check if a collection is a PersistentCollection. You can't do this from outside if the collection that you get back is not the collection that is really attached to the object (except for using reflections which is a bit dirty and not performant). However, as this was never really an issue over the last 15 years (I remember only 1-2 cases where developers did this wrong) so I guess this is not really critical.

By the way, it can also be dangerous the other way round. E.g. we have the code

this.getRepresentations().remove(representation);

in Media.removeRepresentation(). This will not result in the expected behavior if getRepresentations() will not return the real set. Sometimes IDEs automatically do replace assignments by using getters and could create such problems.

#22 Updated by Andreas Kohlbecker 20 days ago

Andreas Müller wrote:

  • MediaUriTransformationProcessor.calculateTargetSize() does not handle maxextend transformations with trans.getWidth() != trans.getHeight() really correct by only returning "return new Point(trans.getWidth(), trans.getHeight());" . The algorithm should not depend on the fact that the required box is a square. I adapted the code accordingly (it even became simpler this way) and added tests. (193bfc9a54d2)

I actually don't expect that we will use maxextend transformations with trans.getWidth() != trans.getHeight() in the dataportal. I can't see the use case for this. More importantly for us is cropping of images. I modified the size calculation so that it works like:

  • trans.getWidth() != trans.getHeight() ==> crop
  • trans.getWidth() == trans.getHeight() ==> max extend

I adapted the test class also.

To cover all cases we would need to add a mode enum field to the MediaUriTransformation class with SCALE, MAX_EXTEND, CROP boolean field for isMaxextend

#23 Updated by Andreas Kohlbecker 20 days ago

Andreas Müller wrote:

"width" : 200,
"height" : 200
if we use &mo=crop&dw=200&dh=147

so we talk about the thumbnails here. To me it sounds strange we have height=200 but dh=147 . But I did not go into detail. They might be a reason for this.

And the second issue was that I was wondering if we do need the width and heigth parameter at all for the algorithm if we use crop because crop will cut the image anyway to the size that is needed. But also here I did not really check against the code. It just came into my mind that this could be something that is not needed anymore with crop.

Now I got what you are referring to.

  1. There actually was a bug in the code creating the default transformations - this is fixed, see bf5abe7ddd
  2. The crop function crops the image to the specified size, that it to the width and height. The resulting image exactly has this width and size. There is no other way to define what is needed than defining the desired size.

#24 Updated by Andreas Kohlbecker 20 days ago

the fixed default transformations as json for reference:

[ {
  "scheme" : null,
  "host" : {
    "replace" : "pictures.bgbm.org",
    "search" : "pictures.bgbm.org"
  },
  "pathQueryFragment" : {
    "replace" : "digilib/Scaler/?fn=$1/$2&mo=crop&dw=200&dh=147&uvfix=1",
    "search" : "digilib/Scaler/IIIF/([^\\!]+)\\!([^\\/]+)(.*)"
  },
  "mimeType" : "image/jpeg",
  "width" : 200,
  "height" : 147
}, {
  "scheme" : null,
  "host" : {
    "replace" : "pictures.bgbm.org",
    "search" : "pictures.bgbm.org"
  },
  "pathQueryFragment" : {
    "replace" : "digilib/Scaler/IIIF/$1!$2/full/!400,400/0/default.jpg",
    "search" : "digilib/Scaler/IIIF/([^\\!]+)\\!([^\\/]+)(.*)"
  },
  "mimeType" : "image/jpeg",
  "width" : 400,
  "height" : 400
}, {
  "scheme" : null,
  "host" : {
    "replace" : "pictures.bgbm.org",
    "search" : "pictures.bgbm.org"
  },
  "pathQueryFragment" : {
    "replace" : "digilib/Scaler/IIIF/$1!$2/full/!400,400/0/default.jpg",
    "search" : "digilib/Scaler/\\?fn=([^\\\\/]+)/(\\w+)(.*)"
  },
  "mimeType" : "image/jpeg",
  "width" : 400,
  "height" : 400
}, {
  "scheme" : null,
  "host" : {
    "replace" : "pictures.bgbm.org",
    "search" : "pictures.bgbm.org"
  },
  "pathQueryFragment" : {
    "replace" : "digilib/Scaler/?fn=$1/$2&mo=crop&dw=200&dh=147&uvfix=1",
    "search" : "digilib/Scaler/\\?fn=([^\\\\/]+)/(\\w+)(.*)"
  },
  "mimeType" : "image/jpeg",
  "width" : 200,
  "height" : 147
} ]

#25 Updated by Andreas Kohlbecker 16 days ago

  • Status changed from Feedback to Resolved
  • Assignee changed from Andreas Kohlbecker to Andreas Müller

Andreas Müller wrote:

Andreas Kohlbecker wrote:

Andreas Müller wrote:

Using media clones can also cause problems: Rest service clients like the data portals may rely on the uuid of the media object as being a stable identifier. Cloned objects would have a different UUIDs than the original. Cloned objects with the same UUID - also not a good idea. May be we could set Media.id = 0, this would most probably cause an error when hibernate attempts to persist such an entity (not tested).

Hmm, true, using a clone does not fully solve the problem, as long as we do not make sure that we are out of an transaction. However, CdmBase.clone() sets the id of the new object to 0 anyway, so it is correct that setting the uuid to the old value will throw an exception if anyone will try to use this object within a transaction. So I think it is not dangerous to do it this way. However, a DTO might also be a good (and maybe more performant) solution but will take more time to implement.

I modified the code a bit so that there is only one method left which actually modifies media objects. All other methods only operate on MediaRepresentation list. In which case unwanted persisting of modified media objects is not a case to take into account. The last method which modifies Media is IMediaToolbox.filterPreferredMediaRepresentations(List<Media> mediaList, Class<? extends MediaRepresentationPart> type, String[] mimeTypes, Integer widthOrDuration, Integer height, Integer size). This method can be replaced easily and is no source of danger at the same time since is in only being used in the TaxonPortalController. I anyway set the entity id of the modified media entities to 0 to raise an error in case of an attempt to save it to the db.

There now is a new ticket for cleaning up filterPreferredMediaRepresentations(List<Media> mediaList, Class<? extends MediaRepresentationPart> type, String[] mimeTypes, Integer widthOrDuration, Integer height, Integer size): #9160

With this commit I consider this ticket as solved completely.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 40 MB)