Project

General

Profile

Actions

bug #6151

closed

implement pager for search service

Added by Andreas Kohlbecker over 6 years ago. Updated 11 months ago.

Status:
Closed
Priority:
Highest
Category:
utis
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
Severity:
normal
Found in Version:
Tags:

Description

UTIS is not retuning all possible results from the connected providers.

A pager for search service should be implemented. Pagers only make sense for like and fuzzy searches.

For UTIS being a federated search engine a general paging can be implemented in two ways:

  1. The PAGESIZE limit defines the number of results in the whole response aggregated from the individual responses from multiple checklists. Since the number of hits per checklist provide in unknown it would be required to always asked to return PAGESIZE elements for a request. With the current architecure of UTIS this can impose a big overhead on the overall request processing which increases with the amount of clients that are queried in parallel.
  2. The PAGESIZE limit only affects the individual checklist queries. The unified response of UTIS contains all responses from the checklist providers. When UTIS requests n checklists the UTIS response can in this case have a maximum size of n * PAGESIZE response items.

Strategy 2. seems to be the better option, since it is not causing performacne problems.

Checklist provider services and paging:

  • ENUNIS: no limitation: DONE (pager implemented)
  • COL (bgbm-cdm-serever[col]): via name Catalogue service seems to have no limit: TODO implement pager mechanism?
  • WoRMS: maximum of 50 records by default, allows paging by using an offset parameter (see below for more details) DONE (pager implemented)
  • PESI: maximum of 50 records by default, no paging possible so far (see below for more details) DONE (pager implemented)
  • PLAZI: DONE (pager implemented)
  • GBIF: DONE

WoRMS & PESI

Both are developed and maintained by VLIZ.

The PESI and WoRMS web services have a couple of operations where the amount of returned records is limited to 50.

For WoRMS some operations have an optional 'offset' parameter by which paging through the full result set is possible:

  • getAphiaRecords
  • getAphiaRecordsByVernacular
  • getAphiaRecordsByDate
  • getAphiaChildrenByID

That's very useful. The PESI service in contrast is missing the ability to retrieve data beyond that 50 records limit.
It would strongly increase the utility of the PESI service if you could implement the 'offset' parameter also for PESI. I guess this should not cause too much work, since this already has been done for WoRMS and both implementations seems to be very similar (at least from looking at the SOAP API).

I asked Bart Vanhoorne from VLIZ for implementation of the 'offset' parameter in PESI (2016-10-21):

It would strongly increase the utility of the PESI service if you could implement the 'offset' parameter also for PESI. I guess this should not cause too much work, since this already has been done for WoRMS and both implementations seems to be very similar (at least from looking at the SOAP API).

His Answer:

OK, we'll put this on our todo list.
I would say this will be done end 2016, begin 2017

This is implemented now!!!!

Notify:


Related issues

Related to PhycoBank - feature request #8356: Algae Names Index - implementationIn ProgressAndreas Kohlbecker

Actions
Related to EDIT - task #8363: PESI SOAP service seems to return duplicate recordsWorksformeAndreas Kohlbecker

Actions
Related to EDIT - feature request #8360: paging results of /classification/{taxonId}/children webserviceWorksformeAndreas Kohlbecker

Actions
Actions #1

Updated by Andreas Kohlbecker over 6 years ago

  • Status changed from New to In Progress

To save time I decided to change the strategy. I will not implement a general pager mechanism into utis. Not all of the connected services are support paging of the results. The new strategy is to limit the possible amount of returned matched by limiting the minimum search string length:

Actions #2

Updated by Andreas Kohlbecker over 6 years ago

  • Description updated (diff)
Actions #3

Updated by Andreas Kohlbecker over 6 years ago

  • Due date set to 01/15/2017
  • Priority changed from New to Highest
  • % Done changed from 0 to 20

For the time until the pager is really implemented I worked on the interim solution by relaxing the max hit setting for the EDIT name catalogue service. This attempt is implemented in utis-core|4f08022. This is however not yet working since the name catalogue service is not yet supporting POST requests. Support for POST requests has been implemented with cdmlib|0842c38 and will officially become available with the CDM Release 4.4.

This ticket should be continued at the latest in Jan. 2017, setting due date accordingly

Actions #4

Updated by Andreas Kohlbecker about 6 years ago

  • Description updated (diff)

Hi Andreas,

We have now implemented this, and (on top) added a REST service.
http://www.eu-nomen.eu/portal/webservices.php

Happy holidays!

Regards,
Bart

Actions #5

Updated by Andreas Kohlbecker over 3 years ago

Actions #6

Updated by Andreas Kohlbecker over 3 years ago

  • Related to feature request #8360: paging results of /classification/{taxonId}/children webservice added
Actions #7

Updated by Andreas Kohlbecker over 3 years ago

the paging mechanism for the WoRMS and PESI clients is implemented.

The PESI SOAP service however seems to return duplicate records:

http --timeout 600 :8081/search 'providers==pesi' 'query==Crepis foetida' searchMode==scientificNameLike pageIndex==0 pageSize==100 | jq '.query[] | .response[] | .taxon | .url, .taxonName.scientificName '

copied to #8363

Actions #8

Updated by Andreas Kohlbecker over 3 years ago

  • % Done changed from 20 to 40
Actions #9

Updated by Andreas Kohlbecker over 3 years ago

  • Description updated (diff)
  • % Done changed from 40 to 30
Actions #10

Updated by Andreas Kohlbecker over 3 years ago

  • Description updated (diff)
Actions #11

Updated by Andreas Kohlbecker over 3 years ago

  • Description updated (diff)
Actions #12

Updated by Andreas Kohlbecker over 3 years ago

  • Related to task #8363: PESI SOAP service seems to return duplicate records added
Actions #13

Updated by Andreas Kohlbecker over 3 years ago

  • Description updated (diff)
Actions #14

Updated by Andreas Kohlbecker over 3 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 30 to 50
Actions #15

Updated by Andreas Kohlbecker over 3 years ago

Email sent to Scott Chamberlain

Actions #16

Updated by Andreas Kohlbecker over 3 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 50 to 100
Actions #17

Updated by Andreas Kohlbecker over 3 years ago

  • Target version set to 280
Actions #18

Updated by Andreas Kohlbecker over 3 years ago

  • Related to deleted (feature request #8360: paging results of /classification/{taxonId}/children webservice)
Actions #19

Updated by Andreas Kohlbecker over 3 years ago

  • Related to feature request #8360: paging results of /classification/{taxonId}/children webservice added
Actions #20

Updated by Andreas Kohlbecker over 3 years ago

  • Tags set to phycobank
Actions #21

Updated by Andreas Kohlbecker over 3 years ago

  • Category set to utis
Actions #22

Updated by Andreas Müller 11 months ago

  • Target version changed from 280 to UTIS 1.3
Actions #23

Updated by Andreas Müller 11 months ago

  • Project changed from UTIS to EDIT
  • Category changed from utis to utis
  • Severity set to normal
Actions #24

Updated by Andreas Müller 11 months ago

  • Due date set to 03/11/2022
  • Start date set to 03/11/2022
Actions #25

Updated by Andreas Müller 11 months ago

  • Due date deleted (03/11/2022)
  • Start date deleted (03/11/2022)
Actions

Also available in: Atom PDF