bug #6151
closedimplement pager for search service
100%
Description
UTIS is not retuning all possible results from the connected providers.
A pager for search service should be implemented. Pagers only make sense for like and fuzzy searches.
For UTIS being a federated search engine a general paging can be implemented in two ways:
- The
PAGESIZE
limit defines the number of results in the whole response aggregated from the individual responses from multiple checklists. Since the number of hits per checklist provide in unknown it would be required to always asked to returnPAGESIZE
elements for a request. With the current architecure of UTIS this can impose a big overhead on the overall request processing which increases with the amount of clients that are queried in parallel. - The
PAGESIZE
limit only affects the individual checklist queries. The unified response of UTIS contains all responses from the checklist providers. When UTIS requestsn
checklists the UTIS response can in this case have a maximum size ofn * PAGESIZE
response items.
Strategy 2. seems to be the better option, since it is not causing performacne problems.
Checklist provider services and paging:¶
- ENUNIS: no limitation: DONE (pager implemented)
- COL (bgbm-cdm-serever[col]): via name Catalogue service seems to have no limit: TODO implement pager mechanism?
- WoRMS: maximum of 50 records by default, allows paging by using an
offset
parameter (see below for more details) DONE (pager implemented) - PESI: maximum of 50 records by default, no paging possible so far (see below for more details) DONE (pager implemented)
- PLAZI: DONE (pager implemented)
- GBIF: DONE
WoRMS & PESI¶
Both are developed and maintained by VLIZ.
The PESI and WoRMS web services have a couple of operations where the amount of returned records is limited to 50.
For WoRMS some operations have an optional 'offset' parameter by which paging through the full result set is possible:
- getAphiaRecords
- getAphiaRecordsByVernacular
- getAphiaRecordsByDate
- getAphiaChildrenByID
That's very useful. The PESI service in contrast is missing the ability to retrieve data beyond that 50 records limit.
It would strongly increase the utility of the PESI service if you could implement the 'offset' parameter also for PESI. I guess this should not cause too much work, since this already has been done for WoRMS and both implementations seems to be very similar (at least from looking at the SOAP API).
I asked Bart Vanhoorne from VLIZ for implementation of the 'offset' parameter in PESI (2016-10-21):
It would strongly increase the utility of the PESI service if you could implement the 'offset' parameter also for PESI. I guess this should not cause too much work, since this already has been done for WoRMS and both implementations seems to be very similar (at least from looking at the SOAP API).
His Answer:
OK, we'll put this on our todo list.
I would say this will be done end 2016, begin 2017
This is implemented now!!!!
Notify:
- Scott Chamberlain scott@ropensci.org once this is done!
Related issues
Updated by Andreas Kohlbecker over 7 years ago
- Status changed from New to In Progress
To save time I decided to change the strategy. I will not implement a general pager mechanism into utis. Not all of the connected services are support paging of the results. The new strategy is to limit the possible amount of returned matched by limiting the minimum search string length:
- utis-core|34e21a0 - testing pager function of worms
- utis-webapp|bdf44ee - limiting amount of returned results by query length restriction
Updated by Andreas Kohlbecker over 7 years ago
- Due date set to 01/15/2017
- Priority changed from New to Highest
- % Done changed from 0 to 20
For the time until the pager is really implemented I worked on the interim solution by relaxing the max hit setting for the EDIT name catalogue service. This attempt is implemented in utis-core|4f08022. This is however not yet working since the name catalogue service is not yet supporting POST requests. Support for POST requests has been implemented with cdmlib|0842c38 and will officially become available with the CDM Release 4.4.
This ticket should be continued at the latest in Jan. 2017, setting due date accordingly
Updated by Andreas Kohlbecker over 7 years ago
- Description updated (diff)
Hi Andreas,
We have now implemented this, and (on top) added a REST service.
http://www.eu-nomen.eu/portal/webservices.php
Happy holidays!
Regards,
Bart
Updated by Andreas Kohlbecker almost 5 years ago
- Related to feature request #8356: Algae Names Index - implementation added
Updated by Andreas Kohlbecker almost 5 years ago
- Related to feature request #8360: paging results of /classification/{taxonId}/children webservice added
Updated by Andreas Kohlbecker almost 5 years ago
the paging mechanism for the WoRMS and PESI clients is implemented.
The PESI SOAP service however seems to return duplicate records:
http --timeout 600 :8081/search 'providers==pesi' 'query==Crepis foetida' searchMode==scientificNameLike pageIndex==0 pageSize==100 | jq '.query[] | .response[] | .taxon | .url, .taxonName.scientificName '
copied to #8363
Updated by Andreas Kohlbecker almost 5 years ago
- Description updated (diff)
- % Done changed from 40 to 30
Updated by Andreas Kohlbecker almost 5 years ago
- Related to task #8363: PESI SOAP service seems to return duplicate records added
Updated by Andreas Kohlbecker almost 5 years ago
- Status changed from In Progress to Resolved
- % Done changed from 30 to 50
Updated by Andreas Kohlbecker almost 5 years ago
Email sent to Scott Chamberlain
Updated by Andreas Kohlbecker almost 5 years ago
- Status changed from Resolved to Closed
- % Done changed from 50 to 100
Updated by Andreas Kohlbecker almost 5 years ago
- Related to deleted (feature request #8360: paging results of /classification/{taxonId}/children webservice)
Updated by Andreas Kohlbecker almost 5 years ago
- Related to feature request #8360: paging results of /classification/{taxonId}/children webservice added
Updated by Andreas Müller about 2 years ago
- Target version changed from 280 to UTIS 1.3
Updated by Andreas Müller about 2 years ago
- Project changed from UTIS to EDIT
- Category changed from utis to utis
- Severity set to normal
Updated by Andreas Müller about 2 years ago
- Due date set to 03/11/2022
- Start date set to 03/11/2022
Updated by Andreas Müller about 2 years ago
- Due date deleted (
03/11/2022) - Start date deleted (
03/11/2022)