Project

General

Profile

bug #6151

Updated by Andreas Kohlbecker almost 5 years ago

UTIS is not retuning all possible results from the connected providers. 

 A pager for search service should be implemented. Pagers only make sense for *like* and *fuzzy* searches. 

 For UTIS being a federated search engine a general paging can be implemented in two ways: 

 1. The `PAGESIZE` limit defines the number of results in the whole response aggregated from the individual responses from multiple checklists. Since the number of hits per checklist provide in unknown it would be required to always asked to return `PAGESIZE` elements for a request. With the current architecure of UTIS this can impose a big overhead on the overall request processing which increases with the amount of clients that are queried in parallel. 
 2. The `PAGESIZE` limit only affects the individual checklist queries. The unified response of UTIS contains all responses from the checklist providers. When UTIS requests `n` checklists the UTIS response can in this case have a maximum size of `n * PAGESIZE` response items.  

 Strategy **2.** seems to be the better option, since it is not causing performacne problems. 
 

 ## Checklist provider services and paging: 

 * ENUNIS: no limitation: **TODO** implement pager mechanism? limitation 
 * Col via name Catalogue service seems to have no limit: **TODO** implement pager mechanism? a limit of 100 but this is hard to hit with the fuzzy search service. 
 * WoRMS: maximum of 50 records by default, allows paging by using an    `offset` parameter (see below for more details) **DONE** (pager implemented) 
 * PESI: maximum of 50 records by default, no paging possible so far (see below for more details) **DONE** (pager implemented) 

 ### WoRMS & PESI 

 Both are developed and maintained by VLIZ. 


 The PESI and WoRMS web services have a couple of operations where the amount of returned records is limited to 50. 

 For WoRMS some operations have an optional 'offset' parameter by which paging through the full result set is possible: 

 * getAphiaRecords 
 * getAphiaRecordsByVernacular 
 * getAphiaRecordsByDate 
 * getAphiaChildrenByID 

 That's very useful. The PESI service in contrast is missing the ability to retrieve data beyond that 50 records limit. 
 It would strongly increase the utility of the PESI service if you could implement the    'offset' parameter also for PESI. I guess this should not cause too much work, since this already has been done for WoRMS and both implementations seems to be very similar (at least from looking at the SOAP API). 

 I asked Bart Vanhoorne from VLIZ for implementation of the 'offset' parameter in PESI (2016-10-21): 

 > It would strongly increase the utility of the PESI service if you could implement the    'offset' parameter also for PESI. I guess this should not cause too much work, since this already has been done for WoRMS and both implementations seems to be very similar (at least from looking at the SOAP API). 

 His Answer: 

 > OK, we'll put this on our todo list. 
 > I would say this will be done end 2016, begin 2017 

 This is implemented now!!!! 

 

 **Notify:** 

 *    Scott Chamberlain <scott@ropensci.org> once this is done! 

Back