Project

General

Profile

bug #8812

Connection losses in taxeditor HTTP invoker (RemoteAccessException: Could not access HTTP invoker remote service at ...)

Added by Andreas Kohlbecker 6 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
Highest
Category:
taxeditor
Target version:
Start date:
01/16/2020
Due date:
% Done:

50%

Severity:
critical
Found in Version:

Description

A couple of situations exist in which users report problems due to RemoteAccessExceptions ("Could not access HTTP invoker remote service at ...).

  1. First of all there is a correlation of these incidents with the network latency. Network latencies > 50ms already seems to be causing problems and connection loss. This has been reported by Katja. The connections to the US are having a much bigger latency (AW_Netzwerklatenzen nach Übersee.mbox)
  2. Connections going through the ZEDAT-VPN seem to be compromised by packet loss (AW_AW_Interessant_Could not access HTTP invoker remote service at .... bei lokalem server.mbox)
    • measuring the packet loss with wireshark shows a packet loss ratio of ~1% ==> Packet loss is not unusual in this case as the VPN access point may choose to drop packages to reduce the throughput for individual clients and connections. This is how rate limiting commonly is realized.
  3. It seems as if computation intensive and thus long running operations are increasing the risk for the occurrence of these incidents (see frequently affected operations below)

Useful links:

frequently affected operations

Deduplication

Hallo,
diesmal der Fehler schon bei der Deduplizierung von nur 2 Einträgen (noch von zuhause).
Herzlichen Gruß
Walter

login : w.berendsohn
editor version : 5.11.0
server : api.cybertaxonomy.org (cybertaxonomy.org) / caryophyllales_spp
schema version : 5.11.2.0.20191109
os : Windows 10 10.0 amd64
java : 1.8.0_221 

Taxon Navigator > Link to Taxon option

Dear Maryam,

please excuse the late reply – our entire university is closed until 5th of January.

I know that the “link with taxon” option is very important especially for you, because while sorting out the imported data you move around in the taxon tree quite a bit. I guess this is indeed a question related to your connection – this function is a rather complicated one, requiring some network traffic.

However, I had similar difficulties in general working with the editor from the US using a weak connection in November – is the “link” function the only problem you face? I think some measures have already been taken to remedy the problems I had. In addition, I understand that a new release is forthcoming in January that may incorporate measures to tackle your problem.

Please don’t hesitate to contact me if you need assistance with any other problems (also with respect to nomenclature etc.). Any progress on the paper?

Best wishes

Walter

From: Maryam malekmohammady maryam.malekmohammady@gmail.com
Sent: 13 January 2020 23:01
To: VL-BGBM-EditSupport EditSupport@BGBM.ORG
Cc: Berendsohn, Walter W.Berendsohn@bgbm.org
Subject: error

Dear all,

It is a while that "link with taxon" option does not work and I receive this copied error.

My internet is slow and may be it causes this problem.

I also have another problem" when link with taxon option is on and I move a taxon as a synonym to another taxon I have to reopen the classification using the triangle beside Caryophyllales.

I received this error while trying to find Limonium latifolium var. album.

regards
Maryam

login : m.malekmohammadi
editor version : 5.11.0
server : api.cybertaxonomy.org (cybertaxonomy.org) / caryophyllales_spp
schema version : 5.11.2.0.20191109
os : Windows 7 6.1 amd64
java : 1.8.0_231
org.eclipse.e4.core.di.InjectionException: org.springframework.remoting.RemoteAccessException: Could not access HTTP invoker remote service at [http://api.cybertaxonomy.org:80/caryophyllales_spp/remoting/taxonnode.service]; nested exception is java.lang.NullPointerException

Simulating network latency and package loss

I created a script which simplifies the application of various netem setting to simulate bad network conditions like packet delay, loss and low data transfer rate: https://gist.github.com/akohlbecker/network-condition-simulator.sh

For pure local testing you can start the cdm-server in a docker container and apply the settings to the according network bridge. To find the according interface device check the list of docker networks:

docker network ls 
NETWORK ID          NAME                         DRIVER              SCOPE
b7a585babf7f        bridge                       bridge              local
fb231843f4b4        caryophyllales_spp_default   bridge              local
a4b391fdf54d        host                         host                local
adf250e35cb8        none                         null                local
6b2b9489586d        phycobank_default            bridge              local

In the above example caryophyllales_spp is the name of the container to be used for testing, so the network id is fb231843f4b4. With this information we know that th network bridge should be br-fb231843f4b4 which can be proven with ifconfig

Problems found

1.

When simulating low bandwidth, 64kbit/s, org.apache.http.NoHttpResponseException .... failed to respond can be observed. This is casued by the server which may refude waiting any longer for packts from the client, the server may close the connection in this case. Usually the http client should be able to detect stale connections, a bug in HttpClient 4.4 is known which causes this Exception (to fail in https://stackoverflow.com/questions/10558791/apache-httpclient-interim-error-nohttpresponseexception#10600762) --> TODO Upgrade httpclient to the latest version.

AW_Netzwerklatenzen nach Übersee.mbox (36.8 KB) Andreas Kohlbecker, 01/16/2020 04:06 PM

AW_AW_Interessant_Could not access HTTP invoker remote service at .... bei lokalem server.mbox (113 KB) Andreas Kohlbecker, 01/16/2020 04:06 PM


Related issues

Related to Edit - bug #8842: CdmServiceRequestExecutor is not stateleass and neither thread save nor access synchronized Resolved 02/04/2020
Related to Edit - task #8844: distinguish SocketTimeoutException from ConnectException Closed 02/05/2020
Related to Edit - bug #8848: Search fails with low data rate network (64kbit/s) network connection due to lost authentication Feedback 02/14/2020

Associated revisions

Revision 7a05f578 (diff)
Added by Andreas Kohlbecker 5 months ago

ref #8812 proper exception handling in RequestExecutor for more meaningful error messages

Revision 2c520578 (diff)
Added by Andreas Kohlbecker 5 months ago

ref #8812 replacing deprecated usage of apache HttpClient by new classes

Revision 82227023 (diff)
Added by Andreas Kohlbecker 5 months ago

ref #8812 upgrading httpcomponents to 4.5.11

Revision 76f6b44c (diff)
Added by Andreas Kohlbecker 5 months ago

ref #8812 upgrading httpcomponents to 4.5.11 - cleanup

Revision d9511145 (diff)
Added by Andreas Kohlbecker 5 months ago

ref #8812 connect request timeouts and more doc on timeouts

Revision b83b73a7 (diff)
Added by Andreas Kohlbecker 5 months ago

ref #8812 logging of request duration in case of socket timeouts

Revision c1ad99d9 (diff)
Added by Andreas Kohlbecker 5 months ago

ref #8812 managing dependencies via maven, removing duplicate dependencies, updating httpclient

Revision 9422faca (diff)
Added by Andreas Kohlbecker 5 months ago

ref #8812 option for measuring remoting method execution duration

Revision e2766ed7 (diff)
Added by Andreas Kohlbecker 4 months ago

ref #8812 setting NoConnectionReuseStrategy for PoolingHttpClientConnectionManager to avoid NoHttpResponseExceptions

History

#1 Updated by Andreas Kohlbecker 6 months ago

  • Description updated (diff)

#3 Updated by Andreas Kohlbecker 6 months ago

  • Description updated (diff)

#4 Updated by Andreas Kohlbecker 6 months ago

  • Description updated (diff)

#5 Updated by Andreas Kohlbecker 5 months ago

  • Tags changed from performance to performance, RemoteAccessException

#6 Updated by Andreas Kohlbecker 5 months ago

  • Description updated (diff)
  • Status changed from New to In Progress

#7 Updated by Andreas Kohlbecker 5 months ago

  • Description updated (diff)

#8 Updated by Andreas Kohlbecker 5 months ago

an analysis of the reported RemoteAccessException as found in my Emails shows that the exceptions are having specific services as hot spots:

      4 /remoting/classification.service
     87 /remoting/common.service
     11 /remoting/featurenode.service
      2 /remoting/io.service
      7 /remoting/name.service
      2 /remoting/occurrence.service
      2 /remoting/progressmonitor.service
     32 /remoting/taxonnode.service
      2 /remoting/taxon.service
     17 /remoting/termtree.service
     11 /remoting/vocabulary.service

In eu.etaxonomy.taxeditor.cdmlib/src/main/resources/eu/etaxonomy/cdm/httpInvokerServiceClients.xml the readtimeout is configured for the client side remoting services. Howserver some, services like taxonnode.service are missing but this does not account for all services causing problems.

#9 Updated by Andreas Kohlbecker 5 months ago

  • Related to bug #8842: CdmServiceRequestExecutor is not stateleass and neither thread save nor access synchronized added

#10 Updated by Andreas Kohlbecker 5 months ago

  • Related to task #8844: distinguish SocketTimeoutException from ConnectException added

#11 Updated by Andreas Kohlbecker 5 months ago

  • Description updated (diff)

#12 Updated by Andreas Kohlbecker 5 months ago

  • Description updated (diff)
  • % Done changed from 0 to 10

#13 Updated by Andreas Kohlbecker 5 months ago

  • Description updated (diff)

#14 Updated by Andreas Kohlbecker 5 months ago

  • Description updated (diff)

#15 Updated by Andreas Müller 5 months ago

I don't think the dependency handling for httpcomponents is correct. Dependencies inherrited from cdmlib are handled in the cdmlib module in MANIFEST.MF - Runtime, which refers to the according versions in cdmlib. There we already use v 4.5.1 (see cdmlib-parent/pom.xml). Is the problem fixed in 4.5.1 already or do we need to upgrade further in cdmlib?

#16 Updated by Andreas Kohlbecker 5 months ago

Andreas Müller wrote:

I don't think the dependency handling for httpcomponents is correct. Dependencies inherited from cdmlib are handled in the cdmlib module in MANIFEST.MF - Runtime, which refers to the according versions in cdmlib. There we already use v 4.5.1 (see cdmlib-parent/pom.xml). Is the problem fixed in 4.5.1 already or do we need to upgrade further in cdmlib?

You are right the version set in cdmlib is 4.5.1, but this version is also causing problems, terefore i upgraded to the lastest version (4.5.11) which helped in the end.

Even if the dependencies are OK for the taxeditor it makes sense to upgrade httpcomponents alos in cdmlib. But this would be subject of another ticket.

#17 Updated by Andreas Müller 5 months ago

Andreas Kohlbecker wrote:

Andreas Müller wrote:

I don't think the dependency handling for httpcomponents is correct. Dependencies inherited from cdmlib are handled in the cdmlib module in MANIFEST.MF - Runtime, which refers to the according versions in cdmlib. There we already use v 4.5.1 (see cdmlib-parent/pom.xml). Is the problem fixed in 4.5.1 already or do we need to upgrade further in cdmlib?

You are right the version set in cdmlib is 4.5.1, but this version is also causing problems, terefore i upgraded to the lastest version (4.5.11) which helped in the end.

Even if the dependencies are OK for the taxeditor it makes sense to upgrade httpcomponents alos in cdmlib. But this would be subject of another ticket.

But the way you updated the dependencies in TaxEditor is not correct. If we do dependency handling at multiple places at the same time we run into a big chaos. We really should do it via MANIFEST.MF in cdmlib module.

#18 Updated by Andreas Kohlbecker 5 months ago

Andreas Müller wrote:

Andreas Kohlbecker wrote:

Andreas Müller wrote:

I don't think the dependency handling for httpcomponents is correct. Dependencies inherited from cdmlib are handled in the cdmlib module in MANIFEST.MF - Runtime, which refers to the according versions in cdmlib. There we already use v 4.5.1 (see cdmlib-parent/pom.xml). Is the problem fixed in 4.5.1 already or do we need to upgrade further in cdmlib?

You are right the version set in cdmlib is 4.5.1, but this version is also causing problems, terefore i upgraded to the lastest version (4.5.11) which helped in the end.

Even if the dependencies are OK for the taxeditor it makes sense to upgrade httpcomponents alos in cdmlib. But this would be subject of another ticket.

But the way you updated the dependencies in TaxEditor is not correct. If we do dependency handling at multiple places at the same time we run into a big chaos. We really should do it via MANIFEST.MF in cdmlib module.

Good point, I really forgot to adapt MANIFEST.MF and the build.properties of eu.etaxonomy.taxeditor.cdmlib but the way dependencies are managed in the taxeditor is not correct anyway. For a lot of dependencies we are still copying the jars into the source code tree - this is bad practice. Jars should be managed and copied through maven only, whereas MANIFEST.MF and build.properties refer to the final target location of the jars.

Currently two different versions of the httplient libs are currently defined for the cdmlib project: as osgi bundle org.apache.commons.httpclient;bundle-version="3.1.0" and as good old jar the version 4.5.1, this needs to be fixed. I will remove the bundle, since this is the older version and seems to be unused.

#19 Updated by Andreas Kohlbecker 5 months ago

Andreas Kohlbecker wrote:

Andreas Müller wrote:

Andreas Kohlbecker wrote:

Andreas Müller wrote:

I don't think the dependency handling for httpcomponents is correct. Dependencies inherited from cdmlib are handled in the cdmlib module in MANIFEST.MF - Runtime, which refers to the according versions in cdmlib. There we already use v 4.5.1 (see cdmlib-parent/pom.xml). Is the problem fixed in 4.5.1 already or do we need to upgrade further in cdmlib?

You are right the version set in cdmlib is 4.5.1, but this version is also causing problems, terefore i upgraded to the lastest version (4.5.11) which helped in the end.

Even if the dependencies are OK for the taxeditor it makes sense to upgrade httpcomponents alos in cdmlib. But this would be subject of another ticket.

But the way you updated the dependencies in TaxEditor is not correct. If we do dependency handling at multiple places at the same time we run into a big chaos. We really should do it via MANIFEST.MF in cdmlib module.

Good point, I really forgot to adapt MANIFEST.MF and the build.properties of eu.etaxonomy.taxeditor.cdmlib but the way dependencies are managed in the taxeditor is not correct anyway. For a lot of dependencies we are still copying the jars into the source code tree - this is bad practice. Jars should be managed and copied through maven only, whereas MANIFEST.MF and build.properties refer to the final target location of the jars.

Currently two different versions of the httplient libs are currently defined for the cdmlib project: as osgi bundle org.apache.commons.httpclient;bundle-version="3.1.0" and as good old jar the version 4.5.1, this needs to be fixed. I will remove the bundle, since this is the older version and seems to be unused.

DONE: taxeditor|c1ad99d9

#20 Updated by Andreas Kohlbecker 5 months ago

With httpcomponentes updated to version 4.5.11 the taxeditor works much better.

However after making an accepted taxon to a synonym a RemoteAccessException again occured:

last remote method : http://localhost:8082/cdm-db/remoting/taxonnode.service
last remote request client time : 2020-02-13T11:47:47.083
last remote request response header time : Thu, 13 Feb 2020 10:47:46 GMT
client error time : 2020-02-13T11:47:47.42
login : admin
editor version : 5.13.0.qualifier
server : localhost (docker:8082) / cdm-db
schema version : 5.12.0.0.20191202
os : Linux 5.3.0-28-generic amd64
java : 1.8.0_131
...
Caused by: org.springframework.remoting.RemoteAccessException: Could not access HTTP invoker remote service at [http://localhost:8082/cdm-db/remoting/taxonnode.service]; nested exception is org.apache.http.NoHttpResponseException: localhost:8082 failed to respond
    at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.convertHttpInvokerAccessException(HttpInvokerClientInterceptor.java:216)
    at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.invoke(HttpInvokerClientInterceptor.java:147)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
    at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:208)
    at com.sun.proxy.$Proxy47.listChildNodesAsTaxonNodeDto(Unknown Source)
    at eu.etaxonomy.taxeditor.util.TaxonTreeNodeContentProvider.getChildren(TaxonTreeNodeContentProvider.java:68)
    at eu.etaxonomy.taxeditor.navigation.navigator.e4.TaxonNavigatorContentProviderE4.getChildren(TaxonNavigatorContentProviderE4.java:30)
    ...
    at eu.etaxonomy.taxeditor.store.CdmUIDataChangeService.fireChangeEvent(CdmUIDataChangeService.java:41)
    at eu.etaxonomy.taxeditor.operation.RemotingCdmOperation.fireDataChangeEvent(RemotingCdmOperation.java:70)
    at eu.etaxonomy.taxeditor.operation.RemotingCdmUpdateOperation.postExecute(RemotingCdmUpdateOperation.java:86)
    at eu.etaxonomy.taxeditor.operation.RemotingCdmOperation.execute(RemotingCdmOperation.java:51)
    at eu.etaxonomy.taxeditor.model.AbstractUtility.lambda$4(AbstractUtility.java:199)
    at org.eclipse.ui.internal.UILockListener.doPendingWork(UILockListener.java:162)
    at org.eclipse.ui.internal.UISynchronizer$3.run(UISynchronizer.java:154)
    at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:35)
    at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:182)
    ... 24 more
Caused by: org.apache.http.NoHttpResponseException: localhost:8082 failed to respond
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
    at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
    at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
    at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
    at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
    at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
    at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
    at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
    at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
    at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
    at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
    at org.springframework.remoting.httpinvoker.HttpComponentsHttpInvokerRequestExecutor.executeHttpPost(HttpComponentsHttpInvokerRequestExecutor.java:338)
    at eu.etaxonomy.taxeditor.service.TimestampingHttpInvokerRequestExecutor.executeHttpPost(TimestampingHttpInvokerRequestExecutor.java:38)
    at org.springframework.remoting.httpinvoker.HttpComponentsHttpInvokerRequestExecutor.doExecuteRequest(HttpComponentsHttpInvokerRequestExecutor.java:229)
    at org.springframework.remoting.httpinvoker.AbstractHttpInvokerRequestExecutor.executeRequest(AbstractHttpInvokerRequestExecutor.java:138)
    at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.executeRequest(HttpInvokerClientInterceptor.java:194)
    at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.executeRequest(HttpInvokerClientInterceptor.java:176)
    at org.springframework.remoting.httpinvoker.CachingHttpInvokerProxyFactoryBean.doExecuteRequest(CachingHttpInvokerProxyFactoryBean.java:158)
    at org.springframework.remoting.httpinvoker.CachingHttpInvokerProxyFactoryBean.handleGeneralRequest(CachingHttpInvokerProxyFactoryBean.java:93)
    at org.springframework.remoting.httpinvoker.CachingHttpInvokerProxyFactoryBean.executeRequest(CachingHttpInvokerProxyFactoryBean.java:81)
    at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.invoke(HttpInvokerClientInterceptor.java:144)
    ... 65 more

Taxeditor log of turning the taxon into a synonym:

11:47:46,171  INFO CachingHttpInvokerProxyFactoryBean:161 - http://localhost:8082/cdm-db/remoting/taxonnode.service#makeTaxonNodeASynonymOfAnotherTaxonNode [80180.0 ms]
11:47:47,075  INFO CachingHttpInvokerProxyFactoryBean:161 - http://localhost:8082/cdm-db/remoting/common.service#findWithUpdate [880.0 ms]
11:47:47,104 ERROR CachingHttpInvokerProxyFactoryBean:167 - (org.apache.http.NoHttpResponseException: "localhost:8082 failed to respond") interrupts http://localhost:8082/cdm-db/remoting/taxonnode.service#listChildNodesAsTaxonNodeDto after [22.0 ms]
11:47:47,420  INFO CachingHttpInvokerProxyFactoryBean:161 - http://localhost:8082/cdm-db/remoting-public/metadata.service#getDbSchemaVersion [306.0 ms]

The makeTaxonNodeASynonymOfAnotherTaxonNode has completed and the problem occured later during the refresh of the taxon navigator

#21 Updated by Andreas Kohlbecker 5 months ago

The above error is perfectly reproducible with limited rate networks (64kbit/s):

15:16:06,962  INFO CachingHttpInvokerProxyFactoryBean:162 - http://localhost:8082/cdm-db/remoting/taxonnode.service#makeTaxonNodeASynonymOfAnotherTaxonNode [61045.0 ms]
15:16:07,849  INFO CachingHttpInvokerProxyFactoryBean:162 - http://localhost:8082/cdm-db/remoting/common.service#findWithUpdate [865.0 ms]
15:16:07,901 ERROR CachingHttpInvokerProxyFactoryBean:168 - (org.apache.http.NoHttpResponseException: "localhost:8082 failed to respond") interrupts http://localhost:8082/cdm-db/remoting/taxonnode.service#listChildNodesAsTaxonNodeDto after [46.0 ms]
15:16:08,181  INFO CachingHttpInvokerProxyFactoryBean:162 - http://localhost:8082/cdm-db/remoting-public/metadata.service#getDbSchemaVersion [276.0 ms]

The NoHttpResponseException always occurs during the request to taxonnode.service#listChildNodesAsTaxonNodeDto !

#22 Updated by Andreas Kohlbecker 5 months ago

  • Related to bug #8848: Search fails with low data rate network (64kbit/s) network connection due to lost authentication added

#23 Updated by Andreas Kohlbecker 5 months ago

  • Description updated (diff)

#24 Updated by Andreas Kohlbecker 5 months ago

Andreas Kohlbecker wrote:

The above error is perfectly reproducible with limited rate networks (64kbit/s):

15:16:06,962  INFO CachingHttpInvokerProxyFactoryBean:162 - http://localhost:8082/cdm-db/remoting/taxonnode.service#makeTaxonNodeASynonymOfAnotherTaxonNode [61045.0 ms]
15:16:07,849  INFO CachingHttpInvokerProxyFactoryBean:162 - http://localhost:8082/cdm-db/remoting/common.service#findWithUpdate [865.0 ms]
15:16:07,901 ERROR CachingHttpInvokerProxyFactoryBean:168 - (org.apache.http.NoHttpResponseException: "localhost:8082 failed to respond") interrupts http://localhost:8082/cdm-db/remoting/taxonnode.service#listChildNodesAsTaxonNodeDto after [46.0 ms]
15:16:08,181  INFO CachingHttpInvokerProxyFactoryBean:162 - http://localhost:8082/cdm-db/remoting-public/metadata.service#getDbSchemaVersion [276.0 ms]

The NoHttpResponseException always occurs during the request to taxonnode.service#listChildNodesAsTaxonNodeDto !

This changed after restarting the taxeditor and cdm server

#25 Updated by Andreas Kohlbecker 5 months ago

Result from a detailed analysis of the client and server behavior during communication over a local connection with simulated low data rate (64kbit/s) and high latency (200ms +/-20ms) which ends in a NoHttpResponseException:

To generate the according log files the log4j.properties need to contain the following configuration:

taxeditor:

# logging of remote service execution duratations
# (requires setting -Dremoting.httpinvoker.measureDuration )
log4j.logger.org.springframework.remoting.httpinvoker = DEBUG
# http client connection handling and management
log4j.logger.org.apache.http.impl = DEBUG
# details on I/O
#log4j.logger.org.apache.commons.httpclient = DEBUG, stdout

cdmserver:

## Server connection management
log4j.logger.org.eclipse.jetty.server = DEBUG
log4j.logger.org.eclipse.jetty.io=DEBUG
log4j.logger.eu.etaxonomy.cdm.server = INFO

This is the combined excerpt of both logs. The entries are ordered by time (time is synchronized since client and server have been running on the same machine). The log entries from the cdmserver are indented for better readability:

15:19:01,587 DEBUG CachingHttpInvokerProxyFactoryBean:68 - Remote invoking : http://localhost:8082/cdm-db/remoting/taxonnode.service#makeTaxonNodeASynonymOfAnotherTaxonNode
15:19:01,588 DEBUG PoolingHttpClientConnectionManager:267 - Connection request: [route: {}->http://localhost:8082][total available: 1; route allocated: 1 of 5; total allocated: 1 of 100]
15:19:01,588 DEBUG PoolingHttpClientConnectionManager:312 - Connection leased: [id: 56][route: {}->http://localhost:8082][total available: 0; route allocated: 1 of 5; total allocated: 1 of 100]
15:19:01,588 DEBUG DefaultManagedHttpClientConnection:88 - http-outgoing-56: set socket timeout to 0
15:19:01,588 DEBUG DefaultManagedHttpClientConnection:88 - http-outgoing-56: set socket timeout to 70000
15:19:01,588 DEBUG MainClientExec:255 - Executing request POST /cdm-db/remoting/taxonnode.service HTTP/1.1
15:19:01,588 DEBUG MainClientExec:266 - Proxy auth state: UNCHALLENGED

    cdm-server_1  | 14:19:01,590 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannel] - REQUEST for //localhost:8082/cdm-db/remoting/taxonnode.service on HttpChannelOverHttp@4f9f538d{s=HttpChannelState@62679ccd{s=IDLE rs=BLOCKING os=OPEN is=IDLE awp=false se=false i=true al=0},r=2,c=false
    /false,a=IDLE,uri=//localhost:8082/cdm-db/remoting/taxonnode.service,age=0}
    cdm-server_1  | POST //localhost:8082/cdm-db/remoting/taxonnode.service HTTP/1.1                  
    cdm-server_1  | Accept-Encoding: gzip            
    cdm-server_1  | Authorization: Basic YWRtaW46a3VwczM2NitSVQ==                                     
    cdm-server_1  | Content-Length: 3061             
    cdm-server_1  | Content-Type: application/x-java-serialized-object                                
    cdm-server_1  | Host: localhost:8082             
    cdm-server_1  | Connection: keep-alive           
    cdm-server_1  | User-Agent: Apache-HttpClient/4.5.11 (Java/1.8.0_131)                             
    cdm-server_1  | Cookie: JSESSIONID=node06vduxyotame8qhi71qyzy14711.node0                          
    cdm-server_1  |                                  
    cdm-server_1  |
    ....
    cdm-server_1  | [cdm-db] 14:19:01,593 DEBUG [qtp951050903-34][eu.et.cd.ap.re.DebuggingRemoteInvocationExecutor] - invoking: eu.etaxonomy.cdm.api.service.ITaxonNodeService#makeTaxonNodeASynonymOfAnotherTaxonNode([java.util.UUID, java.util.UUID, eu.etaxonomy.cdm.model.taxon.SynonymType, eu.etaxonomy.cdm.model.reference.Reference, java.lang.String, boolean])
    ...
    cdm-server_1  | [cdm-db] 14:19:03,875 DEBUG [qtp951050903-34][eu.et.cd.ap.re.DebuggingRemoteInvocationExecutor] - invocation: eu.etaxonomy.cdm.api.service.ITaxonNodeService#makeTaxonNodeASynonymOfAnotherTaxonNode([java.util.UUID, java.util.UUID, eu.etaxonomy.cdm.model.taxon.SynonymType, eu.etaxonomy.cdm.model.reference.Reference, java.lang.String, boolean]) completed [2282 ms]
    ...
    cdm-server_1  | 14:19:03,880 DEBUG [qtp951050903-34][or.ec.je.se.Request] - Response Request(POST //localhost:8082/cdm-db/remoting/taxonnode.service)@36198c9f committing for session     
    cdm-server_1  | 14:19:03,893 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannel] - sendResponse info=null content=DirectByteBuffer@185d77cc[p=0,l=32768,c=32768,r=32768]={<<<~\x00?\x00\xFf\xFf\xFf\xFf\x00\x00q\x00~\x03Bq\x00~\x03Vq\x00~\x02\x0c.../cdm/model/location/Nam>>>} complete=false committing=false callback=Blocker@58153eec{null}
    ... sending data 

15:19:04,259 DEBUG MainClientExec:285 - Connection can be kept alive indefinitely

    cdm-server_1  | 15:19:15,647 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannelState] - unhandle HttpChannelState@62679ccd{s=HANDLING rs=BLOCKING os=COMPLETED is=IDLE awp=false se=false i=true al=0} 
    cdm-server_1  | 15:19:15,647 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannelState] - nextAction(false) COMPLETE HttpChannelState@62679ccd{s=HANDLING rs=COMPLETING os=COMPLETED is=IDLE awp=false se=false i=false al=0}
    cdm-server_1  | 15:19:15,647 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannel] - action COMPLETE HttpChannelOverHttp@4f9f538d{s=HttpChannelState@62679ccd{s=HANDLING rs=COMPLETING os=COMPLETED is=IDLE awp=false se=false i=false al=0},r=2,c=false/true,a=HANDLING,uri=//localhost:8082/cdm-db/remoting/taxonnode.service,age=14057}
    cdm-server_1  | 15:19:15,647 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannelState] - completed HttpChannelState@62679ccd{s=HANDLING rs=COMPLETING os=COMPLETED is=IDLE awp=false se=false i=false al=0}
    cdm-server_1  | 15:19:15,647 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannelState] - unhandle HttpChannelState@62679ccd{s=HANDLING rs=COMPLETED os=COMPLETED is=IDLE awp=false se=false i=false al=0}
    cdm-server_1  | 15:19:15,648 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannelState] - nextAction(false) TERMINATED HttpChannelState@62679ccd{s=IDLE rs=COMPLETED os=COMPLETED is=IDLE awp=false se=false i=false al=0}
    cdm-server_1  | 15:19:15,648 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannel] - action TERMINATED HttpChannelOverHttp@4f9f538d{s=HttpChannelState@62679ccd{s=IDLE rs=COMPLETED os=COMPLETED is=IDLE awp=false se=false i=false al=0},r=2,c=true/true,a=IDLE,uri=//localhost:8082/cdm-db/remoting/taxonnode.service,age=14058}
    cdm-server_1  | 15:19:15,648 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannel] - onCompleted for /cdm-db/remoting/taxonnode.service written=462692                                                   
    cdm-server_1  | 15:19:15,648 DEBUG [qtp951050903-34][or.ec.je.se.Request] - Request Request[POST //localhost:8082/cdm-db/remoting/taxonnode.service]@36198c9f leaving session Session@12cd7af5{id=node06vduxyotame8qhi71qyzy14711,x=node06vduxyotame8qhi71qyzy14711.node0,req=1,res=true}
    cdm-server_1  | 15:19:15,648 DEBUG [qtp951050903-34][or.ec.je.se.session] - Complete called with session Session@12cd7af5{id=node06vduxyotame8qhi71qyzy14711,x=node06vduxyotame8qhi71qyzy14711.node0,req=1,res=true}

    ... sending complete after 15,648 - 03,880 = 11,768 seconds

    cdm-server_1  | 15:19:45,650 DEBUG [Connector-Scheduler-c333c60-1][or.ec.je.io.IdleTimeout] - SocketChannelEndPoint@7d7be04d{/172.18.0.1:55018<->/172.18.0.315,648:8080,OPEN,fill=FI,flush=-,to=30000/30000}{io=1/1,kio=1,kro=4}->HttpConnection@7123c908[p=HttpParser{s=START,0 of -1},g=HttpGenerator@537836e9{s=START}]=>HttpChannelOverHttp@4f9f538d{s=HttpChannelState@62679ccd{s=IDLE rs=BLOCKING os=OPEN is=IDLE awp=false se=false i=true al=0},r=2,c=false/false,a=IDLE,uri=null,age=0} idle timeout check, elapsed: 30000 ms, remaining: 0 ms
    cdm-server_1  | 15:19:45,651 DEBUG [Connector-Scheduler-c333c60-1][or.ec.je.io.IdleTimeout] - SocketChannelEndPoint@7d7be04d{/172.18.0.1:55018<->/172.18.0.3:8080,OPEN,fill=FI,flush=-,to=30001/30000}{io=1/1,kio=1,kro=4}->HttpConnection@7123c908[p=HttpParser{s=START,0 of -1},g=HttpGenerator@537836e9{s=START}]=>HttpChannelOverHttp@4f9f538d{s=HttpChannelState@62679ccd{s=IDLE rs=BLOCKING os=OPEN is=IDLE awp=false se=false i=true al=0},r=2,c=false/false,a=IDLE,uri=null,age=0} idle timeout expired

    ... 30 seconds later timeout of the connection on server side


15:20:04,457 DEBUG PoolingHttpClientConnectionManager:344 - Connection [id: 56][route: {}->http://localhost:8082] can be kept alive indefinitely
15:20:04,457 DEBUG DefaultManagedHttpClientConnection:88 - http-outgoing-56: set socket timeout to 0
15:20:04,458 DEBUG PoolingHttpClientConnectionManager:351 - Connection released: [id: 56][route: {}->http://localhost:8082][total available: 1; route allocated: 1 of 5; total allocated: 1 of 100]
15:20:04,458 DEBUG MainClientExec:145 - Cancelling request execution
15:20:04,459  INFO CachingHttpInvokerProxyFactoryBean:162 - http://localhost:8082/cdm-db/remoting/taxonnode.service#makeTaxonNodeASynonymOfAnotherTaxonNode [62872.0 ms]

... client received data 62.872 seconds after sending the HTTP POST (due to the simulated bad network condition 64kbit/s + 200 ms latency)

15:20:05,358  INFO CachingHttpInvokerProxyFactoryBean:162 - http://localhost:8082/cdm-db/remoting/common.service#findWithUpdate [881.0 ms]

... preparing for next request about 1 second later ...

15:20:05,362 DEBUG CachingHttpInvokerProxyFactoryBean:68 - Remote invoking : http://localhost:8082/cdm-db/remoting/taxonnode.service#listChildNodesAsTaxonNodeDto
15:20:05,364 DEBUG PoolingHttpClientConnectionManager:267 - Connection request: [route: {}->http://localhost:8082][total available: 1; route allocated: 1 of 5; total allocated: 1 of 100]
15:20:05,364 DEBUG PoolingHttpClientConnectionManager:312 - Connection leased: [id: 56][route: {}->http://localhost:8082][total available: 0; route allocated: 1 of 5; total allocated: 1 of 100]
15:20:05,364 DEBUG DefaultManagedHttpClientConnection:88 - http-outgoing-56: set socket timeout to 0
15:20:05,365 DEBUG DefaultManagedHttpClientConnection:88 - http-outgoing-56: set socket timeout to 70000
15:20:05,365 DEBUG MainClientExec:255 - Executing request POST /cdm-db/remoting/taxonnode.service HTTP/1.1
15:20:05,366 DEBUG DefaultManagedHttpClientConnection:79 - http-outgoing-56: Close connection
15:20:05,367 DEBUG DefaultManagedHttpClientConnection:96 - http-outgoing-56: Shutdown connection
15:20:05,367 DEBUG MainClientExec:129 - Connection discarded
15:20:05,367 DEBUG PoolingHttpClientConnectionManager:351 - Connection released: [id: 56][route: {}->http://localhost:8082][total available: 0; route allocated: 0 of 5; total allocated: 0 of 100]
15:20:05,386 DEBUG MainClientExec:145 - Cancelling request execution
15:20:05,386 ERROR CachingHttpInvokerProxyFactoryBean:168 - (org.apache.http.NoHttpResponseException: "localhost:8082 failed to respond") interrupts http://localhost:8082/cdm-db/remoting/taxonnode.service#listChildNodesAsTaxonNodeDto after [23.0 ms]

#26 Updated by Andreas Kohlbecker 5 months ago

Another combined logfile which shows the time stamps of the last package at the client side:

    cdm-server_1  | 14:02:24,413 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannel] - REQUEST for //localhost:8082/cdm-db/remoting/taxonnode.service on HttpChannelOverHttp@286209c0{s=HttpChannelState@f777417{s=IDLE rs=BLOCKING os=OPEN is=IDLE awp=false se=false i=true al=0},r=2,c=false/false,a=IDLE,uri=//localhost:8082/cdm-db/remoting/taxonnode.service,age=0}
    ...
    cdm-server_1  | [cdm-db] 14:02:25,420 DEBUG [qtp951050903-34][eu.et.cd.ap.re.DebuggingRemoteInvocationExecutor] - invocation: eu.etaxonomy.cdm.api.service.ITaxonNodeService#makeTaxonNodeASynonymOfAnotherTaxonNode([java.util.UUID, java.util.UUID, eu.etaxonomy.cdm.model.taxon.SynonymType, eu.etaxonomy.cdm.model.reference.Reference, java.lang.String, boolean]) completed [999 ms]
    cdm-server_1  | 14:02:25,453 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannel] - sendResponse info=null content=DirectByteBuffer@2846ed97[p=0,l=32768,c=32768,r=32768]={<<<\xAc\xEd\x00\x05sr\x00;org.springframewo...e~q\x00~\x01Yt\x00\x07Articlept\x00\x0228>>>} complete=false committing=true callback=Blocker@4ab8791f{null}
    cdm-server_1  | 14:02:25,453 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannel] - COMMIT for /cdm-db/remoting/taxonnode.service on HttpChannelOverHttp@286209c0{s=HttpChannelState@f777417{s=HANDLING rs=BLOCKING os=COMMITTED is=IDLE awp=false se=false i=true al=0},r=2,c=false/false,a=HANDLING,uri=//localhost:8082/cdm-db/remoting/taxonnode.service,age=1040}
    ....
    cdm-server_1  | 14:02:36,849 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannel] - sendResponse info=null content=HeapByteBuffer@18d52943[p=0,l=0,c=0,r=0]={<<<>>>} complete=false committing=false callback=Blocker@4ab8791f{null}
    ....
    cdm-server_1  | 14:02:36,850 DEBUG [qtp951050903-34][or.ec.je.se.HttpChannel] - onCompleted for /cdm-db/remoting/taxonnode.service written=426774
    cdm-server_1  | 14:02:36,850 DEBUG [qtp951050903-34][or.ec.je.se.Request] - Request Request[POST //localhost:8082/cdm-db/remoting/taxonnode.service]@d642b32 leaving session Session@6dc61c3e{id=node0uzok2n9s07hx1ovmsfb3wtkjw8,x=node0uzok2n9s07hx1ovmsfb3wtkjw8.node0,req=1,res=true}
    ...
    cdm-server_1  | 14:03:06,853 DEBUG [Connector-Scheduler-c333c60-1][or.ec.je.io.IdleTimeout] - SocketChannelEndPoint@7736ae89{/172.18.0.1:36020<->/172.18.0.3:8080,OPEN,fill=FI,flush=-,to=30001/30000}{io=1/1,kio=1,kro=4}->HttpConnection@7a796292[p=HttpParser{s=START,0 of -1},g=HttpGenerator@38f9c9b2{s=START}]=>HttpChannelOverHttp@286209c0{s=HttpChannelState@f777417{s=IDLE rs=BLOCKING os=OPEN is=IDLE awp=false se=false i=true al=0},r=2,c=false/false,a=IDLE,uri=null,age=0} idle timeout expired
    ...

14:03:21,368 DEBUG wire:73 - http-outgoing-33 << "[0x0]n[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0x88]pppsq[0x0]~[0x0][0xb6][0x0][0x0][0x0][0x0]w[0x4][0x0][0x0][0x0][0x0]xsq[0x0]~[0x0]?[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0x88]pppsq[0x0]~[0x0][0x9f]w[0xc][0x0][0x0][0x0][0x10]?@[0x0][0x0][0x0][0x0][0x0][0x0]xsq[0x0]~[0x0]n[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0x88]pppsq[0x0]~[0x0][0xb6][0x0][0x0][0x0][0x0]w[0x4][0x0][0x0][0x0][0x0]xpsq[0x0]~[0x0]?[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0x88]pppsq[0x0]~[0x0][0x9f]w[0xc][0x0][0x0][0x0][0x10]?@[0x0][0x0][0x0][0x0][0x0][0x0]xt[0x0] ??? sec. Schlumpberger, B., 2012[0x0][0x1][0x0]ppq[0x0]~[0x3][0x89]p[0x0]sq[0x0]~[0x0]?[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0x88]pppsq[0x0]~[0x0][0x9f]w[0xc][0x0][0x0][0x0][0x10]?@[0x0][0x0][0x0][0x0][0x0][0x0]xsq[0x0]~[0x0]?[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0x88]pppsq[0x0]~[0x0][0x9f]w[0xc][0x0][0x0][0x0][0x10]?@[0x0][0x0][0x0][0x0][0x0][0x0]xsq[0x0]~[0x0]?[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0x88]pppsq[0x0]~[0x0][0x9f]w[0xc][0x0][0x0][0x0][0x10]?@[0x0][0x0][0x0][0x0][0x0][0x0]xsq[0x0]~[0x0]?[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0x88]pppsq[0x0]~[0x0][0x9f]w[0xc][0x0][0x0][0x0][0x10]?@[0x0][0x0][0x0][0x0][0x0][0x0]xsq[0x0]~[0x0]?[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0x88]pppsq[0x0]~[0x0][0x9f]w[0xc][0x0][0x0][0x0][0x10]?@[0x0][0x0][0x0][0x0][0x0][0x0]xsq[0x0]~[0x0][0xc][0x0][0x0],[0x81]sq[0x0]~[0x0][0x1d][0x0][0x0][0x1]b[0x1f][0xf7]/`q[0x0]~[0x0]"q[0x0]~[0x0]'sq[0x0]~[0x0]:[0x0][0x0][0x0][0x2]pq[0x0]~G[0xa7]pxsq[0x0]~[0x0]=[0x8b]ZTVYA[0x8e]8y[0x92][0x94] [0xb9]VH[0xf8]sq[0x0]~[0x0][0x1d][0x0][0x0][0x1]p][0x86][0xee][0xe8]q[0x0]~[0x0]"q[0x0]~[0x0]Ysq[0x0]~[0x0]?[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0xa7]pppsq[0x0]~[0x0][0x9f]w[0xc][0x0][0x0][0x0][0x10]?@[0x0][0x0][0x0][0x0][0x0][0x0]xsq[0x0]~[0x0]?[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0xa7]pppsq[0x0]~[0x0][0x9f]w[0xc][0x0][0x0][0x0][0x10]?@[0x0][0x0][0x0][0x0][0x0][0x0]x[0x0][0x0][0x0][0x0][0x0][0x0][0x0]sq[0x0]~[0x0]?[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0xa7]pppsq[0x0]~[0x0][0x9f]w[0xc][0x0][0x0][0x0][0x10]?@[0x0][0x0][0x0][0x0][0x0][0x0]xsq[0x0]~[0x0]n[0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0xa7]pppsq[0x0]~[0x0][0xb6][0x0][0x0][0x0][0x0]w[0x4][0x0][0x0][0x0][0x0]xpsq[0x0]~[0x0][0xcf][0x0][0xff][0xff][0xff][0xff][0x0][0x1]pq[0x0]~G[0xa7]pppsq[0x0]~[0x0][0x97]?@[0x0][0x0][0x0][0x0][0x0][0x0]w[0x8][0x0][0x0][0x0][0x10][0x0][0x0][0x0][0x0]xpppppppxsq[0x0]~[0x0][0xb6][0x0][0x0][0x0][0x0]w[0x4][0x0][0x0][0x0][0x0]xsq[0x0]~[0x0][0x9f]w[0xc][0x0][0x0][0x0][0x10]?@[0x0][0x0][0x0][0x0][0x0][0x0]x[\r][\n]"
14:03:21,368 DEBUG wire:73 - http-outgoing-33 << "0[\r][\n]"
14:03:21,368 DEBUG wire:73 - http-outgoing-33 << "[\r][\n]"

... final packet received at client side  44,518 seconds after the HttpChannel has completed on the server. The idle timeout for the connection at the server has yet expired 29,997 before.

14:03:21,374 DEBUG PoolingHttpClientConnectionManager:344 - Connection [id: 33][route: {}->http://localhost:8082] can be kept alive indefinitely
14:03:21,374 DEBUG DefaultManagedHttpClientConnection:88 - http-outgoing-33: set socket timeout to 0
14:03:21,375 DEBUG PoolingHttpClientConnectionManager:351 - Connection released: [id: 33][route: {}->http://localhost:8082][total available: 1; route allocated: 1 of 5; total allocated: 1 of 100]
14:03:21,375 DEBUG MainClientExec:145 - Cancelling request execution
14:03:21,375  INFO CachingHttpInvokerProxyFactoryBean:162 - http://localhost:8082/cdm-db/remoting/taxonnode.service#makeTaxonNodeASynonymOfAnotherTaxonNode [56972.0 ms]

#27 Updated by Andreas Kohlbecker 4 months ago

From the current point of view there are following options to solve this problem:

  1. Configure httclient to never reuse connections builder.setConnectionReuseStrategy(new NoConnectionReuseStrategy());
  2. Using OkHttp, which is supposed to gracefully recover from situations when http connection becomes stale.

Option 1. would mean to lose the benefits of persistent connections!

#28 Updated by Andreas Kohlbecker 4 months ago

  • Status changed from In Progress to Resolved
  • Assignee changed from Andreas Kohlbecker to Andreas Müller
  • % Done changed from 10 to 50

After setting the NoConnectionReuseStrategy for PoolingHttpClientConnectionManager (taxeditor|e2766ed7) I could no longer reproduce the NoHttpResponseExceptions. So I am passing this ticket on for heavy testing by someone else.

The second option "2. Using OkHttp, which is supposed to gracefully recover from situations when http connection becomes stale." could be beneficial when it comes to situations when more connections are needed in short periods of time (many users or request intensive operations). But for now is suggest to live with the current solution since we would have to implement an test a OKHttpInvokerRequestExecutor similar to the HttpComponentsHttpInvokerRequestExecutor

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 40 MB)