Plugin isolation support in Elasticsearch

As I blogged yesterday, I recently discover a limitation into Elasticsearch architecture regarding the isolation of plugins. The fact is that every plugin and its libraries are added to the same Java ClassLoader during startup and thus all the plugins share resources and classes definitions.

Observation

I encounter this by developping and testing 2 plugins : one for indexing documents stored onto Google Drive ; the other for indexing documents stored onto Amazon S3. Unfortunately, each one has Apache httpclient coming from its Maven dependencies : version 4.0.1 is used by Google SDK and version 4.1 is used by Amazon SDK.

So when you start Elasticsearch with both, you end up with a beautiful exception as follow :

laurent@ponyo:~/dev/elasticsearch-1.0.0.Beta1-SNAPSHOT$ bin/elasticsearch -f
[2013-06-13 22:25:29,044][INFO ][node                     ] [Brother Tode] {1.0.0.Beta1-SNAPSHOT}[6098]: initializing ...
[2013-06-13 22:25:29,144][INFO ][plugins                  ] [Brother Tode] loaded [river-twitter, river-google-drive, mapper-attachments, river-amazon-s3], sites [head]
[2013-06-13 22:25:31,989][INFO ][node                     ] [Brother Tode] {1.0.0.Beta1-SNAPSHOT}[6098]: initialized
[2013-06-13 22:25:31,989][INFO ][node                     ] [Brother Tode] {1.0.0.Beta1-SNAPSHOT}[6098]: starting ...
[2013-06-13 22:25:32,131][INFO ][transport                ] [Brother Tode] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.1.80:9300]}
[2013-06-13 22:25:35,187][INFO ][cluster.service          ] [Brother Tode] new_master [Brother Tode][LSvX2bRIRCWsQGcqvvvC7Q][inet[/192.168.1.80:9300]], reason: zen-disco-join (elected_as_master)
[2013-06-13 22:25:35,233][INFO ][discovery                ] [Brother Tode] elasticsearch/LSvX2bRIRCWsQGcqvvvC7Q
[2013-06-13 22:25:35,304][INFO ][http                     ] [Brother Tode] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.1.80:9200]}
[2013-06-13 22:25:35,305][INFO ][node                     ] [Brother Tode] {1.0.0.Beta1-SNAPSHOT}[6098]: started
[2013-06-13 22:25:36,339][INFO ][gateway                  ] [Brother Tode] recovered [3] indices into cluster_state
[2013-06-13 22:25:38,429][WARN ][river                    ] [Brother Tode] failed to create river [amazon-s3][s3docs]
org.elasticsearch.common.inject.CreationException: Guice creation errors:

1) Error injecting constructor, java.lang.NoSuchMethodError: org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager: method <init>()V not found
  at com.github.lbroudoux.elasticsearch.river.s3.river.S3River.<init>(Unknown Source)
  while locating com.github.lbroudoux.elasticsearch.river.s3.river.S3River
  while locating org.elasticsearch.river.River

1 error
	at org.elasticsearch.common.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:344)
	at org.elasticsearch.common.inject.InjectorBuilder.injectDynamically(InjectorBuilder.java:178)
	at org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:110)
	at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:132)
	at org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:66)
	at org.elasticsearch.river.RiversService.createRiver(RiversService.java:138)
	at org.elasticsearch.river.RiversService$ApplyRivers$2.onResponse(RiversService.java:270)
	at org.elasticsearch.river.RiversService$ApplyRivers$2.onResponse(RiversService.java:1)
	at org.elasticsearch.action.support.TransportAction$ThreadedActionListener$1.run(TransportAction.java:87)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NoSuchMethodError: org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager: method <init>()V not found
	at com.amazonaws.http.ConnectionManagerFactory.createThreadSafeClientConnManager(ConnectionManagerFactory.java:26)
	at com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:95)
	at com.amazonaws.http.AmazonHttpClient.<init>(AmazonHttpClient.java:118)
	at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:65)
	at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:298)
	at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:280)
	at com.github.lbroudoux.elasticsearch.river.s3.connector.S3Connector.connectUserBucket(S3Connector.java:66)
	at com.github.lbroudoux.elasticsearch.river.s3.river.S3River.<init>(S3River.java:131)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
	at org.elasticsearch.common.inject.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:54)
	at org.elasticsearch.common.inject.ConstructorInjector.construct(ConstructorInjector.java:86)
	at org.elasticsearch.common.inject.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:98)
	at org.elasticsearch.common.inject.FactoryProxy.get(FactoryProxy.java:52)
	at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:45)
	at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:819)
	at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:42)
	at org.elasticsearch.common.inject.Scopes$1$1.get(Scopes.java:57)
	at org.elasticsearch.common.inject.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:45)
	at org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:200)
	at org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:1)
	at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:812)
	at org.elasticsearch.common.inject.InjectorBuilder.loadEagerSingletons(InjectorBuilder.java:193)
	at org.elasticsearch.common.inject.InjectorBuilder.injectDynamically(InjectorBuilder.java:175)
	... 10 more
[2013-06-13 22:25:38,489][INFO ][com.github.lbroudoux.elasticsearch.river.drive.connector.DriveConnector] Establishing connection to Google Drive
^C[2013-06-13 22:25:39,214][INFO ][node                     ] [Brother Tode] {1.0.0.Beta1-SNAPSHOT}[6098]: stopping ...
[2013-06-13 22:25:39,502][INFO ][node                     ] [Brother Tode] {1.0.0.Beta1-SNAPSHOT}[6098]: stopped
[2013-06-13 22:25:39,502][INFO ][node                     ] [Brother Tode] {1.0.0.Beta1-SNAPSHOT}[6098]: closing ...

What happens here ? Both plugins are loaded and Google Drive river seems to be loaded first. As you can see here, its libraries are added to ClassLoader first. So the 4.0.1 definition of org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager is first and will be later resolved by classes referencing it. During its init phase, Amazon plugin will try to use this class but needs the 4.1 definition that holds the new ()V method !

Enhancement

An an enhancement proposition, I’ve forked the Elasticsearch repository here and make some rework onto the classloading scheme of plugins. You may now have the possibility to force the loading of plugins into dedicated and isolated classloaders that will try to resolve requested classes using the plugin libraries first and then the main classloader.

Although I’ve made tests with some other plugins (twitter, head, attachment, fsriver) and see no regression, I thought it will be safer to add a feature toggle in order to activate this. Plugin isolation is then only done if the plugin.isolate settings flag is set to true (either from the YAML configuration file or from the command line).

The result is shown below, when started with the -Des.plugin.isolate=true property, dedicated classloaders are used making use of conflicting plugins a breeze :

laurent@ponyo:~/dev/elasticsearch-1.0.0.Beta1-SNAPSHOT$ bin/elasticsearch -f -Des.plugin.isolate=true
[2013-06-13 22:39:59,905][INFO ][node                     ] [Commando] {1.0.0.Beta1-SNAPSHOT}[6253]: initializing ...
[2013-06-13 22:39:59,908][INFO ][plugins                  ] [Commando] Plugin isolation set to true, loading each plugin in a dedicated ClassLoader
[2013-06-13 22:39:59,948][INFO ][plugins                  ] [Commando] loaded [river-twitter, mapper-attachments, google-drive-river, amazon-s3-river], sites [head]
[2013-06-13 22:40:02,801][INFO ][node                     ] [Commando] {1.0.0.Beta1-SNAPSHOT}[6253]: initialized
[2013-06-13 22:40:02,801][INFO ][node                     ] [Commando] {1.0.0.Beta1-SNAPSHOT}[6253]: starting ...
[2013-06-13 22:40:02,941][INFO ][transport                ] [Commando] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.1.80:9300]}
[2013-06-13 22:40:05,990][INFO ][cluster.service          ] [Commando] new_master [Commando][2Xp9SsHsQ_SmFqiDGZUhzg][inet[/192.168.1.80:9300]], reason: zen-disco-join (elected_as_master)
[2013-06-13 22:40:06,037][INFO ][discovery                ] [Commando] elasticsearch/2Xp9SsHsQ_SmFqiDGZUhzg
[2013-06-13 22:40:06,097][INFO ][http                     ] [Commando] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.1.80:9200]}
[2013-06-13 22:40:06,098][INFO ][node                     ] [Commando] {1.0.0.Beta1-SNAPSHOT}[6253]: started
[2013-06-13 22:40:07,274][INFO ][gateway                  ] [Commando] recovered [3] indices into cluster_state
[2013-06-13 22:40:11,166][INFO ][com.github.lbroudoux.elasticsearch.river.s3.river.S3River] [Commando] [amazon-s3][s3docs] Starting amazon s3 river scanning
[2013-06-13 22:40:11,190][DEBUG][com.github.lbroudoux.elasticsearch.river.s3.river.S3River] [Commando] [amazon-s3][s3docs] lastScanTimeField: 1371154754606
[2013-06-13 22:40:11,190][DEBUG][com.github.lbroudoux.elasticsearch.river.s3.river.S3River] [Commando] [amazon-s3][s3docs] Starting scanning of bucket famillebroudoux since 1371154754606
...
[2013-06-13 22:40:11,985][DEBUG][com.github.lbroudoux.elasticsearch.river.s3.river.S3River] [Commando] [amazon-s3][s3docs] Amazon S3 river is going to sleep for 36000 ms
[2013-06-13 22:40:12,182][INFO ][com.github.lbroudoux.elasticsearch.river.drive.connector.DriveConnector] Connection established.
[2013-06-13 22:40:12,182][INFO ][com.github.lbroudoux.elasticsearch.river.drive.connector.DriveConnector] Retrieving scanned subfolders under folder Travail, this may take a while...
...

I am in the process of suggesting this enhancement to Elasticsearch through a pull request. What is your opinion on it ? Will it be useful ? As usual, do not hesitate to send me your comments.

Advertisements

4 thoughts on “Plugin isolation support in Elasticsearch

  1. Hi,

    I may have encountered another issue to do with the classpath. It seems that elasticsearch may not be very friendly towards modular based applications with isolated classpaths (like say osgi, although I am not actually using osgi, but a similar design). For example, I have got modules that are not in the tomcat-webapp-classloader. Elasticsearch however appears to be looking at the default-classloader in order to find the config/names.txt resource, when in fact it should be looking in the modules’ classloader. So I am now in the process of trying to find a way around the following error:

    “Caused by: org.elasticsearch.env.FailedToResolveConfigException: Failed to resolve config path [names.txt], tried file path [names.txt], path file [H:\BerCo\apache-tomcat-7.0.39\bin\config\names.txt], and classpath.”

    When outputing the classloader that is being used by executing the following code:

    Classes.getDefaultClassLoader() (which inernally does this: Thread.currentThread().getContextClassLoader()) …

    … I get the WebappClassLoader from tomcat, which is of course not where the jar and hence the names.txt resides.

    There seems to be a way of passing a ClassLoader to the ImmutableSettings object, but I have not digged far enough yet, in order to say whether this can help in solving the issue.

    Best regards,
    Michael

    1. Just to clarify – the problem I mentioned can be solved with the following lines of code:

      Settings settings = ImmutableSettings.settingsBuilder()
      .classLoader(Settings.class.getClassLoader()).build();

      Node node = nodeBuilder().client(true).settings(settings).node();
      Client client = node.client();

      Unfortunately this has led to the next problem, which I am in the process of tracing down:

      “at org.elasticsearch.common.mvel2.ParserConfiguration.addClassMemberStaticImports(ParserConfiguration.java:109)”

      “java.lang.ClassNotFoundException: java.util”

      Not sure what is going on because I have never worked with mvel, so I’ll have to do some further digging.

      Best regards,
      Michael

  2. Hi,

    sorry for having invaded your blog entry. It appears that the stacktrace is one that is expected because when viewing the source-code I see the following:

    catch (ClassNotFoundException e)
    {
    // do nothing.
    }

    It just happens that the custom classloader that I was using printed out the error message, which made it look like as if something was wrong. Oh well, looks like as if all is working now – also with the external module and custom classloader.

    To answer your question. it sounds like a very sensible way to go and will probably save some headaches for others in the future. Infact what you have done is the way modular plugins should work. I have done similar work in the past and the design was almost the same. Having said that, osgi works no differently.

    Best regards,
    Michael

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s