Friday, November 27, 2009

JRuby Embed (Red Bridge) Gotchas: __FILE__

About a month ago, I wrote __FILE__ didn't work when Ruby code was loaded from classpath in the thread, Load path issues inside jar / external app. Tracking jruby down with a debugger, I found out one solution. It was a combination of setting a feasible current directory and using File.expand_path.

Here's a test code:

# file_check.rb [Birch]

puts "__FILE__: #{__FILE__}"
puts "dirname: #{File.dirname(__FILE__)}"
puts "expanded path: #{File.expand_path(File.dirname(__FILE__))}"
puts "joined path 1: #{File.join(File.dirname(__FILE__), "abc.rb")}"
puts "joined path 2: #{File.join(File.expand_path(File.dirname(__FILE__)), "abc.rb")}"

// FileCheck.java
package vanilla;

import org.jruby.embed.LocalContextScope;
import org.jruby.embed.PathType;
import org.jruby.embed.ScriptingContainer;

public class FileCheck {

private FileCheck() {
//String userDir = System.getProperty("user.dir");
//System.setProperty("user.dir", userDir+"/src/ruby");
ScriptingContainer container = new ScriptingContainer(LocalContextScope.SINGLETHREAD);
System.out.println("currentDirectory: " + container.getProvider().getRubyInstanceConfig().getCurrentDirectory());
container.getProvider().getRubyInstanceConfig().setCurrentDirectory(System.getProperty("user.dir")+"/src/ruby");
System.out.println("currentDirectory: " + container.getProvider().getRubyInstanceConfig().getCurrentDirectory());
container.runScriptlet(PathType.CLASSPATH, "file_check.rb");
}

public static void main(String[] args) {
new FileCheck();
}
}

The absolute path to file_check.rb was /Users/yoko/NetBeansProjects/Birch/src/ruby/file_check.rb. So, I added the path, /Users/yoko/NetBeansProjects/Birch/src/ruby, to "-cp" option of java command. In a Java code, I set /Users/yoko/NetBeansProjects/Birch/src/ruby as the current directory. Then, the result was below:

yoko$ java -cp build/classes:/Users/yoko/DevSpace/jruby~main/lib/jruby.jar:./src/ruby vanilla.FileCheck
currentDirectory: /Users/yoko/NetBeansProjects/Birch
currentDirectory: /Users/yoko/NetBeansProjects/Birch/src/ruby
__FILE__: file_check.rb
dirname: .
expanded path: /Users/yoko/NetBeansProjects/Birch/src/ruby
joined path 1: ./abc.rb
joined path 2: /Users/yoko/NetBeansProjects/Birch/src/ruby/abc.rb

As you see, a wrapped path by File.expand_path is correct though just File.dirname didn't work. The combination of setting the current directory and using File.expand_path would be the solution of this kind of cases. If you are using JSR223 or BSF, setting user.dir system property works as I commented out in the Java code. This is because JRuby uses the current directory when it expands a path, and the current directory is based on user.dir system property. If it is a web application, perhaps, we can set the current directory using ServletContext#getRealPath().

Wednesday, November 25, 2009

JRuby Embed (Red Bridge) Update: global vars, loading java, and more

During these weeks, I made a couple of changes on Red Bridge (JRuby Embed), which would improve performance a bit and reduce problems caused by global variables. This change is available from 161d0fe in master (1.5.0.dev).

Firstly, I changed an internal implementation of sharing global variables. Red Bridge injects all variables in a variable map just before the evaluation, and tries to retrieve all local, instance, global variables and constants used in Ruby just after the evaluation. This behavior is really greedy, also ends up in poor performance. However, it is necessary since Red Bridge terminates the all state including variable values right after the evaluation is done, which is to save resources. Unless retrieving all variables and constants, Red Bridge can't return requested variables in a Java program. For example, users can do with Red Bridge:

ScriptingContainer container = new ScriptingContainer();
container.runScriptlet("$theta = Math::PI / 6.0");
container.runScriptlet("$value = Math.sin($theta)");
System.out.println(container.get("$theta") + ", " + container.get("$value"));

Above outputs: 0.5235987755982988, 0.49999999999999994


Local, instance variables and constants (except global constants) need to be saved before those are disappeared by the termination, but global variables are still on Ruby runtime. So, I changed to get global variables lazily. Only when it is requested, Red Bridge takes the requested global variable out from runtime.

This new behavior would also reduce troubles caused by global variables. Before, Red Bridge retrieves global variables as much as possible from Ruby runtime except predefined ones. Then, Red Bridge injects all global variables in its variable map to runtime for successive evaluation with values of previous evaluation. This behavior occasionally causes unexpected results and warnings. After the change, Red Bridge doesn't grab unnecessary global variables, doesn't inject them for the next evaluation. Perhaps, unexpected results related to global variables will be reduced. This new behavior is not available when a global local variable behavior, JSR223's default behavior, is chosen since it is tailored to behave exactly the same as the reference implementation.

Some of you might already know clearing up the variable map before the successive evaluation contributes performance. I added two shortcut methods to ScriptingContainer:

org.jruby.embed.ScriptingContainer#remove(String key)
org.jruby.embed.ScriptingContainer#clear()

The remove method removes a specified key-value pair from the variable map and runtime. The clear method removes all key-value pairs from the variable map and runtime. The smaller the variable map size is, the shorter the time for injection is. Don't forget to remove redundant key-value pairs.


I made one more change. Red Bridge no more loads a java library during the initialization. The process of loading libraries in JRuby is quite a cumbersome job. Looking the loaded library tables up to see it is not already loaded, judging how and from where loads the library, then loading, and caching them to avoid duplication... Nevertheless, not all Ruby scripts need the java library. If people run Fibonacchi written in pure Ruby on Red Bridge, they don't need the java library at all. When people want to use the java library, adding the line "require 'java'" in a Ruby code works fine. Moreover, people add "require 'java'" when they run scripts using jruby command if the scripts need the java library. The advantage of pre-loading the java library seems to be less. So, I stopped loading the java library during the initialization. Perhaps, the time for initialization got shortened a bit.

Wednesday, November 04, 2009

A Japanese Teenage Boy Improved Ruby 1.9 Performance Up to 63%

Japanese online magazine, @IT Jibun Senryaku Lab. (information site for IT engineers to educate and/or develop oneself), published an interview with a Japanese teenage boy, Masahiro Kanai, who improved the performance of several methods in Ruby 1.9. He is the age of high school freshman (the third grade of junior high school in Japanese school system). The article (written in Japanese) is here.

According to the article, Masahiro Kanai joined “the Security and Programming Camp 2009” this summer and chose the subject of Ruby’s performance improvement. His mentor was Koichi Sasada (ko1). The performances of the methods he worked have been bumped up 63% in maximum, 8% in average. His patches were applied to Ruby trunk in Oct. 5 this year.

What Masahiro Kanai did was fundamental for performance tuning. He took unnecessary macro references out from a loop. Masahiro spotted macros below in array.c, string.c, and struct.c were referred every time Ruby checked whether data was hold in a structure or not. Even though data were constants, Ruby saw the macros to judge data’s presence in every loop.

-RARRAY_PTR, RARRAY_LEN
-RSTRING_PTR, RSTRING_LEN
-RSTRUCT_PTR, RSTRUCT_LEN

He optimized the loop by eliminating macro references when data were constants.

The interviewer acclaimed that he made it in his age.

Monday, November 02, 2009

JRuby Embed (Red Bridge) Update

Since my last post about JRuby Embed (Red Bridge), it has been vastly changed. JRuby Embed codebase has been merged into JRuby! JRuby 1.5.0 will have Red Bridge inside in its both binary and source archives. Along with this, JRuby Embed wiki pages also have been merged into JRuby's wiki, Embedding JRuby section.

Now, JRuby Embed project is almost in end-of-life period. I'll soon close jruby-embed users ml since it is natural to talk at jruby-users/jruby-dev. Besides, most of discussions have done on jruby-users ml. Jira is also going to be merged into JRuby, but this will be done after JRuby's jira is completed moving from codehaus to kenai. Anyway, JRuby Embed users, please use jruby's ml and jira. JRuby's embedding seciton of jira would be good for us to file issues. However, I'll keep source code repository for JRuby 1.4. JRuby 1.4 has JRuby Embed binary but doesn't have sources. The binary that JRuby 1.4 has is built from codebase of this project, so it still has a reason to be there.

One of the biggest changes is JRuby Embed 0.1.3 has been released from JRuby Embed Project. It will be included in upcoming JRuby 1.4 release. In this release, default value of local context type has been switched from threadsafe to singleton. See the discussion about it. Please make sure your choice is the best to your case. Walk through Context Instance Type section to know what you should choose.

The version, 0.1.3 is identical to the one in JRuby trunk (1.5.0.dev) and also had a fix of JRUBY_EMBED-10. Give it a try. If you find something, file at "JRuby Jira" and ask about it at JRuby's mailing list.