-
Notifications
You must be signed in to change notification settings - Fork 21
Rubex v0.1 goals
Over the next few weeks Rubex will be made ready for alpha release by adding features that will be actually used for wrapping a real-world Ruby library, called rcsv. You can find the Rubex code that I've written for wrapping rcsv here. The aim is to eventually be able to compile the rcsv Rubex wrapper.
In the next iteration of Rubex, I will start adding support for heap-based memory allocation and deallocation using internal Rubex classes that will allow one to allocate structs and allow them to be collected by the Ruby GC so that the programmer does not need to deal with freeing memory. Support for Ruby-like error handling using begin-rescue-ensure and Ruby blocks will also be added later.
I will elaborate more on these other objectives later.
For this proposal, I will outline the goals that will need to be accomplished in the next few weeks in order to bring Rubex to a level that can be termed worthy of being a (somewhat) production ready language.
Goals:
- Support for string literals and Ruby-like string interpolation.
- Get line and column numbers for each symbol and have it printed in the output file.
- Support for
class
keyword and inheritance using the<
directive. - Support for defining class methods using the
self
keyword. - Support for calling methods like
[]
,[]=
,=
,!=
, etc. on arbitrary Ruby objects. - Make Rubex aware of certain Ruby primitive data types like string, hash and array nby specifying data types of certain variables so that methods called on these types can be optimized using C API calls rather than having to go to Ruby-land and call some method.
- Support for initializing Arrays, Hashes and String with literals (like
[]
,{}
and""
). - Support for initializing Ruby classes (like
StringIO.new
). - Support for Ruby symbols.
- Support for binary operators like
|
,&
, etc. - Support for C functions with the
cdef
keyword. - Support for passing by reference to C functions (using
&
for actual parameters and*
in formal parameters). - Ability to declare C function callbacks as pointers to functions inside structs and formal arguments of functions.
- Ability to send function pointers to other C functions as callbacks.
- Support for Ruby-style
raise
clauses. - Support error handling with begin-rescue-ensure.
Particulars:
Currently, when the compiler reads symbols from a file, no information about their location (with regards to file name and line number) is stored. This first milestone will focus on making that happen.
Implementation: Oedipus lex provides the line number and column number of a particular matched element with <> option.
Each statement (or an instance of Rubex::AST::Statement
) will have a line_no
attribute and a file_name
attribute for storing the line number and file name respectively.
This feature will add support for string literals using double quotes (eg - "this is a string.") and also allow string interpolation using the Ruby-like #{}
syntax within strings. After this milestone the following code will be made possible:
def strings
char* s = "My name is Ted."
obj = "My name is also Ted."
int a = 4
float b = 5.6
print "The number a is : #{a}\nThe number b is: #{b}"
print "Char star says #{s} and obj says #{obj}."
return obj
end
Support for the class
keyword will allow users to encapsulate methods inside Ruby classes and not be compelled to have them under Object
, as was the case before. As is the case in Ruby, class names must start with a capital letter. One will also be able to inherit from custom user-defined classes or built-in Ruby classes which will allow to create custom errors. So after this milestone, the following Rubex code will be made possible:
class Kustom
def bye
print "Bye world!"
end
end
class Klass < Kustom
def hello
print "Hello world!"
end
end
Once support for Ruby classes has been added, defining class methods using self
can also be supported. For now, class methods can also be addded using the self.
syntax. Support for class << self
will not exist as of now. Following Rubex code can then be written:
class Bhau
def self.say_what(string)
int i = 0
while i < 10
print string
i += 1
end
end
end
Until now it was not possible to get the individual elements of a string or an array. This milestone will focus on the ability to call Ruby methods that look like operators on any Ruby object. It will thus become possible to acquire eleements of an Array or a Hash. Equality will also be supported.
The following code will work after this:
class StringCaller
def call_now(string)
print string[0]
string[2] = "h"
print string
return string
end
end
One of the problems that exist with C extensions is that method calls to Ruby-land are extremely expensive. Thus, if Rubex is made aware that a particular Ruby object is of a specific type (like a String, Array or Hash), then certain methods called on that object can be directly optimized by Rubex by directly using optimizations from the CRuby C API, instead of going through a Ruby method call.
For example, if a statement like a.size
is encountered, and a
is a string, Rubex can directly translate this code to RSTRING_LEN(a)
instead of rb_funcall(a, rb_intern("size"), 0, NULL)
. The former is much faster the latter.
The string class in Ruby will be represented in Rubex with the string
data type, Hash with hash
and Array with array
. Code using these will look like this:
class RubyTypes
def these_types(string str, array arr, hash h)
str[4] = "a"
arr.append(44)
print str
print arr
print hash["fff"]
hash["fff"] = 565
print hash["fff"]
return hash
end
end
This milestone will involve adding support for initialization of hashes, strings and arrays with literals that are familiar to every Ruby developer. Users will also be able to populate these data structures with data using expressions or more literals. Mainly, it will involve adding support for easy initialization of Ruby data structures by implicitly using the CRuby C API.
Example code:
class DataInit
def init_this(a, b, c)
arr = [1,2,3,4,5,6]
str = "Hello world! Lets have a picnic!"
h = {
"hello" => arr,
"world" => 666,
"message" => str
}
print h["hello"]
return h
end
end
This feature will add support for initializing Ruby classes from within Rubex. Whenever an identifier starting with a capital letter is encountered, it will be treated as a Ruby constant and any methods called on that constant will be translated into the corresponding C API calls.
User defined Ruby classes in Rubex can also be initialized in this way.
Example code:
class InitRubyClass
def init_classes
a = String.new
a[0] = "5"
f = StringIO.new("Hello! This is a test")
s = f.read
return s
end
end
Symbols are an integral part of Ruby and will be supported in this milestone.
Example code:
class RubySymbols
def symbol_support(a, b)
hash h = {}
h[:first] = a
h[:second] = b
other_hash = {
:third => 69
}
h[:third] = other_hash
return h
end
end
Until now, all the methods that you could define through Rubex were exposed to Ruby, i.e. they could be directly called from a Ruby script. However, there are some operations for which defining pure C functions (i.e. functions that are only visible to the generated C program) is important, for example, providing functions as callbacks to other functions. Rubex will allow this with the cdef
keyword.
The user will have to specify the return type of the method if they use cdef
. The scope of a cdef
function will be local to the class that it has been defined in. If a class inherits from another class, the cdef
functions will be inherited as well. These functions will not be callable from a Ruby script. Moreover, pure Ruby methods will be able to call these functions just like any other function.
Example code:
class CFunctions
def pure_ruby_method
a = 55
b = 5.43
int c = first_c_function(a, b)
return a + c
end
cdef int first_c_function(int a, float b)
int c = a + 5
int d = (c * b + 3)/5
return c - d
end
end
One of the tenets of C programming is the ability to pass arguments by reference to other functions. Rubex will support this with a syntax similar to C. Users can declare formal arguments of C functions as pointers with the *
operator (the way they do for any other data type) and can pass the address of a variable using &
.
Rubex will not support the ->
operator for denoting elements of a struct pointer. Instead, users will first have to dereference the struct using [0]
(like choosing the 0th element of an array of structs starting from that particular pointer position) and then use .
for referring to elements inside a struct.
Example code:
struct attribs do
int a, b
float c
end
class CallByReference
def reference_call
attribs a
int b_flat = 460
a.a = 56
a.b = 65
a.c = 23
c_function(&a, b_flat)
return a.a
end
cdef void c_function(attribs *a, int b)
a[0].a = b
end
end
This functionality will allow users to send C functions defined by cdef
as pointers inside other functions so that they can be used as callbacks later. This functionality is crucial for wrapping many modern C libraries and hence will be supported in the next immediate release.
Example:
TODO
This will be a very critical functionality in Rubex for throwing errors and dealing with them in a manner that is similar to the way Ruby deals with errors. It will build error handling right into Rubex and much work will not be necessary on part of the programmer for handling exceptions in C extensions.
Support for this functionality will be put in the future and will not be a part of the release that will contain all of the above functionality.