Translate Javascript to C#
Let’s say you need to automatically port some code from one language to another, how are going to do it? Is it even possible? Maybe you have already seen a conversion between similar languages, such as Java to C#. That sounds much simpler in comparison.
In this article we are going to discuss some strategies to translate Javascript to a very different language, such as C#. We will discuss the issues with that and plan some possible solutions. We will not arrive to writing code: that would be far too complicate for an introduction to the topic. Let’s avoid putting together something terribly hacky just for the sake of typing some code.
Having said that, we are going to see all the problems you may find in converting one real Javascript project: fuzzysearch, a tiny but very successful library to calculate the difference between two strings, in the context of spelling correction.
fuzzysearch.js
'use strict'; function fuzzysearch (needle, haystack) { var hlen = haystack.length; var nlen = needle.length; if (nlen > hlen) { return false; } if (nlen === hlen) { return needle === haystack; } outer: for (var i = 0, j = 0; i < nlen; i++) { var nch = needle.charCodeAt(i); while (j < hlen) { if (haystack.charCodeAt(j++) === nch) { continue outer; } } return false; } return true; } module.exports = fuzzysearch;
When it’s worth the effort
First of all you should ask yourself if the conversion it’s worth the effort. Even if you were able to successfully obtain some runnable C# you have to consider that the style and the architecture will probably be “unnatural”. As consequence the project could be harder to maintain than if you write it from scratch in C#.
This is a common problem even in carefully planned conversion, as the one who originated Lucene.net, that started as a conversion from Java to C#. Furthermore, you will not be able to use it without manual work for every specific project, because even the standard libraries are just different. Look at the example: while you could capitalize the length of haystack.length, you cannot just capitalize charCodeAt, you will have to map different functions in the source and destination language.
On the other hand all languages have area of specialization which may interest to you, such as Natural Language Processing in Python. And if you accept the fact that you will have to do some manual work, and you are very interested in one project, creating an automatic conversion will give you a huge head start. Though if you are interested in having a generic tool you may want to concentrate on small libraries, such as the Javascript one in the example.
Parse with ANTLR
The first step is parsing, and for that, you should just use ANTLR. There are already many grammars available which may not necessarily be up-to-date, but are much better than starting from scratch and they will give you an idea of the scale of the project. You should use visitors, instead of listeners, because they allow you to control the flow more easily. You should parse the different elements in custom classes, that can manage the small problems that arises. Once you have done this generating C# should be easier.
The small differences
There are things that you could just skip, such as the first and last lines, they most probably don’t apply to your C# project. But you must pay attention to the small differences: the var
keyword has a different meaning in Javascript and C#. By coincidence it would work most of the time, and would be quite useful to avoid the problem of the lack of strict typing in Javascript. But it’s not magic, you are just hoping that the compiler will figure it out. And sometimes it’s not a one to one conversion. For instance you can’t use in C# in the way it’s used in the initialization of the for
cycle.
The continue
before outer should be transformed in goto
, but when it is alone it works just as in C#. A difference that could be fixed quite brutally is the strict equality comparison “===/!==”, that could be replaced with “==/!=” in most of cases, since it’s related to problems due to the dynamic typing of Javascript. In general you can do a pre-parse check and transform the original source code to avoid some problems or even comment out some things that cannot be easily managed.
I present you thy enemy: dynamic typing
The real problem is that Javascript uses dynamic typing while C# use strict typing. In Javascript anything could be anything, which lead to certain issues, such as the aforementioned strict equality operator, but it’s very easy to use. In C# you need to know the type of your variables, because there are checks to be made. And this information is simply not available in the Javascript source code. You might think that you could just use the var
keyword, but you can’t. The compiler must be able to determine the real type at compile time, something that will not always be possible. For example you cannot use it in declaring function arguments.
You can use the dynamic
keyword, which makes the type be determined at execution time. Still this doesn’t fixes all the problem, such as initialization. You may check the source code for literal initialization or, in theory, even execute the original Javascript in C# and find a way to determine the correct type. But that would be quite convoluted. You might get lucky, and in small project, such our example, you will, but not always.
There are also problems that can be more easily to manage than you imagined. For instance, assigning a function to a variable it’s not something that you usually do as explicitly in C# as you do in Javascript. But it’s easy using the type delegate
and constructs such as Func.
Of course you still have to deal with determining the correct types of the arguments, if any is present, but it doesn’t add any other difficulties per se.
Not everything is an object and other issues
In Javascript “string” is a string, but not an object, while in C# everything is an object, there are no exceptions. This is a relevant issue, but it’s less problematic than dynamic typing. For instance to convert our example we just have to wrap around the function a custom class, which is not really hard. One obvious problem is that there are different libraries in different languages. Some will not be available in the destination language. On the other hand some part of the project might not be needed in the destination language, because there are already better alternatives. Of course you still have to actually change all the related code or wrap the real library in the destination language around a custom class that mimic the original one.
Conclusion
There are indeed major difficulties even for small project to be able to transform from language to another, especially when they are so different like Javascript and C#. But let’s image that you are interested in something very specific, such a very successful library and its plugins. You want to port the main library and to give a simpler way for the developers of the plugins to port their work. There are probably many similarities in the code, and so you can do most of the work to manage typical problems and can provide guidance for the remaining ones.
Converting code between languages so different in nature it is not easy, that is sure, but you can apply some mixed automatic/manual approach by converting a large amount of code automatically and fix the corner cases manually. If you can also translates the tests maybe you can later refactor the code, once it is in C#, and over time improve the quality.
Reference: | Translate Javascript to C# from our WCG partner Federico Tomassetti at the Federico Tomassetti blog. |