Friday, January 30, 2015

Code Analysis

Sometimes, you have to put in place tools to make sure that the code respects the best practices and guidelines. Some very good ones already exist (in particular the code linters).
In this article, I will introduce you to esprima that allows a deep and precise code analysis.
### Linting is necessary... In a previous post, I have been talking about the benefits of using [code linting tools](http://gpf-js.blogspot.ca/2013/11/ why-lint-tool-can-reduce-development.html) as it really saves the developer time during the coding phase: * It highlights common mistakes (undeclared variables, unused parameters, use of [hasOwnProperty](https://developer.mozilla.org/en-US/docs/Web/JavaScript/ Reference/Global_Objects/Object/hasOwnProperty), == vs ===...) * It maintains code consistency (formatting, naming conventions...) * It checks code complexity, in particular the [cyclomatic complexity](http:// en.wikipedia.org/wiki/Cyclomatic_complexity) Regarding JavaScript, in my opinion, one of the best options is to apply [JSHint](http://www.jshint.com/). In the following example, several errors are raised because of: * quotes inconsistencies * undeclared functions (let say they are declared in other sources) * missing semicolon * undeclared or improper variable names * ... Move your mouse over the underlined errors to read them *(note that this is using live JSHint parsing)*. `(function () {/*gpf:apply-jshint*/ "use strict"; // Handles the click on a specific button function on_click (ctrl) { freezeInterface("Please wait, updating information...") $.ajax ('getInformation.aspx', { success: function (data) { resultCode = data.resultCode; if (resultCode == 0) updateInterfaceUsingData(data); } }) unfreezeInterface(); } document.getElementById('myButton') .addEventListener("click", on_click); })();` ### ...but not sufficient However, this kind of tool has limits as it tends to apply general validation rules without considering the meaning of the code that is executed. To illustrate the problem, let me first fix the JSHint errors: `(function () {/*gpf:apply-jshint*/ "use strict"; /*global $, freezeInterface, updateInterfaceUsingData, unfreezeInterface*/ // Handles the click on a specific button function onClick () { freezeInterface("Please wait, updating information..."); $.ajax("getInformation.aspx", { success: function (data) { var resultCode = data.resultCode; if (resultCode === 0) { updateInterfaceUsingData(data); } } }); unfreezeInterface(); } document.getElementById("myButton") .addEventListener("click", onClick); })();` Using the "global" comment, JSHint is configured to be aware of the global variables to be defined. With this declaration, the code successfully passes JSHint validation. But do you see any problem here? ... no? ... really? ... ### Incorrect use of the API The method **[$.ajax](http://api.jquery.com/jquery.ajax/)** comes from the [jQuery](http://jquery.com/) framework and it encapsulates the necessary code to handle AJAX calls with the server. There are two important things to remember about AJAX calls: * They are asynchronous: this is why callbacks are used to be notified when the answer comes back from the server. * Like any function call, they may generate errors: either because the API that was invoked can't provide the requested result or because a network failure prevents the call to complete. Consequently, there are two problems with the way the method **onClick** is implemented: * The use of **unfreezeInterface** is done right after the call to **$.ajax**. It means that it does not even wait for the call to succeed. * Furthermore, if the call fails, nothing will happen. The error is ignored. There are some situations where it is acceptable to ignore errors but, in a general manner, all errors must be handled. ### Understanding the code To detect such a misconception, one must first locate any function call to **$.ajax** and then analyze the parameters to verify that it has been used the right way. One easy solution could be to consider the whole source as a big string, look for "$.ajax" and then check that the "success" as well as the "error" keywords appear just after. But how reliable is that? `(function () {/*gpf:apply-jshint*/ "use strict"; /*global $, freezeInterface, updateInterfaceUsingData, unfreezeInterface*/ /*global showError*/ // Handles the click on a specific button function onClick () { freezeInterface("Please wait, updating information..."); $.ajax("getInformation.aspx", { success: function (data) { var resultCode = data.resultCode; if (resultCode === 0) { updateInterfaceUsingData(data); } else { // The 'error' keyword is here but inside a string showError("An error occurred"); } } }); unfreezeInterface(); } document.getElementById("myButton") .addEventListener("click", onClick); })();` In the above example, the "error" keyword appears in a comment and a string. Furthermore, how can you make sure that no instruction is executed after the call to **$.ajax**? ### Parsing the code The only good way to understand the JavaScript code is to parse the source in order to build a structured representation of the instructions it contains. Consequently, the **$.ajax** would appear as a function call and it would be possible to enumerate and check its parameters. In particular, we could verify that the second one is an object with a member named "error". JavaScript parsing is a wide topic... In my [library](https://github.com/ ArnaudBuchholz/gpf-js), I created a [tokenizer](https://github.com/ ArnaudBuchholz/gpf-js/blob/master/tokenizer.js) that is used in this website to apply syntax coloring but it does not really provides the program structure: it just identifies the keywords, strings, comments and symbols. ### esprima: a free JavaScript parser Fortunately, there are existing libraries that do the job for you. I will focus on [esprima](http://esprima.org/index.html). To make a long story short, esprima is a JavaScript library that parses JavaScript sources. It generates a [Sensible syntax tree](http://esprima.org/ doc/index.html#ast) format, compatible with [Mozilla Parser AST](https:// developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Parser_API). The previous example is converted into: `{ "type": "Program", "body": [ { "type": "ExpressionStatement", "expression": { "type": "CallExpression", "callee": { "type": "FunctionExpression", "id": null, "params": [], "defaults": [], "body": { "type": "BlockStatement", "body": [ { "type": "ExpressionStatement", "expression": { "type": "Literal", "value": "use strict", "raw": "\"use strict\"" } }, { "type": "FunctionDeclaration", "id": { "type": "Identifier", "name": "onClick" }, "params": [], "defaults": [], "body": { "type": "BlockStatement", "body": [ { "type": "ExpressionStatement", "expression": { "type": "CallExpression", "callee": { "type": "Identifier", "name": "freezeInterface" }, "arguments": [ { "type": "Literal", "value": "Please wait, updating information...", "raw": "\"Please wait, updating information...\"" } ] } }, { "type": "ExpressionStatement", "expression": { "type": "CallExpression", "callee": { "type": "MemberExpression", "computed": false, "object": { "type": "Identifier", "name": "$" }, "property": { "type": "Identifier", "name": "ajax" } }, "arguments": [ { "type": "Literal", "value": "getInformation.aspx", "raw": "\"getInformation.aspx\"" }, { "type": "ObjectExpression", "properties": [ { "type": "Property", "key": { "type": "Identifier", "name": "success" }, "value": { "type": "FunctionExpression", "id": null, "params": [ { "type": "Identifier", "name": "data" } ], "defaults": [], "body": { "type": "BlockStatement", "body": [ { "type": "VariableDeclaration", "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "resultCode" }, "init": { "type": "MemberExpression", "computed": false, "object": { "type": "Identifier", "name": "data" }, "property": { "type": "Identifier", "name": "resultCode" } } } ], "kind": "var" }, { "type": "IfStatement", "test": { "type": "BinaryExpression", "operator": "===", "left": { "type": "Identifier", "name": "resultCode" }, "right": { "type": "Literal", "value": 0, "raw": "0" } }, "consequent": { "type": "BlockStatement", "body": [ { "type": "ExpressionStatement", "expression": { "type": "CallExpression", "callee": { "type": "Identifier", "name": "updateInterfaceUsingData" }, "arguments": [ { "type": "Identifier", "name": "data" } ] } } ] }, "alternate": { "type": "BlockStatement", "body": [ { "type": "ExpressionStatement", "expression": { "type": "CallExpression", "callee": { "type": "Identifier", "name": "showError" }, "arguments": [ { "type": "Literal", "value": "An error occurred", "raw": "\"An error occurred\"" } ] } } ] } } ] }, "rest": null, "generator": false, "expression": false }, "kind": "init", "method": false, "shorthand": false } ] } ] } }, { "type": "ExpressionStatement", "expression": { "type": "CallExpression", "callee": { "type": "Identifier", "name": "unfreezeInterface" }, "arguments": [] } } ] }, "rest": null, "generator": false, "expression": false }, { "type": "ExpressionStatement", "expression": { "type": "CallExpression", "callee": { "type": "MemberExpression", "computed": false, "object": { "type": "CallExpression", "callee": { "type": "MemberExpression", "computed": false, "object": { "type": "Identifier", "name": "document" }, "property": { "type": "Identifier", "name": "getElementById" } }, "arguments": [ { "type": "Literal", "value": "myButton", "raw": "\"myButton\"" } ] }, "property": { "type": "Identifier", "name": "addEventListener" } }, "arguments": [ { "type": "Literal", "value": "click", "raw": "\"click\"" }, { "type": "Identifier", "name": "onClick" } ] } } ] }, "rest": null, "generator": false, "expression": false }, "arguments": [] } } ] }` In particular, the **$.ajax** call is: `// [...] { "type": "ExpressionStatement", "expression": { "type": "CallExpression", "callee": { "type": "MemberExpression", "computed": false, "object": { "type": "Identifier", "name": "$" }, "property": { "type": "Identifier", "name": "ajax" } }, "arguments": [ { "type": "Literal", "value": "getInformation.aspx", "raw": "\"getInformation.aspx\"" },` And, at the same level, you find the call to **unfreezeInterface** call: `// [...] { "type": "ExpressionStatement", "expression": { "type": "CallExpression", "callee": { "type": "Identifier", "name": "unfreezeInterface" }, "arguments": [] } }` So it is possible to write a program that uses esprima to parse the JavaScript source and then analyze the result structure to do the necessary checks. ### verify.js So I created the program [verify.js](https://github.com/ArnaudBuchholz/ ArnaudBuchholz.github.io/blob/master/blog/post/Code%20analysis/verify.js) that relies on esprima to parse a file and locate the two problems described above. The main algorithm relies on the structure exploration done in the function [walk](https://github.com/ArnaudBuchholz/ArnaudBuchholz.github.io/blob/ 65e9bf2164a3081a90f92fe2ff4805479880ea6b/blog/post/Code%20analysis/ verify.js#L71) and, once the **$.ajax** structure is detected, the function [checkAjaxCallbacks](https://github.com/ArnaudBuchholz/ArnaudBuchholz.github.io/ blob/65e9bf2164a3081a90f92fe2ff4805479880ea6b/blog/post/Code%20analysis/ verify.js#L36) takes care of checking the parameters. To use it, first get it from the GitHub repository (use the [raw](https://raw.githubusercontent.com/ArnaudBuchholz/ ArnaudBuchholz.github.io/master/blog/post/Code%20analysis/verify.js) button). Supposing you have [nodeJs](http://nodejs.org/) installed, open a command prompt and type: * npm install gpf-js * npm install esprima ![Setup](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/setup.png) Then you are ready to go. Type "node verify" ![Help](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/help.png) Or download the two samples and test them: * [sample1.js](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/ sample1.js) ![Sample 1](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/ sample1.png) * [sample2.js](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/ sample2.js) ![Sample 2](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/ sample2.png) ### To conclude JavaScript linting with JSHint is necessary. However, you might need a more advanced tool that is capable of understanding the meaning of the algorithm to get a deeper validation of your code. With this article, I tried to demonstrate only a fragment of what can be done with the AST structure generated by esprima. But once the door is opened: * You can also check function signatures or dependencies, keep track of closures, handle variable types... * You may also think of generating statistics, documentation... * Why not modifying the AST structure in order to manipulate the code and generate a modified version of it: have a look on [escodegen](https://github.com/estools/escodegen).

No comments:

Post a Comment