Continuations VS Closures

posted on 2005-03-31

Wandering around, I found an article about Continuations for Web applications here. Please note that while trying to reach the same result, continuations approach is different from closures that I introduced in the post below. In fact, continuations captures not only the method arguments but the whole stack, environment, and variables of the program. That’s a lot more heavy and maybe not suitable for handling thousands of users.

Website Closures

posted on 2005-03-31

When I started working on Dinoparc I looked at several webscript languages and ended up implementing my own (see previous log). It works now very fine and very efficiently since both CPU and RAM are very low on the server (compared, of course, to PHP…). I had fun working on such a new technology and if I have another website to make I’ll definitly push it further.

One thing I would like to add to this language (it’s Motion-Types but with MotionScript generator and VM) are closure-links. What’s that ? When you generate dynamically a webpage, you generate a lot of links to other pages with some parameters embedded. Such parameters (http GET parameters) are really to be care because they can be used in a tricky way. One example I remember on a website we wrote was a amount of goods you wanted to sell. If you were changing the value to a negative one, you ended up loosing money and getting goods for cheap, since after some time, money was locked to 0 and you could get an infinite number of goods this way, and then sell them back normally for money. The fix was quick : just forbid negative values. But in fact, since there was 3 links available “sell 1, 5 and 10″, the value could as well have been hidden from the user, if not for (good) coding practices.

In all the cases, what you have to do to write good and secure website is to check all incoming parameters for consistency against the ones that were actually passed as parameters by the previous link. Such thing can be automated by a CRC. That means the link index.php?v=33;c=hello can be added an additional parameter crc that ensure that v=33 and c=hello. However any CRC can also be forged by the user, so this approach will not work (unless you try to hide the algorithm and hope nobody find it).

Another way is to use closures. In functional programming languages, a closure is a function (or a method) for which a given number of parameters have been already decided. For example let’s take a simple function :

  function plus(a,b) {
    return a + b;
  }

A closure plus(1) could be called “increment” since it will be a function that accepts one parameter “b” and adds 1 to it. A partial application (calling with less that the total numbers of parameters) of a function produces a closure in most of functional languages (this feature is called “currying“).

What closures does have to do with html links and parameters checking ? Well, if you consider every link as a function which takes some arguments (the parameters), then you can see that every followed link is actually executing a function, which generate a webpage (this analogy is particularity true if your website is a single application such as Dinoparc, and not a bunch of php scripts). So one possibility is, for every link that the user might follow, we can capture the name of the “function” and all the parameters into a closure, assign an unique id to it, and just give a link to a script executing the closure with the given id.

For example, if there is a link to a php script someScript.php?p1=V1;p2=v2 , you create a closure with “someScript.php” and parameters “p1=v1″ and “p2=v2″, give it an id (for example 456) and return instead a link to execute.php?id=456 . This way the parameters are hidden from the user which cannot hack them since the only thing he can change is the closure id. That gives us a good tip on how and where to store closures. Since the user can change closure id, he should not be able to access other people closures, so we have to store the closures into the current user session.

Let’s think about it : everytime I generate a page, I look at links with parameters, create closure for each of them, and replace the link with “execute this closure by id”. Closures are very small objects. They can even be a single couple (id,string) if you can parse later your URL and call the corresponding script. But there is a bunch of them… Every page might contains more than 20 closures if there is a lot of links having parameters. That means after 10 pages visited, you have something like 200 closures in your session data. Too much, you might think… One first optimization is to reuse closures ids which already have been generated. If one page have a link “run.php?cmd=echo” and one another page have the same link, then obviously following the first or the second link will result into the same result (for a given user) : that might reduce the number of generated closures, especialy if your website have fixed menus with parameters that are showing on every page.

Another propery of what I would call a “solid” website is the ability of the user to go “Previous” pages using his browser without annoying side effects. For example if I click the “buy” link and then later go back several times, I don’t want to buy a second time. Technicaly speaking, this is acheived by having a two-steps execution : at first do the side effects, but don’t generate any page : instead send a “Location” http header with url (that can itself be a closure) that will show the result page. This way if the user comes back his browser will directly redirect him to the result page and will not execute again the “buy” action.

Giving the ability of your user to go back is nice, but you might limit the number of pages in history, in order to keep the number of closures in session low. If you limit the history to 10 pages then it should already be quite enough and still keep the size of the session data small. Another way is to use closures explicitly in your program so only sensible links you want will be turned into closures. But that might be risky since you can never guess what an user can do.

Please note also that data coming from POST forms cannot obviously be protected with closures abstractions, since such data actually comes from the user. But in most websites, that’s quite normal and easy to check incoming forms for correctness.

About Structural Subtyping

posted on 2005-03-18

Some people ask me about structural subtyping in MotionTypes. If you Google for thoses words, you’ll see a bunch of research papers and not clear-and-simple definition. That’s because structural subtyping is more convenient when you need to write a proof about a specific typing algorithms.

So what’s structural subtyping ?

Structural subtyping means that an object B is a subtype of A (notation B :> A ) iff all fields of A are in B, with same types. In languages such as Java/C++/AS2… subtyping is synonym of inheritance ( B extends A ) or prototyping ( B implements A ). While it’s very easy to check B :> A , you have to encode in your application some strict rules in order to get proper types (you have to think and build an inheritance tree correctly, as part of your application architecture).

With structural subtyping, if you have :

    class Point {
        int x;
        int y;
        void add( Point p )  {
            x += p.x;
            y += p.y;
        }
    }

Then all classes having fields int x, int y and a function void add (Point) are implictly subtypes of Point, and can then be used everywhere the program need a Point : you don’t have to extends Point or create an interface AbstractPoint.

It goes more powerful when used together with type inference. For example let’s see this function :

    function f(o) {
        o.foo(33);
    }

Typing algorithm in compiler can infer about f type that it returns void and that it has one parameter o which is an object with at least a method foo which takes an integer argument and returns whatever. Type inference then can give you a type of f :

    void f ( o : { void foo(int)  } );

And if you have structural subtyping that means that you can apply f on any object have at least a method foo which takes an integer as parameter. I recommend reading the book of Benjamin Pierce Types and Programming Languages if you’re interested in that topic. It bring together good basics about theory, mathematical models and different typing algorithms.

New MTASC Website

posted on 2005-03-08

MTASC official site is now www.mtasc.org

OSCON 2005

posted on 2005-03-08

Congratulations! You have been accepted as a presenter for
the O’Reilly Open Source Convention 2005 at the Oregon Convention Center,
Portland, Oregon, August 01, 2005 - August 05, 2005.

The following has been accepted as a 45 minute session for the event:

“MTASC : Opening the Flash Platform”

People that want to meet me are welcome to visit OSCON