1. Choice of NoSQL Database(s)

    Use Redis for storing simple datatypes (lists, sets, hashes, strings). Redis is basically a tool for accessing these datatypes in a distributed, scalable manner. I don’t think of Redis for persistence, more of as a cache. Also, publish/subscribe is a great feature of Redis.

    Use Mongo to persist your “business objects.” Mongo is great for accessing collections of documents in a distributed, scalable manner. The full text search and map reduce features are handy too.

    If a program is based around tree structures maybe check out Neo4J.

    That’s how I choose what to store where anyway.

    Cheers,
    William

     
  2. Generate REST APIs with Baucis v0.2.1 for Node.js

    New Features

    The major changes in recent updates are a direct and simple evolution of internally relying on Express. Baucis v0.2.1 adds greater flexibility and decoupling of the Mongo collections from their REST endpoints.

    Now, it’s also easy to create a separate file for each controller and add custom middleware and routes. This aids in building modular, maintainable APIs.

    baucis.rest now returns an instance of the controller created. These controllers are Express servers, and can be manipulated as such.

    As an aside, a neat trick about Express is that the servers it creates via express() are themselves middleware functions, so express apps can be embedded in other express apps via app.use('/some/path', otherApp).

    If publish: false is passed into baucis.rest, routes are not automatically created at /modelNamePlural. This can be used to create a controller but attach it at an endpoint of your choosing.

    The restrict option was added for baucis.rest to aid in embedding middleware. It is executed before any Mongoose query is executed internally in the Baucis routes for that controller. It can be used to restrict one collection by a field in another, for example when a parent/child relationship exists.

    From the Baucis docs:

    var controller = baucis.rest({
      singular: 'foo'
    });
    
    var subcontroller = baucis.rest({
      singular: 'bar',
      publish: false, // don't add routes automatically
      restrict: function (query, request) { // allows direct access to the Mongoose queries
        query.where({ parent: request.params.fooId });
      }
    });
    
    // Embed the subcontroller at /foos/:fooId/bars
    controller.use('/:fooId/bars', subcontroller);
    
    // Embed arbitrary middleware at /foos/:fooId/qux
    controller.use('/:fooId/qux', function (request, response, next) {
      // Do something cool…
      next();
    });
    
    Controllers are Express apps, so do whatever you want with them.
    var controller = baucis.rest({
      singular: 'robot'
    });
    
    controller.use(function () { ... });
    controller.set('some option name', 'value');
    controller.listen(3000);
    

    Coming Soon

    Lately, I’ve been considering a few areas where progress is not so clear.

    For one, baucis.rest needs a way to add arbitrary middleware and routes before the automatic routes are added. I’m considering using an initialize parameter. It’s value would be a function that receives the controller as an argument. Not sure yet.

    It might be nice to have middleware that had access to the request and response after load/save, but before the default response is generated. Mongoose’s schema.pre and schema.post can be used currently but don’t have access to the request or response. schema.pre can fire off errors that fit in well with Baucis.

    I’d also like to investigate using Mongoose’s QueryStream to make Baucis even more scalable. I’d like to investigate other ways to use streams in Baucis. I’m in the middle of testing a simple Baucis set up with httperf, so it will be interesting to see the metrics before and after switching to QueryStream.

    I also need to investigate using Connect as a dependency instead of Express.

    I’m looking forward to working on the project more as I find the time!

     
  3. To TDD or not to TDD?

    First let me say that automated testing is awesome. I like using test driven development (TDD) as well. But my experience leads me to believe that using TDD for every design task is too restricting. Architecture/systems/code design is a creative act, and insight is often realized spontaneously, in unpredictable ways.

    image

    On top of that, every person and every team is different, and some variability in ideal design practice is to be expected.

    Personally, I find using only TDD too restrictive. Sometimes it’s nice to write a test first, sometimes I want to stub and freehand an architecture or an algorithm to test some ideas in code, maybe I want to follow a more BDD approach or even create a flow chart or graphic with Gliffy. Sometimes writing tests can be put off for a few iterations, sometimes it’s essential to create them before anything else to ensure specific security or performance requirements are prioritized.

    The goal is to produce a working product that is solid, adaptable, and ready to put in front of potential customers as quickly as is reasonable.

    Every situation is different; people over process!

    Cheers,
    William

     
  4. Two Cool Data URIs

    Put these data URIs in your URL bar to see a couple simple editors.

    I forget where I saw this one. It’s a super simple text editor. It’s just a body tag with the contenteditable attribute set.

    data:text/html,%3Cbody%20contenteditable%3E%3C/body%3E

    This one was found in a comment to a blog article that has since been removed! The comment was by @oncletom, but it could originally come from somewhere else. (Who knows?) It loads a stylesheet and some javascript from GutHub with link tags.

    data:text/html,%3Clink%20href=%22https://raw.github.com/marijnh/CodeMirror/master/lib/codemirror.css%22%20rel=%22stylesheet%22%20type=%22text/css%22%3E%3Cscript%20src=%22https://raw.github.com/marijnh/CodeMirror/master/lib/codemirror.js%22%3E%3C/script%3E%3Cscript%20src=%22https://raw.github.com/marijnh/CodeMirror/master/mode/javascript/javascript.js%22%3E%3C/script%3E%3Clink%20href=%22https://raw.github.com/marijnh/CodeMirror/master/doc/docs.css%22%20rel=%22stylesheet%22%20type=%22text/css%22%3E%3Cstyle%20type=%22text/css%22%3E.CodeMirror%20%7Bborder-top:%201px%20solid%20black;%20border-bottom:%201px%20solid%20black;%7D%3C/style%3E%3Ch1%3EJavaScript%20code%20editor%3C/h1%3E%3Ctextarea%20id=%22code%22%20name=%22code%22%3E%3C/textarea%3E%3Cscript%3Evar%20editor%20=%20CodeMirror.fromTextArea(document.getElementById(%22code%22),%20%7BlineNumbers:%20true,matchBrackets:%20true%7D);%3C/script%3E

    Cheers,
    William

     
  5. RequireJS is Awesome

    RequireJS is the best loader because:

    1. It manages dependencies.
    2. It loads text into strings.
    3. It has an awesome compiler capable of minifying and concatenating your JS and text dependencies (templates).
    4. It works great with non-AMD modules via the shim config option.
    5. It works in node.js too
    6. It can be subbed out with another, optimized AMD loader for deployment e.g. Almond.
     
  6. Monolithic JavaScript Frameworks

    Don’t Hem Me In

    I come from a C#/Java background, and I am absolutely tired of monolithic frameworks like ASP.NET or RSF & SEAM. Some of their components are good, but they don’t try to be composable and are often inflexible and can contain anti-patterns.

    So, I’m holding back on learning Meteor, FlatIron, and some of the other all-inclusive frameworks that are coming out.

    Don’t get me wrong, I see they have value for certain types of applications, for beginners, and as a means to push JavaScript forward. The next generation of CMSs are itching to be developed, and these frameworks will pave the way for building them in Node.

    Composable Stack

    I’m beyond happy with the composable stack I’ve built around npm, bower, Node, Mongo, Redis, Grunt, RequireJS, jQuery/Zepto, Backbone, Mustache, Bootstrap, among others.

    I love that I can code with a 100% JavaScript stack that integrates easily with HTML5 and CSS3.

    Another thing I like about this stack is I can swap in or out components as new ones become available or better approaches become evident.

    That’s why I am running screaming from monolithic frameworks in JavaScript! I’ve been in that mental prison before, and I hope to never go back!

    A word about data binding…

    I think Ember and Angular are awesome, though they have some opinions I don’t agree with, mainly that I prefer to separate presentation and presentation logic into separate layers. I tend to shun markup and templating that describes presentation logic, rather than presentation.

    I really like how un-opinionated Backbone is. So far I’ve found data binding with Backbone to be straightforward, but I’m definitely keeping an eye on Ember and Angular.

    Cheers,
    William

     
  7. Refactor Early, Refactor Often

    Refactoring should be started early in a project’s development, and should be repeated often.

    imageDespite my love for really old computers like the TRS-80 and the Apple ][ series, the best that can be said about the built-in code editors of these systems is that they brought inexpensive programming to the public at a time when compilers and code editing software was expensive.

    I am glad there are many available open source and low cost tools for editing and compiling code; if you’ve ever edited a program by prefixing lines of code with a line number while typing into a REPL, you know that refactoring can sometimes be expensive.

    In today’s world of great open source or low cost compilers and editors, we are very lucky. There is no excuse to refactor as an afterthought. Code must be organized and improved until it can be improved no more. Not all at once, but gradually, through many refactors that each contribute to a deeper understanding of code and coding.

    Code can nearly always be more scalable or more readable, more resilient or more reusable.

    Refactoring saves time and money; it makes business sense. And, it is a matter of professional pride. It is a creative challenge that sharpens the mind.

    Refactoring is expensive only when you put it off. Why not schedule a refactor as part of your spring cleaning?

    Cheers,
    William

     
  8. WebKit First

    Writing Apps that Target Native & Web with One Codebase

    imagePhoneGap lets you write HTML5 apps that run as native mobile apps that can be downloaded from a mobile device’s app store. On iOS and Android that means WebKit.

    PhoneGap provides a JavaScript API that interfaces with the device’s native hardware. So you write against this JavaScript API, and it will use the native features of whatever platform it’s running on, iOS, Android, BB. Plus you use basically the same codebase to run your PC/Mobile Web app.

    Throw in responsive design for some truly amazing cross-platform compatibility.

    It costs less to target only WebKit, while gaining a foothold in roughly 45% of the PC/Laptop market, plus iOS / Android mobile browser and app store markets. That can be a major advantage.

    After you get done rocking all those markets, you can tweak or workaround the incompatibilities with IE and Firefox if and when there’s a clear ROI targeting those browsers.

    It’s not even like FireFox and IE have poor HTML5 support…

    HTML5 is ready now for many classes of apps.

    • Can I Use? is a resource for checking which browsers support which HTML5 APIs and other features.
    • GlobalStats StatCounter interactive graphs showing statistics on browser usage, including mobile.
     
  9. The Advantages of Full Stack JavaScript: Node.js, I/O, Data, and Computation

    The advantages of Node.js for writing massively scalable servers are manifold. I’d like to share with you the advantages I think are most important: the ease of which you can write asynchronous server logic, the ability to write easily scalable servers, and the ability to write your app 100% in JavaScript.

    I wrote a couple blog articles as an introduction to using JavaScript for each tier of an app’s stack: Why Full Stack javascript Rocks! (part 1)Why FullStack JavaScript Rocks! (part 2)

    Asynchronous I/O

    Node.js makes it simple to write asynchronous I/O employing an architecture based on callbacks, promises, and/or events.

    Blocking I/O is to be avoided. To put it simply, it breaks Node’s concurrency model to use blocking I/O. Blocking I/O can kill your ability to scale Node.js clusters.

    The advantage of asynchronous I/O is that you never have to really worry about writing concurrent code. All your app/server code is executed in a single thread, and instead of dealing with the pain of multi-threading, one simply spins up stateless Node.js worker drones on each core, or distributed in a virtual server farm. Node.js makes writing scalable, concurrently-executing server farms easy!

    Heavy Computation

    Google’s V8 engine, which Node uses to execute JavaScript, is by no means slow, but long, blocking calculations could be a limitation for certain computationally intensive apps and servers.

    Most Node.js code is incredibly lean, and day-to-day coding of a massively scalable app does not require any intense computation. Examples of intense computation would be encryption (check out the built-in `crypto` library), machine learning, or rendering video.

    Even though these are approaches of last resort, and rarely needed, I would like to mention a few for completeness:

    One might, for example, implement a clustered Java server with a REST API that interfaces with the NLP Toolkit. Your node servers would just need to communicate with them via REST. Not too difficult.

    Maybe just offload that stuff to Hadoop or SOLR.

    Or you might evaluate if `node-cuda` fits into your project so you can offload calculation to GPUs asynchronously.

    Node is extensible via C++ plugins, so you have access close to the metal in the rare case you would need it. You can also asynchronously interface with UNIX sockets.

    server.listen('/tmp/echo.sock', function() { ... });

    MongoDB

    For most situations MongoDB is the scalable data storage/compute farm of choice. And awesomely, you write queries for it with 100% JavaScript.

    Really, most people haven’t been writing raw SQL for years, in any maintainable way. It’s great for one-off scripts, etc., but usually a framework such as Hibernate, NHibernate, etc. is employed to abstract much of the repetitive aspects of accessing data. (There are often times multiple data access abstraction libraries for each programming language.)

    MongoDB moves most of this logic to the database, so only a thin ORM (or ODM) is needed. Mongoose ODM is a handy, thin wrapper to MongoDB for Node.js.

    Mongo stores data in a JSON-like binary format, can be retrieved with JSON and JavaScript queries, supports MapReduce, and has clustered storage and redundancy built in.

    Need to scale Mongo? Just spin up another replica set. Or take the easy route and use a hosted MongoDB service such as MongoHQ. Nodejitsu has built-in support for easily creating and interfacing with Mongo (and Redis) instances in the cloud.

    
    function finishReading (callback) {
      var query = mongoose.models['article'].findOneAndUpdate(
        {
          title: 'The Advantages of Full Stack JavaScript: Node.js, I/O, Data, and Computation'
        },
        {
          $push: { readers: 'you' }
        },
        {
          upsert: true,
        }
      );
    
      query.exec(callback);
    }
    

    Welcome to massively scalable 100% JavaScript.

    Cheers,
    William

     
  10. Node.js: Creating a Random String

    Here’s a basic example of how to produce a string of pseudo-random characters from 0x00 to 0xFF and then convert it to the range of 0x61 to 0x7A, otherwise known as the character codes for lowercase a through z (in ASCII).

    You might want to do this for generating a psuedo-random, readable verification code, for example.

    The characters in the resulting string are not evenly distributed between a and z because of rounding (Math.floor), but serve our purposes well, in this case. The resulting code is hard to guess, because the range of codes that can be generated is sufficiently large, approximately 6 undecillion (6 × 1036).

      // first create 26 random bytes in the range 0 to 255;
      crypto.randomBytes(26, function(error, buffer) {
        if (error) return throw new Error(error);
    
        var i;
        var verificationCode = '';
    
        // loop through each byte
        for (i=0; i < buffer.length; i++) {
          var c = buffer[i]; // the character in range 0 to 255
          var c2 = Math.floor(c / 10.24); // transform to range 0-25 and round down
          var c3 = c2 + 97; // ASCII a to z is 97 to 122
    
          // now convert the transformed character code to its string
          // value and append to the verification code
          verificationCode += String.fromCharCode(c3);
        }
    
        // For example, 'wibrlxmmtdndnasincbmakxyki'
        console.log(verificationCode);
    

    Cheers,
    William

     
  11. Tooting Your Own Horn

    When bidding for projects on sites like oDesk or Elance, it’s necessary to let people know why you are the best engineer for their project. I feel proud of this recent proposal, and thought I would share it:

    image Hi. I’m a full stack JavaScript developer. (Node.js, Mongo, Backbone.) I have developed software professionally for 9 years. Prior to that I have studied coding as a child/hobbyist/student for 13 years.

    I have 1.5 years experience in MongoDB, and am familiar enough with SOLR for demo-writing purposes. I have experience with NLP and Computational Intelligence. I have read a wide array of research on ML and search, and am intimately familiar with indexing and search concepts.

    I’m familiar with SOLR in the context of scalability, have read the documentation, and have previously implemented basic integration of SOLR with MongoDB and Node.js.

    I have experience with AWS, specifically EC2, CloudFront, and S3. I have years of experience maintaining physical servers and have found the last 1.5 years using cloud hosting on AWS and Nodejitsu to be a major time and money saver.

    Please let me know if I can answer any of your questions.

    Cheers,
    William Riley-Land

     
  12. Nodejitsu: Fixing an ETIMEDOUT Error During Deployment

    I stayed up a bit late the other night stuck on this error while deploying an MVP for an app I’m working on: Paperless, the paperless exhibition catalog.

    At first I thought the problem was a number of large image files that are part of this app. Nodejitsu isn’t your best choice for hosting static files, nor is it designed to be. You should be using something like Amazon S3 or another static hosting solution.

    OK, but my project doesn’t have that many large files yet, and I need to get this out tonight ;)

    info:    Creating snapshot 0.0.1-10
    info	 Uploading: [=============================] 100%
    error:   Error running command deploy
    error:   ETIMEDOUT
    error:   Error: ETIMEDOUT
    error:       at Object._onTimeout (/usr/local/lib/node_modules/jitsu/node_modules/request/main.js:564:15)
    error:       at Timer.list.ontimeout (timers.js:101:19)
    help:    For help with this error contact Nodejitsu Support:
    help:      webchat: >
    help:          irc: 
    help:        email: 
    help:    
    help:      Copy and paste this output to a gist (http://gist.github.com/)
    info:    Nodejitsu not ok
    >

    I started drastically reducing the JPEG quality of said image files. It was timing out around 46%. Then 95%. And then after 100%. In the words of Ryu Hayabusa, “What the…”

    As it was 2 in the morning I decided to get some rest.

    In the morning, I turned to the always helpful folks in the #nodejitsu chat room on chat.freenode.net. Turns out this was a local timeout, and had nothing to do with the network — not immediately obvious.

    The solution was to increase this timeout’s length to something larger. Setting it to about 17 minutes solved the problem:

    wprl ~/code/paperless-exhibit $ jitsu config set timeout 1000000
    
     
  13. Performance Test: WebSockets vs. Server Sent Events (EventSource)

    Server Sent Events (SSE) and WebSockets are similar technologies on the surface. The major differences seem to be that SSE does not require a new protocol or full duplex connection. Instead, vanilla HTTP is used. And of course, SSE sends messages only from server to client and not from client to server.

    image

    Server Sent Events are supported by ~60% of browsers, but not any version of IE. WebSockets are supported by ~55% of browsers, including IE10.

    I wanted to informally test the speed of each technology. In this experiment, WebSockets and Socket.io were pitted against EventSource and connect-sse in Safari 6.0.2 on a MacBook Pro 2.4GHz Core i5. An iteration of the test has the server send 10,000 messages and records the time to send all 10,000 from the server, and the time to receive all 10,000 on the client. 100 Iterations were performed. The results were then averaged and the standard deviation calculated.

    Check out the source code for testing and reporting on GitHub.

    Initially it looked like Socket.io was 2-4 times slower than SSE. Then, I remembered Socket.io has debug mode enabled by default (doh). After correcting this oversight, the speed of each was shown to be within the same order of magnitude.

    The mean time to send 10,000 messages averaged over 100 iterations (µ) is recorded in milliseconds.

    fig. 1 Results for sending SSE events to the client image fig. 2 Results for sending Socket.io events to the client image fig. 3 Results for receiving SSE events from server image fig.4 Results for receiving Socket.io events from server image
     
  14. Tumblr + Node.js = Blog (part 2)

    I added paging to the embedded Tumblr blog. Here’s a gist giving an outline of how you can implement paging. The blog is served from /blog, and also accepts page numbers via e.g. /blog/p3

    I’ll be adding more comments and documentation as I continue to develop the blog software. Please feel free to ask if you have any questions in the meantime…

    Cheers,
    William

     
  15. Sample RequireJS Config

    Here is an example showing the “paths” and “shim” options for you to check out. All the client-side libs were installed in the /components directory using bower.