Very fast SPI-to-parallel interface for Netduino


In my previous post about the Netduino SPI, I pointed out how to perform a faster data transfer. However, to achieve a better performance, the hardware required a pretty complex circuit, and probably most of you don’t like too much components.
This time my actual goal is to connect a normal character-matrix LCD module to my Netduino. Several times I’ve used the fantastic uLiquidCrystal library available on Codeplex, which has been improved for the 74HC595 shift-register by Szymon Kobalczyk.

So, what’s the problem on doing that?
Although that library is very well done, it has several limitations:

  • it performs the data exchange very slowly;
  • it is quite large (due to the many plugins offered);
  • the SPI port cannot shared among other devices.

Let’s analyze how the uLiquidCrystal works.

The “basic” approach.

I’ll consider only the 74HC595 version, because it is a very well-known chip, especially in the Netduino world. The 74HC595 is a serial-in/parallel-out shift-register, that allow a very easy interfacing with any SPI-enabled device, such the Netduino or the Arduino.
From the hardware perspective, the interface is pretty simple: it requires just one 74HC595 chip, and an optional transistor for the LCD backlight (not shown here).

Basically, the SPI’s master-out serial data output (MOSI) feeds the shifter input. Each bit is actually shifted thanks the master clock (SCLK), on every rising edge. Finally, this technique takes advantage of an additional SPI’s master output (SS) rising edge, to freeze a snapshot of the shift register onto the parallel outputs latch.
That’s all.
Being so intuitive, this approach suffers of serious limitations, especially about the performance. In the Netduino world (i.e. the .Net Micro Framework), the SPI is low-level driven automatically, and that’s contrary to the Arduino fashion which implies an imperative “manual” management of the outgoing stream. So, the Netduino-way is much like a “buffer-oriented” approach, while the Arduino-was is simply “byte-oriented”. The SS-pin of the Netduino SPI goes active (false in this case) at the beginning of the transfer. When the entire stream is flushed out, the SS is restored to its inactive state (i.e. true).
Since the register latch is updated thanks to the SS rising edge, it’s clear that the automatic management offered by the Netduino cannot be an advantage. Instead, the stream must be split into several one-byte blocks, getting more complex the software.
Moreover, since the low-level management of the SPI introduces a small delay before and after the transfer, it results the actual data-rate is really poor.
Let’s take an example:

using MicroLiquidCrystal;

namespace NetduinoSpiBoost
    public class Program
        public static void Main()
            //create the transfer provider
            var lcdProvider = new Shifter74Hc595LcdTransferProvider(

            //create the LCD interface
            var lcd = new Lcd(lcdProvider);

            //define the LCD size (cols, rows)
            lcd.Begin(16, 2);

            while (true)
                //clear the screen

                //set the text origin to the beginning of the first row
                lcd.SetCursorPosition(0, 0);
                lcd.Write("Upper row.");

                //set the text origin to the beginning of the second row
                lcd.SetCursorPosition(0, 1);
                lcd.Write("Lower row.");


The above program does a very simple work: every second clears the LCD screen, then writes a string at the first row, and another on the second one.
Oh, yes: this program does almost nothing, but…unless the target is just to play with characters, I guess that the visual management should be a kind of “dress” over the real application running. Thus, the display driver has to be pretty lightweight and fast enough to keep the main application free to run.
Let’s take a peek at the handshake of the SPI through a single cycle (see the above example):

NOTE: the light blue trace is the SCLK, and the yellow trace is the SS.
The very basic task used in the example takes almost 100ms to complete. The Netduino must “halts” its main application for such a long time, just to output less than 20 characters.
Not so good as expected.

Netduino does it better.

Perhaps you can’t believe, but this “inability” to take advantage of the high-speed SPI, almost drove me crazy.
Damn, I just need some way to trigger the data onto the output latch, every single byte…is it possible that there’s no a decent solution?
Finally…gotcha: the Columbus’ Egg!
Here follows the new schematic, much easier than my previous one, and surely appreciated from anyone who does not like too many components!

On the LCD side, most of the lines have been reorganized. The backlight line has been taken off the circuit (later described in this article).
So, the eighth output of the 74HC595 will be free: that’s the trick!
Just consider every single byte shifted into the 74HC595 having the MSB high. As soon this bit reaches the last stage (QHh/LATCH in the schematic), it will produce a low-to-high transition useful to trigger the data onto the chip’s output latch.
That’s not enough, though.
The base shifter must be cleared (RESET), otherwise any other further high-bit shifted into the last stage will cause a spurious latch. I don’t want this.
Thus, the transistor circuit will help to solve the problem…twice!
First off, as soon the LATCH signal rises, the transistor will be polarized, thus its collector become as a “short” to the ground. Such a condition resets the 74HC595 shift register.
Secondly, when the SS in inactive (i.e. high), the transistor is polarized as well, keeping stable in the reset state the 74HC595. This allows a safe sharing of the SPI with other devices, without involving the 74HC595 at all.
This simple trick is really valuable, because it allows to take advantage of the automatic SPI management offered by the Netduino.

The handshake in the new scenario.

I think it would be pretty interesting having a look at some signal of the new circuit.
The first scope snapshot shows the eight clock pulses of the SPI “SCK” output (yellow trace), and the LATCH pulse (light blue) generated on every 8th clock rising edge. This very short pulse is enough to move the current shift register data to the output register (74HC595).

The following picture shows the same signals, but zooming the detail around the short latching pulse. Please, notice the duration of the pulse, being about 100ns.

The next screenshot is about the same LATCH pulse (light blue, even zoomed), and the consequent RESET signal derived from the inversion of the same pulse.
Notice the smooth falling and rising of the RESET signal (yellow), due to the stray capacitance. A pull-up inverter is not the best choice for nanoseconds-timings, but surely enough for a DIY circuit.

The last picture shows the overall performance of the driver.
Although the text strings are exactly the same as the original driver, the CPU-time is dramatically lesser. About 8ms versus 100ms: a 12x-speed improvement.

The LCD driver software (early stage)

Now, let’s take a look at the software.
From the application perspective, the usage of the new driver looks similar to the previous one. To compare better the two drivers’ performance, the application does the same thing on the LCD, on both cases.

using Toolbox.NETMF.Hardware;

namespace NetduinoSpiBoost
    public class Program
        public static void Main()
            //create the LCD interface
            var lcd = new LcdBoost(Pins.GPIO_PIN_D10);

            lcd.Begin(16, 2);

            lcd.SetCursorPosition(0, 0);
            lcd.Write("Upper row.");

            lcd.SetCursorPosition(0, 1);
            lcd.Write("Lower row.");

            while (true)

By the way, this time the approach is much like the “declarative-way”, instead of the “imperative-way”. Basically, the text strings are “located” on the video cache, then the physical transfer of the bytes is performed cyclically, in the main application loop. This approach allows a different view of point to treat our display, much more “WPF-like”, although the tiny Netduino cannot afford a so huge framework.
How looks the LCD driver inside?
Well, instead of posting the entire code (which is not particularly long, though), I prefer to highlight some interesting point.

First of all, the entire driver is just a class. This is far shorter than the uLiquidCrystal library, which counts over 10 modules.
The driver hosts a video cache, one byte per char, which is the actual buffer accessed when the user’s application does any operation, such as writing text, for instance. Since the LCD is character based, the size of this cache won’t be a problem.
To physically transfer the video cache to the LCD module, the main application must call the Dump method periodically. This could have been done automatically, within the driver class, using a timer or a separate thread. However, since it consists only in a trivial method call, I prefer to leave this task to be managed by the main application, in a explicit fashion. This avoids noisy thread safeguards and useless overheads.

        /// <summary>
        /// Perform the buffer transfer to the LCD module
        /// </summary>
        /// <remarks>
        /// This function resets the buffer index
        /// </remarks>
        private void Send()
            //open the SPI using the specified configuration,
            //manage the buffer transfer, finally release the port
            using (var spi = new SPI(this._spiConfig))

            //reset buffer index
            this._bufferIndex = 0;

The physical transfer is actually managed by the Send method, which is a private member of the driver class. This because the video cache contains the character to be displayed, but the external hardware needs an encoding on the outgoing stream. For instance, the eighth bit (MSB) must be always “true”. Then it must be taken in account the LCD module chip (HD44780) handshake, because it works on a 4-bit bus mode.
Thus, the video cache (better, any required command) is encoded on a secondary buffer. This buffer is the actual outgoing stream managed by the SPI.

The physical transfer

I guess that the Send method deserves some additional consideration.
First off, the SPI device is instantiated specifically for the buffer transfer, then it is released. This allows an easy sharing of the SPI for other devices connected on the same wires.
Secondly, despite there’s no any data incoming, the .Net Micro Framework defined WriteRead method offers a better chance to perform a stream transfer. Since the alternative Write (only) method requires the exact buffer to be sent, every time the driver should create a byte-array, then trash it after the transfer. This leads to an unnecessary (and costing) background work for the garbage collector. A better approach is to keep a fixed-length byte-array in the driver, and counting how many bytes should be actually sent. The WriteRead method offers this kind of usage.

The backlight driver

The backlight driver is not included in the driver, because I think has no direct relation with the data to display.
The LCD module I’ve used has no backlight, for instance. Furthermore, the backlight is a set of leds, connected in series or in parallel. The proper driver could vary from module and module, thus is not worthwhile inserting any kind of management in this class.
A feasible idea about the backlight driver, could be to dim it by using any of the PWM outputs available on the Netduino.


Often the Netduino is considered too slow to perform certain operations.
Well, here is a very simple solution to take advantage of the best “hidden” features of this nice board. Simple hackings and smart software can lead you Netudino several satisfactions.
At the moment the driver is still in an early stage, but fully working.

The LcdBoost driver libary is part of the .NET Micro Framework Toolbox on CodePlex.


10 thoughts on “Very fast SPI-to-parallel interface for Netduino

    • Mario Vernari

      You are right.
      The main reason is because I’m adapting the library for several sizes of LCD module. Hope to post sooner, though.
      Thank you.

      • Mario Vernari

        Hello Pankaj. It’s feature to be implemented, but I didn’t think to add such a functionality.
        That’s because it’s something specific of that chip, and I want to keep the library abstract.
        If you have some suggestion on how to expose the cursor and the blinking capabilities, we might work on.

  1. Jeff

    Your design looks really nice. I know you arent done with your code, but can you provide it as a start for someone else to work with?

    Maybe email a copy?

  2. Skidrowgamez

    I was very happy to find this web site. I need to to thank you for your time for this fantastic read!!
    I definitely liked every bit of it and i also have you saved to fav
    to look at new stuff on your blog.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s