MinWin: Software Engineering 101

Posted in Industry, Programming, Software Engineering on November 20th, 2009 by Doug – 5 Comments

Ars Technica has a pretty interesting article about the MinWin project at Microsoft. MinWin is a software engineering project to refactor the Windows kernel into something more manageable.

But there’s always been a dirty little secret hiding underneath that iconic field of green grass. From an engineering and security standpoint, the foundation of Windows 2000 and Windows XP is absolutely horrible.

The fact that Microsoft was put into such a situation is, at the same time, surprising and understandable.  I understand how such a large software project can get out of hand quickly. When you develop software and focus on features and release dates over design and maintainability, you end up with the situation that Windows is in. Such things have been studied by software engineers for almost as long as people have been writing software.

What surprised me is that these problems did not bite them earlier and also that no one stopped to do this sort of organization work earlier. In today’s market full of cell phones and other low powered portable devices, it would be huge win for Microsoft if they could reuse an existing, stable operating system kernel instead of maintaining several kernel families for different markets / use cases. I’m surprised that no one saw the benefit of such flexibility in the Windows product until now.

While reading this, I contrasted the situation of Windows to that of Linux. I find Linux (the kernel, not necessarily all the crap that runs on top of it) to be a very well engineered system. A testament to this is the extremely varied systems that Linux is able to run on (iPods, phones, PDAs, laptops, desktops, supercomputers, you name it). I’m sure this capability has been a big factor for the many companies that have built products on top of Linux. Windows probably wasn’t even a consideration, since it is such an unadaptable, closed system.

The lesson-learned here: design and engineering are important and if you don’t figure these things out at the start or early on, you’ll spend a lot of time playing catch-up. It’s really hard to change the design of a system while you add new features and capabilities. I wonder if Microsoft could have saved time and money by simply starting over, with a fresh design and system. Will Microsoft ever give up the Win32 API?

Sony Reader – Library Link

Posted in Hardware, Industry, Software, Web on August 25th, 2009 by Doug – 1 Comment

Today Sony released details regarding their upcoming digital reader products.

sony-reader

One of the big features hardware-wise is a wireless 3G connection, much like Amazon’s Kindle. On the software side, Sony announced that the reader will be able to be synchronized with a user’s local library (if they support such a thing). The combination of these two features has me pretty excited.

From ArsTechnica,

According to Sony’s Haber, the new version of its online book store will allow users to enter their ZIP code, and determine whether the local library offers electronic versions of its books. These books can be downloaded, at which point they’ll have a 21-day expiration date—no late fees, as Haber was happy to point out. The New York Public Library’s representative announced that his organization would be taking part in the service. That’s a rather significant announcement, given that he said that the NYPL’s website was the second-most visited online library, behind only the Library of Congress.

I think this might have a big impact for students or anyone else who wants/needs access to many books. Imagine going to college and pairing your reader with your school’s library. Suddenly, buying textbooks becomes a thing of the past. Need to look something up, but don’t have the book? No problem: search the library’s catalog and pull in the book over the 3G connection.

This scenario might be wishful thinking for a little longer, but I think Sony’s announcement today goes a long way toward making it reality.

Wolfram|Alpha

Posted in Software, Web on June 8th, 2009 by Doug – 2 Comments

I’ve spent a fair amount of time using Wolfram|Alpha recently. Here are my impressions.

The weeks leading up to the launch of Wolfram|Alpha have created incredible amounts of hype surrounding the project. The creator, Stephen Wolfram, has not helped to lessen this hype. Even the project’s description sets a rather lofty goal

Wolfram|Alpha’s long-term goal is to make all systematic knowledge immediately computable and accessible to everyone.

The comparions to Google were immediate and plentiful. After all, Google describes their goal as making all the world’s information searchable and catalogued. However, Alpha is not Google. Here the term computable can nearly be thought of as making all of the data formatted in such a way that the computer can read, evaluate, and manipulate the data. A great example of this is to do something like produce a graph for a data set over time (such as population). Google, on the other hand, is best used as a catalogue that ultimately points you to data. It does very little of it’s own analysis other than what is required to build their index.

Alpha is a calculator, only a calculator that can understand natural language, access the relevant data and present it in a form that is easily understood by humans. As such, it’s an interesting tool. Alpha’s ability to display data on a webpage is particularly impressive. Take a look at what you get when you query Alpha about the International Space Station.

Alpha displaying the current location of the ISS.

Alpha displaying the current location of the ISS.

The results are impressive when Alpha knows how to handle your input. In the cases where it doesn’t (and there are many) leave much to be desired. For instance, searching for general terms does not yield much beyond basic information, and sometimes that information is only related to the core concept. Consider the next image where I searched for Computer Science. The result is no information other than some queries that Alpha can display results for.

cs

Alpha's results for the term "Computer Science."

If you are interested in results that are analytic or mathematical in nature, Alpha might be a really good resource. For encyclopedia like results, Wikipedia still rules the day. Of course, Alpha is not positioning itself to compete with Google or Wikipedia, but I think the comparisons are fair along the lines that all the mentioned services intend to provide information to people seeking information.

This brings me to the question, “exactly how useful is Wolfram|Alpha?” I though this might be the type of question that Alpha itself could help me answer. I started by searching for information about the United States’ current population.

US Population Results.

US Population Results.

These results are good and quite useful. Now, I wanted to get some information about education in the US.

education_usaNot as much as I’d like to see, and the numbers are a bit out of date. Here I’m a little dissappointed. I’d assume that certainly raw numeric data such as this could be computed upon to reach more conclusions. In my next search, I try something a little more advanced: let’s try to calculate the current portion of the population that has obtained a college degree. This should be a simple calculation, given the data. The term for this value is “education attainment,” something that I learned through a Google search that lead to the US Census Bureau’s page.

edu_attainmentNothing. Ok, maybe I need to be more verbose and use terms that I know already lead to data. So, I searched for “education usa, population usa.”

edu_comparisonWhat? The calculation isn’t even complete. Education enrollment has no result, even though I’ve already found those numbers before. Even if all the numbers had appeared, no useful calculation or comparison has been done.

The number that I’m actually searching for is 28%. This represents the portion of the population over the age of 25 that has obtained at least Bachelor’s degree. This information comes from a Census Bureau page that turned up in a Google search. It would have saved me a lot of time if Alpha had produced this number. Given the data that it has access too (US Census Bureau is listed as one of Alpha’s resources), the calculation is fairly straight forward as well.

The reason I’m searching for this number, 28%, is that I believe that the majority of people interested in the types of results Alpha can provide likely have a background in mathematics or science. Most of the impressive results are related to these fields. This is not hard to understand when you realize that “computable” human knowledge is much more likely to come from these fields than say, literature. Compare the results when searching for Poisson distribution to Gulliver’s Travels.

Search: "poisson distribution"

Search: "poisson distribution"

Search: "Gulliver's Travels"

Search: "Gulliver's Travels"

I think you can see the vast difference in the quality of the result returned. Now, back to that 28% number. I’d assume that people with a background in math or science are likely to have a college degree. Of the people with a college degree, only a portion will have studied math and/or science, but let’s just keep using the 28% number. What is 28% of the US population? Alpha can answer that.

college_popThe number: 85.63 million people. I’d conclude that this is a high estimate of the number of people who might ever be interested in the Alpha project. I think the majority of the population might type a query, get a graph, determine this is not what they want and move on.

People who are interested in these types of results will likely want to dig in deeper, as I tried to do above. I think at that point they will become disappointed in Alpha’s inability to understand what information they are trying to elicit. Further more, if they are able to obtain just the data they need, I wonder of what use it is other than as a cure for their own curiosity? It is difficult to impossible to determine where, exactly, Alpha got the source information it needed to reach its conclusions. Even if this source information was available, the user still has no idea exactly what equations, algorithms, or process was used on the data. Without this information, I find it difficult to believe that anyone would be able to use Alpha’s results in any sort of document or report.

I think a fundamental improvement to Alpha would be to return precise source information that ultimate points to the raw data along with a bit of Mathematica (or other) code that performs the calculation. Then, the user could take this information, verify its validity, modify it as necessary, and incorporate it into their own work. Until this is possible, I’m afraid Alpha will only be used for curiosity’s sake.

Despite these criticisms, I find Alpha to be a fascinating project, if only for it’s data presentation and cataloguing engine. The project is in an early stage, so perhaps the flaws I see are simply due to this. Perhaps the desired features are already in the pipeline. I hope that they are, because I’m a fan of what Wolfram is trying to do here. If they can improve the project to the point where these points are no longer an issue, I can see Alpha becoming the world changing product what Wolfram would have you believe it is. Until then however, you might be better off searching somewhere else.

This Stack goes to Infinity

Posted in Programming on May 10th, 2009 by Doug – Be the first to comment

In the official CPython implementation, the Python interpreter mixes the state of Python’s stack with the state of the underlying C code’s stack. Under normal operation, this design choice has little impact, but if one looks more closely there are some major drawbacks.

One such drawback is that Python’s stack depth is now limited by the maximum stack depth supported by the underlying CPython process. This might not seem so bad until you consider that one function call in Python may cause many C functions to be pushed onto the stack.

A solution to this problem is to separate Python’s stack from C’s stack. This is exactly what Stackless Python does. Now, Python’s stack size is only limited by the amount of memory in the machine.

I’ve implemented the Sieve of Eratosthenes example from last time using the tasklets and channels that Stackless provides. I haven’t really written any Stackless code until now, but I feel this is a decent translation.

import stackless

def ints(n, ch):
    while 1:
        ch.send(n)
        n = n + 1

def filter_mul(n, i, o):
    while 1:
        num = i.receive()
        if (num % n):
            o.send(num)

def sieve(i, o):
    while 1:
        head = i.receive()
        o.send(head)
        n = stackless.channel()
        stackless.tasklet(filter_mul)(head, i, n)
        i = n

def printer(i):
    while 1:
        print i.receive()

def gather(i, o, n):
    l = []
    for x in range(int(n)):
        l.append(i.receive())
    o.send(l)

def main():
    intch = stackless.channel()
    intt = stackless.tasklet(ints)(2, intch)
    sieveo = stackless.channel()
    sievet = stackless.tasklet(sieve)(intch, sieveo)
    #printert = stackless.tasklet(printer)(sieveo)
    gathero = stackless.channel()
    gathert = stackless.tasklet(gather)(sieveo, gathero, 1e4)
    print len(gathero.receive())

    stackless.run()

if __name__ == "__main__":
    main()

Running this results in decent performance:

--( 9:22 : $)-- ~/local/stackless/bin/python --version
Python 2.6.2
--( 9:24 : $)-- time ~/local/stackless/bin/python sless.py
10000

real    0m47.369s
user    0m46.123s
sys     0m0.076s

I tried running the same example to find the millionth prime, but after hours of execution time I killed the process. While the memory consumption was reasonable, the execution time suffered from, what I believe to be, much overhead from object creation and switching between tasklets. Still, it’s encouraging that this version didn’t fail due to stack depth problems.

Stackless Python is an interesting experiment, and I wish the features implemented here were part of the official Python release. If you haven’t heard of Stackless before, I encourage you to read about what it offers to Python and why this can be useful beyond the toy example I have here.

To Infinity

Posted in Programming on May 5th, 2009 by Doug – 6 Comments

In the programming language Scheme the concept of infinite lists can be implemented using streams. A stream is really nothing more than a cons cell that contains a value in the cdr field and a pointer to some code to generate the cdr field. While simple in design, the concept is powerful, allowing the programmer to create infinite sequences by simply defining a function to generate the next value in the sequence.

Python supports a similar concept in the form of generator functions. The implementation here is a little more complex, but it can be thought of as a function that returns a result, but maintains its current state for later execution.

Both can be used to write rather elegant code. Take for instance the Sieve of Eratosthenes. This can be implemented in Scheme like so (borrowed from SICP):

(define (integers-starting-from n)
(cons-stream n (integers-starting-from (+ n 1))))

(define (divisible? x y) (= (remainder x y) 0))

(define (sieve stream)
(cons-stream
(stream-car stream)
(sieve (stream-filter
(lambda (x)
(not (divisible? x (stream-car stream))))
(stream-cdr stream)))))

(define primes (sieve (integers-starting-from 2)))

In this code, a new stream is created to filter out the unwanted multiples from the previous stream.

Similarly, it can be expressed in Python using generators:

def integers(n):
while 1:
yield n
n = n + 1

def filter_mul(n, s):
for i in s:
if (i % n):
yield i

def sieve(stream):
while 1:
head = stream.next()
yield head
stream = filter_mul(head, stream)

The expression is elegant and theoretically allows us to represent an infinite list that is evaluated lazily. However, the reality strays from the theory. While this code works just fine to find small primes, it breaks for larger values:

;Loading "sieve.scm"... done

1 ]=> (stream-ref primes 1000)

;Aborting!: out of memory
>>> from sieve import *
>>> primes = sieve(integers(2))
>>> [primes.next() for x in range(1000)]
RuntimeError: maximum recursion depth exceeded

In both cases, we’ve exhausted the language implementation’s ability to hold the data structures necessary to represent the lazy lists. The result is elegant code that fail rather quickly. Now, we could always buy ourselves a little more execution time by increasing the heap for Scheme and the recursion depth for Python, but that will only take us so far.

So, that leaves me wondering: is there a similarly elegant way to write code of this sort that won’t expire so quickly? To clarify, I’d like to keep using streams and generators, but find some way to limit the overhead incurred from the creation of so many filters (1 for each prime!) without resorting to a different algorithm.

Update:
I want to address the alternative primes calculation listed in SICP:

(define primes
(cons-stream
2
(stream-filter prime? (integers-starting-from 3))))

(define (prime? n)
(define (iter ps)
(cond ((> (square (stream-car ps)) n) true)
((divisible? n (stream-car ps)) false)
(else (iter (stream-cdr ps)))))
(iter primes))

This procedure is still recursive (primes uses prime? which uses primes), but has much less memory overhead — I managed to generate primes much larger than the 1000th without hitting memory limits. However, this solution strikes me as not quite the same in spirit as the first. Here, a given integer is tested by comparing it to the current list of generated primes while the first definition is building up a function to generate the next prime without having to explicitly scan the list of the already generated primes.

#sxstarwars

Posted in Web on March 15th, 2009 by Doug – 4 Comments

Some days, it’s the really quirky things that make me smile. Today it’s the explosion of the #sxstarwars tag on Twitter.

I suppose it started as a play on the SXSW event taking place this week, but the really amazing thing is how fast something like this can catch on and spread throughout the users logged in to Twitter. The tag #sxstarwars became the number 2 trend almost as soon as someone @jaybushman started it.

I think its stuff like this that really shows why Twitter is something special.

Edit: credit where credit is due.

Palm Prē

Posted in Hardware, Web on January 10th, 2009 by Doug – Be the first to comment

This week at the CES, Palm announced its newest software platform and the smartphone to go with it. Personally, I’m impressed for a variety of reasons. pre

First off, Palm has put together a stunning hardware and software package. Check out some of the videos (another) of it in action. It’s slick and intuitive, which is definitely what Palm needed to pull off for people to pay any attention to them. As one journalist stated, “my iPhone suddenly felt old and played out.” What better praise could one have for a new device?

From what I saw, there are three things that really caught my attention. The first is what Palm calls “synergy,” which is really the phones ability to display related information in a single interface. As an example, the messaging application shows messages from all sources (sms, aim, whatever) in the same thread. You don’t need 3+ applications and interfaces to communicate with only one person. On a mobile device, this seems like it will play out to be a huge advantage.

The second thing is the phone’s ability to multi-task well. Applications can run in the background (something that the iPhone does not currently allow) and bring up notifications when events occurr. This in and of itself is not that innovative, but I think Palm’s card metaphor for managing applications and interfaces is. It seems to work well given the capabilities of the device.

The third thing that caught my eye is how well the interface works with gestures. I really like the idea of having a small gesture area below the screen so users can give  input without having to worry about accidentally clicking buttons or activating other functions. On top of that, the whole system seems to work well as a touch driven device. Even though I like gestures and a multi-touch screen, I’m really glad they included a physical keyboard. All of this combined seems like it will really work well on a mobile device.

So the actual device aside, there is another aspect of the event that is interesting to me: the press reaction. Before the  announcement, everyone seemed to be commenting on how much of an uphill battle Palm had before them. It had been a long time since Palm was viewed as a player in the cutting edge smartphone market (at least in the US). It seemed that a lot of critics felt it would  be an almost insurmountable challenge for Palm.

At the moment, a few days after the announcement when (to my knowledge) no one outside of Palm has actually held the device, the response of the press seems overwhelmingly positive. I think Ars Technica came up with the best word to describe the situation when they called it a resurrection. If anything, this device might be famous for how quickly it changed the outlook for Palm as a company struggling to survive in a market that is becoming more and more crowded with big players like Apple and Google. Sites are already posting comparisons between the iPhone and G1 and already (again without any real hands on time) they are showing the Pre in a positive light.

I had hoped that Palm would have something strong to show, but I didn’t expect so many journalists to become so entralled with the phone so quickly. Outside of  Apple, this sort of thing just doesn’t happen that often. My hope is that Palm takes advantage of it while positive forces are aligned on their side.

From what I can see, there are two outstanding issues that, potentially, could become game changers. The first is price. All of this will be no good for Palm if everyone ends up loving the device but doesn’t buy it because it’s too expensive. They really need  to think this one over before the announce the price, especally given the way the economy has turned out and the strong competition (which seems to have taken advantage of carrier subsidies). My thoughts are that Palm should launch the Pre with a low price. If they don’t gain marketshare fast, I think the Pre might slip from the limelight.

The second issue and perhaps the most important aspect for the phone’s long term survival is the development platform. The Pre runs what Palm is calling Web OS. The reason for this is that the development tools/sdk use prevalent web technologies (HTML, CSS, Javascript). What  makes it unique is that hooks have been provided to tap into the capabilities and services available on the phone (think: contacts, messaging, storage).  The advantage is that there are many developers who are very fluent with these technologies. The (potential) disadvantage is that the platform may be limited in some way. The Pre could face a situation where users will feel a strong desire to jailbreak their phones or use some other method to tap into the native capabilities of the device.

Closely tied to the development tools is the applications distribution service (assuming they create one). I think Palm should build one, but that it definitely should not be exclusive. In other words, developers should be able to use a Palm app store to easily sell their software, but they should not be prevented from selling through other outlets or simply giving it away for free on their homepage. I would hope that Palm has been in the PDA/smartphone buisness long enough to realize this, but I suppose there is always the temptation to follow Apple’s (flawed, imho) app store example.

In the end, I’m really impressed by what Palm showed off and I wish them the best of luck going forward. Despite the positive reaction, it seems that Palm still has an uphill battle ahead of them. I hope they make it to the other side, because as it stands now, I don’t think I’d mind having a Pre in my pocket.

Update: Here is a link to a video of the entire presentation of the Pre at CES. The middle of the video has a really good demonstration of the UI features and interaction with the device.

Living in the Clouds

Posted in Open Source, Software, Web on January 5th, 2009 by Doug – Be the first to comment

More than a few months ago, Richard Stallman commented on cloud computing and made some striking remarks.

read more »

White House Eggnog

Posted in Fun on December 31st, 2008 by Doug – Be the first to comment

The story is that the following is the eggnog prepared for President and Mrs. Clinton while they were in the White House.

I’ve tried it and verified that this recipe is too good not to share.

  • 5 egg yolks
  • 5 egg whites
  • 1 1/4 cup sugar
  • 3/4 cup bourbon
  • 3/4 cup cognac/brandy
  • 3/4 cup dark rum
  • 1 quart milk (up to 1 1/2 quart — to taste)
  • 1/4 teaspoon salt
  • 1 pint heavy cream
  • 2 dashes nutmeg
  • 2 teaspoons of vanilla extract

Combine yolks and sugar in bowl and beat to ribbon stage.

Add bourbon, congnac and rum to yolk mixture — Set aside

Whip egg whites and salt to stiff peak

Whip cream to medium peaks

Fold whipped egg whites and cream together

Combine egg whites/cream mixture with yolk-liquor mixture

Add milk and stir well

Add nutmeg and vanilla

Store and serve very cold – blend well bevore serving.

Happy New Year and enjoy!

Stay on Target: Real Life Tron on an Apple IIgs

Posted in Programming, Software, Software Engineering on October 9th, 2008 by Doug – Be the first to comment

This story is so great that I had to link to it. Go read it now.

Stay on Target: Real Life Tron on an Apple IIgs.

As soon as they started explaining how missiles were added to the game, I knew where this was going. I didn’t expect the result to be so entertaining and interesting. It’s certainly a spectacular bug.

It also reminded me of my early programming days in high school when I learned that you could crash Windows 95 by doing something similar to this.

char far *p = 0;
while(1)
    *p++ = 0;

I remember being disappointed when this didn’t work in Windows 98.