Liu 的个人资料唯有仰望是真实的照片日志列表更多 工具 帮助

日志


2006/6/30

view of the msn social relations with h3

As you can see in the previous post on this programm, i save the nick->friends_list in a dict. And I find a cool tool to represent the relations. http://graphics.stanford.edu/~munzner/h3/HypView.html
It's in put file format is like this:

INPUT FILE FORMAT

Each line of the input file is of the form

  depth identifier 1 group1 [group2 ... groupN]

with the following types

  int string int string [string ... string]

Each line corresponds to a node. The order of the lines is meaningful. The integer at the beginning of the line is the depth in the tree. A line with depth one greater than the line above it indicates a link between two nodes from the line above to the line below. The identifier strings are assumed to uniquely specify a node. If an identifier occurs twice that means a node has more than one incoming link, so the file describes a general graph, as opposed to a tree or a directed acyclic graph (DAG). The identifier string is used as an interface between the library and the application program in many methods which have the argument "string & id".

A file with structure like

0       A   [...]
1       B   [...]
2       C   [...]
3       D   [...]
4       A   [...]
3       E   [...]
2       F   [...]
1       G   [...]
1       H   [...]
2       I   [...]
2       J   [...]
1       K   [...]

corresponds to a graph that looks like:

  .-------- A
 /      ____|_______
 |     /   |   \    |
 |    B    G    H   K
 |   / \       / \
 |  C   F     I   J
 \ / \
  D   E

The very first node is the root, the numbers correspond to the number of layers deep in the hierarchy.

 

And this is the code i use to build the input file from my dict.

import cPickle
from sets import Set

def unescape(s):
    if '&' not in s:
        return s
    s = s.replace("&lt;", "<")
    s = s.replace("&gt;", ">")
    s = s.replace("&apos;", "'")
    s = s.replace("&nbsp;", " ")
    s = s.replace("&quot;", '"')
    s = s.replace("&amp;", "&")
    return s

def get_all(d):
    s = Set()
    for k, v in d.iteritems():
        s.add(k)
        if (len(v) > 1) and (len(v[1]) > 0):
            s.update(v[1])
    return s
   
degree_d = {}
def bft(T, node=None, degree=0, queue=[]):
    """this bft is really ugly, and it's for build the input file of h3,
    so we have to traverse the graph twice, so we can make our friends
    closer to me.
    """
    visited = Set()
    global degree_d
    if node not in degree_d:
        degree_d[node] = degree
    queue.append(node)
    visited.add(node)
    degree += 1
    num_people = len(get_all(T))
    while((len(visited) <  num_people) and queue):
        #print degree, len(visited)
        queue_tmp = Set()
        for node in queue:
            if node not in degree_d:
                degree_d[node] = degree
                visited.add(node)
            if (node in T) and (len(T[node][1]) > 0):
                queue_tmp.update(Set(T[node][1]).difference(visited))
        del queue
        queue = list(queue_tmp)
        degree += 1
               
def preorder(T, node=None, degree=0):
    print degree,

    try:
        title = T[node][0]
    except KeyError:
        title = None
       
    if title and not title == "MSN Spaces":
        """we should do the unescape when the title is stored, sorry for doing
         it here, it's because the crawling is already one it's way"""
        title = unescape(title)
        print node+" ("+title.replace("&#183;", "*")+")",
    else:
        print node,

    if degree%2:        # some quick hack to make the image colorful, I don't know what the magic number means:P
        print "0 html"
    else:
        print "0 image"

    global degree_d
   
    if (node in T) and (len(T[node][1]) > 0):
        for child in T[node][1]:
            if (child not in degree_d) or degree_d[child] >= degree:
                preorder(T, child, degree+1)

if __name__ == "__main__":
    import sys
    d = cPickle.loads(open(sys.argv[1]).read())
    #print len(d)
   
    bft(d, "ayueer")
    preorder(d, "ayueer", 0)

And to give you a screenshot of the picture.

 

www.doovee.com

CS多媒体组做的似乎是, 蛮好的, 看了那个雨一直下的搞笑版, 里面那个mm好pp的说, 哈哈

Madness

writed by Jp Calderone on March 5th, 2004
 
Twisted "chat server" in one expression
Tonight's messed up code:


(lambda r,p,b: (r.listenTCP(6665,(type('F',(p.Factory,object),{'protocol':(type('P',(b.LineReceiver,object),{'connectionMade':lambda s:s.factory.c.append(s),'lineReceived':lambda s,m:(s.factory.m(m),None)[1]})),'c':[],'m':lambda s,m:[c.sendLine(m)for c in s.c]}))()),r.run()))(*(lambda p,i:(i(p,'reactor'),i(p,'protocol'),i('twisted.protocols.','basic')))('twisted.internet.',lambda a,b:__import__(a+b,None,None,b)))
2006/6/29

twisted version of the msn space social graph build

After my post on the c.p.t, I received some help from a kind guy, but he write to me privately, so i'm not going to put the messege here. So below is the twisted version of the code, very ugly, but works. I want to get the argument passing right later.
And also, if you know some way to draw a directed graph nicely, please let me know.
 
import re, sys, socket
import urllib
from twisted.internet import reactor
from twisted.python import log
from twisted.web import client
from sets import Set

socket.setdefaulttimeout = 10
re_buddy_pre = re.compile("href=\"(?:http://)?([^\.\"]+)\.spaces.msn.com", re.IGNORECASE)  # http://nick.spaces.msn.com
re_buddy_post = re.compile("spaces.msn.com/members/([^\"\/]+)", re.IGNORECASE) # http://spaces.msn.com/members/nick/

all_buddies = Set()
relation_dict = {}

def get_buddy(page):#, buddy=None):
    #print page
    #print buddy
    global all_buddies
    buddies_of_this_page = Set()
    for re_buddy in [re_buddy_pre, re_buddy_post]:
        for m in re_buddy.finditer(page):
            buddies_of_this_page.add(m.group(1))
    #relation_dict[buddy] = buddies_of_this_page
    return buddies_of_this_page

def multi_buddy_factory(buddies):
    global all_buddies
    next_to_visit = buddies.difference(all_buddies)
    all_buddies = all_buddies.union(buddies)


    for buddy in next_to_visit:
        url = "
http://"+buddy+".spaces.msn.com/"
        try:
            got(url)
        except:
            continue

def got(url):#, buddy):
    down = client.getPage(url)#, buddy)
    down.addCallback(get_buddy)
    down.addCallback(multi_buddy_factory)
    down.addErrback(log.err)
    #down.addCallback(lambda ign: reactor.stop ())

log.startLogging(sys.stdout)

got(URL)#, "ayueer")
try:
    reactor.run()

except KeyboardInterrupt:
    print all_buddies
2006/6/28

today's /. article

Something Bruce Schneier reporting on his blog about the recent paper "how to defeat China's national firewall" .
 
No comments.

my post on comp.python.twisted

From: Jia Liu <ayuer.python <at> gmail.com>
Subject: Naive questions about how to do it twisted way.
Newsgroups: gmane.comp.python.twisted
Date: 2006-06-28 09:07:14 GMT (1 hour and 56 minutes ago)

Hi,
  I have been follow twisted for like half a year now, I have read the doc and the oreilly book, but i still cann't do some simple work in the twisted way. Could someone give me a hint?
  Like today, i write this little code to follow my friends list on my msn space, and get their friends list, so on and so on, and maybe at last I can build a social relationship graph with it. The code is actually really navie, here it goes. Sorry for the ugliness of my code, please help me make it prettier:)
  Thanks. Appologize for my poor English.
Jia Liu

import re, sys, socket
import urllib
from sets import Set

socket.setdefaulttimeout = 10
re_buddy_pre = re.compile("href=\"(?:http://)?([^\.\"]+)\.spaces.msn.com", re.IGNORECASE)  # http://nick.spaces.msn.com
re_buddy_post = re.compile("spaces.msn.com/members/([^\"\/]+)", re.IGNORECASE) # http://spaces.msn.com/members/nick/

all_buddies = Set()
relation_dict = {}

def get_buddy(page, buddy=None):
    #print page
    print buddy
    global all_buddies
    buddies_of_this_page = Set()
    for re_buddy in [re_buddy_pre, re_buddy_post]:
        for m in re_buddy.finditer(page):
            buddies_of_this_page.add(m.group(1))
    relation_dict[buddy] = buddies_of_this_page
    return buddies_of_this_page

def got(url, buddy=None):
    f = urllib.urlopen(url)
    return f.read(), buddy
   
URL = "http://ayueer.spaces.msn.com/"
(page, buddy) = got(URL, "ayueer")

buddies_of_this_page = get_buddy(page, buddy)
all_buddies.add("ayueer")
next_to_visit = buddies_of_this_page.difference(all_buddies)

try:
    while 1:
        #reactor.run()
        #multi_buddy_factory(buddies_of_this_page)
        for buddy in next_to_visit:
            url = " http://"+buddy+".spaces.msn.com/"
            try:
                (page, buddy) = got(url, buddy)
                buddies_of_this_page.update(get_buddy(page, buddy))
            except:
                pass
        all_buddies.update(next_to_visit)
        next_to_visit = buddies_of_this_page.difference(all_buddies)
except KeyboardInterrupt:
    print all_buddies



--
银筝夜久殷勤弄,心怯空房不忍归
2006/6/27

python unicode browser(zz)

interesting:)
 
Python unicode browser
In the process of playing around with punycode today, I ended up writing this:

import unicodedata, pydoc
pydoc.pager('\n'.join(['%s: %s - %x' % (unichr(i).encode('utf-8'), 
                                        unicodedata.name(unichr(i), 
'<NO NAME>'), i) for i in range(0xA1, 0x10000)]))
Make sure your terminal supports UTF-8 or you will not experience the full enjoyment.

socket.recv -- three ways to turn it into recvinto(zz)

Ways to reduce the overheat of malloc() in socket.recv(), looks good, maybe some day when I have time to get up my lazy ass and take some measurements on performence.
 
socket.recv -- three ways to turn it into recvinto
Inspired by http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408859

While some people are busy worrying about how to make Python's builtin sockets less efficient, one might be wondering if the reverse is possible - how do you make them more efficient? After all, you usally want your program to run more quickly, or tax your CPU less heavily, or consume fewer resources, not the reverse. Fortunately, I have just the solution for you1. The approach explored below will be to avoid allocating new memory when reading from the socket. Since malloc() is a relatively expensive operation, this will save us a bunch of CPU time, as well as saving us memory by reducing chances for heap fragmentation and so forth.

  1. Solution the first: readinto
    exarkun@boson:~$ python
    Python 2.4.1 (#2, Mar 30 2005, 21:51:10)
    [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import socket, array, os
    >>> s = socket.socket()
    >>> s.bind(('', 4321))
    >>> s.listen(3)
    >>> c, a = s.accept()
    >>> buf = array.array('c', '\0' * 50)
    >>> os.fdopen(c.fileno()).readinto(buf)2
    50
    >>> buf.tostring()
    'apiodjwoaidjaowidjalskdjlaksdjlaksjdawd\r\naiopjwdoa'
    >>> c.recv(10)
    Traceback (most recent call last):
      File "", line 1, in ?
    socket.error: (9, 'Bad file descriptor')
    >>> 
    

    As you can see, the handy readinto method of file objects can be used to provide a pre-allocated memory space for a read to use. Unfortunately, it is a file method, not a socket method (also, its documentation recommends strongly against its use, though I can't imagine why!). We can get around this, though, since a file descriptor is just a file descriptor. os.fdopen will happily give us a file object wrapped around the socket we're really interested in. Then it's a simple matter of calling readinto on the resulting file object with an array we have previously allocated.


    "Great!" you say. "Why even bother with the other two examples?" you wonder. Well, there are a few problems. Even if we accept the os.fdopen hack, and even if we do not let the strong words in the file.readinto docstring dissuade us, there's still a tiny problem. file.readinto closes the file descriptor before returning! Damn, there goes our socket. Maybe the next solution will fare better.

  2. Solution the second: recv(2)
    Okay, that stuff with file.readinto was just silly. Let's get serious here. libc already provides the functionality we need here, and has for decades. This is basic BSD sockets 101. Stevens would cry (if he were still with us) if he saw us doing anything else. So let's cut the funny business and just do what a C programmer would do: call recv.
    exarkun@boson:~$ python
    Python 2.4.1 (#2, Mar 30 2005, 21:51:10)
    [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import dl
    >>> libc = dl.open('libc.so.6')
    >>> import socket, array
    >>> s = socket.socket()
    >>> s.bind(('', 4321))
    >>> s.listen(3)
    >>> c, a = s.accept()
    >>> buf = array.array('c', '\0' * 50)
    >>> libc.call('recv', c.fileno(), buf.buffer_info()[0], 50, 0)
    29
    >>> buf.tostring()
    'aldjiawoidjaskdjlacnwmoqawd\r\n\x00\x00\x00\x00\x00
    \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
    \x00\x00\x00\x00' >>> >>> libc.call('recv', c.fileno(), buf.buffer_info()[0], 50, 0) 30 >>> buf.tostring() 'ncbnczmnxbcmznxcbzmnxbcu7wyw\r\n\x00\x00\x00\x00\x00
    \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
    \x00\x00\x00'

  3. Sweet. We open libc so we can call recv in it, create a socket as usual, and another array object to act as our pre-allocated memory location. Note we use the buffer_info method this time, because recv() does not expect a "read-write buffer object" (like file.readinto did), but a pointer to a location in memory, which is exactly what buffer_info()[0] gives us. Then we just call recv. Easy as eatin' pancakes. We can even do it twice, demonstrating that recv isn't doing anything ridiculous, like closing the socket for us (I did it with the same array object, overwriting the previous contents, demonstrating that our no-allocation trick is working just fine).

    I know what you're thinking, though. array objects? What the hell can you do with an array object? Well, here's what. All kinds of stuff! Why, you can build one from a string. Or build a string from one. Or, uh, swap the byte order... umm, oh yea you can reverse them too. Cool deal, eh? Err, no, maybe not actually... None of those cool string methods are around, unfortunately. You can create a string from the array but that kind of defeats the purpose... in doing so you've just allocated a pile of memory. Nuts. Well, wait, don't give up yet, we may be able to improve upon this situation...

  4. Solution the ultimate: recv(2) (uh yea, again).

    The only problem we really have with recv isn't actually with recv: it's with array! Let's not throw the baby out with the bathwater, then. Solution: drop array, keep recv. We want a string. Well, let's use a string.

    exarkun@boson:~$ python
    Python 2.4.1 (#2, Mar 30 2005, 21:51:10)
    [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import socket, dl
    >>> libc = dl.open('libc.so.6')
    >>> s = socket.socket()
    >>> s.bind(('', 4321))
    >>> s.listen(3)
    >>> c, a = s.accept()
    >>> buf = '\0' * 50
    >>> libc.call('recv', c.fileno(), id(buf)   20, 50, 0)
    36
    >>> buf
    'aodijaacnwuihaiuwdhkasjnbkawuhdawd\r\n\x00\x00\x00\x00
    \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' >>>

    It's the perfect solution. No wasted memory allocation, but the same level of convenience as a normal call to socket.recv. Rarely are we lucky enough to find such elegant and flawless solutions in computer science. The astute reader might object to the magical 20 in the recv call as being inelegant or flawed, however the value can easily be computed at runtime. The code to do so is extremely simple and only omitted because it slightly too large to fit in the margin.


So there you have it. Happy networking.


1 Sorry, it's way too late to post something useful. Especially when I could post something fun instead.
2 Note: in each example where socket IO occurs, I have launched telnet in another terminal and type in some random bytes.

keep on stealling from Jp Calderone

It seems there's a problem with tee when the file is too big, so do it by yourself.
 
Doubling bytestreams

For a while now, every day, we have been moving a somewhat hefty chunk of bytes, about 15GB worth from one machine onto another. The bytes are a tar file, generated in realtime and piped to ssh (connected to a host untaring the bytes). Pretty standard stuff, really. A while back we decided we wanted to send this tar to two hosts, instead of one. No big deal, we just ran tar twice, piping the output to an ssh process connected to a different host each time. Worked like a charm. Recently, we decided the load incurred by the second copy was heavy enough to be worth avoiding. Obvious solution: pipe tar to tee, send one of tee's outputs to one ssh, the other to the other.

That doesn't work. Woops.

For whatever reason, tee chokes after a bit less than 8 GB of data. write error it says, and poof one of the streams is dead (always the same one, interestingly, and the other one always carries on just fine). Rather than waste too much time trying to figure out who is at fault here (or perhaps as a way of doing so ;), I wrote this quickie:

#!/usr/bin/python

"""Write bytes read from one file to one or more files.
"""

import os, sys

from twisted.python import usage, log

class Options(usage.Options):
    def opt_out(self, arg):
        vars(self).setdefault('out', []).append(arg)
    opt_o = opt_out

    def postOptions(self):
        self.outfds = []
        for fname in vars(self).get('out', []):
            self.outfds.append(os.open(fname, os.O_WRONLY | os.O_CREAT))

def main(infd, *outfds):
    while 1:
        bytes = os.read(infd, 2 ** 16)
        if not bytes:
            break
        for i in xrange(len(outfds) - 1, -1, -1):
            try:
                os.write(outfds[i], bytes)
            except:
                log.msg("Error writing to %d" % (outfds[i],))
                log.err()
                del outfds[i]

if __name__ == '__main__':
    o = Options()
    try:
        o.parseOptions()
    except:
        raise # sys.exit(str(sys.exc_info()[1]))
    else:
        log.startLogging(sys.stdout)
        main(0, *o.outfds)

I called it yjoint, uncreative clod that I am (Hey, at least I didn't call it pytee). It's not exactly a drop-in tee replacement. We use it more or less like this:

exarkun@boson:~/$ tar c yummy_data | yjoint \
> --out >(ssh host1 tar x) \
> --out >(ssh host2 tar x)

Swapped tee out and yjoint in, and suddenly we are in business again.

I wonder what the deal with tee is?

what really happened on Mars(zz)

呵呵, 讲到有趣的bug, 我就想到这个了
下面是在http://research.microsoft.com/~mbj/这位同学主页上的两篇文章.
 

From: Mike Jones <mbj@microsoft.com>
Sent: Sunday, December 07, 1997 6:47 PM
Subject: What really happened on Mars?

The Mars Pathfinder mission was widely proclaimed as "flawless" in the early days after its July 4th, 1997 landing on the Martian surface. Successes included its unconventional "landing" -- bouncing onto the Martian surface surrounded by airbags, deploying the Sojourner rover, and gathering and transmitting voluminous data back to Earth, including the panoramic pictures that were such a hit on the Web. But a few days into the mission, not long after Pathfinder started gathering meteorological data, the spacecraft began experiencing total system resets, each resulting in losses of data. The press reported these failures in terms such as "software glitches" and "the computer was trying to do too many things at once".

This week at the IEEE Real-Time Systems Symposium I heard a fascinating keynote address by David Wilner, Chief Technical Officer of Wind River Systems. Wind River makes VxWorks, the real-time embedded systems kernel that was used in the Mars Pathfinder mission. In his talk, he explained in detail the actual software problems that caused the total system resets of the Pathfinder spacecraft, how they were diagnosed, and how they were solved. I wanted to share his story with each of you.

VxWorks provides preemptive priority scheduling of threads. Tasks on the Pathfinder spacecraft were executed as threads with priorities that were assigned in the usual manner reflecting the relative urgency of these tasks.

Pathfinder contained an "information bus", which you can think of as a shared memory area used for passing information between different components of the spacecraft. A bus management task ran frequently with high priority to move certain kinds of data in and out of the information bus. Access to the bus was synchronized with mutual exclusion locks (mutexes).

The meteorological data gathering task ran as an infrequent, low priority thread, and used the information bus to publish its data. When publishing its data, it would acquire a mutex, do writes to the bus, and release the mutex. If an interrupt caused the information bus thread to be scheduled while this mutex was held, and if the information bus thread then attempted to acquire this same mutex in order to retrieve published data, this would cause it to block on the mutex, waiting until the meteorological thread released the mutex before it could continue. The spacecraft also contained a communications task that ran with medium priority.

Most of the time this combination worked fine. However, very infrequently it was possible for an interrupt to occur that caused the (medium priority) communications task to be scheduled during the short interval while the (high priority) information bus thread was blocked waiting for the (low priority) meteorological data thread. In this case, the long-running communications task, having higher priority than the meteorological task, would prevent it from running, consequently preventing the blocked information bus task from running. After some time had passed, a watchdog timer would go off, notice that the data bus task had not been executed for some time, conclude that something had gone drastically wrong, and initiate a total system reset.

This scenario is a classic case of priority inversion.

HOW WAS THIS DEBUGGED?

VxWorks can be run in a mode where it records a total trace of all interesting system events, including context switches, uses of synchronization objects, and interrupts. After the failure, JPL engineers spent hours and hours running the system on the exact spacecraft replica in their lab with tracing turned on, attempting to replicate the precise conditions under which they believed that the reset occurred. Early in the morning, after all but one engineer had gone home, the engineer finally reproduced a system reset on the replica. Analysis of the trace revealed the priority inversion.

HOW WAS THE PROBLEM CORRECTED?

When created, a VxWorks mutex object accepts a boolean parameter that indicates whether priority inheritance should be performed by the mutex. The mutex in question had been initialized with the parameter off; had it been on, the low-priority meteorological thread would have inherited the priority of the high-priority data bus thread blocked on it while it held the mutex, causing it be scheduled with higher priority than the medium-priority communications task, thus preventing the priority inversion. Once diagnosed, it was clear to the JPL engineers that using priority inheritance would prevent the resets they were seeing.

VxWorks contains a C language interpreter intended to allow developers to type in C expressions and functions to be executed on the fly during system debugging. The JPL engineers fortuitously decided to launch the spacecraft with this feature still enabled. By coding convention, the initialization parameter for the mutex in question (and those for two others which could have caused the same problem) were stored in global variables, whose addresses were in symbol tables also included in the launch software, and available to the C interpreter. A short C program was uploaded to the spacecraft, which when interpreted, changed the values of these variables from FALSE to TRUE. No more system resets occurred.

ANALYSIS AND LESSONS

First and foremost, diagnosing this problem as a black box would have been impossible. Only detailed traces of actual system behavior enabled the faulty execution sequence to be captured and identified.

Secondly, leaving the "debugging" facilities in the system saved the day. Without the ability to modify the system in the field, the problem could not have been corrected.

Finally, the engineer's initial analysis that "the data bus task executes very frequently and is time-critical -- we shouldn't spend the extra time in it to perform priority inheritance" was exactly wrong. It is precisely in such time critical and important situations where correctness is essential, even at some additional performance cost.

HUMAN NATURE, DEADLINE PRESSURES

David told us that the JPL engineers later confessed that one or two system resets had occurred in their months of pre-flight testing. They had never been reproducible or explainable, and so the engineers, in a very human-nature response of denial, decided that they probably weren't important, using the rationale "it was probably caused by a hardware glitch".

Part of it too was the engineers' focus. They were extremely focused on ensuring the quality and flawless operation of the landing software. Should it have failed, the mission would have been lost. It is entirely understandable for the engineers to discount occasional glitches in the less-critical land-mission software, particularly given that a spacecraft reset was a viable recovery strategy at that phase of the mission.

THE IMPORTANCE OF GOOD THEORY/ALGORITHMS

David also said that some of the real heroes of the situation were some people from CMU who had published a paper he'd heard presented many years ago who first identified the priority inversion problem and proposed the solution. He apologized for not remembering the precise details of the paper or who wrote it. Bringing things full circle, it turns out that the three authors of this result were all in the room, and at the end of the talk were encouraged by the program chair to stand and be acknowledged. They were Lui Sha, John Lehoczky, and Raj Rajkumar. When was the last time you saw a room of people cheer a group of computer science theorists for their significant practical contribution to advancing human knowledge? :-) It was quite a moment.

POSTLUDE

For the record, the paper was:

L. Sha, R. Rajkumar, and J. P. Lehoczky. Priority Inheritance Protocols: An Approach to Real-Time Synchronization. In IEEE Transactions on Computers, vol. 39, pp. 1175-1185, Sep. 1990.


This note was widely circulated after I sent it to a few friends in the systems community. Among other places, it appeared in Peter G. Neumann's moderated Risks Forum (comp.risks) on Tuesday, 9 December 1997 in issue RISKS-19.49.

Be sure to also read the follow-up message from Glenn Reeves of JPL, who led the software team for the Mars Pathfinder spacecraft.

Related Links:

From: Glenn E Reeves[SMTP:Glenn.E.Reeves@jpl.nasa.gov]
Sent: Monday, December 15, 1997 10:28 AM
To: jack@telerobotics.jpl.nasa.gov; mishkin@telerobotics.jpl.nasa.gov; rmanning@mail1.jpl.nasa.gov; miked@wrs.com; Mike Jones; gregg@wrs.com; dlre@chevron.com; David.Cummings@andrew.com; pky@appsig.com; karl_schneider@radixtek.com; lisa@wrs.com; dave@wrs.com; John Krumm; bilzabub@pclink.com; mike_conrad@atk.com; elf@wavemark.com; mlisa@erols.com; Thomas Blumer; shep@sunne.East.Sun.com; kmarks@xionics.com; frederick_hallock@lotus.com; athomas@xenergy.com; andy@harlequin.com; don@wavemark.com; bostic@bostic.com; samiller@red.primextech.com; higgins@daffy.fnal.gov; intersci@smof.demon.co.uk; kay@rsvl.unisys.com; glen@qnx.com; nev@bostic.com; jharris@mail1.jpl.nasa.gov; Richard A Volpe; Richard V Welch; Henry W Stone; Allen R Sirota; Kim P Gostelow; Robert D Rasmussen; David J Eisenman; Daniel E Erickson; Udo Wehmeier; Peter R Gluck; David E Smyth; Robert C Barry; Steven A Stolper; Alfred D Schoepke; David C Gruel; Guy S Beutelschies; Robert M Manning; Friedrich A Krug; Samuel H Zingales; Steven M Collins; J Morgan Parker
Subject: Re: [Fwd: FW: What really happened on Mars?]

What really happened on Mars ?

By now most of you have read Mike’s (mbj@microsoft.com) summary of Dave Wilner’s comments given at the IEEE Real-Time Systems Symposium. I don’t know Mike and I didn’t attend the symposium (though I really wish I had now) and I have not talked to Dave Wilner since before the talk. However, I did lead the software team for the Mars Pathfinder spacecraft. So, instead of trying to find out what was said I will just tell you what happened. You can make your own judgments.

I sent this message out to everyone who was a recipient of Mike’s original that I had an email address for. Please pass it on to anyone you sent the first one to. Mike, I hope you will post this wherever you posted the original.

Since I want to make sure the problem is clearly understood I need to step through each of the areas which contributed to the problem.

THE HARDWARE

The simplified view of the Mars Pathfinder hardware architecture looks like this. A single CPU controls the spacecraft. It resides on a VME bus which also contains interface cards for the radio, the camera, and an interface to a 1553 bus. The 1553 bus connects to two places : The "cruise stage" part of the spacecraft and the "lander" part of the spacecraft. The hardware on the cruise part of the spacecraft controls thrusters, valves, a sun sensor, and a star scanner. The hardware on the lander part provides an interface to accelerometers, a radar altimeter, and an instrument for meteorological science known as the ASI/MET. The hardware which we used to interface to the 1553 bus (at both ends) was inherited from the Cassini spacecraft. This hardware came with a specific paradigm for its usage : the software will schedule activity at an 8 Hz rate. This **feature** dictated the architecture of the software which controls both the 1553 bus and the devices attached to it.

THE SOFTWARE ARCHITECTURE

The software to control the 1553 bus and the attached instruments was implemented as two tasks. The first task controlled the setup of transactions on the 1553 bus (called the bus scheduler or bc_sched task) and the second task handled the collection of the transaction results i.e. the data. The second task is referred to as the bc_dist (for distribution) task. A typical timeline for the bus activity for a single cycle is shown below. It is not to scale. This cycle was constantly repeated.

    |< ------------- .125 seconds ------------------------>|
    |<***************|                    |********|   |**>|
                     |<- bc_dist active ->|    bc_sched active
    |< -bus active ->|                             |<->|
----|----------------|--------------------|--------|---|---|-------
    t1               t2                   t3       t4  t5  t1

The *** are periods when tasks other than the ones listed are executing. Yes, there is some idle time.

t1 - bus hardware starts via hardware control on the 8 Hz boundary. The transactions for the this cycle had been set up by the previous execution of the bc_sched task.
t2 - 1553 traffic is complete and the bc_dist task is awakened.
t3 - bc_dist task has completed all of the data distribution
t4 - bc_sched task is awakened to setup transactions for the next cycle
t5 - bc_sched activity is complete

The bc_sched and bc_dist tasks check each cycle to be sure that the other had completed its execution. The bc_sched task is the highest priority task in the system (except for the vxWorks "tExec" task). The bc_dist is third highest (a task controlling the entry and landing is second). All of the tasks which perform other spacecraft functions are lower. Science functions, such as imaging, image compression, and the ASI/MET task are still lower.

Data is collected from devices connected to the 1553 bus only when they are powered. Most of the tasks in the system that access the information collected over the 1553 do so via a double buffered shared memory mechanism into which the bc_dist task places the latest data. The exception to this is the ASI/MET task which is delivered its information via an interprocess communication mechanism (IPC). The IPC mechanism uses the vxWorks pipe() facility. Tasks wait on one or more IPC "queues" for messages to arrive. Tasks use the select() mechanism to wait for message arrival. Multiple queues are used when both high and lower priority messages are required. Most of the IPC traffic in the system is not for the delivery of real-time data. However, again, the exception to this is the use of the IPC mechanism with the ASI/MET task. The cause of the reset on Mars was in the use and configuration of the IPC mechanism.

THE FAILURE

The failure was identified by the spacecraft as a failure of the bc_dist task to complete its execution before the bc_sched task started. The reaction to this by the spacecraft was to reset the computer. This reset reinitializes all of the hardware and software. It also terminates the execution of the current ground commanded activities. No science or engineering data is lost that has already been collected (the data in RAM is recovered so long as power is not lost). However, the remainder of the activities for that day were not accomplished until the next day.

The failure turned out to be a case of priority inversion (how we discovered this and how we fixed it are covered later). The higher priority bc_dist task was blocked by the much lower priority ASI/MET task that was holding a shared resource. The ASI/MET task had acquired this resource and then been preempted by several of the medium priority tasks. When the bc_sched task was activated, to setup the transactions for the next 1553 bus cycle, it detected that the bc_dist task had not completed its execution. The resource that caused this problem was a mutual exclusion semaphore used within the select() mechanism to control access to the list of file descriptors that the select() mechanism was to wait on.

The select mechanism creates a mutual exclusion semaphore to protect the "wait list" of file descriptors for those devices which support select. The vxWorks pipe() mechanism is such a device and the IPC mechanism we used is based on using pipes. The ASI/MET task had called select, which had called pipeIoctl(), which had called selNodeAdd(), which was in the process of giving the mutex semaphore. The ASI/ MET task was preempted and semGive() was not completed. Several medium priority tasks ran until the bc_dist task was activated. The bc_dist task attempted to send the newest ASI/MET data via the IPC mechanism which called pipeWrite(). pipeWrite() blocked, taking the mutex semaphore. More of the medium priority tasks ran, still not allowing the ASI/MET task to run, until the bc_sched task was awakened. At that point, the bc_sched task determined that the bc_dist task had not completed its cycle (a hard deadline in the system) and declared the error that initiated the reset.

HOW WE FOUND IT

The software that flies on Mars Pathfinder has several debug features within it that are used in the lab but are not used on the flight spacecraft (not used because some of them produce more information than we can send back to Earth). These features were not "fortuitously" left enabled but remain in the software by design. We strongly believe in the "test what you fly and fly what you test" philosophy.

One of these tools is a trace/log facility which was originally developed to find a bug in an early version of the vxWorks port (Wind River ported vxWorks to the RS6000 processor for us for this mission). This trace/log facility was built by David Cummings who was one of the software engineers on the task. Lisa Stanley, of Wind River, took this facility and instrumented the pipe services, msgQ services, interrupt handling, select services, and the tExec task. The facility initializes at startup and continues to collect data (in ring buffers) until told to stop. The facility produces a voluminous dump of information when asked.

After the problem occurred on Mars we did run the same set of activities over and over again in the lab. The bc_sched was already coded so as to stop the trace/log collection and dump the data (even though we knew we could not get the dump in flight) for this error. So, when we went into the lab to test it we did not have to change the software.

In less that 18 hours we were able to cause the problem to occur. Once we were able to reproduce the failure the priority inversion problem was obvious.

HOW WAS THE PROBLEM CORRECTED

Once we understood the problem the fix appeared obvious : change the creation flags for the semaphore so as to enable the priority inheritance. The Wind River folks, for many of their services, supply global configuration variables for parameters such as the "options" parameter for the semMCreate used by the select service (although this is not documented and those who do not have vxWorks source code or have not studied the source code might be unaware of this feature). However, the fix is not so obvious for several reasons :

1) The code for this is in the selectLib() and is common for all device creations. When you change this global variable all of the select semaphores created after that point will be created with the new options. There was no easy way in our initialization logic to only modify the semaphore associated with the pipe used for bc_dist task to ASI/MET task communications.

2) If we make this change, and it is applied on a global basis, how will this change the behavior of the rest of the system ?

3) The priority inversion option was deliberately left out by Wind River in the default selectLib() service for optimum performance. How will performance degrade if we turn the priority inversion on ?

4) Was there some intrinsic behavior of the select mechanism itself that would change if the priority inversion was enabled ?

We did end up modifying the global variable to include the priority inversion. This corrected the problem. We asked Wind River to analyze the potential impacts for (3) and (4). They concluded that the performance impact would be minimal and that the behavior of select() would not change so long as there was always only one task waiting for any particular file descriptor. This is true in our system. I believe that the debate at Wind River still continues on whether the priority inversion option should be on as the default. For (1) and (2) the change did alter the characteristics of all of the select semaphores. We concluded, both by analysis and test, that there was no adverse behavior. We tested the system extensively before we changed the software on the spacecraft.

HOW WE CHANGED THE SOFTWARE ON THE SPACECRAFT

No, we did not use the vxWorks shell to change the software (although the shell is usable on the spacecraft). The process of "patching" the software on the spacecraft is a specialized process. It involves sending the differences between what you have onboard and what you want (and have on Earth) to the spacecraft. Custom software on the spacecraft (with a whole bunch of validation) modifies the onboard copy. If you want more info you can send me email.

WHY DIDN’T WE CATCH IT BEFORE LAUNCH ?

The problem would only manifest itself when ASI/MET data was being collected and intermediate tasks were heavily loaded. Our before launch testing was limited to the "best case" high data rates and science activities. The fact that data rates from the surface were higher than anticipated and the amount of science activities proportionally greater served to aggravate the problem. We did not expect nor test the "better than we could have ever imagined" case.

HUMAN NATURE, DEADLINE PRESSURES

We did see the problem before landing but could not get it to repeat when we tried to track it down. It was not forgotten nor was it deemed unimportant. Yes, we were concentrating heavily on the entry and landing software. Yes, we considered this problem lower priority. Yes, we would have liked to have everything perfect before landing. However, I don’t see any problem here other than we ran out of time to get the lower priority issues completed.

We did have one other thing on our side; we knew how robust our system was because that is the way we designed it.

We knew that if this problem occurred we would reset. We built in mechanisms to recover the current activity so that there would be no interruptions in the science data (although this wasn’t used until later in the landed mission). We built in the ability (and tested it) to go through multiple resets while we were going through the Martian atmosphere. We designed the software to recover from radiation induced errors in the memory or the processor. The spacecraft would have even done a 60 day mission on its own, including deploying the rover, if the radio receiver had broken when we landed. There are a large number of safeguards in the system to ensure robust, continued operation in the event of a failure of this type. These safeguards allowed us to designate problems of this nature as lower priority.

We had our priorities right.

ANALYSIS AND LESSONS

Did we (the JPL team) make an error in assuming how the select/pipe mechanism would work ? Yes, probably. But there was no conscious decision to not have the priority inversion enabled. We just missed it. There are several other places in the flight software where similar protection is required for critical data structures and the semaphores do have priority inversion protection. A good lesson when you fly COTS stuff - make sure you know how it works.

Mike is quite correct in saying that we could not have figured this out **ever** if we did not have the tools to give us the insight. We built many of the tools into the software for exactly this type of problem. We always planned to leave them in. In fact, the shell (and the stdout stream) were very useful the entire mission. If you want more detail send me a note.

SETTING THE RECORD STRAIGHT

First, I want to make sure that everyone understands how I feel in regard to Wind River. These folks did a fantastic job for us. They were enthusiastic and supported us when we came to them and asked them to do an affordable port of vxWorks. They delivered the alpha version in 3 months. When we had a problem they put some of the brightest engineers I have ever worked with on the problem. Our communication with them was fantastic. If they had not done such a professional job the Mars Pathfinder mission would not have been the success that it is.

Second, Dave Wilner did talk to me about this problem before he gave his talk. I could not find my notes where I had detailed the description of the problem. So, I winged it and I sure did get it wrong. Sorry Dave.

ACKNOWLEDGMENTS

First, thanks to Mike for writing a very nice description of the talk. I think I have had probably 400 people send me copies. You gave me the push to write the part of the Mars Pathfinder End-of-Mission report that I had been procrastinating doing.

A special thanks to Steve Stolper for helping me do this.

The biggest thanks should go to the software team that I had the privilege of leading and whose expertise allowed us to succeed :

Pam Yoshioka
Dave Cummings
Don Meyer
Karl Schneider
Greg Welz
Rick Achatz
Kim Gostelow
Dave Smyth
Steve Stolper

Also ,

Miguel San Martin
Sam Sirlin
Brian Lazara (WRS)
Mike Deliman (WRS)
Lisa Stanley (WRS)

---------------------

Glenn Reeves
Mars Pathfinder Flight Software Cognizant Engineer
glenn.e.reeves@jpl.nasa.gov


Related Links:

转一个比较ft的bug

还是从http://jcalderone.livejournal.com/看到的, 想到了那天在调那个pie程序的时候, 怎么也找不到为什么不能用C-g然后7选特殊模式, 好只选有附件的帖子, 结果最后才发现原来pie跑得是匿名, 匿名在水母那个C-g然后7什么也没有, 晕死...
 
How Not To Make A Page-Atomic Copy Of A File
Can you find the hidden bug?
import os

def copy(fIn, fOut):
    blockSize = os.statvfs(fIn.name).f_bsize
    while 1:
        bytes = fIn.read(blockSize)
        if not bytes:
            break
        fOut.write(bytes)

well...
[info]micahel
2005-06-17 12:22 pm UTC
fOut could be on a different filesystem. fstavfs would generally seem to make more sense. I don't see how "page-atomic" and "file-system block size" relate, necessariy.

Actually, I don't know what "Page-Atomic Copy Of A File" means, so I'll stop now.

Re: well...
[info]jcalderone
2005-06-17 01:54 pm UTC
Consider it safe to assume that fIn and fOut are on the same local filesystem, are both real, regular files and that statvfs().f_bsize gives the actual filesystem blocksize for the filesystem they reside on.

A "page-atomic" (perhaps there is a better term for this?) file copy is one where no page (a filesystem block worth of bytes) ends up in the resulting file which was not present in the source file.

This could happen if a separate process is writing pages to the input file while the copy function is running, if the copy function does not perform only atomic reads. For example, if the first 4096 bytes of the file are 'X' and the copy function reads 1 byte at a time, if a process writes 'Y' over those first 4096 bytes, the output file may end up containing a series of 'X' followed by a series of 'Y', even though at no point were 'X' and 'Y' mixed in the source file.

Pretty much all modern operating systems guarantee that reads and writes of the filesystem page size will be atomic.

So with all that in mind, there is still a bug in the function in the original post.

Re: well...
[info]micahel
2005-06-17 02:10 pm UTC
The file grows?

Re: well...
[info]jcalderone
2005-06-18 04:51 pm UTC
Yes, exactly :)

If the file is appended to between the read() at which EOF is reached and the next read() which would otherwise have returned '', and the file's size was not a multiple of the filesystem block size, the rest of the file will essentially be corrupt.

Unfortunately for me, it took quite a long time to discover this shortcoming of this otherwise reasonable file copy routine.

Re: well...
[info]micahel
2005-06-19 05:52 pm UTC
Oh, that's actually nastier than I thought.

I wonder if Python should somehow 'latch' the file object so that it stays at EOF once EOF has been reached (unless a seek or something happens, obviously). Might be hard to get right, though.

在听有里知花的歌, 呵呵

从bigjoe大哥那里知道这个名字, 听了那个I cry满好听的.

All about hanoi

昨天帮一个小朋友写数据结构作业, 一个用栈来做非递归hanoi, 算法是小朋友从书上找到的, 只用实现就好了, 然后在我数据结构课本上看到了递归的实现, 下面两个是抄来的.
void move(int count, int start, int finish, int temp) //recursive
{
 if(count > 0){
  move(count - 1, start, temp, finish);
  cout <<"Move disk "<<count<<" from " << start
   <<" to "<<finish<<"."<<endl;
  move(count - 1, temp, finish, start);
 }
}
void move(int count, int start, int finish, int temp) //without tail recursion
{
 int swap;
 while(count > 0){
  move(count - 1, start, temp, finish);
  cout << "Move disk "<<count<<" from "<<start
   <<" to "<<finish<<"."<<endl;
  count--;
  swap = start;
  start = temp;
  temp = swap;
 }
}
然后是我写的栈来做的, 开始的时候做的有些笨, 完全照着算法翻译, 用count到一来判断是否cout, 即中间真正移动的那一步, 然后就没有了每次是移动的哪个盘子的信息. 今天同事帮着想到是怎么回事的, 555, 好zt.
typedef struct
{
 int count;
 bool exchange;
 int start;
 int finish;
 int swap;
} StepType;
void move(int count, int start, int finish, int swap)
{
 stack<StepType> S;
 StepType temp1, temp2;
 temp1.count = count;
 temp1.exchange = false;
 temp1.start = start;
 temp1.finish = finish;
 temp1.swap = swap;
 S.push(temp1);
 while(!S.empty())
 {
  temp2 = S.top();
  S.pop();
  if(temp2.exchange){
   cout << "Move disk "<<temp2.count<<" from "<<temp2.start
   <<" to "<<temp2.finish<<"."<<endl;
  } else if(temp2.count > 0)
  {
   temp1.count = temp2.count - 1;
   temp1.start = temp2.swap;
   temp1.swap = temp2.start;
   temp1.finish = temp2.finish;
   temp1.exchange = false; 
   S.push(temp1);
   temp1.exchange = true;
   temp1.count = temp2.count;
   temp1.start = temp2.start;
   temp1.swap = temp2.swap;
   temp1.finish = temp2.finish;
   S.push(temp1);
   temp1.count = temp2.count - 1;
   temp1.start = temp2.start;
   temp1.swap = temp2.finish;
   temp1.finish = temp2.swap;
   temp1.exchange = false;
   S.push(temp1);
  }
 }
}
然后下午就看到水母programming版讨论非递归(不用栈)来做的方法, 因为昨天就在topcoder上看到大家都是从"Algorithms in C++" by Sedgewick上看到的, 因为我没有这本书, 所以只能从他们转述的来理解,不过还是很有趣的算法.
topcoder上的小朋友如是说
In short - if you have n slices to move in 2^n - 1 moves, you enumerate steps by numbers from 1 to (2^n - 1) inclusive, and enumerate all slices with number from 0 to (n-1), inclusive.
On the step i, you calculate the maximal k, that (i % (2^k)) == 0, and move the slice number k (you can prove that moving that slice will be legal at that moment). I don't remember how do you determine where to move it, but this is obvious for all slices except the smallest one, and for the smallest one there is a simple algo.
One other way of presenting the non-recursive algorithm:
- on odd moves, move the smallest disc one peg to the right (or from the rightmost to the leftmost peg)
- on even moves, make the only move not involving the smallest peg.
水母上是这样说的
n个盘子,都放在A,要经过C放到B
按照从小到大的顺序编号为1、2、……、n
1个盘子
1       把1挪到B
2个盘子
01      把1挪到C
10      把2挪到B
11      把1挪到B                                                          
 
3个盘子
001     把1挪到B
010     把2挪到C
011     把1挪到C
100     把3挪到B
101     把1挪到A
110     把2挪到B
111     把1挪到B
很显然,循环变量从1到2^n - 1循环,每次要挪动的盘子编号对应于循环变量从低到高
第一个为1的比特
或者考虑一颗满二叉树的中序遍历过程,一共有2^n - 1个节点,每步遍历到哪层也是由
这个循环变量的最低的非0比特位确定的

挺有趣的, 什么时候找到书了看一下
贴一下他们给的代码, mathli的c代码, 没看懂, 放着慢慢看
#include <stdio.h>
void hanoi(int n)
{
    int i, j, f, a=1, b = (n%2)?3:2;
    if (n<1 || n>30) return;
    printf("%d from %c to %c\n", 1, 'A'-1+a, 'A'-1+b);
    for (n=(1<<n)-1, i=1; i<n; i+=2)
    {
        for (f=1,j=i; j%2; j/=2,f++);
        if (f%2) a^=b;
        b^=a;
        printf("%d from %c to %c\n", f, 'A'-1+a, 'A'-1+b);
        a^=b;
        if (f%2) b^=a;
        printf("%d from %c to %c\n", 1, 'A'-1+a, 'A'-1+b);
    }
}
int main(int argc, char *argv[])
{
    int n;
    scanf("%d", &n);
    hanoi(n);
    return 0;
}
asterix的python代码, 这个写的还比较好懂
#!/usr/bin/python
""" inorder traversal the complete binary-tree. index is the index of the
coresponed heap of the tree. i, j, k stands for the peg 1, 2, 3.
"""
def left((index, i, j)): #index references the binary heap. 1-based.
    return (2*index, i, 6-i-j)
def right((index, i, j)):
    return (2*index+1, 6-i-j, j)
def parent((index, i, j)):
    k = 6-i-j
    if index%2:       # right child of parent
        return ((index-1)/2, k, j)
    else:             # left
        return (index/2, i, k)
def first(plates, (index, i, j)): #get leftmost node of subtree
    while 2*index <2 **plates:
        (index, i, j) = left((index, i, j))
    return (index, i, j)
def next(plates, (index, i, j)): #next node in in-order traversal
    if 2*index < 2**plates:        # not a leaf
        return first(plates, right((index, i, j)))
    elif index % 2:                # right leaf
        while (index != 1) and (index % 2):  #go up till reaches a left child
            (index, i, j) = parent((index, i, j))
        if index == 1:          # reached root
            return None         # is last node
        else:
            return parent((index, i, j))
    else:                       # left leaf
        return parent((index, i, j))
def iterativeHanoi(plates, i, j):
    node = first(plates, (1, i, j))
    while node:
        print node[1], "->", node[2]
        node = next(plates, node)
 
iterativeHanoi(8, 1, 2)  #test case, I added this
 
------------------------------------------
xiaxia's summary on smth
发信人: xiaxia (心态好才是真的好), 信区: Programming
标  题: Re: the towers of hanoi
发信站: 水木社区 (Wed Jun 28 15:05:52 2006), 站内

狗尾续貂下 整理了下

汉诺塔问题介绍:
在印度,有这么一个古老的传说:在世界中心贝拿勒斯(在印度北部)的圣庙里,一块黄铜板上插着三根宝石针。印度教的主神梵天在创造世界的时候,在其中一根针上从下到上地穿好了由大到小的64片金片,这就是所谓的汉诺塔。不论白天黑夜,总有一个僧侣在按照下面的法则移动这些金片,一次只移动一片,不管在哪根针上,小片必在大片上面。当所有的金片都从梵天穿好的那根针上移到另外一概针上时,世界就将在一声霹雳中消灭,梵塔、庙宇和众生都将同归于尽。

递归算法:

定义 void Hanoi(char src, char des, char via, int n)
表示把n个盘子从src上借助via移动到des上。
显然有
     void Hanoi(char src, char des, char via, int n)
      {
          Hanoi(src, via, des, n - 1);
          Move(src, des, n); //把第n个盘子直接从src移动到des
          Hanoi(via,des, src, n - 1);
      }

根据递归算法,设f(n)为n个盘子要移动的次数。
那么显然 f(n + 1) = 2*f(n) + 1  ->  [f(n + 1) + 1] = 2*[f(n) + 1]
f(1) = 1,-> f(n) + 1 = (1 + 1)^n -> f(n) = 2^n - 1。
f(64)= 2^64-1=18446744073709551615  

非递归算法:
定义从小到大的盘子序号分别为1,2,……n。
可以用一个1到2^n - 1的2进制序列可以模拟出n个盘子的汉诺塔过程中被移动的盘子的序号序列。即给定一个n,我们通过0到2^n - 1序列可以判断出任意一步应该移动那个盘子。

判断方法:第m步移动的盘子序号是m用二进制表示的最低位bit为1的位置。

证明: n = 1,显然成立。
假设n = k 成立。
n = k + 1时,对应序列1到2^(k+1) - 1,显然这个序列关于2^k左右对称。
假设我们要把k + 1个盘子从A移动C。
那么2^k可以对应着Move(k + 1, A, C)。 1 到 2^k - 1 根据假设可以
对应Hanoi(A, B, C, k)。至于2^k + 1 到 2^(k + 1) - 1把最高位的1去掉对应序列
变成1到2^k - 1,显然2^k + 1 到 2^(k + 1) - 1和1到2^k - 1这两个序列中的对应元素的最低位bit为1的位置相同。
因此2^k + 1 到 2^(k + 1) - 1可以对应Hanoi(B, C,A,k)。
所以对n = k + 1也成立。

下面讨论第m步应该移动对应的盘子从那到那?
定义顺序为 A->B->C->A, 逆序为C->B->A->C。

性质对n个盘子的汉诺塔,任意一个盘子k(k <= n)k在整个汉诺塔的移动过程中要么一直顺序的,要么一直逆序的。而且如果k在n个盘子移动过程的顺序和k - 1(如果k > 1)以及k + 1(如果k < n)的顺序是反序。
比如:n = 3
1 A->C
2 A->B
1 C->B
3 A->C
1 B->A
2 B->C
1 A->C
其中1的轨迹A->C->B->A>C逆序,2的轨迹A->B->C顺序,3的轨迹A->C逆序
      
证明:假设n <= k成立
对于n = k + 1 根据递归算法
Hanoi(A,C,B,k + 1) = Hanoi(A, B, C, k) + Move(A, C, k + 1) + Hanoi(B, C,A,k);

整个过程中盘子k + 1只移动一次A->C为逆序对应着2^k。
对于任意盘子m < k + 1,
m盘子的移动由两部分组成一部分是前半部分Hanoi(A, B, C, k)以及后半部分的Hanoi(B, C,A,k)组成。显然有如果m在Hanoi(A, C, B, k)轨迹顺序的话,则m在Hanoi(A, B, C, k)以及Hanoi(B, C,A,k)都是逆序。
反之亦然。这两部分衔接起来就会证明m在Hanoi(A,C,B,k)和Hanoi(A,C,B,k + 1)中是反序的。同时有Hanoi塔中最大的盘子永远是逆序且只移动1步,A->C。
这样的话:m = k + 1,在Hanoi(A,C,B,k + 1)中是逆序。
m = k,由于在Hanoi(A,C,B,k)中是逆序的,所以Hanoi(A,C,B,k + 1)中是顺序的。
m = k - 1,由于在Hanoi(A,C,B,k - 1)是逆序的,所以Hanoi(A,C,B,k)是顺序的,所以Hanoi(A,C,B,k + 1)是逆序的。
依次下去……
结论得证。

总结:在n个汉诺中n, n - 2, n - 4……是逆序移动,n - 1, n - 3,n - 5……是顺序移动。

有了以上结论,非递归的程序就很好写了。写了个递归和非递归比较:

#include <iostream>
using namespace std;

void Hanoi(char src, char des, char via, int n)
{
        if(n == 1)
        {
                cout << n <<" : "<< src <<" --> " <<des << endl;
                return;
        }
        Hanoi(src, via, des, n - 1);
        cout << n <<" : "<< src <<" --> " <<des << endl;
        Hanoi(via, des, src, n - 1);
}


int main()
{
        int n;
        cin >> n;
    cout<<"recusive:"<< endl;
        Hanoi('A','C','B', n);
        cout << endl;
        cout<<"normal:"<<endl;
    char order[2][256];
        char pos[64];
        order[0]['A'] = 'B';
        order[0]['B'] = 'C';
        order[0]['C'] = 'A';
        order[1]['A'] = 'C';
        order[1]['B'] = 'A';
        order[1]['C'] = 'B';
        //0是顺序 1是逆序
        int index[64];
        //确定轨迹的顺序还是逆序
        int i, j, m;

    for(i = n; i > 0; i -= 2)
                 index[i] = 1;
        for(i = n - 1; i > 0; i -= 2)
                 index[i] = 0;
    memset(pos, 'A', sizeof(pos));

        for(i = 1; i < 1 << n; i ++)
           {
                  for(m = 1, j = i; j%2 == 0; j/=2, m ++);                                
                  cout << m <<" : "<< pos[m]  <<" --> " << order[index[m]][pos[m]] << endl;        
                  pos[m] = order[index[m]][pos[m]];
           }
        return 0;
}

 
最后, 有人贴了一个算法导论上的习题, 想了很久想不出, 先放着, 慢慢想
Give a nonrecursive algorithm that performs an inorder tree walk.
(Hint: There is an easy solution that uses a stack as an auxiliary
data structure and a more complicated but elegant solution that uses n
o stack but assumes that two pointers can be tested for equality.)

struct Node
{
    Node* left;
    Node* right;
    T data;
};

void WalkInorder( Node* root )
{
    Node* stack[]; //allocate this however you want.

    Node* iter = stack;
    *iter++ = 0;

    Node* curr = root;

    while( curr )
    {
        while( curr )
        {
            *iter++ = curr;
            curr = curr->left;
        }

        if( *--iter )
        {
            ProcessNode( *iter );

            curr = *iter->right;
        }
    }
};

 

Simple Twisted Application for Fun <strikethrough>and Profit</strikethrough>(zz)

Twisted is really fun, and the people behind it, what can i say, they are crazy( in the nice way, haha:)
Maybe later i will post my little program used to grab the rtsp links from any course of webcast@berkeley, and dump the stream with mplayer. I thought it was pretty handy. 
 
Simple Twisted Application for Fun <strikethrough>and Profit</strikethrough>

I really enjoyed the series Firefly that aired a few years back. I've enjoyed it a few more times since. I'm really looking forward to the upcoming movie. So much so, in fact, that I even wanted to try to get into one of the pre-screenings. There's a website that announces dates and locations for pre-screenings. Being pretty lazy and also forgetful, I can't be bothered to actually check the website frequently enough to notice when new tickets are available before they are all sold out. Fortunately, being a programmer, I can simply bend a minion to my will. I whipped up this quickie to do the checking for me:

import rfc822, md5, sys

from twisted.internet import task, reactor
from twisted.mail import smtp
from twisted.python import log
from twisted.web import client

url = "http://www.cantstopthesignal.com/"

MSG = """\
Subject: The upcoming film, Serenity, is awesome
Date: %(date)s
From: "A Respectable Internet Citizen" <serenitymovie@intarweb.us>
Message-ID: %(msgid)s
Content-Type: text/html

<html>
<body>
Serenity hash changed!  Go look!
<a href="%(url)s">%(url)s</a>
</body>
</html>
"""

def mailinfo():
    return {
        'date': rfc822.formatdate(),
        'msgid': smtp.messageid(),
        'url': url}

lastDig = None

def main():
    def got(page):
        global lastDig
        dig = md5.md5(page).hexdigest()
        if dig != lastDig:
            if lastDig is not None:
                smtp.sendmail(
                    sys.argv[1],  # SMTP host
                    sys.argv[2],  # sender email address
                    sys.argv[3:], # recipient email addresses
                    MSG % mailinfo())
            lastDig = dig
    client.getPage(url).addCallback(got)

log.startLogging(sys.stdout)
task.LoopingCall(main).start(60 * 30)
reactor.run()

The took about 10 minutes to write and even worked on the 2nd try (the first try I forget recipient addresses had to be a list and gave it a string instead). Then it ran for 21 days before sending me an email a few days ago! Alas, I don't live in Great Britain, or even Europe. It has also, of course, sent out an email once each time one of the listed theaters sells out. And now they're all sold out. :( Given the short period of time remaining before the probable release date, it seems unlikely there will be any more pre-screenings

Handy Bookmarklet(zz)

This is not my first time seeing this kind of ff bookmark hacks, first it's some perldoc tricks, I copy the dict one here in case I forget (my memery sucks). Maybe latter if i had time, i will copy those perldoc tricks from Perl Hacks.
 
Handy Bookmarklet

I cannot present this material in the manner which I desire. LiveJournal scrubs links, removing important structure. So, instead of one instruction, here are eight:

  1. If you are a Firefox user, right-click on the area below the location bar but above the row of tab labels.
  2. Select "New Bookmark...".
  3. Name the bookmark "Look It Up".
  4. Set the location of the bookmark to be
    javascript:var s = String(window.getSelection()); 
    if (s.length) { window.open('http://www.dict.
    org/bin/Dict?Form=Dict2&Database=*&Query=' +
    encodeURI(s)); } else { alert('Select a word
    to look up.'); } void(0);
  5. Visit some web page which contains a word the definition of which you would like to know.
  6. Select the word in the document (for example, double click on it).
  7. Click the "Look It Up" bookmarklet you just added.
  8. Savor the novelty (it wears off fast, so savor it likewise).

Maybe this is convenient for something. However, it does demonstrate document range selections, which have some other interesting uses. For example, I would like someone to write a Mantissa application which tracks quotes this way. People should be able to vote for quotes they find on the web. The quote can easily be selected and submitted to the server for tracking. The server can keep running tallies for various time periods and report them. It can also do things like indicate the first reporter of a quote, long-persisting quotes, etc. Communities like comp.lang.python should find this useful, as people are always responding to posts with messages that consist of nothing more than "+1 QOTW".

This would be an extremely trivial Mantissa application. Who would like to write it?

Props to radix for reminding me that dict.org is way better than dictionary.com.

2006/6/26

Limiting Parallelism(zz)

This is something i saw on the planet twisted, and Jp Calderone is one of the twisted guys. I want to "steal" some of his interesting stuffs and put it here.
These materials are all from his blog here http://jcalderone.livejournal.com/.
 
This one particular is about one cool way to limit how much parallelism you run on, much good looking than the one I used to hardcoded.
 
Limiting Parallelism
Concurrency can be a great way to speed things up, but what happens when you have too much concurrency? Overloading a system or a network can be detrimental to performance. Often there is a peak in performance at a particular level of concurrency. Executing a particular number of tasks in parallel will be easier than ever with Twisted 2.4 and Python 2.5:
from twisted.internet import defer, task

def parallel(iterable, count, callable, *args, **named):
    coop = task.Cooperator()
    work = ((yield callable(elem, *args, **named)) for elem in iterable)
    return defer.DeferredList([coop.coiterate(work) for i in xrange(count)])
Here's an example of using this to save the contents of a bunch of URLs which are listed one per line in a text file, downloading at most fifty at a time:
from twisted.python import log
from twisted.internet import reactor
from twisted.web import client

def download((url, fileName)):
    return client.downloadPage(url, file(fileName, 'wb'))

urls = [(url, str(n)) for (n, url) in enumerate(file('urls.txt'))]
finished = parallel(urls, 50, download)
finished.addErrback(log.err)
finished.addCallback(lambda ign: reactor.stop())
reactor.run()

RailsConf2006 Chicago

I think today is day 3, there's some guy taking notes and dairy like about how the RailsConf goes, it's on http://www.oreillynet.com/ruby/blog/. Sounds interesting, but seems they will charge the audio and presentation for $50, wtf.
2006/6/25

Profiling and Optimizing Python

In this article, the author showed us how to use the profiling tools in the standard library with his own project as an example. It really clearly show the basic steps you have to go through when you find yourself saying, this code runs too long, I got to get it run faster. Why, We like fast, that's why we learn algorithm, that's why we like to play nfs:)
 

Profiling and Optimizing Python

by Jeremy Jones
12/15/2005

Most of the time, code that we write doesn't have to perform as fast as if we wrote it in C. Most of the time, the first pass at writing it is "fast enough" and we don't have to optimize--but there are times when a piece of code just has to meet a certain standard of performance. For those "it's gotta run like a scalded dog" moments, the Python profiler may lend some assistance.

The Python standard library provides three libraries for performing code profiling: profile, hotshot, and timeit. The type of profiling performed in these modules involves tracing method and function calls and recording how long they took to run. profile provides detailed profiling information on all functions and methods in a code base, which are called at a specific runtime. hotshot is a replacement for the profile module. timeit provides less detailed profiling information for smaller pieces of code, where "less detailed" comprises overall execution time.

Profiling Code

One of the frequently repeated mantras I hear from Python veterans is, "Premature optimization is the root of all evil" (a quote from Donald Knuth, which apparently originated from Tony Hoare). The Pythonic approach to writing code that performs acceptably is:

  • Write the code.
  • Test it to see whether performance is OK right off the bat.
  • If so, you're done.
  • If not, optionally try Psyco.
  • Profile it and identify the bottlenecks.
  • Modify the code until the performance is as fast as you need, rewriting bottlenecks in C (or Pyrex) if necessary.

The example code I use in these examples comes from ediplex. ediplex is an open source project centered around parsing and handling EDI data. For people who don't know, EDI stands for Electronic Data Interchange and is a generic term used as a general umbrella for the means used to trade electronic business documents between two business partners. I profiled my EDI parser with hotshot by putting this code in the if __name__ == '__main__': block of the main entry point module, aggregate_parser.py:

import hotshot
prof = hotshot.Profile("hotshot_edi_stats")
prof.runcall(main)
prof.close()

calling aggregate_parser.py from the command line and passing the appropriate parameters. Interestingly, running this with hotshot takes about 25 seconds, while running it without a profiler takes a little over 3 seconds. Here is a run on the command line without the profiler, tracking the time with the Unix time utility:

jmjones@qiwi 10:10PM edi % time python \
    ~/svn/ediplex/src/ediplex/parser/aggregate_parser.py \
    edi_1000_interchanges.txt
python ~/svn/ediplex/src/ediplex/parser/aggregate_parser.py
    3.09s user 0.08s system 93% cpu 3.380 total

Here is a run on the command line with the profiler:

jmjones@qiwi 10:54PM edi % time python \
    ~/svn/ediplex/src/ediplex/parser/aggregate_parser.py \
    ~/svn/home/source/edi/edi_1000_interchanges.txt
python ~/svn/ediplex/src/ediplex/parser/aggregate_parser.py
    4.11s user 20.99s system 98% cpu 25.381 total

This run with the hotshot profiler created a log file (hotshot_edi_stats) with profiling information of each called method. Here is an IPython session showing how to get the profiling information:

In [1]: from hotshot import stats In [2]: s = stats.load("hotshot_edi_stats") In [3]: s.sort_stats("time").print_stats() 286016 function calls in 3.359 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 134000 1.455 0.000 2.104 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:97(body_seg) 1 0.942 0.942 3.358 3.358 /home/jmjones/svn/ediplex/src/ediplex/parser/util/state_machine.py:22(run) 136000 0.658 0.000 0.658 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/example/edi_handler.py:30(segment) 1000 0.203 0.000 0.250 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:29(header_seg) 1000 0.034 0.000 0.057 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:135(end_seg) 7000 0.032 0.000 0.032 0.000 /usr/lib/python2.4/string.py:248(strip) 2000 0.009 0.000 0.009 0.000 /usr/lib/python2.4/string.py:281(split) 1001 0.006 0.000 0.006 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:28(searching_header) 1000 0.005 0.000 0.005 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/example/edi_handler.py:24(__init__) 1000 0.005 0.000 0.005 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/example/edi_handler.py:26(start_interchange) 1000 0.005 0.000 0.005 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/example/edi_handler.py:28(end_interchange) 1000 0.005 0.000 0.005 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:62(append_interchange) 1 0.000 0.000 3.359 3.359 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:51(run) 1 0.000 0.000 3.359 3.359 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:74(main) 6 0.000 0.000 0.000 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/util/state_machine.py:14(add_state) 1 0.000 0.000 0.000 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:25(add_transitions) 1 0.000 0.000 0.000 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:10(__init__) 1 0.000 0.000 0.000 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:17(__init__) 1 0.000 0.000 0.000 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/util/state_machine.py:9(__init__) 1 0.000 0.000 0.000 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/util/state_machine.py:19(set_start) 1 0.000 0.000 0.000 0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:40(eof) 0 0.000 0.000 profile:0(profiler) Out[3]: <pstats.Stats instance at 0xb798f52c>

The stats.load() call returns a pstats.Stats object. The only methods that I could find documented in the standard library for this object were print_stats(), sort_stats(), and strip_dirs(). Here, I have used only sort_stats(), which sorts the statistics based on the column specified in the argument list, and print_stats(), which prints out the statistics table displayed above. The strip_dirs() method would have displayed only the module rather than the absolute path to the module. Call me nutty, but I like seeing the absolute path to my modules. With a 17-inch display on my laptop, I don't worry about the extra output width from absolute filenames.

Interpreting Output

The statistical output shows the number of times a method was called (ncalls), the total time spent in the specified method but not in any of its children (tottime), the amount of time spent for each call to the specified method (the first percall), the amount of time spent in a method and calls to its children (cumtime), the amount of time spent for each call to the specified method including child methods (the second percall), and the filename. Given the profiling information, I can see that my three biggest bottlenecks are in x12_parser.py (in the body_seg method), state_machine.py (in the run method), and edi_handler.py (in the segment method).

I sorted the statistics on the tottime column (by passing "time" to the sort_stats() method) to show how well each method performed by itself. The total time spent in the top four methods is about 97 percent of the entire processing time. The top one takes about 43 percent of the entire processing time.

I should now ask the question, Is it fast enough? My answer is, Probably, but faster would be nice. It can parse about 300 2K interchanges per second on my laptop, (about 440 per second if I don't take into consideration Python interpreter startup, library imports, and such), which is actually pretty respectable, but I would like to trim whatever time I can. The profiler has pointed out a couple of glaring architectural deficiencies, so some changes are in order.

I just mentioned the parser's performance without Python interpreter startup and library imports. The timeit module is an excellent tool for helping determine the performance of only the interesting piece of code. I put this code:

import timeit
t = timeit.Timer("main()", "from __main__ import main")
res_set = t.repeat(10, 1)
print res_set, "::", min(res_set)

in my if __name__ == "__main__": block and ran it.

jmjones@qiwi  5:39AM parser % python aggregate_parser.py
    ~/svn/ediplex/example_data/edi_1000_interchanges.txt
[3.1825220584869385, 2.4778821468353271, 2.4290919303894043, 2.4358069896697998,
2.4379229545593262, 2.4492578506469727, 2.439661979675293, 2.4325060844421387,
2.4347169399261475, 2.435114860534668] :: 2.42909193039

repeat(10, 1) runs 10 iterations of the specified code one time. I specified for the code to execute only once per iteration because I handled the looping by parsing a file of 1,000 EDI interchanges. The output of repeat() is a list of runtimes for each repeating call to the specified code. The recommended method of determining performance for a piece of code (by using timeit) is to take the minimum value from the returned list.

Finding and Fixing Bottlenecks

A primer on X12 EDI would probably be helpful at this point. EDI documents are character-delimited text files, which are more machine-readable than human-readable. Each document is made up of one interchange; each interchange is made up of a series of segments that roughly equate to database records; each segment is made up of elements that roughly equate to database columns. The first few characters of each segment (sometimes called the tag), up to the element-separating delimiter, declares what type of segment it is. The first segment in an X12 interchange has an ISA tag. The last segment in an X12 interchange has an IEA tag.

First problem: when ediplex parses an EDI document, it first determines whether it has a valid header segment, then proceeds to look for each individual segment and determines whether that segment is the end segment. This is really inefficient and results in an inordinate number of calls to the body_seg() method as it chunks through the document. A better method would be to read in the body of the EDI document in bigger chunks to search specifically for the end segment rather than just the segment delimiter.

Second problem: aggregate_parser calls a segment() method in the specified EDI handler for each segment it finds, passing it the segment data. aggregate_parser was using the NoOpEDIHandler for this example. The segment() method in this particular handler does nothing. The entirety of this segment() method is pass. It's interesting, but not surprising, that a no-op now takes about 20 percent of the overall processing time; it's not surprising because the parser called segment() 136,000 times. This is an obvious inefficiency, but it is also an architectural problem. If I write a piece of code that uses this parser, which ignores the finer level of detail of an EDI document at the segment level, then handling segment data, even with a no-op, costs a 20 percent performance hit. A better method would be to return a segment iterator such that code will retrieve segments only when the interested code calls the iterator's next() method.

I have rearchitected ediplex to work more iteratively rather than relying on callbacks. I have also reduced the granularity of what the parser is interested in on its first pass. Now when ediplex parses a file of EDI interchanges, it identifies the header segment, then immediately starts looking for the footer segment rather than each body segment. When it finds a full interchange, it yields a parser object, which can, in turn, return an iterator for the interchange's segments.

Reprofiling

The results were surprising. I was hoping for a 50 to 100 percent performance improvement. Here is a time of running the parser with no profiler:

jmjones@qiwi  9:25PM parser % time python aggregate_parser.py \
    ~/svn/ediplex/example_data/big_edi.txt
python aggregate_parser.py ~/svn/ediplex/example_data/big_edi.txt
    0.47s user 0.01s system 99% cpu 0.481 total

Here is the parser run with the hotshot profiler:

jmjones@qiwi  5:43PM parser % time python aggregate_parser.py \
    ~/svn/ediplex/example_data/big_edi.txt
python aggregate_parser.py ~/svn/ediplex/example_data/big_edi.txt
    0.42s user 0.49s system 87% cpu 1.042 total

Without the profiler, this is a performance boost of about 7X. Removing the Python interpreter startup time, library imports, and such from both sides of the equation, this was actually a performance boost of 10X. The timeit module shows a 2.3 second best time for parsing a file before optimization and .22 seconds after optimizations. Here is the statistical data from running the parser:

In [1]: from hotshot import stats

In [2]: s = stats.load("hotshot_edi_stats")

In [3]: s.sort_stats("time").print_stats()
         11005 function calls in 0.436 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1000    0.338    0.000    0.374    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:32(is_valid_header)
     7000    0.036    0.000    0.036    0.000 /usr/lib/python2.4/string.py:248(strip)
     1001    0.035    0.000    0.428    0.000 aggregate_parser.py:57(parse_file)
     1000    0.013    0.000    0.013    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:15(__init__)
        1    0.008    0.008    0.436    0.436 aggregate_parser.py:128(main)
     1000    0.006    0.000    0.006    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:64(is_valid_footer)
        1    0.000    0.000    0.000    0.000 aggregate_parser.py:122(get_canned_aggregate_parser)
        1    0.000    0.000    0.000    0.000 aggregate_parser.py:15(__init__)
        1    0.000    0.000    0.000    0.000 aggregate_parser.py:107(add_parser)
        0    0.000             0.000          profile:0(profiler)


Out[3]: <pstats.Stats instance at 0xb798f4cc>

The profiler shows far fewer methods than before optimizing things. The majority of time is now in methods that actually do things rather than no-op or state transitions. All of the features that had no-op placeholders are now gone and will occur only when a user decides he needs to call them.

The profiler also shows far fewer method calls. Before, there were two methods, body_seg() and segment(), with calls over 130,000 times. Function calls were fairly expensive in previous versions of Python--and I believe they still are, but to a lesser degree. This explains why 130,000 function calls to even a no-op created a performance bottle neck.

I heavily optimized the method that searched for the end of an EDI interchange (now called is_valid_footer()) but hardly touched the one that identified the beginning of an interchange (now called is_valid_header()). is_valid_header() is now the biggest bottleneck, taking 10 times more time than the next method in the list, which happens to be the Python standard string.strip() method--and about 77 percent of the total processing time. The only section of is_valid_header() that does anything is a for loop that iterates over range(106) and checks the header string for correctness. If I really wanted to optimize is_valid_header(), I would remove the for loop and either do the string check with a (shudder) regex or write a function in C and swig it. I may implement one of these optimizations just because of that nagging bottleneck (and for fun!)--but the 2,080 2K interchanges per second, which includes Python interpreter startup and library imports, or 4,500 2K interchanges per second, excluding Python startup and imports, is faster than I had hoped to take this parser.

Just for the sake of comparison, here is the timeit run of the optimized code:

jmjones@qiwi  4:33PM parser % python aggregate_parser.py
    ~/svn/ediplex/example_data/edi_1000_interchanges.txt
[0.4770960807800293, 0.47198605537414551, 0.43927288055419922,
0.29295921325683594, 0.21883893013000488, 0.21800088882446289,
0.2185051441192627, 0.21764492988586426, 0.23419904708862305,
0.21747803688049316] :: 0.21747803688

Conclusion

I have attempted to live by the "premature optimization" motto with the activity surrounding this article. I spent very little time up front trying to predict where my code would run slow, and made very few concessions to preemptively make it run faster. I simply used the hotshot profiler, a tool that is part of the Python standard library, found my bottlenecks, rearchitected my parser (which is probably a bit more extreme than what most folks will have to do), made some code changes, and saw a significant performance improvement. For any performance analysis of Python source code, the profiler is an indispensable tool.

Jeremy Jones is a software engineer who works for The Weather Channel. His weapon of choice is Python.

2006/6/24

离别的七月

晚上和几个好友吃散伙饭,他们很快就要离开北京了,我很窝囊的什么离别的话也没说出口,在这里祝福他们一路走好
永远都无法忘记那年的一系列组图,每次看都很伤感
因为离别,因为自己的毫无脸面的离开PKU,因为生活的所有碎片没有丝毫像对的方向发展的趋势,因为...