Reliable file updates with Python

Programs need to update files. Although most programmers know that unexpected things can happen while performing I/O, I often see code that has been written in a surprisingly naïve way. In this article, I would like to share some insights on how to improve I/O reliability in Python code.

Consider the following Python snippet. Some operation is performed on data coming from and going back into a file:

with open(filename) as f:
   input = f.read()
output = do_something(input)
with open(filename, 'w') as f:
   f.write(output)

Pretty simple? Probably not as simple as it looks at the first glance. I often debug applications that show strange behaviour on production servers. Here are examples of failure modes I have seen:

  • A run away server process spills out huge amounts of logs and the disk fills up. write() raises an exception right after truncating the file, leaving the file empty.
  • Several instances of our application happen to run in parallel. After they have finished, the file contents is garbage because it intermingles output from multiple instances.
  • The application triggers some follow-up action after completing the write. Seconds later, the power goes off. After we have restarted the server, we see the old file contents again. The data already passed to other applications does not correspond to what we see in the file anymore.

Nothing of what follows is really new. My goal is to present common approaches and techniques to Python developers who are less experienced in system programming. I will provide code examples to make it easy for developers to incorporate these approaches into their own code.

What does “reliability” mean anyway?

In the broadest sense, reliability means that an operation is performing its required function under all stated conditions. With regard to file updates, the function in question is to create, replace or extend the contents of a file. It might be rewarding to seek inspiration from database theory here. The ACID properties of the classic transaction model will serve as guidelines to improve reliability.

To get started, let’s see how the initial example can be rated against the four ACID properties:

  • Atomicity requires that a transaction either succeeds or fails completely. In the example shown above, a full disk will likely result in a partially written file. Additionally, if other programs read the file while it is being written, they get a half-finished version even in the absence of write errors.
  • Consistency denotes that updates must bring the system from one valid state to another. Consistency can be subdivided into internal and external consistency: Internal consistency means that the file’s data structures are consistent. External consistency means that the file’s contents is aligned with other data related to it. In this example, it is hard to reason about consistency since we don’t know enough about the application. But since consistency requires atomicity, we can say at least that internal consistency is not guaranteed.
  • Isolation is violated if running transactions concurrently yields different results from running the same transactions sequentially. It is clear that the code above has no protection against lost updates or other isolation failures.
  • Durability means that changes need to be permanent. Before we signal success to the user, we must be sure that our data hits non-volatile storage and not just a write cache. Perhaps the code above has been written with the assumption in mind that disk I/O takes place immediately when we call write(). This assumption is not warranted by POSIX semantics.

Use a database system if you can

If we would be able to gain all four ACID properties, we would have come a long way towards increased reliability. But this requires significant coding effort. Why reinvent the wheel? Most database systems already have ACID transactions.

Reliable data storage is a solved problem. If you need reliable storage, use a database. Chances are high that you will not do it by yourself as good as those who have been working on it for years if not decades. If you do not want to set up a “big” database server, you can use sqlite for example. It has ACID transactions, it’s small, it’s free, and it’s included in Python’s standard library.

The article could finish here. But there are valid reasons not to use a database. They are often tied to file format or file location constraints. Both are not easily controllable with database systems. Reasons include:

  • we must process files generated by other applications, which are in a fixed format or at a fixed location
  • we must write files for consumption by other applications (and the same restrictions apply)
  • our files must be human-readable or human-editable

…and so on. You get the point.

If we are set out to implement reliable file updates on our own, there are some programming techniques to consider. In the following, I will present four common patterns of performing file updates. After that, I will discuss what steps can be taken to establish ACID properties with each file update pattern.

File update patterns

Files can be updated in a multitude of ways, but I see at least four common patterns. These will serve as a basis for the rest of this article.

Truncate-Write

This is probably the most basic pattern. In the following example, hypothetical domain model code reads data, performs some computation, and re-opens the existing file in write mode:

with open(filename, 'r') as f:
   model.read(f)
model.process()
with open(filename, 'w') as f:
   model.write(f)

A variant of this pattern opens the file in read-write mode (the “plus” modes in Python), seeks to the start, issues an explicit truncate() call and rewrites the contents:

with open(filename, 'a+') as f:
   f.seek(0)
   model.input(f.read())
   model.compute()
   f.seek(0)
   f.truncate()
   f.write(model.output())

An advantage of this variant is that we open file only once and keep it open all the time. This simplifies locking for example.

Write-Replace

Another widely used pattern is to write new contents into a temporary file and replace the original file after that:

with tempfile.NamedTemporaryFile(
      'w', dir=os.path.dirname(filename), delete=False) as tf:
   tf.write(model.output())
   tempname = tf.name
os.rename(tempname, filename)

This method is more robust against errors than the truncate-write method. See below for a discussion of atomicity and consistency properties. It is used by many applications.

These first two patterns are so common that the ext4 filesystem in the Linux kernel even detects them and fixes some reliability shortcomings automatically. But don’t depend on it: you are not always using ext4, and the administrator might have disabled this feature.

Append

The third pattern is to append new data to an existing file:

with open(filename, 'a') as f:
   f.write(model.output())

This pattern is used for writing log files and other cumulative data processing tasks. Technically, its outstanding feature is its extreme simplicity. An interesting extension is to perform append-only updates during regular operation and to reorganize the file into a more compact form periodically.

Spooldir

Here we treat a directory as logical data store and create a new uniquely named file for each record:

with open(unique_filename(), 'w') as f:
   f.write(model.output())

This pattern shares its cumulative nature with the append pattern. A big advantage is that we can put a little amount of metadata into the file name. This can be used, for example, to convey information about the processing status. A particular clever implementation of the spooldir pattern is the maildir format. Maildirs use a naming scheme with additional subdirectories to perform update operations in a reliable and lock-free way. The md and gocept.filestore libraries provide convenient wrappers for maildir operations.

If your file name generation is not guaranteed to give unique results, there is even a possibility to demand that the file must be actually new. Use the low-level os.open() call with proper flags:

fd = os.open(filename, os.O_WRONLY | os.O_CREAT| os.O_EXCL, 0o666)
with os.fdopen(fd, 'w') as f:
   f.write(...)

After opening the file with O_EXCL, we use os.fdopen to convert the raw file descriptor into a regular Python file object.

Applying ACID properties to file updates

In the following, I will try to enhance the file update patterns. Let’s see what we can do to meet each ACID property in turn. I will keep this as simple as possible, since we are not planning to write a complete database system. Please note that the material presented in this section is not exhaustive, but it may give you a good starting point for your own experimentation.

Atomicity

The write-replace pattern gives you atomicity for free since the underlying os.rename() function is atomic. This means that at any given point in time, any process sees either the old or the new file. This pattern has a natural robustness against write errors: if the write operation triggers an exception, the rename operation is never performed and thus, we are not in the danger of overwriting a good old file with a damaged new one.

The append patterns is not atomic by itself, because we risk to append incomplete records. But there is a trick to make updates appear atomic: Annotate each written record with a checksum. When reading the log later on, discard all records that do not have a valid checksum. This way, only complete records will be processed. In the following example, an application makes periodic measurements and appends a one-line JSON record each time to a log. We compute a CRC32 checksum of the record’s byte representation and append it to the same line:

with open(logfile, 'ab') as f:
    for i in range(3):
        measure = {'timestamp': time.time(), 'value': random.random()}
        record = json.dumps(measure).encode()
        checksum = '{:8x}'.format(zlib.crc32(record)).encode()
        f.write(record + b' ' + checksum + b'\n')

This example code simulates the measurements by creating a random value every second.

$ cat log
{"timestamp": 1373396987.258189, "value": 0.9360123151217828} 9495b87a
{"timestamp": 1373396987.25825, "value": 0.40429005476999424} 149afc22
{"timestamp": 1373396987.258291, "value": 0.232021160265939} d229d937

To process the log file, we read one record per line, split off the checksum, and compare it to the read record:

with open(logfile, 'rb') as f:
    for line in f:
        record, checksum = line.strip().rsplit(b' ', 1)
        if checksum.decode() == '{:8x}'.format(zlib.crc32(record)):
            print('read measure: {}'.format(json.loads(record.decode())))
        else:
            print('checksum error for record {}'.format(record))

Now we simulate a truncated write by chopping the last line:

$ cat log
{"timestamp": 1373396987.258189, "value": 0.9360123151217828} 9495b87a
{"timestamp": 1373396987.25825, "value": 0.40429005476999424} 149afc22
{"timestamp": 1373396987.258291, "value": 0.23202

When the log is read, the last incomplete line is rejected:

$ read_checksummed_log.py log
read measure: {'timestamp': 1373396987.258189, 'value': 0.9360123151217828}
read measure: {'timestamp': 1373396987.25825, 'value': 0.40429005476999424}
checksum error for record b'{"timestamp": 1373396987.258291, "value":'

The checksummed log record approach is used by a large number of applications including many database systems.

Individual files in the spooldir can likewise feature a checksum in each file. Another, probably easier, approach is to borrow from the write-replace pattern: first write the file aside and move it to its final location afterwards. Devise a naming scheme that protects work-in-progress files from being processed by consumers. In the following example, all file names ending with .tmp are ignored by readers and are thus safe to use during write operations:

newfile = generate_id()
with open(newfile + '.tmp', 'w') as f:
   f.write(model.output())
os.rename(newfile + '.tmp', newfile)

At last, truncate-write is non-atomic. I am sorry that I am not able to offer you an atomic variant. Right after performing the truncate operation, the file is nulled and no new content has been written yet. If a concurrent program reads the file now or, worse yet, an exception occurs and our program gets aborted, we see neither the old nor the new version.

Consistency

Most things I have said about atomicity can be applied to consistency as well. In fact, atomic updates are a prerequisite for internal consistency. External consistency means to update several files in sync. As this cannot easily be done, lock files can be used to ensure that read and write access do not interfere. Consider a directory where files need to be consistent with each other. A common pattern is to designate a lock file, which controls access for the whole directory.

Example writer code:

with open(os.path.join(dirname, '.lock'), 'a+') as lockfile:
   fcntl.flock(lockfile, fcntl.LOCK_EX)
   model.update(dirname)

Example reader code:

with open(os.path.join(dirname, '.lock'), 'a+') as lockfile:
   fcntl.flock(lockfile, fcntl.LOCK_SH)
   model.readall(dirname)

This method only works if we have control over all readers. Since there may be only one writer active at a time (the exclusive lock is blocking all shared locks), the scalability of this method is limited.

To take it one step further, we can apply the write-replace pattern to whole directories. This involves creating a new directory for each update generation and changing a symlink once the update is complete. For example, a mirroring application maintains a directory of tarballs together with an index file, which lists file name, file size, and a checksum. When the upstream mirror gets updated, it is not enough to implement an atomic file update for every tarball and the index file in isolation. Instead, we need to flip both the tarballs and the index file at the same time to avoid checksum mismatches. To solve this problem, we maintain a subdirectory for each generation and symlink the active generation:

mirror
|-- 483
|   |-- a.tgz
|   |-- b.tgz
|   `-- index.json
|-- 484
|   |-- a.tgz
|   |-- b.tgz
|   |-- c.tgz
|   `-- index.json
`-- current -> 483

Here, the new generation 484 is in the process of being updated. When all tarballs are present and the index file is up to date, we can switch the current symlink with a single, atomic os.symlink() call. Other applications see always either the complete old or the complete new generation. It is important that readers need to os.chdir() into the current directory and refer to files without their full path names. Otherwise, there is a race condition when a reader first opens current/index.json and then opens current/a.tgz, but in the meanwhile the symlink target has been changed.

Isolation

Isolation means that concurrent updates to the same file are serializable — there exists a serial schedule that gives the same results as the parallel schedule actually performed. “Real” database systems use advanced techniques like MVCC to maintain serializability while allowing for a great degree of parallelism. Back on our own, we better use locks to serialize file updates.

Locking truncate-write updates is easy. Just acquire an exclusive lock prior to all file operations. The following example code reads an integer from a file, increments it, and updates the file:

def update():
   with open(filename, 'r+') as f:
      fcntl.flock(f, fcntl.LOCK_EX)
      n = int(f.read())
      n += 1
      f.seek(0)
      f.truncate()
      f.write('{}\n'.format(n))

Locking updates using the write-replace pattern can be tricky. Using a lock the same way as in truncate-write can lead to updates conflicts. A naïve implementation could look like this:

def update():
   with open(filename) as f:
      fcntl.flock(f, fcntl.LOCK_EX)
      n = int(f.read())
      n += 1
      with tempfile.NamedTemporaryFile(
            'w', dir=os.path.dirname(filename), delete=False) as tf:
         tf.write('{}\n'.format(n))
         tempname = tf.name
      os.rename(tempname, filename)

What is wrong with this code? Imagine two processes compete to update a file. The first process just goes ahead, but the second process is blocked in the fcntl.flock() call. When the first process replaces the file and releases the lock, the already open file descriptor in the second process now points to a “ghost” file (not reachable by any path name) with old contents. To avoid this conflict, we must check that our open file is still the same after returning from fcntl.flock(). So I have written a new LockedOpen context manager to replace the built-in open context. It ensures that we actually open the right file:

class LockedOpen(object):

    def __init__(self, filename, *args, **kwargs):
        self.filename = filename
        self.open_args = args
        self.open_kwargs = kwargs
        self.fileobj = None

    def __enter__(self):
        f = open(self.filename, *self.open_args, **self.open_kwargs)
        while True:
            fcntl.flock(f, fcntl.LOCK_EX)
            fnew = open(self.filename, *self.open_args, **self.open_kwargs)
            if os.path.sameopenfile(f.fileno(), fnew.fileno()):
                fnew.close()
                break
            else:
                f.close()
                f = fnew
        self.fileobj = f
        return f

    def __exit__(self, _exc_type, _exc_value, _traceback):
        self.fileobj.close()
    def update(self):
        with LockedOpen(filename, 'r+') as f:
            n = int(f.read())
            n += 1
            with tempfile.NamedTemporaryFile(
                    'w', dir=os.path.dirname(filename), delete=False) as tf:
                tf.write('{}\n'.format(n))
                tempname = tf.name
            os.rename(tempname, filename)

Locking append updates is as easy as locking truncate-write updates: acquire an exclusive lock, append, done. Long-running processes, which leave a file permanently open, may need to release locks between updates to let others in.

The spooldir pattern has the elegant property that it does not require any locking. Again, it depends on using a clever naming scheme and a robust unique file name generation. The maildir specification is a good example for a spooldir design. It can be easily adapted to other cases, which have nothing to do with mail.

Durability

Durability is a bit special because it depends not only on the application, but also on OS and hardware configuration. In theory, we can assume that os.fsync() or os.fdatasync() calls do not return until data has reached permanent storage. In practice, we may run into several problems: we may be facing incomplete fsync implementations or awkward disk controller configurations, which never give any persistence guarantee. A talk from a MySQL dev goes into great detail of what can go wrong. Some database systems like PostgreSQL even offer a choice of persistence mechanisms so that the administrator can select the best suited one at runtime. The poor man’s option although is to just use os.fsync() and hope that it has been implemented correctly.

With the truncate-write pattern, we have to issue an fsync after finishing write operations but before closing the file. Note that there is usually another level of write caching involved. The glibc buffer holds back writes inside the process even before they are passed to the kernel. To get the glibc buffer empty as well, we have to flush() it before fsync’ing:

with open(filename, 'w') as f:
   model.write(f)
   f.flush()
   os.fdatasync(f)

Alternatively, you can invoke Python with the -u flag to get unbuffered writes for all file I/O.

I prefer os.fdatasync() over os.fsync() most of the time to avoid synchronous metadata updates (ownership, size, mtime, …). Metadata updates can result in seeky disk I/O, which slows things down quite a bit.

Applying the same trick to write-replace style updates is only half of the story. We make sure that the newly written file has been pushed to non-volatile storage before replacing the old file, but what about the replace operation itself? We have no guarantee that the directory update is performed right on. There are lengthy discussions on how to sync a directory update on the net, but in our case (old and new file are in the same directory) we can get away with this rather simple solution:

os.rename(tempname, filename)
dirfd = os.open(os.path.dirname(filename), os.O_DIRECTORY)
os.fsync(dirfd)
os.close(dirfd)

We open the directory with the low-level os.open() call (Python’s built-in open() does not support opening directories) and perform a os.fsync() on the directory’s file descriptor.

Persisting append updates is again quite similar to what I have said about truncate-write.

The spooldir pattern has the same directory sync problems as the write-replace pattern. Fortunately, the same solution applies here as well: first sync the file, then sync the directory.

Conclusion

It is possible to update files reliably. I have shown that all four ACID properties can be met. The code examples presented above may serve as a toolbox. Pick the programming techniques that match your needs best. At times, you don’t need all four ACID properties but only one or two. I hope that this article helps you to make an informed decision about what to implement and what to leave out.

Posted in en | Tagged , | 10 Comments

#monitoringlove sprint takeaway

A few weeks ago I co-organised and participated in a #monitoringlove sprint in Berlin.

My personal plan was to play with more modern utilities that can potentially replace our existing Nagios monitoring chain. The result of what I think would be a good setup would probably look like this:

monitoringlove2

Most of those parts already exist. The new thing in there is what I called “riemann-actual” – something that generates new events based on existing events from the index. I call this “higher order” monitoring – in Nagios these would be known as “business processes”.

The word “business processes” is a bit misleading as nothing is really about processes there: it means taking previously taken monitoring data and subsuming it into a more dense expression. Ideally you can recombine any of your metrics to make an overall statement of “everything is good if more than 80% of the appservers are up and we have less than 5% of error response rate and the frontpage is reachable from at least 3 outside systems”.

Data gathering

First, I tried to setup something for data gathering. I already got the recommendation to look at scales for in-app metrics and found it easy to get started. I like the notion that metrics in your app behave a little bit like logging: you don’t care where they go and you expect the user of your system to configure an actual target. The built-in webserver is nice to get started and graphite as a protocol seams fair enough nowadays to forward data.

To gather system-level metrics I guess both collectd and statsd are fine points to start from. I used collectd to begin with as it actually had a riemann output plugin.

Central processing

We want to be able to take all of the data we acquire into account on making decisions quickly. Riemann seems to be the most suitable tool for this task. After playing around for a while trying to implement “business process” monitoring in clojure I found it easier to provide a Python-environment that can talk to riemann and do those decisions. I made this available as “riemann-actual” on bitbucket.

I noticed that this setup would require only a very generic riemann configuration and could perform on a per-customer or per-project basis by just adding more of those loops on top of riemann.

Performance-wise I was extremely happy. I could have a 10Hz monitoring loop resulting in about 1k events per second on my computer. With that resolution all business processes would notice an outage with no visible delay.

A nice feat is that Riemann can generate events when old events reach their TTL. This way you can make sure that you notice when a system you are monitoring “goes dark”.

Also, it seems that Riemann configuration can be unit-tested easily: feed events in, watch the index, or see events coming out. It doesn’t get much simpler than that.

The configurable dashboard in Riemann 0.2 is already very helpful: responsive, flexible, and fast  - until you try to display 10k metrics at once. ;) It needs a little more finishing but it’s on a good way.

Distributed consumers

My understand of Riemann is that it wants to be a nexus for “central, volatile, shared state”. This means you get a lot of updates going through and that it needs to be good with I/O. OTOH it means that it shouldn’t do much and just make it easy for you to router your data somewhere else.

Actually looking at further consumers didn’t happen as 3 days aren’t that long. :) I see graphite on the horizon (with the setup becoming easier over time) as well as more custom tooling to turn events into notifications, etc.

A look at OpenTSDB seemed promising at first but it turns out to have an even more complex setup requirement than graphite. I got it running but it seemed extremly hard to control, so I dropped it after a few hours.

Overall it seems that since the outcry of #monitoringsucks a lot has happened and I’m faithful that there’s a way out of Nagiosland in the near future.

More notes from our sprint are available at pysprints.de. (Although in German.)

 

Posted in en | Tagged , | Leave a comment

Sommerfest bei gocept

english version below

13 – eine missverstandene Zahl. Was gibt es nicht alles für Vorurteile, ob nun Glücks- oder Unglückszahl und Verschwörungstheorien ohne Ende.

gocept wird 13 Jahre. Das wollen wir mit unseren Familien, Freunden und Geschäftspartnern feiern. Grund genug also für ein Sommerfest in unserem tollen Garten. Tragt das Datum schon mal fest im Kalender ein:

gocept Sommerfest
Samstag den 17. August 2013 ab 16 Uhr
Forsterstr. 29 06112 Halle (Saale)

Und wem die Party allein nicht reicht, dem sei verraten, dass ab Donnerstag dem 15. August bei uns im Haus ein Pyramid Sprint stattfindet.

Wir wollen sicher sein, dass nicht nur 13 Gäste erscheinen. Bitte gebt uns kurz Bescheid, ob ihr euch den Termin freihalten könnt: mail@gocept.com

Die Einladung als PDF gib es hier.

gocept summer party

13 – what a number! Misunderstood so often, held responsible for luck both good and bad, full of prejudice and conspiracy theories.

Anyhow, gocept will be 13. For us that’s the best reason to celebrate together with all of our families, friends and business partners. Let’s have a summer party in our wonderful garden. Save the date in your calendar:

gocept summer party
Saturday August 17th 2013 from 4pm
Forsterstr. 29 DE 06112 Halle (Saale)

Party alone not exciting enough to come over? You can also join the  Pyramid Sprint taking place from Thursday, August 15th.

We want to be sure not only 13 guests will join our Party, so please send a short message if you want to join: mail@gocept.com

Posted in de, en | Tagged | Leave a comment

August, 15th–17th: Sprinting on Pyramid

After Zope “-the-Framework” reaching the end of its lifecycle during the last few years, we did a bunch of new projects with Pyramid, a nice web framework primarily authored by long-term Zope developer Chris McDonough.

We think it’s about time to give something back to the community, and become more involved in Pyramid development. We therefore happily announce to host a large Pyramid sprint organised in cooperation with the pysprints.de team. You find more information and sprint topics at GitHub.

The sprint starts on Thu, August 15th 10:00h CEST, and ends on Sat, August 17th with a garden party in the evening! Expect BBQ, beer and (most likely) live music!

If you would like to attend, please sign up on lanyrd.

Posted in en | Tagged , , | 2 Comments

developer & admin BBQ IV

Our fourth BBQ (invitation post) had the most participants so far, almost 20 people were here to talk shop, exchange ideas and brave the unfortunately slightly rainy weather (the grilled goods were delicious regardless). We’re especially glad that the ratio of gocept people to guests was only about 50% this time, and we’re hoping it will go down further. :-)

The sessions in the Open Space were about diverse subjects, ranging from “Deploying lots of Rasperry Pi’s” over “Gamification in a business context” to “Why is there no slim and simple CMS yet?”. In several sessions we didn’t find a satisfactory solution to the problem, but sometimes sharing your frustrations with others who have similar experiences is helpful in itself.

Since the session about code katas was very well liked, we’re thinking about maybe doing a Code Retreat instead of a classic Open Space for the next BBQ, so stay tuned.

Posted in en | Leave a comment

Running tests using gocept.selenium on Travis-CI

Travis-CI is a free hosted continuous integration platform for the open source community. It has a good integration with Github, so each push to a project runs the tests  of the project.

gocept.selenium is a python package our company has developed as a test-friendly Python API for Selenium which allows to run tests in a browser.

Travis-CI uses YML-Files to configure the test run. I found only little documentation how to run Selenium tests on Travis-CI. But it is straight forward. The following YML file I took from a personal project of mine. (I simplified it a bit for this blog post.):

language: python
python:
  - 2.6
before_install:
  - "export DISPLAY=:99.0"
  - "sh -e /etc/init.d/xvfb start"
  - "wget http://selenium.googlecode.com/files/selenium-server-standalone-2.31.0.jar"
  - "java -jar selenium-server-standalone-2.31.0.jar &"
  - "export GOCEPT_SELENIUM_BROWSER='*firefox'"
install:
  - python bootstrap.py
  - bin/buildout
script:
  - bin/test

Explanation:

  • Lines 1 – 4: My project is a Python project which currently only runs on Python 2.6. But other Python versions will work as well.
  • Lines 5, 6: Firefox needs a running XServer, so we start it first as it takes some seconds to launch. See Travis-CI documentation, too.
  • Lines  7, 8: The Selenium server seems not to be installed by default, so get it and launch it.
  • Line 9: Tell gocept.selenium to use Firefox to run the tests. (Note: To use the new Webdriver-API in the upcoming version 2 of gocept.selenium you have to set other environment variables.)
  • Lines 10 – 14: Install the project and run the tests as usual. (The example uses zc.buildout to do this.)

Note: Although I use the Firefox which is installed by default on the Travis-CI machine, I did not yet find out which version it is.

Posted in en | Tagged , , , , , , , , , | 2 Comments

PyCon 2013 report

PyCon 2013 was an excellent conference bringing together Python’s vast, diverse, and technically excellent community. I had the opportunity to visit the whole conference including the sprint days.

Magnitude

The size of the community seems well reflected by the number of attendees that PyCon US attracts: the limit of 2,500 attendees was reached on 2013-02-02, about 1 month prior to the conference. This should be about 500 attendees more than in 2012 when they exceeded their planned capacity of 1,500 ending up with 2,000 IIRC.

It was very nice to see that the organization is growing along with the task: everything ran very smoothly, a lot of detailed changes over last year, some for better, some for worse (Remember: if you want to improve, you need to change, and the means you need to accept set backs to learn.)

Diversity

Yes, there was this FUBAR situation regarding a “code of conduct” violation. I think too many people who have not been at PyCon have contributed to the turmoil already so I’ll refrain from commenting.

I was happy to hear that PyLadies (and everybody else working on the diversity of the community) could see their efforts showing excellent results: around 20% of all participants were women (or girls). I had the impression on the first day of the conference that more women were around than usually on tech conferences.

But not only that, we also had:

  • a wide range of ages: from kids, to students, to way more senior people
  • very business and very relaxed, alternative people (Plone RV, anyone?)
  • visitors from all over the world

I had to ponder a bit why this actually makes me happy: the diversity shows me that what we do is important to everyone and does not need to be either obscure and geeky or shirt-and-tie business.

We can have a community where you can be geeky and nerdy, do business, and feel like a human being. How great is that? Conferences always tend to be very intense environments, somewhat “from outer space”. Combined with travelling overseas for almost two weeks, having a human environment just makes it so worthwhile and a bit more sustainable.

To everybody who did not have an absolutely great experience personally: I’m empathatic and I hope next PyCon will be better for you. A lot has been said about the code of conduct and the organizers definitely pay a lot of attention to it. Nevertheless: 2,500 people stuffed into a few rooms over almost a week will cause friction here and there. If I should encounter a similar situation myself I will hopefully be able to apply some of the experience as a bystander and: stay calm, be friendly, and help defusing situations.

Technical excellence

There isn’t much you can do to get more sophisticated technical people talking about programming into the same spot compared to PyCon. Maybe DEFCON, or USENIX, or other more orthogonally oriented spots. But for practicality this is just it.

I recommend you visit pyvideo.org and go through the recorded videos of all sessions. It’s always a good idea to listen to what Raymond Hettinger has to say. And Guido, of course.

Sprinting

I felt very productive during the sprints: I started out sitting in a room with Nate Aune, Jeff Forcier, and some others, talking about deployment things. I worked a bit on our deployment utility batou trying to soften some rough edges and gather feedback from others.

However, I also had the PyPI mirror client software on my radar. As we are operating one of the official mirrors (the F mirror) I was fed up by the constant breakage that the existing pep381client experienced everywhere. I sat down, refactored, and lo and behold! a bandersnatch appeared. This is a full rewrite that can be used with the existing mirror data and is much more reliable and – in the case of error – easier debug and recover.

Sponsoring

gocept also was a silver sponsor for PyCon. We already sponsored PyCon in 2012, but this year we:

  • did not insert more stuff into the attendee bag (it’s way too heavy already anyways)
  • did set up a booth to become approachable and get to talk to people

Our product (the Flying Circus) is in use for consulting clients but still on its way to become a product that you can just use by registering and providing payment details. Operations as a service is a very dynamic space today and we had some good opportunities to try to explain what we envision and where we think existing IaaS and PaaS models are aiming at the wrong thing. If you’re interested in this kind of thing, then visit our homepage and sign up to our newsletter and we’ll keep you updated.

PyCon has been a very sponsor-friendly place, especially for small businesses. It’s always a hassle to bring a lot of stuff half around the globe, but the environment was perfect to just bring some banner and flyers and talk to people strolling around.

2014

So next year, PyCon US will actually be PyCon North America, as the conferences moves to Montreal. Besides making this a much shorter trip I’m also looking forward to some new cultural impressions.

Posted in en | Leave a comment