The NASA Space Apps Challenge

Adam and I were virtual participants in NASA’s Space Apps Challenge, a weekend global hackathon the weekend of April 21st encouraging citizen scientists to “solve current challenges relevant to both space exploration and social need”. The event had teams from all 7 continents, including researchers from McMurdo Station, Antarctica, as well as the the International Space Station.

Space Apps Challenge map

Space Apps Challenge Hackathon

NASA seeded candidate projects prior to the event in 4 categories: software, open hardware, citizen science, and data visualization. Teams could also submit their own proposals. You can view the full list of candidate projects.

I was pleased to see the NASA Planetary Data System (PDS) catalog listed as a seed project. I had looked into programmatically downloading the images available through PDS 6 months ago and discovered that they are stored server-side in the VICAR format, which was unsupported by any open source image processing utilities I could find (ImageMagick nominally has support, but produces garbage images on conversion to other formats). This hackathon was a perfect opportunity to write a utility to convert VICAR into a more accessible format.

vicar2png

Our submission for this hackathon was vicar2png, a utility written in Python with the pypng library to convert VICAR files to the popular PNG image format.

VICAR (the Video Image Communication And Retrieval format), has been used since 1966 for NASA mission images. Check out N1349602309_2.IMG from the Cassini-Huygens mission for an example of a VICAR file. NASA maintains a format specification that we used to implement our solution.

VICAR file example

VICAR file from the Cassini-Huygens mission

You can view all of the hackathon submissions.

vicar2png makes it easy for anyone to view, enjoy, and remix NASA’s mission image data easily by converting VICAR files to PNG images. We also hope that making this simple, open source Python implementation available makes it easier for general-purpose open source image processing tools to add VICAR support.

Hackathon deliverables

vicar2png was one of 37 entries nominated for global judging, out of 111 hackathon submissions!

Saturn's moon Phoebe, converted from VICAR data from the Cassini-Huygens mission

Saturn's moon Phoebe, converted from VICAR data from the Cassini-Huygens mission

Space Apps Challenge Global Judging

For global judging we had to create a short video explaining and demoing our submission.

Global competition deliverables

Global judging was streamed on uStream; you can view the archived video here. They announced 5 winners:

  1. Most Inspiring: Planet Hopper
    Visualize exoplanets using Kepler data.
  2. Best Use of Data: vicar2png
    View, enjoy, and remix NASA’s mission image data easily by converting VICAR files to the popular PNG image format.
  3. Most Disruptive: Growing Fruits: Pineapple Project
    Determine the optimal crop for your community based on rainfall, latitude, elevation, and pH.
  4. Most Innovative: Strange Desk
    Share strange events in your community with other citizen scientists.
  5. Galactic Impact: Growers Nation
    Use location, climate, and growing data to find appropriate crops for unused land.
  6. “People’s Choice”: Bit Harvester
    Control energy systems in remote areas via SMS.

Best use of Data — awesome!

Winners got a miniature spaceship and an interview with Gov 2.0.

We were satisfied to have scoped a hackathon project to meet an actual need we had and could finish in a weekend. Thank you Space Apps Challenge organizers for giving us this opportunity!

Saturn's Rings, converted from VICAR data from the Cassini-Huygens missionSaturn's moon Phoebe, converted from VICAR data from the Cassini-Huygens missionJupiter, converted from VICAR data from the Cassini-Huygens mission

Press for global winners

Future work

We’d like to use vicar2png as a reference implementation for fixing VICAR support in ImageMagick and adding VICAR support to other open source image processing utilities.

Additionally, NASA uses several other obscure image formats for some of its mission data. In particular, a lot of the Mars Rover images are in the “PDS” image format. As with VICAR, we’d like to write a PDS converter as a step towards increasing the accessibility of the Mars Rover data and to facilitate adding PDS support to other image processing tools.

LD_TRACE_LOADED_OBJECTS, but not really

Zephyr was talking about the ability to dispute CVE assignments, and CVE-2009-5064: “glibc: ldd unexpected code execution issue” came up as an example. MITRE says:

** DISPUTED ** ldd in the GNU C Library (aka glibc or libc6) 2.13 and earlier allows local users to gain privileges via a Trojan horse executable file linked with a modified loader that omits certain LD_TRACE_LOADED_OBJECTS checks. NOTE: the GNU C Library vendor states “This is just nonsense. There are a gazillion other ways to introduce code if people are downloading arbitrary binaries and install them in appropriate directories or set LD_LIBRARY_PATH etc.”

This got me curious about the LD_TRACE_LOADED_OBJECTS environment variable and what the libc authors were disputing. Googling around got me to this 2009 article by Peteris Krumins. The premise is:

  1. ldd is mostly just a wrapper around invoking the dynamic linker on an executable with LD_TRACE_LOADED_OBJECTS set. The GNU dynamic linker (e.g. /lib/ld-linux.so.2 or /lib64/ld-linux-x86-64.so.2) checks for LD_TRACE_LOADED_OBJECTS and if set prints the shared libraries required by the executable instead of running it.
  2. If you can get a sysadmin to run ldd on a malicious executable which was compiled to use a dynamic linker that doesn’t check for LD_TRACE_LOADED_OBJECTS, that malicious executable will run instead of just having its shared library dependencies printed to stdout.

The blog post jumps through quite a few hoops to use a modified uClibc dynamic linker that doesn’t try to check for LD_TRACE_LOADED_OBJECTS. Comments on the post pointed out that there was an easier way to do this:

1. Create a test executable

$ cat > test_ld.c
#include <stdio.h>

int main() {
  if (getenv("LD_TRACE_LOADED_OBJECTS")) {
    printf("LD_TRACE_LOADED_OBJECTS is set, and yet I am executing!\n");
  }
  else {
    printf("LD_TRACE_LOADED_OBJECTS is not set, so I am executing.\n");
  }
  return 0;
}

2. Statically compile the test executable into what will be our dynamic linker

$ gcc -static -o test_ld.so test_ld.c

3. Tell the linker to specify our test executable as the dynamic linker to use for our test executable

$ gcc -Wl,-dynamic-linker,test_ld.so test_ld.c -o test_ld

The -Wl,<option>,<option args> syntax passes <option> and <option args> on to the linker.

The dynamic linker is written into the PT_INTERP program header in the resulting ELF executable. Chapter 2 of the ELF format specification describes dynamic linking in detail.

4. Run our executable which specifies a custom dynamic linker

$ ./test_ld
LD_TRACE_LOADED_OBJECTS is not set, so I am executing.
$ LD_TRACE_LOADED_OBJECTS=1 ./test_ld
LD_TRACE_LOADED_OBJECTS is set, and yet I am executing!

And we see that running test_ld, even when LD_TRACE_LOADED_OBJECTS is set, the program executes. What is actually going on here is that test_ld.so is getting invoked as the dynamic linker. test_ld never actually executes, since test_ld.so didn’t set up the execution environment and jump to the _start of the executable, which is what a dynamic linker is supposed to do.

I don’t like this example, though, because using the test executable as both the dynamic linker and target executable makes it less obvious what is actually executing. It also over-emphasizes the security implications of LD_TRACE_LOADED_OBJECTS: if you are running or probing untrusted executables, you don’t need LD_TRACE_LOADED_OBJECTS to get yourself into trouble.

This is a clearer example:

1. Create and compile a stub dynamic linker

$ cat > stub_ld.c 
int main() {
  return 27;
}
$ gcc -static -o stub_ld.so stub_ld.c

2. Create and compile a test executable that specifies our stub linker as the dynamic linker

$ cat test.c 
#include <stdio.h>

int main()
{
  printf("Hello world\n");
  return 0;
} 
$ gcc -Wl,-dynamic-linker,stub_ld.so test.c -o test

3. Run the test executable

$ ./test
$ echo $?
27

The fact that "Hello world" isn’t printed makes it clear that only stub_ld.so executes.

ldd security improvements

The specific ldd trick from Krumins’ 2009 article — social engineering a sysadmin into running ldd on a executable that invokes a custom dynamic linker that ignores LD_TRACE_LOADED_OBJECTS — doesn’t work on newer systems. On older systems (e.g. RHEL 4), ldd will use the dynamic linker specified in the executable. On newer systems (e.g. Ubuntu Oneiric) ldd hardcodes and will only use the GNU dynamic linkers.

Here are the relevant snippets from old and new ldd versions for comparison, with extraneous code elided.

Older ldd

RTLDLIST="/lib/ld-linux.so.2 /lib64/ld-linux-x86-64.so.2"

try_trace() {
  eval $add_env '"$@"' | cat
}

for rtld in ${RTLDLIST}; do                                                                                                                                                                                      
  if test -x $rtld; then                                                                                                                                                                                         
    verify_out=`${rtld} --verify "$file"`
    ret=$?
    case $ret in
    [02]) RTLD=${rtld}; break;;                                                                                                                                                                                  
    esac
  fi
done
case $ret in
0)
  try_trace "$file"

Note that if verification returns 0, try_trace just evals the executable directly, so the dynamic linker specified in the executable is invoked. With this ldd we get:

$ ldd ./test
$ echo $?
27

Newer ldd

RTLDLIST="/lib/ld-linux.so.2 /lib64/ld-linux-x86-64.so.2"

try_trace() {
  eval $add_env '"$@"' | cat
}

for rtld in ${RTLDLIST}; do                                                                                                                                                                                      
  if test -x $rtld; then                                                                                                                                                                                         
    verify_out=`${rtld} --verify "$file"`
    ret=$?
    case $ret in
    [02]) RTLD=${rtld}; break;;                                                                                                                                                                                  
    esac
  fi
done
case $ret in
0|2)
  try_trace "$RTLD" "$file" || result=1

In this case try_trace always use the GNU dynamic linker for our architecture to run the executable, even if something else was specified in PT_INTERP. With this ldd we get:

$ ldd ./test
	linux-vdso.so.1 =>  (0x00007fff4cbff000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd51eec9000)
	stub_ld.so => /lib64/ld-linux-x86-64.so.2 (0x00007fd51f287000)

which is the same as if we hadn't specified a custom dynamic linker. Note that our stub_ld.so is printed as living at the location of the GNU dynamic linker /lib64/ld-linux-x86-64.so.2.

I'm not sure where the newer version of ldd is coming from, though. dpkg -S `which ldd` says libc-bin, which comes from eglibc, is the package supplying ldd. I checked out the eglibc and glibc source repositories and neither have the updated ldd in their histories.

The Indianapolis Python Workshop and Indiana LinuxFest

Attendees practicing at the Indianapolis Python Workshop

I flew out to Indianapolis for the weekend of April 13th to help run the first Indianapolis Python Workshop, which was co-located with Indiana LinuxFest.

This was the second city to use our Boston Python Workshop grant from the Python Software Foundation to help bootstrap Python workshops for women with new user groups in the US. Catherine Devlin and Mel Chua were the locals and main organizers leading the show.

Running the event as part of Indiana LinuxFest presented some challenges, but the attendees were great, including some very sweet father-daughter teams. This part of the country has some strong regional events, including PyOhio and Ohio LinuxFest, and I look forward to seeing more great outreach initiatives from Catherine and Mel that capitalize on this fact.

Catherine has a write-up on the event on her blog, and I took a few photos.

Saturday afternoon instructions at the Indianapolis Python Workshop

Since I was already going to be at Indiana LinuxFest for the workshop, I got the chance to give a rehash of my “The Internet Shouldn’t Work, Networking 101″ talk. It is a beginner-level tour through the Internet’s history, governance, protocols, and current events, with telnet, traceroute, and wireshark demos interspersed. Here are my slides from the talk.

Sunday Morning Linux Review, a “weekly podcast in a news format that focuses on Linux and Open source topics”, asked if I could give a short interview for the show while I was in town. Interviewer Tony Bemus and I chatted a bit about the workshop, school, Ksplice, and Linux, starting around 41:15 in episode 27 of the show, my first ever appearance on a podcast.

LD_ASSUME_KERNEL

To quote the ld.so man page,

The LD_ASSUME_KERNEL environment variable overrides the kernel version used by the dynamic linker to determine which library to load.

An ELF executable has its minimum compatible OS ABI version written into its .note.ABI-tag ELF section at compile-time. The dynamic linker (e.g. /lib/ld-linux.so.2) compares the kernel ABI to that version and errors out if the current environment isn’t sufficient.

Ulrich Drepper has a short write-up on the mechanism: http://www.akkadia.org/drepper/assumekernel.html

The ABI version is written into the .note.ABI-tag ELF section. You can get the ABI version using objdump and interpreting the hex manually, or you can install and use eu-readelf, which will pretty-print the information for you.

Environment note

All work in this post was done on an Ubuntu Natty machine:

$ uname -a
Linux kid-charlemagne 2.6.38-13-generic #53-Ubuntu SMP Mon Nov 28 19:33:45 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Using objdump

$ objdump -s --section=.note.ABI-tag /bin/ls

/bin/ls:     file format elf64-x86-64

Contents of section .note.ABI-tag:
 400254 04000000 10000000 01000000 474e5500  ............GNU.
 400264 00000000 02000000 06000000 0f000000  ................

The Linux Standard Base Specification describes the format of the section:

The first 32-bit word of the desc field must be 0 (this signifies a Linux executable). The second, third, and fourth 32-bit words of the desc field contain the earliest compatible kernel version.

In our case, 2, 6, and f mean our earliest compatible kernel version is 2.6.15.

Using eu-readelf

eu-readelf is part of the elfutils package on Ubuntu.

$ eu-readelf -n /bin/ls

Note section [ 2] '.note.ABI-tag' of 32 bytes at offset 0x254:
  Owner          Data size  Type
  GNU                   16  VERSION
    OS: Linux, ABI: 2.6.15

Note section [ 3] '.note.gnu.build-id' of 36 bytes at offset 0x274:
  Owner          Data size  Type
  GNU                   20  GNU_BUILD_ID
    Build ID: 3e6f3159144281f709c3c5ffd41e376f53b47952

And we see that eu-readelf agrees with our interpretation of the objdump output.

ABI mismatch

What happens if your kernel ABI isn’t sufficient to run an executable?

Since the minimum compatible OS version for this ls we’ve been examining is 2.5.15, let’s set LD_ASSUME_KERNEL to a lower version number and find out:

$ export LD_ASSUME_KERNEL=2.5.14
$ ls
ls: error while loading shared libraries: librt.so.1: cannot open shared object file: No such file or directory

This is not that helpful an error message, since it’s not true that librt.so.1 doesn’t exist:

$ ldd /bin/ls
	linux-vdso.so.1 =>  (0x00007fff653ff000)
	libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007fa3f87e7000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fa3f85df000)
	libacl.so.1 => /lib/x86_64-linux-gnu/libacl.so.1 (0x00007fa3f83d6000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa3f8042000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fa3f7e3e000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fa3f8a2f000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fa3f7c1f000)
	libattr.so.1 => /lib/x86_64-linux-gnu/libattr.so.1 (0x00007fa3f7a1a000)

tells us that it is in /lib/x86_64-linux-gnu/.

We can see why this happens, though, in elf/dl-load.c in the glibc source, where errno is set to ENOENT on version mismatch:

osversion = (abi_note[5] & 0xff) * 65536
	  + (abi_note[6] & 0xff) * 256
	  + (abi_note[7] & 0xff);                                                                                                                                                                                
 (abi_note[4] != __ABI_TAG_OS
  || (GLRO(dl_osversion) && GLRO(dl_osversion) < osversion))
{
close_and_out:
  __close (fd);                                                                                                                                                                                                  
  __set_errno (ENOENT);                                                                                                                                                                                          
  fd = -1;                                                                                                                                                                                                       
}

A couple of notes on the various version variables in the above code snippet:

  • __ABI_TAG_OS describes the platform (i.e. Linux, Solaris, FreeBSD) and is derived from data in abi-tags. As mentioned in the LSB spec, Linux has an __ABI_TAG_OS of 0, which is checked against the 5th word in the ABI note (which we verified was 0 in the objdump above).
  • In sysdeps/unix/sysv/linux/dl-sysdep.c, the dynamic linker attempts to set dl_osversion based on the following sources:
    1. a version given in the PT_NOTE ELF section for the “kernel-supplied DSO”. I’m not sure which DSO that glibc comment is specifying.
    2. the uname system call
    3. /proc/sys/kernel/osrelease

Miscellaneous notes

  • The assembly that writes the ABI version into the .note.ABI-tag ELF section lives in csu/abi-note.S in the glibc source:
            .section ".note.ABI-tag", "a"
            .p2align 2
            .long 1f - 0f           /* name length */
            .long 3f - 2f           /* data length */
            .long  1                /* note type */
    0:      .asciz "GNU"            /* vendor name */
    1:      .p2align 2
    2:      .long __ABI_TAG_OS      /* note data: the ABI tag */
            .long __ABI_TAG_VERSION
    3:      .p2align 2              /* pad out section */
  • A lot of shared libraries don’t have a .note.ABI-tag section. Here’s a small shell script to print out the OS version for those that do in the main shared library directories:
    for dir in /lib /lib64 /lib/x86_64-linux-gnu /lib64/x86_64-linux-gnu /usr/lib /usr/lib64
    do
        for elt in `ls $dir/*.so*`
        do
            res=`eu-readelf -n $elt | grep "OS"`;
            echo $elt;
            if ! [ -z "$res" ]; then
                echo "    "$res;
            fi
        done
    done

    With almost no exception, all shared libraries with .note.ABI-tag sections are in /lib/x86_64-linux-gnu and /lib64/x86_64-linux-gnu (although not all shared libraries in those directories have an ABI tag). This is presumably on purpose, although I couldn’t find a document describing when you do or don’t have an ABI tag or why shared libraries in /lib or /usr/lib never have an ABI tag.

    The one exception was /usr/lib/libvdpau_nvidia.so / /usr/lib64/libvdpau_nvidia.so, which has the odd OS minimum version of 2.3.99.

  • I ran this script on RHEL 4 (2.6.9), Precise (3.2.0), and Fedora 16 (3.3.0) machines for comparison. The RHEL 4 shared libraries that have ABI tags all require 2.2.5, Precise required 2.6.24, and Fedora 16 required 2.6.32. I’m not sure what the relationship is between the minimum OS version written into ABI tags as built for a particular environment and the actual kernel ABI.
  • The script caught a couple of .so files that were not actually shared libraries. They all in fact ended up being linker scripts. Here’s a partial list:
    • /usr/lib/libc.so, provided by glibc-devel
    • /usr/lib/libbfd.so, provided by binutils-devel
    • /usr/lib/libcurses.so, provided by ncurses-devel
    • /usr/lib/libfl.so, provided by flex
    • /usr/lib/libpthread.so, provided by glibc-devel
    • /usr/lib/libtermcap.so, provided by ncurses-devel

    You can determine which package provides a file with dpkg -S on dpkg-based sytems and rpm -qf on rpm-based systems.

The Architecture of Open Source Applications, Volume II

I was honored to be approached last year by editors Amy Brown and Greg Wilson to write a chapter on Twisted for The Architecture of Open Source Applications Volume II: Structure, Scale, and a Few More Fearless Hacks. This was my first-ever contribution to a book, and a great introduction to the writing, review, and editing cycles for a technical book.

The book was released as a paperback on May 8th and is now available in a number of formats. Other chapters include: Firefox release engineering, GDB, Git, Matplotlib, Mediawiki, Puppet, and PyPy.

As with Volume I, the material is under a Creative Commons Attribution 3.0 Unported license, and all royalties are donated to Amnesty International.

Twisted logo

Ways to enjoy the book:

Thank you Glyph and JP for letting me pick their brains for hours while researching my chapter, and to Glyph and Adam for their reviews of the numerous drafts.

The 6th Boston Python Workshop

The 6th Boston Python Workshop ran the weekend of March 30th at MIT. It marked a full year of diversity outreach with the Boston Python user group and was the second workshop to utilize our grant from the Python Software Foundation Outreach and Education Committee.

Boston Python Workshop 6, Friday night

Additional resources:

Ways to toggle execstack for assembly files

There are a couple of utilities to toggle whether or not the stack is executable, and ways to set this flag at various stages while compiling assembly files into ELF binaries. Below is an investigation of these options, inspired by the following kernel bug:

commit 07c3ae18cac4dc96bb87ddc7bf9ad93999890146
Author: Jiri Olsa 
Date:   Mon Feb 6 18:54:06 2012 -0200

   perf tools: Fix perf stack to non executable on x86_64

   BugLink: http://bugs.launchpad.net/bugs/937915

   commit 7a0153ee15575a4d07b5da8c96b79e0b0fd41a12 upstream.

   By adding following objects:
     bench/mem-memcpy-x86-64-asm.o
   the x86_64 perf binary ended up with executable stack.

   The reason was that above object are assembler sourced and is missing the
   GNU-stack note section. In such case the linker assumes that the final binary
   should not be restricted at all and mark the stack as RWX.

   Adding section ".note.GNU-stack" definition to mentioned object, with all
   flags disabled, thus omiting this object from linker stack flags decision.

1. Toggle noexecstack with the linker

Here’s a bare-bones assembly file:

jesstess$ cat test.S
	.text
.global _start

_start:
	mov	$0x1,%eax
	int	$0x80

Assemble it into an ELF executable by hand, and check if the stack is executable:

jesstess$ as -o test.o test.S
jesstess$ ld -s -o test test.o
jesstess$ execstack test
? test

execstack isn’t sure, because it checks for a GNU_STACK section, which our program doesn’t have:

jesstess$ readelf -Sl test
There are 3 section headers, starting at offset 0x90:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000400078  00000078
       0000000000000007  0000000000000000  AX       0     0     4
  [ 2] .shstrtab         STRTAB           0000000000000000  0000007f
       0000000000000011  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

Elf file type is EXEC (Executable file)
Entry point 0x400078
There are 1 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000000007f 0x000000000000007f  R E    200000

 Section to Segment mapping:
  Segment Sections...
   00     .text

We can ask the linker to add a GNU_STACK section:

jesstess$ ld -z execstack -s -o test_exec test.o
jesstess$ ld -z noexecstack -s -o test_noexec test.o
jesstess$ readelf -Wl test_noexec | grep GNU_STACK
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x8
jesstess$ readelf -Wl test_exec | grep GNU_STACK
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RWE 0x8
jesstess$ diff <(hexdump test_noexec) <(hexdump test_exec)
8c8
< 0000070 0000 0020 0000 0000 e551 6474 0006 0000
---
> 0000070 0000 0020 0000 0000 e551 6474 0007 0000
jesstess$ execstack test_noexec test_exec
- test_noexec
X test_exec

A single SHF_EXECINSTR bit dictates if the stack is executable.

2. Toggle noexecstack in the assembly

We can toggle noexecstack directly in the assembly by adding a .note.GNU-stack section manually:

jesstess$ cat >> test.S
.section .note.GNU-stack,"",%progbits
jesstess$ as -o test_note.o test.S
jesstess$ readelf -S test_note.o
There are 8 section headers, starting at offset 0x88:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  00000040
       0000000000000007  0000000000000000  AX       0     0     4
  [ 2] .data             PROGBITS         0000000000000000  00000048
       0000000000000000  0000000000000000  WA       0     0     4
  [ 3] .bss              NOBITS           0000000000000000  00000048
       0000000000000000  0000000000000000  WA       0     0     4
  [ 4] .note.GNU-stack   PROGBITS         0000000000000000  00000048
       0000000000000000  0000000000000000           0     0     1
  [ 5] .shstrtab         STRTAB           0000000000000000  00000048
       000000000000003c  0000000000000000           0     0     1
  [ 6] .symtab           SYMTAB           0000000000000000  00000288
       0000000000000090  0000000000000018           7     5     8
  [ 7] .strtab           STRTAB           0000000000000000  00000318
       0000000000000008  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)
jesstess$ ld -s -o test_note test.o
jesstess$ execstack test_note
- test_note

To specify an executable stack manually, add an "x" in the .section description: .section .note.GNU-stack, "x".

3. Toggle noexecstack from the compiler

To get around passing complicated linker options for this toy example, pass -nostdlib:

jesstess$ gcc -o test_gcc test.s -nostdlib
jesstess$ readelf -Sl test_gcc
There are 6 section headers, starting at offset 0x110:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .note.gnu.build-i NOTE             00000000004000b0  000000b0
       0000000000000024  0000000000000000   A       0     0     4
  [ 2] .text             PROGBITS         00000000004000d4  000000d4
       0000000000000007  0000000000000000  AX       0     0     4
  [ 3] .shstrtab         STRTAB           0000000000000000  000000db
       0000000000000034  0000000000000000           0     0     1
  [ 4] .symtab           SYMTAB           0000000000000000  00000290
       00000000000000a8  0000000000000018           5     3     8
  [ 5] .strtab           STRTAB           0000000000000000  00000338
       0000000000000020  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

Elf file type is EXEC (Executable file)
Entry point 0x4000d4
There are 2 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x00000000000000db 0x00000000000000db  R E    200000
  NOTE           0x00000000000000b0 0x00000000004000b0 0x00000000004000b0
                 0x0000000000000024 0x0000000000000024  R      4

 Section to Segment mapping:
  Segment Sections...
   00     .note.gnu.build-id .text 
   01     .note.gnu.build-id

As we can see above, gcc doesn’t try to set a non-executable stack by default for assembly files, but we can pass a flag to tell gcc what to do:

jesstess$ execstack -q test_gcc
? test_gcc
jesstess$ gcc -z execstack -o test_gcc_exec test.s -nostdlib
jesstess$ gcc -z noexecstack -o test_gcc_noexec test.s -nostdlib
jesstess$ execstack -q test_gcc_exec test_gcc_noexec 
X test_gcc_exec
- test_gcc_noexec

The situation with C source code is a bit different: gcc tacks a .note.GNU-stack section onto the end of C files by default.

We can verify this by stopping gcc‘s compilation of a bare-bones C file before running the assembler:

jesstess$ cat test.c
int main()
{
  return 0;
}
jesstess$ gcc -o test.i -S test.c
jesstess$ cat test.i
	.file	"test.c"
	.text
.globl main
	.type	main, @function
main:
.LFB0:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	movq	%rsp, %rbp
	.cfi_offset 6, -16
	.cfi_def_cfa_register 6
	movl	$0, %eax
	leave
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	main, .-main
	.ident	"GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
	.section	.note.GNU-stack,"",@progbits

4. Toggle noexecstack on an executable

We can use execstack to toggle the behavior of an already-compiled executable:

jesstess$ gcc -o ctest test.c
jesstess$ execstack -q ctest
- ctest
jesstess$ execstack -s ctest; execstack -q ctest
X ctest
jesstess$ execstack -c ctest; execstack -q ctest
- ctest

5. When trampolines make an executable stack necessary

A gcc extension allows nested functions, which require an executable stack under certain conditions. To quote this blog post: “if you pass a local function (a) as a parameter to another function (b) from an outer calling function (c), then gcc makes that local function a trampoline that’s resolved at runtime, because AFAICS the function is on the stack.”

We can verify this with a short example:

jesstess$ cat trampoline.c 
int main() {
    int a = 1;
  
    int nested() {
        return a;
    }
    int (*fptr)() = nested;
  
    return fptr();
}
jesstess$ gcc -o trampoline trampoline.c
jesstess$ execstack -q trampoline
X trampoline
jesstess$ ./trampoline 
jesstess$ execstack -c trampoline
jesstess$ ./trampoline 
Segmentation fault

6. Finding binaries with an executable stack

scanelf, part of the pax-utils package, makes short work of identifying binaries that want an exectuable stack:

jesstess$ scanelf -lpeq
RWX --- ---  /usr/lib32/libSDL-1.2.so.0.11.3
RWX --- ---  /lib/klibc-EBLO2mlo7LXngcucphVUH-0CbT0.so
RWX --- ---  /usr/bin/grub-editenv
RWX --- ---  /usr/bin/grub-mkpasswd-pbkdf2
RWX --- ---  /usr/bin/grub-mklayout
RWX --- ---  /usr/bin/grub-menulst2cfg
RWX --- ---  /usr/bin/grub-mount
RWX --- ---  /usr/bin/grub-fstest
RWX --- ---  /usr/bin/grub-mkfont
RWX --- ---  /usr/bin/grub-mkrelpath
RWX --- ---  /usr/bin/grub-mkimage
RWX --- ---  /usr/bin/grub-script-check
RWX --- ---  /usr/sbin/grub-probe
RWX --- ---  /usr/sbin/grub-mkdevicemap
RWX --- ---  /usr/sbin/grub

The flags to scanelf say to:

  • -l: scan all directories listed in /etc/ld.so.conf
  • -p: scan all directories in your $PATH
  • -e: print GNU_STACK information
  • -q: only print data for binaries with ‘bad’ attributes
  • Let’s pick one reported binary to confirm:

    jesstess$ execstack /usr/sbin/grub
    X /usr/sbin/grub
    jesstess$ readelf -l /usr/sbin/grub | grep GNU_STACK
      GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x4

PyCon 2012 poster: getting and retaining new contributors to open source projects

My poster at the PyCon 2012 poster session was on getting and retaining contributors to open source projects.

A short video of me summarizing the poster is at http://pyvideo.org/video/692/2-twisted-matrix-high-scores.

The full-sized 4′x6′ poster pdf is here.

The poster strives to start a dialog in 3 areas of open source community management:

  1. Providing a welcoming environment with clear contribution guidelines and opportunities for new contributors.
  2. Identifying where in the ticket lifecycle a project bottlenecks and loses potential contributors, and how to incentivize community members to work on those bottlenecks.
  3. Resources for beginning open source contributors.

I use the Twisted Matrix High scores list as an example of one strategy to incentivize community members to work on ticket bottlenecks.

I happily got a lot of traffic in the poster hall, with a lot of people sharing their community stories and checking out OpenHatch and Twisted as a result (one Twisted sprinter even said he came to our sprint because of the poster!)

I was was right next to Brian Curtin, who had posters on the PSF Sprints and Outreach and Education committees. He funneled people to me to talk about the Boston Python Workshop grant.

For more on the poster session, see the call for posterslist of posters, and the full PyCon 2012 video list.

PyCon 2012: 5K

I ran my first 5K at PyCon 2012. All proceeds from the event went to the the charities Autism Speaks, the American Cancer Society, and the Epilepsy Foundation.

An impressive 148 participants completed the 7am run. Jacob Kaplan-Moss and the other 5K organizers did a great job of keeping the event upbeat and encouraging newcomers. I hope to see more programming communities supporting and encouraging fitness through events like this.

Read more at the event page and see more photos on Flickr (thanks and credit to volunteer photographer Michael McHugh).

PyCon 2012 talk: Diversity in Practice

I gave a talk with Asheesh Laroia at PyCon 2012 called Diversity in practice: How the Boston Python user group grew to 1700 people and over 15% women.

The video can be viewed online at http://pyvideo.org/video/719/diversity-in-practice-how-the-boston-python-user. Thank you to the PyCon organizers for orchestrating the lightning-fast turnaround time on subtitling and publishing the talk videos.

The slides are available here.

The talk was very well-received, with a great Q&A and many follow-up contacts from folks interested in running outreach events in their communities. Praise from Glyph, a long-time supporter who is leaving Boston soon for San Francisco, was particularly touching. We benefited tremendously from our practice run with the Boston Python user group.

PyCon 2012 talk

Abstract:

How do you bring more women into programming communities with long-term, measurable results? In this talk we’ll analyze our successful effort, the Boston Python Workshop, which brought over 200 women into Boston’s Python community this year. We’ll talk about lessons learned running the workshop, the dramatic effect it has had on the local user group, and how to run a workshop in your city.