TL;DR: Install win32 extensions (e.g. using this to put it in your virtual env). Read some code below (there are folders with stuff, messages have CapitalizedPropertyNames). Profit.

The other day I needed to perform some statistics on an Outlook mailbox. In particular, use the frequency per day to identify peaks of activity. I could get an excel sheet going with VBA but it's more fun in python. So, here it goes (mind you, it's a bit long due to the included code).

Preparation

First, I've defined a class to help me with Outlook objects:

class Oli(object):
    def __init__(self, outlook_object):
        self._obj = outlook_object

    # Get all items in the object (e.g. messages in a folder)
    #
    def items(self):
        array_size = self._obj.Count
        for item_index in xrange(1,array_size+1):
            yield (item_index, self._obj[item_index-1])

    def prop(self):
        return sorted(self._obj._prop_map_get_.keys())

Now, like in C# or VB, the outlook object (MAPI) is a singleton. In python terms, I'm initialising and storing it as a global variable:

outlook = win32com.client \
    .Dispatch("Outlook.Application") \
    .GetNamespace("MAPI")
EXCLUSION_LIST = [
    'SharePoint Lists - outlook',
    'Project Canendars'
]

As you can notice, there's also an exclusion list (I'll use this later to skip folders).

Message Processor

The message processor is quite simple in the sense that it only spits out some message information:

  1. The subject
  2. The sent on date
  3. The categories

Two notes:

  1. I'm using categories as tags. It's much easier grouping messages by subject with categories than moving them in subfolders, particularly since some messages have multiple categories and
  2. No heavy lifting. Aggregation and statistics are left to other packages

The function takes the folder as a parameter and loops through messages:

def process_messages(folder):
    messages = folder.Items
    message = messages.GetFirst()
    while message:
        # Process a message
        print "%s;%s;%s" % (message.Categories, message.Subject, message.SentOn)
        message = messages.GetNext()

Note: The messages in a folder can be retrieved via messages.GetNext(). I could probably use the helper class but I haven't tried it yet for the code.

Now, all I need is to get the folder.

Getting to the folder

To get to the actual Outlook folder object, I've implemented a simple path-based search engine:

f = search('Large Files/Projects/A Project')

The search function is pretty simple:

def search(path):
    components = path.split('/')

    folder = None
    root = outlook.folders
    for name in components:
        index, folder = search_item(root, name)
        if not index:
            return None
        root = folder.Folders

    return folder

The search_item() only performs a string search on the names:

# It returns a tuple (index, outlook_object)
#
def search_item(folders, name):
    for index, folder in Oli(folders).items():
        # Return if we have a match
        if folder.Name == name:
            return index, folder

    # Return an empty tuple
    return None, None

Bonus: Pretty print

To make things a bit nicer and help with debugging, I've also created a pretty print function:

def browse(folders, depth=2, recursive=True):
    if not folders:
        return
    for index, folder in Oli(folders).items():
        print " "*depth, u"(%i) [%s] [%s]" % (index, folder.Name, folder)
        if u"%s" % folder in EXCLUSION_LIST:
            continue
        if recursive:
            browse(folder.Folders, depth + 2, recursive)

I used the exclusion list here to avoid going in those folders (too time consuming).

The full code

The full code (which can be quite verbose if DEBUG mode is enabled) is:

# -*- coding: utf-8 -*-
"""
Outlook folder email reader

Created on   : 25/09/2015
"""
__author__ = 'ivanlla'

import win32com.client

DEBUG = False
EXCLUSION_LIST = [
    'SharePoint Lists - outlook',
    'Project Canendars'
]

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")

class Oli(object):
    def __init__(self, outlook_object):
        self._obj = outlook_object

    def items(self):
        array_size = self._obj.Count
        for item_index in xrange(1,array_size+1):
            yield (item_index, self._obj[item_index-1])

    def prop(self):
        return sorted( self._obj._prop_map_get_.keys() )

def search_item(folders, name):
    if DEBUG: browse(folders, recursive=False)
    for index, folder in Oli(folders).items():
        if folder.Name == name:
            if DEBUG: print " Found %s @ %d" % (folder.Name, index)
            return index, folder
    return None, None

def search(path):
    components = path.split('/')
    if DEBUG: print components
    folder = None
    root = outlook.folders
    for name in components:
        index, folder = search_item(root, name)
        if not index:
            return None
        root = folder.Folders

    return folder


def browse(folders, depth=2, recursive=True):
    if not folders:
        return
    for index, folder in Oli(folders).items():
        print " "*depth, u"(%i) [%s] [%s]" % (index, folder.Name, folder)
        if u"%s" % folder in EXCLUSION_LIST:
            continue
        if recursive:
            browse(folder.Folders, depth + 2, recursive)

def process_messages(folder):
    if not folder:
        print "Folder could not be found!"
        return
    messages = folder.Items
    message = messages.GetFirst()
    while message:
        # Process a message
        print "%s;%s;%s" % (message.Categories, message.Subject, message.SentOn)
        message = messages.GetNext()

if __name__ == "__main__":
    #list(outlook.Folders)
    f = search('Large Files/Projects/A Project')
    if DEBUG and f: print "Folder name: ", f.Name
    process_messages(f)

HTH,