TL;DR: Install win32 extensions (e.g. using this to put it in your virtual env). Read some code below (there are folders with stuff, messages have CapitalizedPropertyNames). Profit.
The other day I needed to perform some statistics on an Outlook mailbox. In particular, use the frequency per day to identify peaks of activity. I could get an excel sheet going with VBA but it's more fun in python. So, here it goes (mind you, it's a bit long due to the included code).
Preparation
First, I've defined a class to help me with Outlook objects:
class Oli(object):
def __init__(self, outlook_object):
self._obj = outlook_object
# Get all items in the object (e.g. messages in a folder)
#
def items(self):
array_size = self._obj.Count
for item_index in xrange(1,array_size+1):
yield (item_index, self._obj[item_index-1])
def prop(self):
return sorted(self._obj._prop_map_get_.keys())
Now, like in C# or VB, the outlook object (MAPI) is a singleton. In python terms, I'm initialising and storing it as a global variable:
outlook = win32com.client \
.Dispatch("Outlook.Application") \
.GetNamespace("MAPI")
EXCLUSION_LIST = [
'SharePoint Lists - outlook',
'Project Canendars'
]
As you can notice, there's also an exclusion list (I'll use this later to skip folders).
Message Processor
The message processor is quite simple in the sense that it only spits out some message information:
- The subject
- The sent on date
- The categories
Two notes:
- I'm using categories as tags. It's much easier grouping messages by subject with categories than moving them in subfolders, particularly since some messages have multiple categories and
- No heavy lifting. Aggregation and statistics are left to other packages
The function takes the folder as a parameter and loops through messages:
def process_messages(folder):
messages = folder.Items
message = messages.GetFirst()
while message:
# Process a message
print "%s;%s;%s" % (message.Categories, message.Subject, message.SentOn)
message = messages.GetNext()
Note: The messages in a folder can be retrieved via messages.GetNext(). I could probably use the helper class but I haven't tried it yet for the code.
Now, all I need is to get the folder.
Getting to the folder
To get to the actual Outlook folder object, I've implemented a simple path-based search engine:
f = search('Large Files/Projects/A Project')
The search function is pretty simple:
def search(path):
components = path.split('/')
folder = None
root = outlook.folders
for name in components:
index, folder = search_item(root, name)
if not index:
return None
root = folder.Folders
return folder
The search_item() only performs a string search on the names:
# It returns a tuple (index, outlook_object)
#
def search_item(folders, name):
for index, folder in Oli(folders).items():
# Return if we have a match
if folder.Name == name:
return index, folder
# Return an empty tuple
return None, None
Bonus: Pretty print
To make things a bit nicer and help with debugging, I've also created a pretty print function:
def browse(folders, depth=2, recursive=True):
if not folders:
return
for index, folder in Oli(folders).items():
print " "*depth, u"(%i) [%s] [%s]" % (index, folder.Name, folder)
if u"%s" % folder in EXCLUSION_LIST:
continue
if recursive:
browse(folder.Folders, depth + 2, recursive)
I used the exclusion list here to avoid going in those folders (too time consuming).
The full code
The full code (which can be quite verbose if DEBUG mode is enabled) is:
# -*- coding: utf-8 -*-
"""
Outlook folder email reader
Created on : 25/09/2015
"""
__author__ = 'ivanlla'
import win32com.client
DEBUG = False
EXCLUSION_LIST = [
'SharePoint Lists - outlook',
'Project Canendars'
]
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
class Oli(object):
def __init__(self, outlook_object):
self._obj = outlook_object
def items(self):
array_size = self._obj.Count
for item_index in xrange(1,array_size+1):
yield (item_index, self._obj[item_index-1])
def prop(self):
return sorted( self._obj._prop_map_get_.keys() )
def search_item(folders, name):
if DEBUG: browse(folders, recursive=False)
for index, folder in Oli(folders).items():
if folder.Name == name:
if DEBUG: print " Found %s @ %d" % (folder.Name, index)
return index, folder
return None, None
def search(path):
components = path.split('/')
if DEBUG: print components
folder = None
root = outlook.folders
for name in components:
index, folder = search_item(root, name)
if not index:
return None
root = folder.Folders
return folder
def browse(folders, depth=2, recursive=True):
if not folders:
return
for index, folder in Oli(folders).items():
print " "*depth, u"(%i) [%s] [%s]" % (index, folder.Name, folder)
if u"%s" % folder in EXCLUSION_LIST:
continue
if recursive:
browse(folder.Folders, depth + 2, recursive)
def process_messages(folder):
if not folder:
print "Folder could not be found!"
return
messages = folder.Items
message = messages.GetFirst()
while message:
# Process a message
print "%s;%s;%s" % (message.Categories, message.Subject, message.SentOn)
message = messages.GetNext()
if __name__ == "__main__":
#list(outlook.Folders)
f = search('Large Files/Projects/A Project')
if DEBUG and f: print "Folder name: ", f.Name
process_messages(f)
HTH,
Member discussion: