
‘panic’ phrases. This is as opposed to training sets 
comprised of hundreds or sometimes thousands of 
training phrases, as is more usually observed in the 
use of CRFs. That the use of relatively broad 
categorisations of phrases was able to approximately 
reflect the timelines of the investigation into Enron 
means the method could be extended in many ways. 
There are many limitations with the approach as 
detailed within this study. Selection of phrases 
corresponding to the three categories studied was 
entirely subjective and therefore there was a risk of 
bias in model training. Additionally, the nature of 
the corpus meant that although there were extensive 
attempts to clean the dataset, many artifacts of email 
in its raw form remain (e.g. spam, multiple quoting 
biasing counts). The precise nature of the association 
between phrase use and actual events in Enron’s 
history can only be guessed at, more information 
regarding the detailed course of events would be 
required to validate the accuracy and sensitivity of 
the association detailed here. The a priori nature of 
CRF model training in this instance virtually 
guarantees bias. 
There are also general limitations in probabilistic 
topics models which may affect inferred results; 
topic models are prone to overfitting, as in, the mode 
by which an individual document’s topic mixture is 
established is not robust enough to handle the 
addition of new documents to the trained corpus. 
Related, the number of free model parameters 
increases linearly with the number of training 
documents, making re-training a computationally 
expensive exercise.  
Possible extensions to the software are many and 
varied. Results from what was a relatively lightly 
trained CRF model seemed reasonable but it was 
trained only on binary data with a first-order model. 
The use of higher-order model will likely increase 
the precision of tagging of phrases as more 
information about context is modelled. This may 
also allow a finer grained model training of more 
specific phrases as ‘slacker’, ‘aggressive’, etc. are 
relatively broad terms for the language being 
modelled.  
Instead, sub-types of slacker/aggressive/panic 
phrases could be tagged. The results of a topic 
model could also be used to inform the tagging of 
phrases rather than the a priori method as detailed in 
this study. Ideally, a formal evaluation of tagging 
predictive accuracy could be conducted on non-
Enron emails or on Enron emails with a k-folds 
cross-validation methodology with attendant 
measures of fit (e.g. positive predictive value). 
5  CONCLUSIONS 
The method as detailed provides a broad method for 
the descriptive analysis of email data by tagging of 
phrases that are semantically interesting. That this 
exercise even broadly reflects the timeline of 
investigation validates the use of both a sub-set of 
the full Enron corpus as well as the method used to 
tag information of interest. This is suggestive that 
the performance of even a lightly-trained model may 
be acceptable on a far smaller test set than would be 
the case were it exhaustively trained on the full 
Enron Email Dataset. 
REFERENCES 
Blei, D. (2012). Probabilistic Topic Models. 
Communications of the ACM , 55 (4), 77-84. 
Buys, N. M. (2010). Employees’ Perceptions of the 
Management of Workplace Stress. International 
Journal of Disability Management, 5 (2), 25-31. 
Chapanond, A. K. (2005). Graph Theoretic and Spectral 
Analysis of Enron Email Data. Computational & 
Mathematical Organization Theory, 11, 265-281. 
Chekina, L. G. (2013). Exploiting label dependencies for 
improved sample complexity. Machine Learning, 91, 
1-42. 
Dahl, C. (2004). Pipe Dreams: Greed, Ego, and the Death 
of Enron/Anatomy of Greed: The Energy Journal, 25 
(4), 115-134. 
Diesner, J. C. (2008). Conditional random fields for entity 
extraction and ontological text coding. Computer and 
Mathematical Organisation Theory, 14, 248-262. 
Diesner, J. F. (2005). Communication Networks from the 
Enron Email Corpus “It’s Always About the People. 
Enron is no Different". Computational & 
Mathematical Organization Theory, 11, 201-228. 
Dreijer, J. H. (2013). Left ventricular segmentation from 
MRI datasets with edge modelling conditional random 
fields . BMC Medical Imaging, 13, 1-24. 
El Shikieri, A. M. (2012). Factors Associated with 
Occupational Stress and Their Effects on 
Organizational Performance in a Sudanese University . 
Creative Education , 3 (1), 134-144. 
Hayashida, M. K. (2013). Prediction of protein-RNA 
residue-base contacts using two-dimensional 
conditional random field with the lasso . BMC Systems 
Biology, 7 (Suppl 2), 1-11. 
Hurley-Hanson, A. G. (2011). The Effect of the Attacks of 
9/11 on Organizational Policies, Employee Attitudes 
and Workers’ Psychological States. American Journal 
of Economics and Business Administration, 3 (2), 377-
389. 
Hutton, A. L. (2006). Crowdsourcing Evaluations of 
Classifier Interpretability. AAAI Technical Report SS-
12-06 Wisdom of the Crowd, 21-26. 
Jahanian, R. T. (2012). Stress Management in the 
ICAART2015-InternationalConferenceonAgentsandArtificialIntelligence
252