Characterizing and Predicting Enterprise Email Reply Behavior

Transcript

1 Characterizing and Predicting Enterprise Email Reply Behavior 2 2 1 2 Ahmed Hassan Awadallah Susan T. Dumais Paul N. BenneŠ Liu Yang 1 Center for Intelligent Information Retrieval, University of MassachuseŠs Amherst, Amherst, MA, USA 2 Microso‰ Research, Redmond, WA, USA { sdumais,pauben,hassanam } @microso‰.com [email protected], Friday 11:51 PM Alice; Bob; Philip : To ABSTRACT James : CC : From Jack Email is still among the most popular online activities. People Subject: Meeting High Importance spend a signi€cant amount of time sending, reading and respond- Hi Alice, Bob and Philip: Could we have a meeting tomorrow to discuss a possible paper collaboration? In particular, ing to email in order to communicate with others, manage tasks I’d like to discuss a SIGIR’17 submission on email reply behavior prediction. See the aŠached and archive personal information. Most previous research on email €le for some promising results. Œank you! is based on either relatively small data samples from user surveys and interviews, or on consumer email accounts such as those from Jack Yahoo! Mail or Gmail. Much less has been published on how people SIGIR17 Experimental Results.pptx (304K) interact with enterprise email even though it contains less automati- : Predicted Reply Probability 67% (Likely to receive a response) cally generated commercial email and involves more organizational Predicted Reply Time Latency : ≥ 245 minutes (High) behavior than is evident in personal accounts. In this paper, we ex- Figure 1: A motivational email example with predicted reply tend previous work on predicting email reply behavior by looking probability and reply time latency. at enterprise seŠings and considering more than dyadic commu- nications. We characterize the inƒuence of various factors such as email content and metadata, historical interaction features and improve communication and productivity by providing insights for temporal features on email reply behavior. We also develop mod- the design of the next generation of email tools. els to predict whether a recipient will reply to an email and how By modeling user reply behaviors like reply rate and reply time, long it will take to do so. Experiments with the publicly-available we can integrate machine intelligence into email systems to provide Avocado email collection show that our methods outperform all value for both email recipients and senders. For email recipients, baselines with large gains. We also analyze the importance of dif- reply predictions could help €lter emails that need replies or fast ferent features on reply behavior predictions. Our €ndings provide ]. For email senders, 9 replies, which can help reduce email overload [ new insights about how people interact with enterprise email and the reply behaviors could be predicted in advance during email have implications for the design of the next generation of email composition. More generally, beŠer reply strategies could lead to clients. improved communication eciency. Figure 1 shows a motivating email example with predicted reply probability and reply time KEYWORDS latency shown in the boŠom panel. Speci€c features could also be highlighted. For example, identifying a request in the email Email reply behavior; information overload; user behavior modeling ( ) could improve automated triage for Could we have a meeting... the recipient by highlighting that a reply is needed; or alerting the sender that a reply is likely to take longer if it is sent late at night 1 INTRODUCTION or over the weekend could improve communication eciency. Email remains one of the most popular online activities. Major Previous work investigated strategies that people use to organize, email services such as Gmail, Outlook, and Yahoo! Mail have mil- ]. However, those 29 , 22 , 10 , 9 reply to, or delete email messages [ lions of monthly active users, many of whom perform frequent studies are based on relatively small surveys or interviews. Some interactions like reading, replying to, or organizing emails. A re- recent research proposes frameworks for studying user actions cent survey shows that reading and answering emails takes up ]. Both of these studies are 20 , 11 on emails with large scale data [ to 28% of enterprise workers’ time, which is more than searching based on consumer emails from Yahoo! Mail. Enterprise email has and gathering information (19%), communication and collabora- received liŠle aŠention compared to consumer email even though tion internally (14%), and second only to role speci€c tasks (39%) several studies have shown that enterprise email usage is not the [ ]. Understanding and characterizing email reply behaviors can 6 ] reports that en- same as consumer email usage. For example, [ 25 terprise users send and receive twice as much emails as consumer Work primarily done during Liu Yang’s internship at Microso‰ Research. 15 users and [ ] shows that consumer email is now dominated by Permission to make digital or hard copies of all or part of this work for personal or machine-generated messages sent from business and social net- classroom use is granted without fee provided that copies are not made or distributed for pro€t or commercial advantage and that copies bear this notice and the full citation working sites. on the €rst page. Copyrights for components of this work owned by others than ACM Perhaps the closest prior research to our work is the study on must be honored. Abstracting with credit is permiŠed. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speci€c permission and/or a email reply behavior in Yahoo! Mail by Kooti et al. [ 20 ]. However, fee. Request permissions from [email protected] they focus on personal email and only consider a subset of email SIGIR’17, August 7–11, 2017, Shinjuku, Tokyo, Japan exchanges, speci€cally those from dyadic (one-to-one) email con- 15.00 $ © 2017 ACM. 978-1-4503-5022-8/17/08... DOI: hŠp://dx.doi.org/10.1145/3077136.3080782 versations for pairs of users who have exchanged more than 5 prior

2 emails in consumer email. Focusing only on dyadic email conver- 2 RELATED WORK sations is limiting, especially in the context of enterprise emails. Our work is related to several research areas, including modeling In the enterprise email data that we study, 52.99% of emails are actions on email, email overload, email acts and intent analysis, non-dyadic emails, that is, they are sent to more than one recipient email classi€cation and mining. other than the sender. Œus it is important and more realistic to Our work is related to previ- Modeling Actions on Email. study the more general seŠing of modeling email reply behavior 11 , , ous research on user behavior and email action modeling [ 10 including both dyadic emails and emails sent to a group of people, 10 ]. Dabbish et al. [ 28 , 24 , 23 , 20 ] examined people’s ratings of without any threshold on previous interactions. message importance and the actions they took on speci€c email In this paper, we address this gap by characterizing and predict- messages with a survey of 121 people at a university. While this ing reply behaviors in enterprise emails, where we consider both work provides insights on email usage and user actions on email dyadic conversations and group discussions. We use the publicly- messages, how well the results generalize to other user groups is 1 available Avocado research email collection, which consists of not clear. DiCastro et al. [ ] studied four common user actions 11 emails and aŠachments taken from a defunct information technol- on email (read, reply, delete, delete-withoutread) using an opt-in ogy company referred to as “Avocado”. Œere are 938 035 emails , sample of more than 100k users of the Yahoo! Mail service. Œey from 279 accounts in this email collection. proposed and evaluated a machine learning framework for pre- We analyze and characterize various factors a‚ecting email dicting these four actions. Kooti et al. [ ] also used Yahoo! Mail 20 replies including: temporal features (e.g. time of day and day of data to quantitatively characterize the reply behavior for pairs of week), historical interaction features (e.g., previous interactions users. Œey investigated the e‚ects of increasing email overload between sender and recipients), properties of the content features on user behaviors and performed experiments on predicting reply (e.g. length of subject and email body), predictions based on textual time, reply length and whether the reply ends a conversion. Our content features (e.g., sentiment, contains a request), address fea- work is inspired by the laŠer two studies but it di‚ers in several tures (e.g., recipients), and metadata features (e.g., aŠachments). We important ways. Œese two studies looked at behavior in Yahoo! €nd several interesting paŠerns connecting these factors to reply Mail, a consumer email collection, whereas we studied interaction behavior. For example, emails with requests or commitments get in a enterprise seŠing using the Avocado collection. What’s more, more but slower replies while longer emails tend to get fewer and the study by Kooti et al. [ 20 ] only considers dyadic emails from a slower replies. Based on this analysis, we used a variety of features subset of people who had at least €ve previous email interactions. to build models to predict whether an email will receive a reply and Our work considers a more general seŠing where we consider both the corresponding reply time. We show that our proposed model dyadic emails and emails sent to a group of users. We allow cases outperforms all baselines with large gains. We also perform feature where there is no previous interactions between the sender and importance analysis to understand the role di‚erent features play receivers, which makes our prediction task a more challenging (and in predicting user email reply behavior. realistic) one. Our experimental data is also publicly available in Our contributions can be summarized as follows: the recently-released Avocado collection from LDC, whereas prior research used proprietary internal data. Last but not least, we ana- We introduce and formalize the task of reply behavior (1) lyzed several novel features including properties of the content like prediction in enterprise emails involving both one-to-one email subject and body length, predictions of whether the email (dyadic) and one-to-many communication. Unlike previous contained a request or commitment, and address features like inter- , work either on small user surveys and interviews [ 9 , 10 nal vs. external emails and showed that they are useful for building 22 , 29 ] or only for dyadic email conversations in consumer models to predict user email reply behavior. emails [ 20 ], our work is the €rst to model reply behavior Several research e‚orts have examined the Email Overload. in a more general seŠing including emails sent to groups , 9 , 2 problem and proposed solutions to reduce it [ email overload of people as well as individuals for enterprise email. ]. In their pioneering work, WhiŠaker and Sidner [ 29 29 , 22 , 12 ] We analyze and characterize various factors a‚ecting email (2) explored how people manage their email and found that email was replies. Compared to previous work, we study many novel not only used for communication, but also for task management factors including email textual content, request / com- and maintaining a personal archive. A decade later, Fisher et al. mitment in emails, address features like internal/external [ 12 ] revisited the email overload problem by examining a sample of emails, and number of email recipients. mailboxes from a high-tech company. Œey showed that some as- (3) We extract 10 di‚erent classes of features and build models pects of email dramatically changed, such as the size of archive and to predict email reply behaviors. We perform thorough number of folders, but others, like the average inbox size remained experimental analysis with the publicly-available Avocado more or less the same. Several researchers have proposed solutions email collection and show that the proposed methods out- 1 , to mitigate the email overload problem [ 2 , 13 , 18 ]. Aberdeen et perform all baselines with large gains. We also analyze the al. [ 1 ] proposed a per-user machine learning model to predict email importance of each class of features in predicting email “importance” and rank email by how likely the user is to act on that reply behavior. mail; this forms the the Priority Inbox feature of Gmail. Our work shares similar motivations for handling the email overload problem by modeling and predicting user reply behaviors on emails. We focused on email reply behaviors, speci€cally identifying emails 1 hŠps://catalog.ldc.upenn.edu/LDC2015T03

3 (a) ‡e number of sent emails in each month. (b) ‡e number of active users in each month. Figure 2: Temporal analysis of the number of sent emails and active users in each month of the Avocado email collection. various meta information as well as aŠachments, folders, calen- which receive replies and the reply latency. Œeir work looked at dar entries, and contact details from Outlook mailboxes for 279 consumer emails, Gmail, whereas we focus on the enterprise email collection, Avocado. company employees. Previous research studied 506 aŠach- Œe full collection contains 938 , 035 emails and 325 , Email Acts and Intent Analysis. ]. Cohen et al. [ ] , 7 , 21 ments. Since our goal is to study email reply behavior, we need to , 7 26 , 3 email acts and email intent analysis [ 27 proposed machine learning methods to classify emails according to preprocess the data to €gure out the reply relationships between an ontology of verbs and nouns, which describe the “email speech emails. For emails that are replies, the email meta information includes a “reply act” intended by the email sender. Follow-up work by Carvalho to” relationship €eld, containing the ID of the email that this email is replying to. We generate the email thread ] described a new text classi€cation algorithm based 4 and Cohen [ on a dependency-network-based collective classi€cation method structure by parsing the “reply to” relationship €elds in email meta data. Speci€cally, we €rst use the “reply to” relationship €elds to and showed signi€cant improvements over a bag-of-words baseline classi€er. In a recent study using the Avocado collection, Sappelli collect sets of messages that are in the same thread. Œen emails ] studied email intent and tasks, and proposed a taxonomy et al. [ in the same thread are ordered by the sent time to generate their 26 thread positions. We removed all the duplicated quoted “original of tasks in emails. Œey also studied predicting the number of tasks messages” in the email text €les to identify the unique body for in emails. Our work extends previous research on email acts and to” relationship €elds and associated intent by extracting requests and commitments from emails and each message. From the “reply email sent time, we generate the ground truth for reply behavior using them as features for predicting user reply behavior. including whether the email received a reply, and reply latency. Email Classi€cation and Mining. We formalized the user We €rst perform a temporal analysis of sent emails and active email reply behavior prediction task as a classi€cation task. Œere are many previous papers on email classi€cation and mining [ users in each month. Figure 2a and Figure 2b show the number 14 – , 5 of sent emails and the number of active people in this collection ] introduced the Enron corpus as a 19 19 , 16 ]. Klimt and Yang [ data set and used it to explore automated classi€cation of email respectively. Emails are aggregated by the sent time into di‚erent messages into folders. Graus et al. [ ] studied the task of recipient 14 months. Each person is represented by an interval based on the recommendation. Œey also used enterprise email collections, but sent time of his/her €rst and last email. Œen the number of active people in each month is the number of overlapping intervals during their collection was proprietary and their focus was on recipient recommendation, rather than reply behavior modeling. the month. We see that the number of active people increases to a To summarize, the research described in this paper extends previ- peak in February 2001 and then decreases to under 50 a‰er February 2002, as the company started laying o‚ employees. Œe number of ous research on email interaction in several ways. Using a recently released public enterprise email collection, we formalize the task emails sent has a similar paŠern over months. To ensure that our analysis and experimental results are built upon emails when the of prediction whether an email will be responded to and how long it will take to do so. In contrast to earlier work on reply prediction, company was under normal operations, we select the emails from we study enterprise (vs. consumer) email behavior, consider both June 1st 2000 to June 1st 2001. one-to-one (dyadic) and one-to-many emails, and develop several We further perform a few additional data cleaning and €ltering steps. We focus on the reply behaviors toward the €rst message new types of features to characterize the email content and intent and study their importance in reply prediction. in each thread. Because replies towards the €rst messages and follow-up replies may have di‚erent properties and we want to focus on one phenomenon in this study. We €lter out email threads 3 DATA SET & TASK DEFINITION where the €rst reply comes from the sender himself or herself and emails where the only recipient is the sender. A‰er these €ltering Our data set is the Avocado research email collection from the Linguistic Data Consortium. Œis collection contains corporate emails from a defunct information technology company referred to as “Avocado”. Œe collection contains the full content of emails,

4 steps, we have 351 532 emails le‰, which become the basis of our , a‰ernoon receive more and faster replies. For reply rate, emails 2 data analysis. . sent in the a‰ernoon have the highest reply rate (7 77%) and emails 63%). For reply time, the . sent at night have the lowest reply rate (3 Œe email behavior we study includes the reply action and reply median reply time for morning and a‰ernoon is less than 1 hour, time. We formally de€ne the task as following. Given an email u } at time t , we study: that user whereas reply time for mails sent in evening and night is more than sent to a user set { u message m j i will reply { : whether any one of users in Reply Action (1) } u 7 hours. Œis makes sense since users are more active during the j Reply Time { u } reply, day than night time. to message ; (2) m : how quickly users in j which is the di‚erence between the €rst response time and the start Figure 3c and Figure 3d show the reply rate and reply time to time . If there are multiple replies from users in emails sent on di‚erent days of a week. We see that emails sent } t { u , we consider j on weekdays receive more and faster replies. Œe reply rate for only the time latency corresponding to the €rst reply. Œus our seŠing includes both “one-to-one” and “one-to-many” emails sent on weekdays is around 7%, but that rate drops to 4% on ] which only included dyadic 20 emails, which is more general than [ weekends. For the median reply time, we see that emails sent on emails between pairs of users with at least €ve previous emails. weekdays receive replies in less than 1 hour. However, emails sent on the weekend have 13 (for Sunday) or 30 (for Saturday) times longer reply latency. Œis is consistent with the fact that most people don’t check or reply to emails as regularly on the weekend. Most emails sent over the weekend are replied to a‰er Sunday. 4.2 Email Subject and Body Next we study the e‚ects of properties of the email content on user reply behavior. We start with the length of email subjects and email bodies. We remove punctuations and maintain stop words (b) Reply time vs. time of day. (a) Reply rate vs. time of day. when counting the number of words. Figure 4a and Figure 4b show the e‚ects of email subject length on user reply behavior. Œe 10% as the length of email . 47% to 1 . reply rate decreases from 11 subject increases from 1 to 29, which is an interesting phenomenon. Because we have access to the full content of emails in Avocado, we were able to examine emails that have long and short subjects to identify. We found that many long subjects include announcements in the subjects but no text in the body or some machine generated bug reports. Such email subjects are similar to “ [Bug 10001] Changed (c) Reply rate vs. day of week. (d) Reply time vs. day of week. - Loop length not assigned correctly for xml €le when extracted to a Figure 3: ‡e e‚ects of temporal features of the original sent 4 Users don’t ”. €le and tested, works OK when tested from database. emails on user email reply behavior. ‡e median reply time need to reply to such emails, leading to lower reply rate. denoted by Median RT is in minutes. In examining short subjects, we computed the reply rates for all unique subjects. Checking those that occur 20 times or more, we summarize those that have the highest reply rate. A large number 4 CHARACTERIZING REPLY BEHAVIOR of these are simply company names – these o‰en indicate current We characterize multiple factors a‚ecting email reply action and contracts, customers, or sales leads that are being pursued. Other reply time. For reply action, we compute the reply rate which is ”, “ ”, “ meeting ”, lunch ”, “ expense report phrases such as “ sales training 3 the percentage of emails that received replies. In Section 5 we ” indicate common activities that require approval, co- alerts and “ show that these factors enable us to learn models to predict user ordination, or action. Finally, a set of other phrase types such as reply behavior. ”, etc. indicate recipient-sender familiarity (note hey ”, “ hi ”, “ hello “ that spam e-mail €lters were in place before collection). We also 4.1 Temporal Factors €nd that reply time is inƒuenced by subject length. Œe main trend is that the reply time decreases as the subject length increases. We €rst study the impact of temporal factors on user replies. Figure We also analyze the impact of the email body length on reply 3a and Figure 3b show the reply rate and median reply time to behavior. Figure 4c and Figure 4d show the e‚ects of email body emails sent at di‚erent times of the day. We partition the time of a length on reply behavior. We see that the reply rate initially in- day into Night (0 to 6), Morning (6 to 12), A‰ernoon (12 to 18) and . creases from 4 . 64% words as the email length increases 65% to 9 Evening (18 to 24). Œen we aggregate emails in each time range 82% as the email body from 1 to 40 words and then decreases to 2 . and compute the reply rate and median reply time. Œe unit of length increases to 500 words. Œus people are not likely to reply reply time is in minutes. We see that emails sent in morning and to emails with too few or too many words. Œe median reply time 2 Œe email IDs of the €ltered subset can be downloaded from hŠps://sites.google.com/ increases as the length of the email body increases. Œis may be site/lyangwww/code-data. 3 For partitions of the data we compute the reply rate for each partition separately. For example, the reply rate to emails sent at night is the percentage of emails sent at night 4 that receive a reply regardless of whether the reply is sent at night or day. We paraphrase this sentence due to data sensitivity.

5 Table 2: ‡e e‚ects of email addresses and attachments on user reply behavior. ‡e median reply time denoted by Me- dian RT is in minutes. Factor Email Addresses AŠachments External Email Feature Internal HasAŠach NoAŠach 7.76% 2.26% 8.08% 6.38% Reply Rate 61.23 Median RT 65.52 134.67 80.37 (a) Reply rate vs. subject length. (b) Reply time vs. subject length. 4.4 Internal / External Email Addresses Next we investigate the impact of internal or external emails on reply rate and reply time. To do this, we adopt a heuristic method to classify the email addresses. For each receiver address, we check whether there is a “@” in it. If a receiver address contains “@” and does not contain “@avocadoit.com”, this address is assigned an “external” label. If there is no “@” or only “@avocadoit.com” (d) Reply time vs. body length. (c) Reply rate vs. body length. in a receiver address, we classify it as an internal address. Email Figure 4: ‡e e‚ects of email subject length and email body addresses with “@avocadoit.com” are de€nitely internal addresses. length on reply rate and median reply time. ‡e median re- However, a limitation of this method is that if email addresses with ply time denoted by Median RT is in minutes. external domains are in the sender’s contacts, they will be treated as internal addresses. In this sense, the internal email addresses in our analysis are those of Avocado employees or people who have Table 1: ‡e e‚ects of requests and commitments in emails frequent communications with Avocado employees and are stored on user reply behavior. ‡e median reply time denoted by in the contact books. Emails sent to at least one external address are Median RT is in minutes. labeled as external, otherwise they are treated as internal emails. Factor Requests Commitments Table 2 shows the comparison of the reply rate/ reply time for HasReq NoReq HasCom NoCom Email Feature 4 times more likely . internal emails and external emails. People are 3 8.78% Reply Rate 8.32% 7.45% 14.81% 76%) than external emails (2 to reply to internal emails (7 26%). In . . 81.13 86.71 60.60 Median RT 54.51 . 1 times faster addition median reply time to internal emails is 2 than to external emails, 66 vs. 135 minutes respectively. In our context, internal emails come from the colleagues or people in the sender’s contacts, and are thus more likely to be quickly replied to. because people need to spend more time reading and digesting the content of longer emails, thus increasing reply time. 4.5 Email Attachments We further analyze the e‚ects of email aŠachments on user reply 4.3 Requests and Commitments in Emails behaviors. Table 2 shows that reply rates are higher for emails with 3 4 ] has studied “speech acts” in emails. , Previous work [ 21 7 , , . aŠachments (8 38%). Œis . 08%) than emails without aŠachments (6 ] developed algorithms to classify email 4 Carvalho and Cohen [ may be because the types of emails that contain aŠachments are messages as to whether or not they contain certain “email acts”, more likely to require replies. Œe median reply time for emails such as a request or a commitment. We classify emails into those 81% longer than that for emails with no with aŠachments is 23 . with requests (e.g. ”Send me the report.”) or commitments (e.g. ”I’ll aŠachments. Œis result is consistent with the fact that people need complete and send you the report.”) and those without requests or to open the aŠachments and spend more time reading or replying commitments using an internally developed classi€er inspired by to emails with aŠachments. previous work in this area [8]. Table 1 shows the e‚ects of whether an email contains a request 4.6 Number of Email Recipients or commitment on reply rate and reply time. Emails that contain requests are almost twice as likely to receive replies as those that Unlike [ 20 ] who consider only dyadic emails, we also include emails sent to multiple recipients in our analysis. Figure 5 shows the com- 81% vs. 7 . . do not, 14 45% respectively. Œis is reasonable since parisons of reply rate and reply time for emails with di‚erent num- intuitively people are more likely to reply to emails if the sender ber of recipients. Most emails are sent to 1 to 8 addresses. Emails makes a request. In contrast, there is very liŠle di‚erences between sent to 3 to 5 addresses have the highest reply rates (approximately the reply rate toward emails with and without commitments. 10%). As the number of email recipients continues to increase, the Œe median reply time toward emails with requests or commit- reply rate begins to decrease. Œus more email recipients does not ments is longer, which may seem a bit counter-intuitive. However, always mean a higher reply rate. Œere are at least two reasons for this may be because such emails are associated with tasks, and there- this: 1) some emails sent to many addresses are general announce- fore, people may need to do some work like searching information, reading or writing before they can reply to such a mail. ments or reports which do not require replies; 2) when an email is

6 Table 3: Reply time prediction class distribution. Training Data Testing Data Class 1: 38.53% 25min 5,143 32.80% 3,299 ≤ 32.93% 2: 25-245min 5,150 32.84% 2,820 ≥ 245min 3: 5,389 34.36% 2,444 28.54% Table 4: Reply action prediction class distribution. (b) Reply time vs. the number of (a) Reply rate vs. the number of email recipients. email recipients. Class Training Data Testing Data Figure 5: ‡e e‚ects of the number of email recipients on 93.47% 224,605 0: No Reply 92.30% 102,682 reply rate and median reply time. ‡e median reply time 8,563 6.53% 1: Has Reply 15,682 7.70% denoted by Median RT is in minutes. data and the distribution of labels for reply time/action prediction sent to many recipients, it is more likely that no one will reply to it task are shown in Table 3 and Table 4. since people may think the other co-recipients will reply to it. Median reply time increases from 44 minutes to 95 minutes as 5.2 Learning Models and Evaluation Metrics the number of email recipients increases from 1 to 8. Œus emails We experiment with a variety of machine learning algorithms in- sent to more recipients get slower replies. When an email has many cluding Logistic Regression, Neural Networks, RandomForest, Ad- recipients, people may wait a while to see if others reply before 31 aBoost [ ] and Bagging algorithms. Since the focus of our work is they choose to do so, or it may be that emails that are sent to many on feature analysis, we only report the experimental results with a people require some work to be accomplished before the reply. basic Logistic Regression (LR) model and the best model, AdaBoost. 5 We used the sklearn package for the implementation of LR and 5 PREDICTING REPLY BEHAVIOR AdaBoost. We did multiple runs of hyper-parameter tuning by grid Œe results presented in Section 4 show that email reply behaviors search to €nd the best seŠing for each model with cross validation 6 are inƒuenced by various factors such as time, email content, email on the training data. addressees, email recipients and aŠachments. In this section, we Since reply action prediction is highly imbalanced (see Table use these insights to guide the development of features to train a 4) and ranking quality is of importance for tasks like triage, we supervised learning model to predict email reply behavior including report Area Under the ROC Curve (AUC), which is a ranking metric reply time and reply action. insensitive to class imbalance, as the metric for reply action predic- tion. For reply time prediction, we report precision, recall, F-1 and accuracy (i.e., the percentage of correctly classi€ed emails). Œe 5.1 Data Overview and Methodology precision, recall, and F-1 are computed as weighted macro averages Given the Avocado email collection described in Section 3, we split over all three classes. the data into training / testing partitions using the sent time of emails. Speci€cally, we use emails in 9 months from June 1st 2000 5.3 Features to February 28th 2001 for training and use emails in 3 months from March 1st 2001 to June 1st 2001 for testing. Because email Our analysis in Section 4 identi€es factors that impact email reply is a temporally ordered collection, we used a split by time (rather behaviors. Building on these observations we develop 10 classes than randomly) to ensure that we do not use future information to of features that can be used to build models for predicting both predict past reply behavior. reply time and reply action. Œe summary and description of the We formalize the reply action prediction task as a binary classi- extracted features is provided in Table 5. In total, we extract 65 €cation problem and the reply time prediction task as a multi-class features and Bag-of-Words features in 10 feature groups. We nor- classi€cation problem. We follow the notations and task de€ni- , malize all features to be in the range of [ 0 1 ] using min-max feature tions presented in Section 3. Given an email message m that user normalization. u } at time t , reply action prediction is to sent to a user set { u : Œese features include features Address Features (Address) i j will reply to message predict whether any user in { u } m . Œus the derived from the email addresses such as whether the email is j classi€ed instances are emails with binary labels. For reply time internal or external and the number of recipients. prediction, we do not necessarily need to get the exact reply time 5 latency. We follow a similar seŠing as previous related work [ 20 ], hŠp://scikit-learn.org 6 1 = C , the maximum number of iterations For reply time prediction with LR, we set where we consider three classes of reply times: immediate replies 500 as , penalization as l1 norm. For reply 0001 . 0 , tolerance for stopping criteria as that happen within 25 minutes (32 . 80% of the training data), quick 1 = 100 , the maximum number of iterations as action prediction with LR, we set C , 84% of . replies that happen between 25 minutes and 245 minutes (32 . 0 tolerance for stopping criteria as 0001 , penalization as l1 norm. For reply time prediction with AdaBoost, we set the learning rate as 0 . 1 , the maximum number of the training data), and slow replies that take longer than 245 min- estimators as , boosting algorithm as . For the reply action prediction R 800 . S AM M E 36% of the training data). Œe ground truth labels can be . utes (34 with AdaBoost, we set the learning rate as 1 , the maximum number of estimators as R directly extracted from the training/testing data based on the actual 50 , boosting algorithm as S AM M E . . reply time and actions on emails. Œe statistics of the experimental

7 Table 5: ‡e features extracted for predicting user email reply behaviors. ‡e 10 feature groups Address, BOW, CPred, CProp, HistIndiv, HistPair, Meta, MetaAdded, Temporal, User stand for “Address Features”, “Bag-of-Words”, “Content Predictions”, “Content Properties”, “Historical Interaction-Individual”, “Historical Interaction-Pairwise”, “Metadata Properties”, “Metadata Properties-Sender Added”, “Temporal Features” and “User Features” respectively. Note that for the computation of features in User, HistIndiv and HistPair, we respect the temporal aspects of the data and only use the historical information before the sent time of the email instance. Group Description Feature 1 binary feature indicating whether the email is internal or external Address IsInternalExternal Address NumOfRecipients Œe number of recipients of the email Œe bag-of-words features indicating the TF-IDF weights of terms in the email body text BOW BagOfWords SentimentWords 2 integer features indicating the number of positive/negative sentiment words in the email body text CPred CommitmentScore CPred Œe commitment score of the email from an internal binary classi€er Œe request score of the email from an internal binary classi€er CPred RequestScore EmailSubLen CProp Œe length of the subject of the email EmailBodyLen CProp Œe length of the body of the email HistReplyRateGlobalUI u HistIndiv towards all the other users Œe historical reply rate of the sender i Œe historical reply count of the sender u HistReplyNumGlobalUI towards all the other users HistIndiv i HistIndiv u as in sent to address from all the other users HistRecEmailNumSTGlobalUI Œe historical number of received emails of the sender i u HistRecEmailNumCCGlobalUI HistIndiv Œe historical number of received emails of the sender as in CC address from all the other users i HistSentEmailNumGlobalUI HistIndiv Œe historical number of sent emails of sender u to all the other users i HistIndiv Œe historical mean reply time of sender u HistReplyTimeMeanGlobalUI to all the other users i HistReplyTimeMedianGlobalUI HistIndiv u Œe historical median reply time of sender to all the other users i HistGlobalUJ { u HistIndiv } towards the other users 21 features indicating similar mean/min/max historical behavior statistics of recipients j } { u HistPair HistReplyNumLocal to sender u 3 features indicating the historical mean, min, max reply count of the recipient i j HistPair 3 features indicating the historical mean, min, max of the mean reply time of the recipient { u towards sender } HistReplyTimeMeanLocal u i j HistPair 3 features indicating the historical mean, min, max of the median reply time of the recipient { u HistReplyTimeMedianLocal } towards sender u i j Meta 1 binary feature indicating whether the email has aŠachments HasAŠachment Meta 1 integer feature indicating the number of aŠachment of the email NumOfAŠachment MetaAdded 1 binary feature indicating the importance of the email, which is a tag speci€ed by u IsImportant i IsPriority MetaAdded 1 binary feature indicating the priority of the email, which is a tag speci€ed by u i IsSensitivity MetaAdded 1 binary feature indicating the sensitivity of the email, which is a tag speci€ed by u i TimeOfDay 4 binary features indicating the time of the day (0-6, 6-12, 12-18, 18-24) Temporal DayOfWeek 7 binary features indicating the day of week (Sun, Mon, ... , Sat) Temporal 2 binary feature indicating whether the day is a weekday or a weekend WeekDayEnd Temporal UserDepartment 1 feature indicating the department of the email sender u . User i u . 1 feature indicating the job title of the email sender UserJobTitle User i u { statistics to capture the historical interactions between and all } Bag-of-Words (BOW) : Œese features include the TF-IDF weights j of non-stop terms in the email body text. Œe vocabulary size of other users. : Œese features Historical Interaction-Pairwise (HistPair) our experimental data set is 554061. : Œese features include some Content Predictions (CPred) u characterize the local (pairwise) interactions between the sender i and the recipients predictions like positive / negative sentiment words, commitment / u { } , which are statistics like number of replied j request scores from email textual content. We count the number of emails, mean/median reply time from the historical interactions u between the sender and the recipients { u } . Note that for the positive sentiment words and negative sentiment words in email i j body text using a sentiment lexicon from a previous research [ 17 ]. computation of “HistIndiv” and “HistPair” features, we compute per day updated user pro€les and only use the information before We also include the commitment and request score of emails from the email instance. an internal classier to infer the likelihood of whether an email Metadata Properties (Meta) : Œis feature group contains fea- contains a commitment or a request. Content Properties (CProp) : Œese features are content prop- tures derived from email aŠachments including whether the email erties including the length of email subjects and email body text. has aŠachments and number of email aŠachments. Historical Interaction-Individual (HistIndiv) : Œese features Metadata Properties-Sender Added (MetaAdded) : Œese fea- tures include tags speci€ed by the sender to indicate the impor- characterize the historical email interactions related to each sender u i aggregated across all interactions. Œis u tance, priority or sensitivity of the sent email. In our data, less than } { u and recipient in j i 3% of emails have such tags. But they can still provide some clues feature group has two subgroups: global interaction features for to infer user reply behavior once they are set by the sender. and global interaction features for recipients . u the sender } u { i j Œe global interaction features for the sender u contain a set of : Œese features are generated Temporal Features (Temporal) i based on the sent time of emails to capture the temporal factors on with all features to capture the historical interactions between u i the other users such as reply rate, reply count, number of received user email reply behaviors. emails, number of sent emails, mean/median reply time, etc. For : Œese features include the department User Features (User) the global interaction features for recipients , since there could } and job title of the person. u { j be multiple recipients, we compute the mean/min/max of those

8 Table 6: Summary of the prediction results for user email gains. Œe di‚erences are statistically signi€cantly with p < 0 . 01 reply time and reply action. ‡e Precision, Recall and F- ]. Œe best method based on measured by a micro sign test [ 30 1 scores are weighted averages by supports over all classes. . 89%, 26 . 36% for F-1 AdaBoost achieves large improvements of 23 ‡e best performance is highlighted in boldface. Both LR and accuracy comparing with “Previous Reply” and 253 85% . . 18%, 60 and AdaBoost show signi€cant improvements over all base- for F-1, accuracy comparing with “Majority Vote”. Comparing the 0.01 measured by micro sign test [30]. < line methods with p two learning models, AdaBoost has beŠer performance than LR in terms of both F-1 and accuracy. Œis shows the advantages of Action Time AdaBoost that can feed the relative “hardness” of each training sample into the tree growing algorithm such that later trees tend AUC Rec F1 Accuracy Method Prec to focus on harder-to-classify instances. .3244 .3257 .3253 .3262 .5024 Random Majority Vote .5000 .0815 .2854 .1267 .2854 .3717 .3613 .3633 .3742 .5858 Previous Reply 5.6 Feature Importance Analysis .4098 .3791 .3952 .7036 LR .4098 We further perform analyses to under- Feature Group Analysis. .4591 .4591 .4561 .4476 .7208 AdaBoost stand the relative importance of di‚erent feature groups in pre- dicting reply time. We consider two approaches: (1) Remove one feature group. We observe the change in performance when we 5.4 Baselines remove any one of the 10 feature groups. (2) Only one feature We compare our method against three baselines as follows: group. We observe the change in performance when we classify . Œis method randomly generates a predicted class Random emails only using one feature group. Table 7 and Table 8 show the from the label distribution in the training data. results of these analyses for reply action prediction and reply time . Œis method always predicts the largest class, Majority Vote prediction using the AdaBoost model, which is the best method in which is class 0 (no reply) for reply action prediction and class 3 our previous experiments. 245min) for reply time prediction. ( > Table 7 shows the performance when we use only one feature Previous Reply . Œis method generates predictions according group. Œe classes are ordered by AUC scores on action predic- to the previous reply behaviors of the recipients { u } towards the j tion. Œe most important features are highlighted using a triangle. u t of email m . For reply action, it sender before the sent time i For reply action prediction, “HistIndiv” and “HistPair” show the predicts 0 (no reply) if there is no previous reply behavior from best performance compared to other feature groups. Using only to u { u . For reply time, it predicts the majority class if there is no } i j “HistIndiv” features results in 0 . 6924 for the AUC score, which is . If there are previous reply { previous reply behavior from u to } u j i close to the performance with all feature groups. Œese results behaviors, it computes the median time of previous reply time as suggest that historical interactions are important features for reply the predicted result. Note that this baseline is similar to the “Last action prediction. “CPred” features (i.e., algorithmic predictions of 7 Reply” baseline used in [20]. request, commitments and sentiment) are also important although somewhat less so than the historical interaction features. However, 5.5 Experimental Results and Analysis for reply time prediction we see a di‚erent story. “Temporal” fea- We now analyze the results of our proposed method compared with tures are the most important features for predicting reply time as various baseline methods. Table 6 shows the prediction results for highlighted in Table 7. Using only “Temporal” features results in reply action and reply time. Figure 6 shows the ROC curves for all . 4261, which is only slightly good latency prediction accuracy of 0 methods for reply action prediction. Œe baseline “Majority Vote”, worse than the result from combining all feature groups. “HistIn- while accurate since 92 . 30% of emails are negative, achieves zero 40 are also helpful in div” features which result in accuracy above 0 . true positive rate (recall) and predicts no positive instances. Like- predicting latency. For reply actions, historical interaction features x = y dashed line in red which wise ”Random” falls nearly on the are the most important in indicating whether people will reply to indicates expected random performance (empirical variance leads the email eventually no maŠer the time latency. Given they will to a slight bit of luck). As shown in Table 6, both LR and AdaBoost reply to an email, people seem to strongly prefer to reply during outperform all three baselines by AUC. Œe best model AdaBoost oce hours on workdays, which explains why “Temporal” factors . achieves large improvements of 44 16% comparing with “Major- are so important for reply time prediction. 05% comparing with “Previous Reply”. AdaBoost . ity Vote” and 23 Another way of looking at the importance of features is to re- achieves slightly beŠer performance than LR. Examining the ROC move one class and look at the decrease in performance. Table curves, the most competitive baseline is “Previous Reply”, which is 8) shows the results of removing one feature group. Performance still under the ROC curves of LR and AdaBoost. Œe AUC scores decrease the most when we remove “HistIndiv” features for reply show that our methods outperform all baseline methods with a action prediction and “Temporal” features for reply time prediction. large margin. Œese results are consistent with the results when we only use one For reply time prediction, both LR and AdaBoost models with feature group. We also found some features are not very useful for the proposed features outperform all baseline methods with large reply behaviour prediction. For instance, when we remove “Meta” features which are derived from email aŠachments, both F-1 and 7 ], we can not reproduce their method since they don’t 20 For the proposed method in [ accuracy increase slightly for reply time prediction. Œis suggests 83 disclose the details of the features in their model and they also don’t release the code and data due the proprietary nature of their work. that there is still space for feature selection to further improve the

9 Table 7: Comparison of performance on predicting reply time and reply action when we only use one feature group. ‡e learning model used is AdaBoost. ‡e best performance N indicates strong performance is highlighted in boldface. when only use one feature group. ‡e feature settings are sorted by the AUC scores. Time Action F-1 Accuracy AUC Rec Feature Set Prec N N N .4104 .4104 .3891 HistIndiv .6924 .3642 N .3721 .3890 .3463 .3890 .6382 HistPair .3748 .3784 .3352 .3784 CPred .5954 .5944 .3847 .3575 .3847 User .3729 .3790 .3038 .3641 .5912 .3641 Address N N N N .4264 Temporal .4261 .5401 .4436 .4261 .5346 .3415 .3785 .3060 .3785 CProp .5291 .3877 .2672 .3877 MetaAdded .2524 Figure 6: ‡e ROC curves of di‚erent methods for the reply .2398 Meta .3670 .3670 .5247 .2866 action prediction task. .3976 .5106 .3391 BOW .3744 .3976 .4476 .4591 .4561 .7208 AllFeat .4591 email bodies, and address features like “NumOfReceivers”, “IsInter- Table 8: Comparison of performance on predicting reply nalExternal” etc. On the other hand, temporal features like “Time- time and reply action when we remove one feature group. OfDay1Morning”, “IsWeekDay”, “DayOfWeek1Sunday”, “TimeOf- indicates large ‡e learning model used is AdaBoost. H Day4Night” are among the most important features for reply time drops in performance when remove one feature group. ‡e prediction. Œese interesting di‚erences are also consistent with feature settings are sorted by the AUC scores. the results in the feature group analysis. Some features including historical interaction features and content properties like the length Time Action of email bodies are important for both reply action prediction and AUC F-1 Accuracy Feature Set Prec Rec reply time prediction. H H .6620 .4481 .4409 .4481 .4453 –HistIndiv 6 CONCLUSIONS AND FUTURE WORK –CProp .7112 .4472 .4543 .4413 .4543 .4599 –Address .7187 .4550 .4599 .4476 In this paper, we introduce and formalize the task of reply behavior .4449 –MetaAdded .7191 .4549 .4579 .4579 prediction in a corporate email seŠing, using the publicly available –HistPair .4544 .4432 .4572 .7198 .4572 Avocado collection. We characterize various factors a‚ecting email –Meta .7216 .4573 .4604 .4515 .4604 reply behavior, showing that temporal features (time of day and H H H H .7218 –Temporal .3841 .4056 .3800 .4056 day of week), content properties (such as the length of email sub- .4611 –CPred .4611 .7229 .4457 .4540 jects and email bodies) and prior interactions between the sender and recipients are related to reply behavior. We use these insights .7237 .4539 .4573 .4503 –BOW .4573 to extract 10 classes of features groups and build models to pre- .7256 .4431 .4482 .4482 .4473 –User dict whether an email will be responded to and how long it will AllFeat .4591 .4476 .4591 .4561 .7208 take to do so. We show that the proposed methods outperform all baselines with large gains. We further show that temporal, textual content properties, and historical interaction features are especially performance of reply behavior prediction. We leave the study of important in predicting reply behavior. feature selection to future work. Our research represents an initial e‚ort to understand email AdaBoost [ 31 ] provides a mech- actions in a corporate seŠing. We examined email reply behavior in Importance of Individual Features. detail in one technology company, but is unclear how representative anism for reporting the relative importance of each feature. By ana- lyzing the relative importance, we gain insights into the importance the company is. It is important to see how our €ndings generalize to of individual features for di‚erent email reply prediction tasks. Ta- di‚erent industry sectors and di‚erent demographic backgrounds ble 9 shows the most important features for predicting reply time of employees. Future work will consider more available email and reply action with the relative feature importance learned by collections and more features that could be signals for user reply behavior prediction. Of special interest is the use of richer content AdaBoost. Œe most important features for reply action prediction features such as lexical, syntactic and semantic features. We shared are historical interaction features including “HistReplyCountRecip- the IDs of the emails that we consider in our research so that others ientMax”, “HistSentEmailCountSender”, “HistReceiveEmailSTRe- cipientMin”, content properties like the length of email subjects and can extend our research on the Avocado collection.

10 Table 9: ‡e most important features for predicting reply time and reply action with relative feature importances in AdaBoost. Reply Action Prediction Reply Time Prediction Group Feature Name Group Importance Importance Rank Feature Name 1.0000 EmailSubjectLen 1.0000 1 TimeOfDay1Morning Cprop Temporal 0.5714 Temporal 0.8083 HistIndiv HistReplyCountRecipientMax 2 IsWeekDay 0.4286 DayOfWeek1Sunday Temporal 3 HistSentEmailCountSender HistIndiv 0.4946 HistIndiv EmailBodyLen Cprop 0.3260 4 HistReceiveEmailSTRecipientMin 0.4286 0.4286 TimeOfDay4Night Temporal 0.3052 NumOfReceivers 5 Address 6 0.4286 HistRTMedianRecipientAvg HistIndiv 0.1690 EmailBodyLen Cprop HistLocalMeanRTMin 0.2857 HistRTMedianRecipientMin HistIndiv 0.1563 7 HistPair Address HistLocalReplyCountMax HistPair 0.1125 IsInternalExternal 8 0.2857 User 0.2857 HistReceiveEmailSTSender HistIndiv 0.1101 9 UserJobTitleSender HistLocalReplyCountMin HistPair IsPriority MetaAdded 0.0951 10 0.2857 NumoOfAŠachment 0.2857 IsWeekEnd Temporal 0.0909 11 Meta HistReceiveEmailCCSender HistIndiv 0.1429 HistLocalMedianRTMin HistPair 0.0907 12 HistRTMeanSender HistIndiv HistReceiveEmailCCSender HistIndiv 0.0858 13 0.1429 HistIndiv MetaAdded IsSensitivity HistReplyCountRecipientAvg 0.0759 14 0.1429 HistLocalReplyCountMax 0.1429 HistRTMeanRecipientAvg HistIndiv 0.0674 15 HistPair [14] David Graus, David van Dijk, Manos Tsagkias, Wouter Weerkamp, and Maarten 7 ACKNOWLEDGMENTS de Rijke. Recipient Recommendation in Enterprises Using Communication Œis work was supported in part by the Center for Intelligent In- Graphs and Email Content. In . SIGIR ’14 Mihajlo Grbovic, Guy Halawi, Zohar Karnin, and Yoelle Maarek. How Many [15] formation Retrieval and in part by NSF grant #IIS-1419693. Any Folders Do You Really Need?: Classifying Email into a Handful of Categories. In opinions, €ndings and conclusions or recommendations expressed . CIKM ’14 in this material are those of the authors and do not necessarily Ido Guy, Michal Jacovi, Noga Meshulam, Inbal Ronen, and Elad Shahar. Public [16] CSCW vs. Private: Comparing Public Social Network Information with Email. In reƒect those of the sponsor. . ’08 KDD Minqing Hu and Bing Liu. Mining and Summarizing Customer Reviews. In [17] ’04 . [18] Anjuli Kannan, Karol Kurach, Sujith Ravi, Tobias Kaufmann, Andrew Tomkins, REFERENCES Balint Miklos, Greg Corrado, Laszlo Lukacs, Marina Ganea, Peter Young, and Douglas Aberdeen, Ondrej Pacovsky, and Andrew Slater. Œe Learning behind [1] Vivek Ramavajjala. Smart Reply: Automated Response Suggestion for Email. In Gmail Priority Inbox. In NIPS 2010 Workshop on Learning on Cores, Clusters and . KDD ’16 Clouds . Bryan Klimt and Yiming Yang. Œe Enron Corpus: A New Dataset for Email [19] [2] Victoria BelloŠi, Nicolas Ducheneaut, Mark Howard, and Ian Smith. Taking . ECML’04 Classi€cation Research. In Email to Task: Œe Design and Evaluation of a Task Management Centered Email Farshad Kooti, Luca Maria Aiello, Mihajlo Grbovic, Kristina Lerman, and Amin [20] CHI ’03 . Tool. In Mantrach. Evolution of Conversations in the Age of Email Overload. In WWW Paul N. BenneŠ and Jaime Carbonell. Detecting Action-items in E-mail. In [3] SIGIR ’15 . . ’05 Andrew Lampert, Robert Dale, and Cecile Paris. Detecting Emails Containing [21] Vitor R. Carvalho and William W. Cohen. On the Collective Classi€cation of [4] . HLT ’10 Requests for Action. In . Email ”Speech Acts”. In SIGIR ’05 Carman Neustaedter, A. J. Bernheim Brush, and Marc A. Smith. Beyond ”From” [22] [5] Marta E. Cecchinato, Abigail Sellen, Milad Shokouhi, and Gavin Smyth. Finding . CHI EA ’05 and ”Received”: Exploring the Dynamics of Email Triage. In Email in a Multi-Account, Multi-Device World. In CHI’16 . Byung-Won On, Ee-Peng Lim, Jing Jiang, Amruta Purandare, and Loo-Nin Teow. [23] [6] Michael Chui, James Manyika, Jacques Bughin, Richard Dobbs, Charles Roxburgh, Mining Interaction Behaviors for Email Reply Order Prediction. In ASONAM Hugo Sarrazin, Geo‚rey Sands, and Magdalena Westergren. 2012. Œe social ’10 . economy: Unlocking value and productivity through social technologies. (2012). [24] Ashequl Qadir, Michael Gamon, Patrick Pantel, and Ahmed Hassan Awadallah. A report by McKinsey Global Institute. NAACL-HLT ’16 Activity Modeling in Email. In . William W. Cohen, Vitor R. Carvalho, and Tom M. Mitchell. Learning to Classify [7] [25] S. Radicati. 2014. Email statistics report, 2014-2018. (2014). EMNLP ’04 Email into Speech Acts. In . Maya Sappelli, Gabriella Pasi, Suzan Verberne, Maaike de Boer, and Wessel Kraaij. [26] [8] Simon Corston-Oliver, Eric Ringger, Michael Gamon, and Richard Campbell. Inf. Sci. 2016. Assessing E-mail intent and tasks in E-mail messages. 358-359 . Integration of Email and Task Lists. In First Conference on Email and Anti-Spam (2016), 1–17. [9] Laura A. Dabbish and Robert E. Kraut. Email Overload at Work: An Analysis of [27] Michael Gamon Richard Campbell Simon H. Corston-Oliver, Eric Ringger. Task- Factors Associated with Email Strain. In CSCW ’06 . focused Summarization of Email. In ACL’04 . [10] Laura A. Dabbish, Robert E. Kraut, Susan Fussell, and Sara Kiesler. Understanding [28] Joshua R. Tyler and John C. Tang. When Can I Expect an Email Response? A Email Use: Predicting Action on a Message. In CHI ’05 . . ECSCW’03 Study of Rhythms in Email Usage. In [11] Dotan Di Castro, Zohar Karnin, Liane Lewin-Eytan, and Yoelle Maarek. You’ve Steve WhiŠaker and Candace Sidner. Email Overload: Exploring Personal [29] Got Mail, and Here is What You Could Do With It!: Analyzing and Predicting Information Management of Email. In CHI ’96 . WSDM ’16 Actions on Email Messages. In . [30] Yiming Yang and Xin Liu. A Re-examination of Text Categorization Methods. In Danyel Fisher, A. J. Brush, Eric Gleave, and Marc A. Smith. Revisiting WhiŠaker [12] SIGIR ’99 . . CSCW ’06 & Sidner’s ”Email Overload” Ten Years Later. In [31] Ji Zhu, Hui Zou, Saharon Rosset, and Trevor Hastie. 2009. Multi-class AdaBoost. Michael Freed, Jaime G. Carbonell, Geo‚rey J. Gordon, Jordan Hayes, Brad A. [13] (2009). Statistics and Its Interface Myers, Daniel P. Siewiorek, Stephen F. Smith, Aaron Steinfeld, and Anthony Tomasic. RADAR: A Personal Assistant that Learns to Reduce Email Overload. In . AAAI ’08

Related documents