MboxExtractor
API Documentation¶
-
class
tidyextractors.tidymbox.
MboxExtractor
(source, auto_extract=True, *args, **kwargs)¶ The
MboxExtractor
class is for extracting data from local Mbox files. This class has methods for outputting data into theemails
andsends
tidy formats, and a raw untidy format.Parameters: - source (str) – The path to either a single mbox file or a directory containing multiple mbox files.
- auto_extract (bool) – Defaults to True. If True, data is extracted automatically. Otherwise, extraction must be initiated through the internal interface.
-
emails
(drop_collections=True)¶ Returns a table of mbox message data, with “messages” as rows/observations.
Parameters: drop_collections (bool) – Defaults to True. Indicates whether columns with lists/dicts/sets will be dropped. Returns: pandas.DataFrame
-
raw
(drop_collections=False)¶ Produces the extractor object’s data as it is stored internally.
Parameters: drop_collections (bool) – Defaults to False. Indicates whether columns with lists/dicts/sets will be dropped. Returns: pandas.DataFrame
-
sends
()¶ Returns a table of mbox message data, with “sender/recipient” pairs as rows/observations.
Note
Rows may have a recipient from either “TO” or “CC”. SendType column specifies this for each row.
Note
drop_collections is not available for this method, since there are no meaningful collections to keep.
Returns: pandas.DataFrame
Note
This submodule’s internals were adapted from Phil Deutsch’s mbox-to-pandas script with his permission.