Backing Up Gmail Mailbox with Labels to Desktop PC (or Laptop)

Overview

I was using Outlook Express 6 with POP3 to backup my gmail mailbox. A serious problem was that Outlook Express did not show the gmail labels. I also faced some unusual problems with Outlook Express as my gmail mailbox started touching 2 GB, even though there was enough space on the disk drive on the PC where Outlook Express was storing the mail. Maybe the problem was that I was using an old Outlook Express version with old POP3 protocol. A friend reported that he used/uses Outlook 2013 and IMAP to download mails from his gmail mailbox and is able to see the gmail labels.

Anyway, I decided to explore other options to backup my gmail mailbox than Outlook Express 6 and POP3.

Google Takeout looked promising but it does not handle gmail service as of now. Sharing gmail password (even application specific password) with an unknown/not-well-known gmail backup software seemed somewhat risky and so that ruled out such software.

Eventually I decided on seriously trying Mozilla Thunderbird with IMAP even though I was informed that work on Thunderbird has been recently stopped. I guess it should be fine for a few years even without further development.

It worked like a charm (so far). Gmail labels are mapped to folders in Thunderbird. The mail data is stored in one of the mbox family formats, and is a text format. Each folder (label) has a separate data file.

Thunderbird does not seem to backup the Contacts. I used Google Takeout to backup Contacts.

If you would like to know details of using Thunderbird to backup gmail mailbox you may read on.

Details

BTW I am on Windows XP now (plan to move to Windows 7 in a month or two).

Mozilla Thunderbird for Windows can be downloaded (for free) from: https://www.mozilla.org/en-US/thunderbird/.

For setting up IMAP a/c in Thunderbird for my gmail a/c I primarily followed instructions from gmail help, “Get started with IMAP and POP3”: https://support.google.com/mail/troubleshooter/1668960?hl=en&ref_topic=3026306. At times I also referred to Thunderbird help, “Thunderbird and Gmail”: https://support.mozillamessaging.com/en-US/kb/thunderbird-and-gmail.

Thunderbird set the imap and smtp server (automatically at some stage in the account creation process) as googlemail something. I changed it to conform to the above referenced gmail help instructions (specific part copy-pasted below)

Incoming Mail (IMAP) Server – Requires SSL

imap.gmail.com
Port: 993
Requires SSL:Yes

Outgoing Mail (SMTP) Server – Requires TLS

smtp.gmail.com
Port: 465 or 587
Requires SSL: Yes

— end copy-paste —

Another important point is that I created an application-specific password in gmail for Thunderbird to access my gmail a/c, and specified this password to Thunderbird instead of my regular gmail password. To create such application specific passwords in gmail click on the drop down icon next to your email id shown on the top right in gmail window. In the dropped down box click on Account. Then go to Security -> Manage Access (of ‘Connected Applications and sites’). Now you will be on the Application-specific passwords page. You may follow the instructions in it to create one for your mail client like Thunderbird or Outlook (Express).

One fleeting concern I had is that the default settings in Thunderbird for the account created to access gmail has ‘Authentication method’ as ‘Normal password’ but with ‘Connection security’ as ‘SSL/TLS’ (this can be seen under Account Settings -> Server Settings). I went with the default which may be okay as SSL option being chosen should result in Thunderbird doing the tough encryption of the password before it is put out on the public Internet. This would be similar to how the password typed in the gmail login page on the browser is encrypted by the browser before being put on the public Internet. So I guess it is not a matter of concern. [There are other options like ‘Encrypted password’ and ‘Kerberos …’ for ‘Authentication Method’ but perhaps those options may be useful if the ‘Connection security’ is not ‘SSL/TLS’ (e.g. ‘None).]

Then I clicked on some ‘test’ button on a page in the Account creation process in Thunderbird. It seemed to show that the test worked by showing some details of the connection settings. I don’t recall the details now.

Either at this point or earlier I wondered about the location where the mail retrieved from gmail using IMAP would be kept on local disk. ‘Account Settings’ -> ‘Server Settings’ for the gmail a/c entry created in Thunderbird has a ‘Local directory’ field at the bottom of the page, which by default was set to some folder related to Thunderbird in the system drive (C:). I changed that to another folder on a non-system drive. The ‘Account Settings’ page lists another entry called ‘Local Folders’. Clicking on it shows another ‘Local directory’ field which was also mapped to some Thunderbird related folder on the system drive. I changed that folder also to some non system drive. [I had first changed the local directory field of ‘Local Folders’ and then the gmail a/c related to Thunderbird one.]

Once all this configuration was done, without me initiating any message download Thunderbird automatically started downloading info. from my gmail a/c. Initially it gave a message that it is downloading the message headers and showed all the gmail folders (system ones and the user created ‘label’ folders) very quickly! Then it started downloading the message details and was giving some info. about it on the status bar.

It has two files on local disk related to each folder/label – one has the same name as the folder/label but without an extension and another has the same name with extension msf. The file whose filename is without an extension seems to be the data file as its size seems to be reflective of the message data downloaded. The download process is quite slow.

The format of the data file (filename not having an extension): From https://en.wikipedia.org/wiki/Mbox, “The Mozilla family of MUAs (Mozilla, Netscape, Thunderbird, et al.) use an mboxrd variation with more complex From line quoting rules.”

mboxrd is part of a family of formats known as mbox. From the same wiki, https://en.wikipedia.org/wiki/Mbox:
“mbox is a generic term for a family of related file formats used for holding collections of electronic mail messages. All messages in an mbox mailbox are concatenated and stored as plain text in a single file. The beginning of each message is indicated by a line whose first five characters consist of “From” followed by a space (the so named “From_ line” or “‘From ‘ line” or simply “From line”) and the sender’s e-mail address. A blank line is appended to the end of each message. For a while, the mbox format was popular because text processing tools can be readily used on the plain text files used to store the e-mail messages.”

Some useful links regarding the mailbox files:
http://www.z-a-recovery.com/thunderbird-email-database.htm
http://kb.mozillazine.org/Importing_and_exporting_your_mail

Regarding messages associated with two or more labels in gmail: From https://support.mozillamessaging.com/en-US/kb/thunderbird-and-gmail,
‘Note that a message can have multiple labels (for instance, “Personal”, “Travel”, “All Mail” and “Starred”). In this case, a copy of this message will be downloaded and displayed in all the corresponding Thunderbird’s folders.’

Thunderbird help notes that BCC message recipients are not listed in the mail downloaded to Thunderbird via IMAP from gmail, https://support.mozillamessaging.com/en-US/kb/thunderbird-and-gmail:
‘When you are using a mail client (such as Thunderbird, Outlook, Entourage, etc) with a Gmail account and your account is configured to synchronize mail using the IMAP protocol, you cannot see BCC message recipients when you look at the message in the Sent Mail folder.’

There is a Tools -> Activity Manager command which opens up a window showing progress bars for download for each folder/label.

The downloading/sync of messages stopped in Thunderbird at some stage perhaps due to me closing Thunderbird and then re-opening it. Later it was automatically checking only for new messages.

But the File -> Offline ->Download/Sync Now command initiates old messages download/sync too. Download/Sync had to re-initiated some 2 or 3 times. But eventually all mail seems to have got downloaded. It took fair amount of work time of 2 days – though not continuous usage (my gmail mailbox size was around 2 GB). Download/Sync goes through each folder, downloads messages that have newly come/changed in some folders, and then stops showing any status messages indicating that all download has been done. As a double-check that all mail has been downloaded I repeated Download/Sync after all messages had been downloaded – it went through all folders without downloading any messages (as there were no new/changed messages), the last folder reported in the Status bar being Trash, and then stopped showing any status messages.

Concluding Words

I am quite satisfied with the Thunderbird backup of my gmail mailbox. I have got my gmail labels in the backup. The mail data file for each folder can be read in a text editor! Yes, there are a lot of headers and format stuff which makes it difficult to read but you still can read the mail content and even search for some string in the data file if Thunderbird is not available! There may be possibilities of easy conversion/import to other mail clients as the mail data format is an open one. That is a significantly different situation from Outlook Express’ proprietary format .pst mail data file.

Advertisements
This entry was posted in Misc. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s