Blogs by Filip Hroch

Idle Synchronisation

A description of a homemade control utility intended for synchronisation of e-mails between a remote IMAP server and a local maildir throughout a link with limited network capacity.

Having e-mail as the principal communication channel on a slow network connection, I implemented an uncommon e-mail handling workflow by a utility called Idlesync. Idlesync calls mbsync to synchronise e-mails, mu to make an maildir database index, and gdbus to notify a user. One is the assistant service for comfortable use of mu4e, the e-mail client.

The functionality is designed to be highly efficient. New messages notification has delay only a few seconds, the synchronisation is executed when a mailbox state is expected to be changed, and all the operations are designed with a mind to keep network traffic as low as possible, and to be helpful as possible.

I has been forced to change my e-mail workflow with increasing traffic during the last year epidemic lock-down, when I moved from a regular IMAP client Evolution, to the described workflow. The approach significantly helps me to master the flood of e-mails.

The critical point was real-time notifications. My desktop environment is Gnome, and the favourite application is mailnag written in Python and consuming too much system resources. Therefore, I has looked for a tool, which is implementing the notification via IDLE mechanism of IMAP protocol in C/C++.

E-Mail synchronisation strategy

My strategy is to synchronises only a few mailboxes regularly. Inbox, Sent, and similar are expected to be updated often. The rest (all) of mailboxes are synchronised once per day at low traffic hours. The frequent synchronisation takes a few seconds, the full synchronisation can take a minute.

The key of the strategy is the knowledge of which mailboxes are used frequently, and which are modified occasionally. We can measure success of the approach by checking if a mailbox has been synchronised, and there was a modification (a new email, or a new flag). No-load synchronisations are useless.

In principle, it can be solved by opening of all mailboxes and checking for changes, but it causes heavy load of server. By the way, a split of mailboxes onto groups, which are synchronised with different periods, is good for both server and network load.

My personal preference is materialised in .mbsyncrc file below. I have defined the two groups of channels:

Channel Quick
Master :Remote:
Slave :Local:
Patterns "INBOX" "Sent" "Trash" "Spam" "mailing-lists*" ...

Channel Full
Master :Remote:
Slave :Local:
Patterns .. all mailboxes...

The common duration of synchronisations are up to two seconds for Inbox, Quick channels and, say, half of minute for Full channel.

The Full channel contains e-mail archives made by me. My habit is to archive Inbox, eventually Send, folder(s) once per year into a new folder like Inbox2021. The mailboxes are never changed. An update, once per day, is, perhaps, too frequent.

Periodic Notifications

I created a systemd timer (only the essential part is presented) for periodic mbsync-ing:

[Unit]
Description=Mailbox synchronisation timer

[Timer]
OnBootSec=3m
Unit=mbsync.service
AccuracySec=1s
RandomizedDelaySec=666

# working hours, 8..19, every 20 min
OnCalendar=Mon..Fri *-*-* 08..18:07,27,47:07

# working days, evening, one per hour
OnCalendar=Mon..Fri *-*-* 19..23:37:07

# weekends, only 8-23, one per hour
OnCalendar=Sat,Sun *-*-* 08..23:37:07

[Install]
WantedBy=timers.target

This timer has issued a script which calls mbsync with a parameter to distinguish full or partial mailbox folders update on base of daytime.

The script was supplied with a notification utility implemented in shell (the essential part should by extended for detection of recent e-mails only).

RUNNING=$(pgrep -u $USER '^mu$' -c)
# if not running:
mu index --quiet
MSG=$(mu find -f "f s" -s date "maildir:/Inbox and flag:new)
N=$(echo "$MSG" | wc -l | cut -f 1 -d ' ')
SUMMARY="You have $N new e-mail(s)"
notify-send --icon=mail-unread -c e-mail.arrived \
             "$SUMMARY" "$MSG"
paplay e-mail.wav

The approach brings a disadvantage in the long delays between updates (20 minutes during working days) which is inconvenient. Also, I need to remove older messages in some mailboxes (like spam, mail-listings, etc.) on periodic base; it requires an another tool (a script).

Idle synchronisation

The delayed notification was the principal reason why I has started utilising the IDLE command of IMAP4v1 protocol.

IMAP protocol is normally client-driven. A client send a request, and has got a response. Often querying is possible, but causing both system and network load.

Modern versions of IMAP (since 1996) offers the IDLE command which notifies clients about recent changes in mailboxes in real-time. A client launch IDLE command and waits; if a new message arrive in Inbox, the server immediately respond to the client; the client finish the active IDLE command, and can run additional actions. Whole delivery and notification process takes a few seconds.

The idle synchronisation requires adding of a new channel to .mbsyncrc:

Channel Inbox
Master :Remote:
Slave :Local:
Patterns "INBOX"

Resolving the principal drawback, I has improved also the periodic updates. I founded three groups: the Idle synchronised Inbox, the fast changing group contains Spam, Send, Trash, and mail-lists which is updated every hour, and finally, the slowly changing group which is updated with daily during low traffic hours. The same time (daybreak) is also used for maintenance actions.

Trash maintenance

Another implemented idea is to archive deleted messages in Trash mailbox, which is equipped by an auto-clean mechanism. Deleted messages are moved to Trash, where stay for month before they are expunged forever. This care can be important in cases, when I has changed my premature mind.

The approach requires supplement setup on side of e-mail readers. Evolution keeps messages with Deleted flag in Inbox until the expunge is requested (the application is closed); deleted messages are shown in virtual Trash. “Thrash as real folder” option is offered (by my understanding) for moving the messages to Trash folder, but it seems it does not works. Therefore, I implemented a background mechanism, which moves messages with \Delete flag to Trash folder, and unset the flag. It is done in imap.MoveToTrash() once per day.

On the other side, mu4e moves deleted messages into Trash mailbox (with active Deleted flags) by default; the desired handle requires re-definition of [d]elete shortcut to be just move action (without set of Delete) to Trash, see the discussion.

(fset 'my-move-to-trash "mT")
(define-key mu4e-headers-mode-map (kbd "d") 'my-move-to-trash)
(define-key mu4e-view-mode-map (kbd "d") 'my-move-to-trash)

By having the setup, imap.DeleteObsolete() checks folders for any obsolete content (messages older than month). The messages will has set Delete flag, and an IMAP server will delete them on server side.

The local copy should be emptied as well as, so the last step is the appropriate mbsync setup. The recommended global Expunge option should be definitely switched-off. Expunge should be activated only for Trash folder, eventually for folders which should be expunged on the auto-clean base like Spam, or mailing list, etc.

Channel ExpungeObsolete
Master :IMAPServer:
Slave :Local:
Expunge Both
Patterns "Trash","Spam", ...

The local Inbox contains the deleted messages in meantime. To hide ones, the common bookmark setup of mu4e should be replaced:

(add-to-list 'mu4e-bookmarks
          '( :name  "Inbox without deleted"
             :query "maildir:/Inbox and not flag:trashed"
             :key ?i))

The implementation

Support of IDLE command by common libraries is limited which guided me to develop of a small dedicated IMAP client class as thin layer on top of libcurl.

Now, the implementation is complete, but I’m in doubts, if libcurl was a good choice. All IMAP commands, except login, are re-implemented by myself. Direct implementations of network client via sockets, supported by gnutls library, may by a little bit better choice.

Idlesync is written in C++. By my opinion, classes and structured types makes programming easy. Many lines are in pure C; I’m amoral man mixing C and C++ without any inhibition. Even worse, I have no scruples to call external utilities by C++ executable. An efficiency does not matter, and the way inhibits a dependency hell.

Unsorted notes:

  • string standard implementation is really meretricious, as whole C++ standard library (if I compare it with Fortran),
  • formatting capabilities of standard C++ streams makes me to be a sad man,
  • the localisation by functions wchar, wstring, wcerr, etc is pretty bewildered. I spend week with study of mbstowcs() and iconv() functions, the exact implementation of UTF-8, and the whole i18 machinery. Those magnificent men which are able to implement correct localisation of large projects like Gnome, or Latex, has my adorable respect,
  • decoding of strings in BASE64 is easier by calling base64 utility, part of GNU coreutils, rather than to copy many lines of an unfamiliar code,
  • notifications are issued by calling of gdbus utility. I encountered the bug.
  • the current program listing can be found in idlesync.

The authentication

IMAP servers requires authentication. The simplest way to write password to a program source is efficient, but insecure.

Therefore, authentication agents, like gpg-agent, are utilised. Seahorse application maintains a key-ring in Gnome. A regular login makes available included passwords, and command line utility secret-tool can be used to reveal it. USERID is an identifier which can be obtained in Seahorse.

secret-tool lookup id USERID

Idlesync calls the command during start-up, and set IDLESYNC_IMAP_PASSWORD environment variable. The password can be later passed to mbsync as the login credentials:

IMAPAccount Xyz
Host imap.xyz.uv
SSLType STARTTLS
User luser
PassCmd "echo $IDLESYNC_IMAP_PASSWORD"

It is also possible to get the password directly by mbsync with:

PassCmd "secret-tool lookup id USERID"

The way needs running Gnome services during shutdown sequence, which implicates a manual Idlesync termination. See the D-bus section.

System(d) integration

Idlesync is a common executable; it can be start, or stop, by hand. Moreover, the integration into systemd environment makes them to be a daemon, technically speaking.

Idlesync can be easy integrated into a graphical desktop environment with capabilities offered by systemd. There’s idlesync.service file which start the service when user log-in and send SIGTERM when user going to logout. This is guarantied by graphical-session.target which is started by Gnome itself.

[Unit]
Description=IdleSync
Wants=graphical-session.target

[Service]
Type=simple
ExecStart=idlesync --debug \
       --host IMAP_SERVER --user USER --id USERID \
       --quick-folders 'Sent,Trash,Spam,...' \
       --expunge 'Trash,Spam,...' \
       --sound-file bell.wav \
       --hmin 10 --hmax 22

[Install]
WantedBy=default.target

Resourceful systemd architecture keeps a list of PIDs of child processes, so it’s easy to properly stop Idlesync during logout sequence. Also, both output and error streams are redirected to system logs (available via journalctl [-f] --user-unit idlesync). The systemd approach significantly simplifies developing of daemon-like utilities; the hard work with forks, closing streams, or syslog functions, goes away (man 7 daemon).

D-bus non-integration

The implementation issues actions orderly in a loop. The control flow is not event driven, which means that one can not respond on handy events issued by Gnome infrastructure.

Idlesync is deaf to D-bus notifications of network (non-)availability. Any network interrupts are detected by the traditional way as network timeouts. The IMAP connection is stopped if a critical network error is occurred; a new connection is attempted after approximately ten minutes.

The same difficulty brings a shutdown. Idlesync is finished by SIGTERM, or SIGINT signals. If the signals are catch, it calls the final (quick) synchronisation. Unfortunately, at the time, mbsync still needs credentials, one needs running the key-ring service. This does not happen during Gnome logout, listening of D-bus events can help to solve it. That is the reason why the password is passed as the environment variable.

E-Mail sending

Sending of e-mails is independent on IMAP server. I configured Exim (default Debian mailer) for a satellite site (smart-host) via SMTP over TLS with Submission (port 465), see Exim on Debian wiki for details.

It is possible to use some smart SMTP clients like msmtp. Some of them can block Emacs while a huge e-mail is send. The mail queue of Exim is the better way for such cases.

Caveats

Sending emails means upload of (potentially huge) messages twice: when they are send, and than are synchronised. I have no idea how to eliminate the redundancy.

Conclusions

This article is a guideline how to comfortably live with huge mailboxes on a slow network line. Please remember, any e-mail handling is unique as everybody is unique, so please modify, and use, Idlesync to improve your life.

References