Email Sync Slow

#1

The last three days or so, syncing on Runbox 7 is quite slow, compared to how it was before.

Larry

#2

the difference in the syncing is because of the change from the link below, which was released 3 days ago. The main difference is that while earlier every change (marking as read, moving to another folder, flagging etc), caused the message to be fully reloaded and reindexed, now only the headers (subject, date, etc), are reloaded and only the changes are reindexed. So you should see a less messages be indexed while, new messages (where contents need to be loaded) may take bit more time, since they need to be parsed for extracting the text content. This change also does the full parsing and preloading of the new message so that it’s ready when you click on it. The parsed version is cached, so this is only once mer message. The previous version where text content were retrieved in full for every change used a different mechanism for only extracting text (see source snippet below), which is faster for the purpose of just extracting text, but less accurate than the full mailparser (https://github.com/nodemailer/mailparser), which is capable of handling much more when it comes to content types and encoding.

So in short. Syncing is now faster for changes, since loading less data, there are also less syncing entries than earlier. It’s slower for new messages that have not been parsed earlier.

We can also improve this by caching the parsed results from the server-side indexer, so that when the user starts Runbox7, text contents for most of the new messages are ready to be served, and should not take any extra time to fetch.

sub extract_msg_plaintxt {
    my $self = shift;
    my $msgfilename = shift;

    open(my $fh, $msgfilename);

    my $lastContentTypeIsText = 1;
    my $headerArea = 1;
    my $boundarySeparator;

    my $contentTransferEncoding = "";    

    while (my $row = <$fh>) {
        if($lastContentTypeIsText &&
            $headerArea &&
            $row=~/^[\n\r]/
          ) {
            $headerArea = 0;
        }
        if($row=~ /boundary=\"*([^\"]+)/) { #"
            $boundarySeparator = $1;
        }

        if($row=~ /Content-Type: ([a-z\/]+)/) {
            if($1 eq 'text/plain') {
                $lastContentTypeIsText = 1;
            } else {
                $lastContentTypeIsText = 0;
            }
        }

        if($row=~ /Content-Transfer-Encoding: ([a-z0-9\-]+)/) {
            $contentTransferEncoding = $1;
        }

        if($headerArea) {
            if(
                $row=~ /^Message-Id:/i ||
                $row=~ /^In-Reply-To:/ ||
                $row=~ /^References:/ ||
                $row=~ /^From:/ ||
                $row=~ /^To:/ ||
                $row=~ /^Cc:/ ||
                $row=~ /^Subject:/ ||
                $row=~ /^Date:/) {
                #print $row;
            }
        } else {
            if(
                # Stop parsing if we see a boundary separator
                $boundarySeparator && index($row,$boundarySeparator) !=-1
              ) {
                $lastContentTypeIsText = 0;
                last;
            }

            if($lastContentTypeIsText) {
                if($contentTransferEncoding eq "quoted-printable") {
                    $row = decode_qp($row);
                } elsif($contentTransferEncoding eq "base64") {
                    $row = decode_base64($row);
                }
                # trim
                $row =~ s/^\s+|\s+$/ /g;
                # remove newlines and repeating spaces               
                $row =~ s/\s+/ /g;                

                # check if we have more than 10 characters and no spaces
                if(length($row)>10 && $row !~ /\s/) {
                    # line contains no spaces - we don't want this in the search index
                    last;
                }
                print $row;                
            }
        }
    }

    close($fh);
        
}
1 Like
#3

Thank you for the explanation. I was afraid it might be my internet connection or computer.

#4

I think some work needs to be done with the speed of the email synchronization for the web client. It really takes a long time (at least a full minutes) for my index to finish each time I log in to the web mail client.