Fix reply-by-mail for 8-bit transfer encodings

The mail class seems to handle mails sent with Content-Transfer-Encoding: 8bit
somewhat weirdly: It decodes them (to utf-8), changes the raw source to base64,
and does not modify the Content-Type:charset= header.

This leads to Discourse trying the message encoding (in my example ISO-8859-1)
first, and if that does not contain any unparseable characters, it uses that.
Sadly, in ISO-8859-1, every byte sequence is valid.

Fix this by always trying to decode as UTF-8 first. The probability of someone
using another encoding that cleanly (but wrongly) decodes as UTF-8 should be
fairly low.
This commit is contained in:
Claus Strasburger 2017-04-30 23:30:40 +02:00
parent a4815047c0
commit e9bb9a167b
3 changed files with 25 additions and 0 deletions

View File

@ -217,6 +217,13 @@ module Email
encodings = ["UTF-8", "ISO-8859-1"]
encodings.unshift(mail_part.charset) if mail_part.charset.present?
# mail (>=2.5) decodes mails with 8bit transfer encoding to utf-8, so
# always try UTF-8 first
if mail_part.content_transfer_encoding == "8bit"
encodings.delete("UTF-8")
encodings.unshift("UTF-8")
end
encodings.uniq.each do |encoding|
fixed = try_to_encode(string, encoding)
return fixed if fixed.present?

View File

@ -158,7 +158,9 @@ describe Email::Receiver do
expect { process(:html_reply) }.to change { topic.posts.count }
expect(topic.posts.last.raw).to eq("This is a **HTML** reply ;)")
end
it "handles different encodings correctly" do
expect { process(:hebrew_reply) }.to change { topic.posts.count }
expect(topic.posts.last.raw).to eq("שלום! מה שלומך היום?")
@ -167,6 +169,10 @@ describe Email::Receiver do
expect { process(:reply_with_weird_encoding) }.to change { topic.posts.count }
expect(topic.posts.last.raw).to eq("This is a reply with a weird encoding.")
expect { process(:reply_with_8bit_encoding) }.to change { topic.posts.count }
expect(topic.posts.last.raw).to eq("hab vergessen kritische zeichen einzufügen:\näöüÄÖÜß")
end
it "prefers text over html" do

View File

@ -0,0 +1,12 @@
Return-Path: <discourse@bar.com>
From: Foo Bar <discourse@bar.com>
To: reply+4f97315cc828096c9cb34c6f1a0d6fe8@bar.com
Date: Fri, 15 Jan 2016 00:12:43 +0100
Message-ID: <43@foo.bar.mail>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
hab vergessen kritische zeichen einzufügen:
äöüÄÖÜß