Corrupted Subject and From header in Amazon.co.uk mail

Corrupted Subject and From header in Amazon.co.uk mail

Post by Chris Lawr » Sun, 28 May 2006 20:32:10


Hi, does anyone else use Pine and Amazon.co.uk and see this? Almost all
of the messages I get from Amazon have some header corruption, but they
display differently depending on whether I have full headers turned off
or on. I am using the ISO-8859-1 character set in my Pine config, and
Pine does not display any warnings or advisories about the message.

With standard headers:

From: Amazon.co.uk < XXXX@XXXXX.COM >
Subject: =?iso-8859-1?B?QW1hem9uLmNvLnVrIHJlY29tbWVuZHMgTWljaGVsIFRob21hcyBBZHZhbmNlZCBGcmVuY2ggKENEKSBhbmQgbW9yZQ==?=

With full headers:

From: "=?iso-8859-1?B?QW1hem9uLmNvLnVr?=" < XXXX@XXXXX.COM >
Subject: =?iso-8859-1?B?QW1hem9uLmNvLnVrIHJlY29tbWVuZHMgTWljaGVsIFRob21hcyBBZHZhbmNlZCBGcmVuY2ggKENEKSBhbmQgbW9yZQ==?=

The Subject also appears the same in the main index page alongside all
my other mails.

You can see the From address displays differently and the Subject header
remains screwed. My gut feeling is that the mail from Amazon is
malformed and Pine is simply trying to parse it as best as it can.
Presumably though this isn't a widespread problem with other clients or
Amazon would no doubt have fixed it by now - it's been happening since
as long as I can remember.

I've been meaning to post here about it for ages, just got a mail today
and finally decided to work it out once and for all!

--
Chris
 
 
 

Corrupted Subject and From header in Amazon.co.uk mail

Post by Alan J. Fl » Sun, 28 May 2006 21:08:41


RFC1522 section 2 applies: for the sender, there is a mandatory limit
of 75 characters on the length of encoded words. For the recipient,
section 6.1 applies, I think: "Any other sequence of printable
characters should be treated as ordinary ASCII text."

PINE is conforming exactly to the published rules. Very wise.
To try to do otherwise could be a security exposure.


As you see, PINE is doing precisely what RFC1522 says that it should
do in this situation.


I think you're mis-stating the case! It *is* a widespread problem,
with other clients failing to conform to the interworking
specification - which is exactly why Amazon can't be arsed to do their
part of the job right.

Quite a number of spams come with stuff faked in malformed headers
like this, presumably hoping to evade content-based scanners while
taking advantage of bugs in popular mail client software (I'm sure you
know which one I have in mind).

Our departmental mailer used to reject mail with such malformed
headers - not merely as an anti-spam measure, but on security grounds.

Unfortunately, the intentions of the admins were overruled by
commercial considerations, and this defence removed, in the face of
too much non-conforming commercial software. My vote goes to PINE for
standing fast. What's the point of published interworking
specifications if every vendor is free to disregard them?

 
 
 

Corrupted Subject and From header in Amazon.co.uk mail

Post by Franz Haeu » Sun, 28 May 2006 21:56:35

Alan J. Flavell writes:



Not exactly... look at the following subject header that is taken
from an existing post.

--8<---------------cut here---------------start------------->8---
Newsgroups: de.comm.software.newsreader
Subject: Decodiert Pine beliebig lange Headerzeilen mit =?ISO-8859-1?Q?g=FCltigen?= encoded words? (was: [PC-PINE] Muster pinerc)
Message-ID: < XXXX@XXXXX.COM >
--8<---------------cut here---------------end--------------->8---

Pine should refuse to decode the subject header as I understand
the following excerpt of RFC1522 section 2:

,----[ RFC1522 section 2 ]
| While there is no limit to the length of a multiple-line header
| field, each line of a header field that contains one or more
| encoded-words is limited to 76 characters.
`----

This is equally true for the prerequisites imposed by RFC2047
which obsoletes RFC1522. However, Pine perfectly decodes that
header and seems to be very permissive towards the software that
generated that header.

Franz.
 
 
 

Corrupted Subject and From header in Amazon.co.uk mail

Post by Chris Lawr » Mon, 29 May 2006 05:01:50

n Sat, 27 May 2006, Alan J. Flavell wrote:


I agree. In my olden days I used Turnpike which is also a very
standards-aware client, and I value the adherence to these standards.


No arguments with that, but what I'm trying to establish is the actual
cause of the gibberish in the headers. It appears that Amazon's mails
are not correctly following standards, and I wasn't aware that headers
like Subject and From were encoded - that makes no sense to me at all,
so I feel there's more to it than first seems.

I feel that I may have to ftp into the mailbox later and download it, in
order to inspect the mail in a text editor to see what it actually
contains.


Perhaps "problem" was the wrong word to use given the dual
interpretation. I completely agree on the standards side, it's not
different to a website designer creating something which only works
properly in Internet Explorer and having no desire to change because
that's where most of the target audience is. I'm guessing that most
people are viewing the mails in Outlook Express and that in OE the
Amazon mails display okay due to OE's sloppy application of the
standards.


I still don't understand how the Subject header contains encoded text,
nor why the From header looks okay with the standard view but wrong with
the header view in Pine. Clearly Pine is parsing the raw data
differently in each case.


I agree althought there's room for a more pragmatic approach too - Pine
could adhere to the standards itself but be more generous in accepting
the flaws of other clients. However I'm quite happy with Pine and can
live without Amazon mails that contain a sensible subject. I'm simply
trying to identify exactly what it going wrong with those messages so I
can relay it to Amazon, for all the good it may do.

--
Chris
 
 
 

Corrupted Subject and From header in Amazon.co.uk mail

Post by Alan J. Fl » Mon, 29 May 2006 06:42:54

n Sat, 27 May 2006, Chris Lawrence wrote:


It looks as if they're encoding the subject text (or rather, trying to
encode it and doing it wrongly), even though in this case it contained
only US-ASCII characters and so there was no need to encode it.

RFC2047 (end of section 5):
Use of 'encoded-word's to represent strings of purely ASCII
characters is allowed, but discouraged


If they contained 8-bit (i.e non-ascii) characters, they would *have*
to be encoded in this way.


Maybe. I really don't know. Spammers certainly do it in the hope of
slipping past content-based filters - though our content-based filters
award *extra* penalty points for useless encoding, and even more for
broken encoding, over and above the penalty points for the spam
content itself. But, as I say, we were forced to stop treating broken
encoding as grounds in itself for outright rejection.


Amazon.co.uk recommends Michel Thomas Advanced French (CD) and more


In the normal view, PINE is refusing to decode the defective subject
header, for the reason we discussed. It *is* decoding the encoded
From: header.

In the full-headers view, the decoding is turned off, by design - so
that you see the headers in their original form.


There are potential security issues with broken headers. Just because
you and I can't exactly see how to achieve a security compromise, does
not mean that it isn't possible. We should be able to draw worthwhile
lessons from a certain vendor, whose sloppy approach to specifications
leads to a continual stream of half-baked fixes, without the
underlying weaknesses ever really being resolved. It's so much safer
to start from a basically secured system. Enforcing mandatory
requirements of the specifications is certainly defensible as a
component of such an approach, IMNSHO.

When Postel said "be liberal in what you accept", I'm confident that
being tolerant of clear violations of mandatory requirements was NOT
what he had in mind.


It could; but I'd prefer it if the others would tighten up their
behaviour instead.

There's a certain school of thought that a mail client could alert the
user to the fact that there's a defect in the mail, and offer to try
some kind of a fixup if the user consents to any security compromise
which might result. But, to be realistic, many users would have no
basis for deciding one way or the other in response to such an alert;
and maybe those who have the competence to decide, would also have no
difficulty in applying their own workaround if they wanted to.


"for all the good it may do", indeed. Most commercial enterprises
would tell you to abandon the Internet-conforming software that you
use, and throw yourself open to anything that the commercial software
would force-feed you. And as long as they have an apparently endless
stream of sucker^W uncomplaining customers, why *would* they be
interested in spending time and effort with a few people who know what
they're doing and who are trying to stick up for standards?

all the best
 
 
 

Corrupted Subject and From header in Amazon.co.uk mail

Post by Alan J. Fl » Mon, 29 May 2006 07:42:29


[that, in the case under discussion...]

Well, in the case that was being discussed, it /was/ doing so.

However, I should have cited RFC2047, which replaces 1521-2. Sorry.
But to get back to your point...


I understand what you're saying.


message id < XXXX@XXXXX.COM > , to be
exact. Inzwischen gelesen.


(I think you're referring to section 6.1. Section 2 (in rfc1522 as in
2047) is setting requirements on the sender. What a mail client
should do if it receives defective formats is actually codified in
6.1.)

Is there any difference between mail and news in this regard? Usenet
usage differs in a number of respects, and has never really been
formally codified since MIME formats were introduced. They've been
sort-of adopted into news from mail, but there are still
long-established differences of custom. PINE sometimes sits a bit
awkwardly across this divide...

However, if the current USEFOR draft ever makes it into an RFC, then
this will be clearly prohibited, even in news. Usual caveats apply in
reference to internet drafts, but:

http://www.yqcomputer.com/

o Compliant software MUST NOT generate (but MAY accept) header
fields of more than 998 octets. This is the only limit on the
length of a header field prescribed by this standard. However,
specific rules to the contrary may apply in particular cases (for
example, according to [RFC2047] lines of a header field containing
encoded-words are limited to 76 octets).

But yes, I guess you're right - in this situation, PINE is being
tolerant, in a way that it wasn't in the other situation.

best regards
 
 
 

Corrupted Subject and From header in Amazon.co.uk mail

Post by Franz Haeu » Mon, 29 May 2006 22:53:13

Alan J. Flavell writes:

[... Discussion about encoded words in a Subject header of a
Usenet article that may be too long, but are decoded by Pine...]


That is true.


This was not clear to me. Thank you for the clarification. I
used to think that mail and Usenet services work in a very
similar way---at least regarding the headers they have in common.
And Pine makes it very easy through the structure of its user
interface and its user guidance to forget, that there is actually
a difference.


It is interesting that RFC1036 creates no technical restriction
on the Subject header. I guess, at that time (RFC1036 will
celebrate its 20th anniversary next year) internationalization
was not considered to be a problem. How likely is it, that the
USEFOR draft will evolve to an RFC?

Franz.
 
 
 

Corrupted Subject and From header in Amazon.co.uk mail

Post by Mark Crisp » Tue, 30 May 2006 00:34:29


The short answer to this is that the majority of users (and software
developers) want to treat mail and news as being essentially the same.

A small minority of fanatics are determined to stop it, mostly by claiming
that the obviously broken behavior of ancient software as a "reason" why
mail and news are different. These fanatics dominate any attempt at
reforming the news specifications.


I don't know. USEFOR has been a disaster from its onset, as has NNTPEXT.

-- Mark --

http://www.yqcomputer.com/
Democracy is two wolves and a sheep deciding what to eat for lunch.
Liberty is a well-armed sheep contesting the vote.