encoding question mark in URL makes URL fail

encoding question mark in URL makes URL fail

Post by TheBicycli » Mon, 02 Aug 2010 01:34:25


Hello everyone. I am still working on an essay about a WW2 book, and
added another link to it from a wikipedia article. That link though
includes many unescaped unallowed characters. I encoded them as
instructed and then the URL didn't work. In particular, when I used
%3F in the place of the question mark, the URL resulted in a 404
error.

The URL is http://www.yqcomputer.com/

Can someone please demonstrate the proper way to encode this so the
URL will still go to the page it points to AND also passes the w3c
HTML validator?
 
 
 

encoding question mark in URL makes URL fail

Post by Jukka K. K » Mon, 02 Aug 2010 01:55:37


How? What is the URL of your page?


Where did you find instructions that told you to escape the question mark?


The "?" starts the query part in URL syntax, and it need not and must not be
encoded. Similarly the "=" characters are part of the URL syntax, and so are
the "&" characters.

Quite apart from this, the "&" character, appearing in _HTML_, whether in an
attribute value or elsewhere, should be represented as "&". This has
nothing to do with URL encoding, or % encoding as it is currently called
officially.

--
Yucca, http://www.yqcomputer.com/ ~jkorpela/

 
 
 

encoding question mark in URL makes URL fail

Post by TheBicycli » Mon, 02 Aug 2010 01:59:05


The page where I include the link is at http://www.yqcomputer.com/

When I run that page through the w3c HTML validator, it no longer
passes since I added that link. If that URL I provided is a valid one
(it does work in the browser), why does the w3c validator tell me that
my HTML is no longer valid?
 
 
 

encoding question mark in URL makes URL fail

Post by Denis McMa » Mon, 02 Aug 2010 02:30:53


Because you have "&" where you should have "&"

"&" starts an entity, which should be terminated by an ";"

change this part of the URL in question:

CATID=4547444&CATLN=6&accessmethod=5

to this:

CATID=4547444&CATLN=6&accessmethod=5

Rgds

Denis McMahon
 
 
 

encoding question mark in URL makes URL fail

Post by TheBicycli » Mon, 02 Aug 2010 02:51:28

On Jul 31, 10:30m, Denis McMahon < XXXX@XXXXX.COM >



Thanks Jukka and Denis. I rewrote the "&" to be "&" two places in
the URL as suggested. The URL still works, but still generates errors
in the w3 HTML validator. Is this a bug in the validator?
 
 
 

encoding question mark in URL makes URL fail

Post by TheBicycli » Mon, 02 Aug 2010 02:55:36

On Jul 31, 10:30m, Denis McMahon < XXXX@XXXXX.COM >



It worked! At first I thought it hadn't, but it does!

There was a web page that had told me that question marks, equal
signs, and such also had to be encoded properly in URLs. Apparently
that page was wrong. So far as I can tell, the ampersands are tne most
common problem. Thanks again everyone!
 
 
 

encoding question mark in URL makes URL fail

Post by David E. R » Mon, 02 Aug 2010 04:03:45


A question mark is a valid character in a URI, even when contained in
HTML. It indicates that a query is to be made. It does not have to be
escaped, coded in hex, or recoded as a character or entity referece.

In your URI, the only characters required to be recoded are the two
ampersands. In HTML, these must be coded as the entity reference &
with the semi-colon required. This is because an ampersand by itself in
HTML indicates the beginning of a character or entity reference.

--

David E. Ross
< http://www.yqcomputer.com/ ;.

Anyone who thinks government owns a monopoly on inefficient, obstructive
bureaucracy has obviously never worked for a large corporation.
1997 by David E. Ross
 
 
 

encoding question mark in URL makes URL fail

Post by Denis McMa » Mon, 02 Aug 2010 05:31:39


It may not be that that page was wrong.

Typical form of a http get request:

http://host/path/file?par1=val1&par2=val2

If you use that as a url in a web page, you need to encode the "&" to
"&" because "&" is a special character in html.

http://host/path/file?par1=val1&par2=val2

However, any of "&=?" inside a value string may also need to be encoded
using % encoding, as may several other characters.

e.g. if val2 is the string "a=b&c?d:e"

http://host/path/file?par1=val1&par2=a%3Db%26c%3Fd%3Ae

So there are places where the "&" changes to "&", and places where
it changes to "%26", and sometimes the browser is smart enough to figure
out the correct result when you get it wrong, and sometimes it isn't.

Rgds

Denis McMahon
 
 
 

encoding question mark in URL makes URL fail

Post by Scott Bryc » Mon, 02 Aug 2010 06:15:38


Is there a reason why he doesn't just replace them with semicolons?
 
 
 

encoding question mark in URL makes URL fail

Post by Thomas 'Po » Mon, 02 Aug 2010 06:38:59


ACK


No, it starts a (character) entity reference.


PointedEars
--
Danny Goodman's books are out of date and teach practices that are
positively harmful for cross-browser scripting.
-- Richard Cornford, cljs, <cife6q$253$1$ XXXX@XXXXX.COM > (2004)
 
 
 

encoding question mark in URL makes URL fail

Post by Jukka K. K » Mon, 02 Aug 2010 14:54:40


Yes.

Did you actually test your advice before giving it?

Hint: The URL in question was
http://www.yqcomputer.com/
Now try
http://www.yqcomputer.com/ ;CATLN=6;accessmethod=5

(The author apparently has no control over anything in the
www.nationalarchives.gov.uk server.)

--
Yucca, http://www.yqcomputer.com/ ~jkorpela/
 
 
 

encoding question mark in URL makes URL fail

Post by TheBicycli » Mon, 02 Aug 2010 23:45:42


What I could do, since the problem link is from a reference given in a
Wikipedia article, is edit that page in Wikipedia and rewrite the URL
with "&" in place of "&" in the URL. Would I be performing a
public service to do this? That is an improvement, right? I had
similar problems a few months back with another link I quoted from
Wikipedia. I could theoretically improve that link too so others won't
experience the same problem (failing HTML standards validation).
 
 
 

encoding question mark in URL makes URL fail

Post by TheBicycli » Mon, 02 Aug 2010 23:46:56


oh, and I would test the corrected link to make sure it works before
posting the proposed edit in Wikipedia.
 
 
 

encoding question mark in URL makes URL fail

Post by Ben Bacari » Tue, 03 Aug 2010 02:02:26

TheBicyclingGuitarist < XXXX@XXXXX.COM > writes:




<snip>

It's not clear from your wording if the Wiki page has a problem. & is
correct in a URL. When that URL needs to be written in HTML, the &
should be written &. If you copy a correctly written link from a
web page using, say, the "copy link address" browser action it will
(correctly) have &s in it.

Is the page: http://www.yqcomputer.com/ (53) ?
If so, the URL is correctly encoded.

--
Ben.
 
 
 

encoding question mark in URL makes URL fail

Post by Denis McMa » Tue, 03 Aug 2010 02:44:33


What you could do, first, is "view source" on the wikipedia page
concerned and check if there is in fact an error in it.

Remember that what you see is your browser rendering what the web server
sends. If the web server sends "&" you will see "&".

Rgds

Denis McMahon