This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Seems validator ignores short version of encoding declaration: <meta charset="utf-8"> Validation of page http://htmlex.met.cz/ gives me 1 warning "No Character Encoding Found! Falling back to UTF-8." Validation with http://html5.validator.nu/ tool gives no warning. Looks problem is only in "Validate by URI" and "Validate by File Upload". "Validate by Direct input" does produce no warning.
(In reply to comment #0) > Seems validator ignores short version of encoding declaration: > <meta charset="utf-8"> Indeed... agreed, something is not right in the validator, i get the same problem. Best Regards, Patrick
The problem is in the HTML::Encoding perl module used by the validator. There's a bug report open about it at https://rt.cpan.org/Ticket/Display.html?id=42497
(In reply to comment #2) > The problem is in the HTML::Encoding perl module used by the validator. > There's a bug report open about it at > https://rt.cpan.org/Ticket/Display.html?id=42497 > I can't see how that can be the problem. There may well be a problem with the HTML::Encoding module, but that shouldn't affect (X)HTML5 validation. AFAICT the W3C's part of the markup validator shouldn't even see the meta charset (<meta charset="utf-8">) part of the webpage, as soon as the validator sees the new HTML doctype (introduced in HTML5 (<!DOCTYPE html>)) it should pass the whole document over to the validator.nu part of the validator for validation and then the validator.nu should decide if the charset is correct or not, not the main W3C validator.
(In reply to comment #3) > (In reply to comment #2) > > The problem is in the HTML::Encoding perl module used by the validator. > I can't see how that can be the problem. [snip] > as soon as the validator sees the > new HTML doctype (introduced in HTML5 (<!DOCTYPE html>)) it should pass the > whole document over to the validator.nu The validator 1) needs to know the encoding before it can preparse the document and detect that doctype and 2) needs to know and decode the bytes before it can pass the document to the validator.nu engine. It is not “just” a redirection.
(In reply to comment #4) > (In reply to comment #3) > > (In reply to comment #2) > > > The problem is in the HTML::Encoding perl module used by the validator. > > > I can't see how that can be the problem. > [snip] > > as soon as the validator sees the > > new HTML doctype (introduced in HTML5 (<!DOCTYPE html>)) it should pass the > > whole document over to the validator.nu > > The validator 1) needs to know the encoding before it can preparse the document > and detect that doctype and 2) needs to know and decode the bytes before it can > pass the document to the validator.nu engine. It is not “just” a > redirection. > I think problems like this are going to be never ending, therefore I think the W3C should use the validator.nu as for the "front end" of its validation service. Has this been considered before?
(In reply to comment #5) > I think problems like this are going to be never ending, therefore I think the > W3C should use the validator.nu as for the "front end" of its validation > service. Has this been considered before? This is getting a little OT and would probably be best on the validator list, but yes, this has been considered. The validator.nu engine is a wonderful piece of software, in many ways superior to the other engines which validator.w3.org uses. However, IMHO validator.nu is neither stable enough (see e.g http://lists.w3.org/Archives/Public/www-validator/2009Mar/0037.html ) nor flexible enough (limited number of profiles, no DTD support for legacy HTML, etc) nor usable enough (bare bone UI and limited message explanations, no file upload, no direct input, etc) to simply "be" the sole and front engine on validator.w3.org. I am quite certain that at this point, having validator.w3.org be a frontend for multiple engines, including OpenSP for DTD and validator.nu for html5 and other applications, is the most desirable architecture.
For what itfs worth, I wrote up a description of this issue, with some linked reductions: http://oli-studio.com/bugs/validator/html5-charset/ It was mainly intended to explain the situation to content creators, and show what combination of character set declaration methods generated no errors.
*** Bug 7135 has been marked as a duplicate of this bug. ***
This one is biting me too. Nothing to add, except I'd like to see it fixed soon.
I encountered the same issue for http://usesthis.com/
Ville has a new Validator release queued up to deploy, and I think it may contain a fix for this issue. I'll check with him and see.
There is no fix for this issue yet. I have some local prototype level code for this which I'll revisit soon, but it has some showstopper problems (for example it might in some cases affect validation of non-HTML5 HTML documents). Due to how the validator works at the moment, the fix is not trivial.
A fix is now in CVS and available for testing at http://qa-dev.w3.org/wmvs/HEAD/ . Something weird happens when that (and my local instance) of validator tries to access the HTML5 validator installed locally on http://qa-dev.w3.org:8888/html5/ when validating http://htmlex.met.cz/ . The error is "Insecure dependency in connect while running with -T switch" and what makes it strange is that interfacing the very same HTML5 validator when checking some other documents (such as the ones from comment 7 and comment 10) works just fine. As does when the validator is configured to use http://validator.nu/ as its HTML5 validator. I have no idea how the document to be validated could cause this (it has already been fetched locally, and is about to be POSTed to the same HTML5 instance which works fine for other docs), but I'll try to find out.
(In reply to comment #13) > Something weird happens when that (and my local instance) of validator tries to > access the HTML5 validator installed locally on > http://qa-dev.w3.org:8888/html5/ when validating http://htmlex.met.cz/ . Workaround (but no reason) found and applied, more details at http://rt.cpan.org/Public/Bug/Display.html?id=52707
*** Bug 8678 has been marked as a duplicate of this bug. ***
(In reply to comment #13) > A fix is now in CVS and available for testing at > http://qa-dev.w3.org/wmvs/HEAD/ . > This fix works for me, thanks
Code fixes are included in 0.8.6 but unfortunately the required HTML::HeadParser >= 3.60 module is not installed on the production validator.w3.org boxes yet.
(In reply to comment #17) > Code fixes are included in 0.8.6 but unfortunately the required > HTML::HeadParser >= 3.60 module is not installed on the production > validator.w3.org boxes yet. Installed now, sorry for the inconvenience.
Thanks, closing.
I just ran into this bug on the production site: http://validator.w3.org/#validate_by_upload The validator didn't see my file's <!DOCTYPE html>. I verified that my code validates at http://qa-dev.w3.org/wmvs/HEAD/#validate_by_upload Is it possible that this bug is fixed for the URI case, but not for uploads? (In reply to comment #18) > (In reply to comment #17) > > Code fixes are included in 0.8.6 but unfortunately the required > > HTML::HeadParser >= 3.60 module is not installed on the production > > validator.w3.org boxes yet. > > Installed now, sorry for the inconvenience.
I changed the category on this because this is not a bug in the validator.nu HTML5-checking backend but instead relates to the Perl code
Just use http://validator.w3.org/nu/ directly.