Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
764 views
in Technique[技术] by (71.8m points)

.htaccess - mod rewrite to remove file extension, add trailing slash, remove www and redirect to 404 if no file/directory is available

I would like to create rewrite rules in my .htaccess file to do the following:

  • When accessed via domain.com/abc.php: remove the file extension, append a trailing slash and load the abc.php file. url should look like this after rewrite: domain.com/abc/

  • When accessed via domain.com/abc/: leave the url as is and load abc.php

  • When accessed via domain.com/abc: append trailing slash and load abc.php. url should look like this after rewrite: domain.com/abc/

  • Remove www

  • Redirect to 404 page (404.php) when accessed url doesn't resolve to folder or file, e.g. when accessing either domain.com/nothingthere.php or domain.com/nothingthere/ or domain.com/nothingthere

  • Make some permanent 301 redirects from old urls to new ones (e.g. domain.com/abc.html to domain.com/abc/)

All php files sit in the document root directory, but if there is a solution that would make urls such as domain.com/abc/def/ (would load domain.com/abc/def.php) also work it would be great as well, but not necessary

So here is what I have at the moment (thrown together from various sources and samples from around the web

<IfModule mod_rewrite.c>
  RewriteCond %{HTTPS} !=on
  # redirect from www to non-www
  RewriteCond %{HTTP_HOST} ^www.(.+)$ [NC]
  RewriteRule ^ http://%1%{REQUEST_URI} [R=301,L]

  # remove php file extension
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{THE_REQUEST} ^GET /[^?s]+.php
  RewriteRule (.*).php$ /$1/ [L,R=301]

  # add trailing slash
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule ^.*[^/]$ /$0/ [L,R=301]

  # resolve urls to matching php files 
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteRule (.*)/$ $1.php [L]

With this the first four requirements seem to work, whether I enter domain.com/abc.php, domain.com/abc/ or domain.com/abc, the final url always ends up being domain.com/abc/ and domain.com/abc.php is loaded.

When I enter a url that resolves to a file that doesn't exists I'm getting an error 310 (redirect loop), when really a 404 page should be loaded. Additionally I haven't tried if subfolders work, but as I said, that's low priority. I'm pretty sure I can just slap the permanent 301 redirects for legacy urls on top of that without any issues as well, just wanted to mention it. So the real issue is really the non working 404 page.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I've had problems with getting ErrorDocument to work reliably with rewrite errors, so I tend to prefer to handle invalid pages correctly in my rewrite cascade. I've tried to cover a fully range of test vectors with this. Didn't find any gaps.

Some general points:

  • You need to use the DOCUMENT_ROOT environment variable in this. Unfortunately if you use a shared hosting service then this isn't set up correctly during rewrite execution, so hosting providers set up a shadow variable to do the same job. Mine uses DOCUMENT_ROOT_REAL, but I've also come across PHP_DOCUMENT_ROOT. Do a phpinfo to find out what to use for your service.
  • There's a debug info rule that you can trim as long as you replace DOCROOT appropriately
  • You can't always use %{REQUEST_FILENAME} where you'd expect to. This is because if the URI maps to DOCROOT/somePathThatExists/name/theRest then the %{REQUEST_FILENAME} is set to DOCROOT/somePathThatExists/name rather than the full pattern equivalent to the rule match string.
  • This is "Per Directory" so no leading slashes and we need to realise that the rewrite engine will loop on the .htaccess file until a no-match stop occurs.
  • This processes all valid combinations and at the very end redirects to the 404.php which I assume sets the 404 Status as well as displaying the error page.
  • It will currently decode someValidScript.php/otherRubbish in the SEO fashion, but extra logic can pick this one up as well.

So here is the .htaccess fragment:

Options -Indexes -MultiViews
AcceptPathInfo Off

RewriteEngine On
RewriteBase   /

## Looping stop.  Not needed in Apache 2.3 as this introduces the [END] flag
RewriteCond %{ENV:REDIRECT_END}  =1
RewriteRule ^                    -                       [L,NS]

## 302 redirections ##

RewriteRule ^ - [E=DOCROOT:%{ENV:DOCUMENT_ROOT_REAL},E=URI:%{REQUEST_URI},E=REQFN:%{REQUEST_FILENAME},E=FILENAME:%{SCRIPT_FILENAME}]

# redirect from HTTP://www to non-www
RewriteCond %{HTTPS} !=on
RewriteCond %{HTTP_HOST}        ^www.(.+)$ [NC]
RewriteRule ^                   http://%1%{REQUEST_URI}  [R=301,L]

# remove php file extension on GETs (no point in /[^?s]+.php as rule pattern requires this)
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_METHOD}   =GET
RewriteRule (.*).php$          $1/                      [L,R=301]

# add trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^.*[^/]$            $0/                      [L,R=301]

# terminate if file exists.  Note this match may be after internal redirect.
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^                   -                        [L,E=END:1]

# terminate if directory index.php exists.  Note this match may be after internal redirect.
RewriteCond %{REQUEST_FILENAME}    -d
RewriteCond %{ENV:DOCROOT}/$1/index.php    -f
RewriteRule ^(.*)(/?)$             $1/index.php          [L,NS,E=END:1]

# resolve urls to matching php files 
RewriteCond %{ENV:DOCROOT}/$1.php  -f
RewriteRule ^(.*?)/?$              $1.php                [L,NS,E=END:1]

# Anything else redirect to the 404 script.  This one does have the leading /

RewriteRule ^                      /404.php              [L,NS,E=END:1]

Enjoy :-)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...