If your site is hosted on a Unix or Linux server which runs Apache, you may already be familiar with your .htaccess file.
We referred to it in an earlier tutorial on Creating Custom Error Messages, where we showed how you can configure it to instruct the browser to display custom error messages rather than the dull and unhelpful generic ones.
But that is far from the whole story! In this article we will look at some of the other things that this powerful little file can do. In part two we have 7 Magic Tricks that you can perform with .htaccess, but first let’s have a look at the file itself.
What is the .htaccess file?
The .htaccess file is a text file which resides in your main site directory and/or in any subdirectory of your main directory. There can be just one, there can be a separate one in each directory or you may find or create one just in a specific directory.
Any commands in a .htaccess file will affect both the directory it is in and any subdirectories of that directory. Thus if you have just one, in your main directory, it will affect your whole site. If you place one in a subdirectory it will affect all the contents of that directory.
Some Important Points
Windows does not use the .htaccess system. I believe there are ways of doing the things .htaccess does on Windows servers but that is a story for another day and I am afraid I will not be telling it – it just isn’t as simple or as elegant as the way Apache manages things in my humble opinion! So unless you are on a Linux/Unix server, this article is no good to you. Sorry.
A warning you will commonly see is that changing the .htaccess file on a server that has FrontPage extensions installed will at best not work and at worst make a complete mess of your extensions. I have to say that this has not been my experience and I have done a fair bit of messing with .htaccess files on FrontPage sites, including using .htaccess for authentication. However do any of these things at your own risk – I cannot be responsible for any harm they might cause.
Your host may not support alteration of the .htaccess file; either contact them first and ask before you make changes or proceed with caution and be sure you have a backup of the original file in case of problems.
Oh! And none of the ‘Magic Tricks’ described in this article are either magic or tricks. They just seem that way!
Working With Your .htaccess File
Sometimes the first problem is finding it! When you FTP to your site the .htaccess file is generally the first one displayed in a directory if it exists.
Some servers are configured to hide files whose names begin with a period. Your FTP client allows you to chose to display these files. In WS_FTP you can do this by entering -la or -al as indicated in the image on the left and then clicking Refresh. Other clients may use a different method – check the help files in yours.
Editing should be done in a text editor, such as NotePad. You should not edit .htaccess files in editors such as FrontPage. The best thing to do is download a copy of your .htaccess file to your computer, edit it, and upload again, remembering to save a copy of the original in case of errors.
If you do not already have a .htaccess file you can create one in NotePad, it is just a simple text file. However when saving it to the server you may need to rename it from .htaccess.txt to just .htaccess. The two are NOT the same. In fact .htaccess is an extension – to a file with no name!
It is very important when entering commands in your file that each is entered on a new line and that the lines do not wrap. If you find that when you paste any of the commands in this article into your file that the lines are not breaking or are wrapping you will need to correct this.
You must upload and download your .htaccess file in ASCII mode, not BINARY.
So, What about the Magic Tricks? Read on!
Well, first another boring bit! To prevent people from being able to see the contents of your .htaccess file, you need to place the following code in the file:
-==-
Be sure to format that just as it is above, with each line on a new line as shown. There is every likelihood that your existing .htaccess file, if you have one, includes those lines already.
Magic Trick No. 1: Redirect to Files or Directories
You have just finished a major overhaul on your site, which unfortunately means you have renamed many pages that have already been indexed by search engines, and quite possibly linked to or bookmarked by users. You could use a redirect meta tag in the head of the old pages to bring users to the new ones, but some search engines may not follow the redirect and others frown upon it.
.htaccess leaps to the rescue!
Enter this line in your .htaccess file:
-==-
You can repeat that line for each file you need to redirect. Remember to include the directory name if the file is in a directory other than the root directory:
-==-
If you have just renamed a directory you can use just the directory name:
-==-
(Note:The above commands should each be on a single line, they may be wrapping here but make sure they are on a single line when you copy them into your file.)
This has the added advantage of preventing the increasing problem on the Internet, as people change their sites, of ‘link rot’. Now people who have linked to pages on your site will still have functioning links, even if the pages have changed location.
Magic Trick No. 2: Change the Default Directory Page
In most cases the default directory page is index.htm or index.html. Many servers allow a range of pages called index, with a variety of extensions, to be the default page.
Suppose though (for reasons of your own) you wish a page called honeybee.html or margarine.html to be a directory home page?
No problem. Just put the following line in your .htaccess file for that directory:
-==-
You can also use this command to specify alternatives. If the first filename listed does not exist the server will look for the next and so on. So you might have:
-==-
(Again, the above should all be on a single line)
Magic Trick No. 3: Allow/Prevent Directory Browsing
Most servers are configured so that directory browsing is not allowed, that is if people enter the URL to a directory that does not contain an index file they will not see the contents of the directory but will instead get an error message. If your site is not configured this way you can prevent directory browsing by adding this simple line to your .htaccess file:
-==-
But there may be times when you want to allow browsing, perhaps to allow access to files for downloading or for whatever reason, on a server configured not to allow it. You can override the servers settings with this line:
-==-
Easy!
Magic Trick No. 4: Allow SSI in .html files
Most servers will only parse files ending in .shtml for Server Side Includes. You may not wish to use this extension, or you may wish to retain the .htm or .html extension used by files prior to your changing the site and using SSI for the first time.
Add the following to your .htaccess file:
-==-
You can add both extensions or just one.
Remember though that files which must be parsed by the server before being displayed will load more slowly that standard pages. If you change things as above, the server will parse all .html and .htm pages, even those that do not contain any includes. This can significantly, and unnecessarily, slow down the loading of pages without includes.
Magic Trick No 5: Keep Unwanted Users Out
You can ban users by IP address or even ban an entire range of IP addresses. This is pretty drastic action, but if you don’t want them, it can be done very easily.
Add the following lines:
-==-
The second line bans the IP address 123.456.78.90, the third line bans everyone in the range 123.456.78.1 to 123.456.78.999 and so is much more drastic. The fourth line bans everyone from AOL. A somewhat excessive display of power perhaps!
One thing to bear in mind here it that banned users will get a 403 error – “You do not have permission to access this site”, which is fine unless you have configured a custom error for this page which in fact appears to let them in. So bear that in mind and if you are banning users for whatever reason make sure your 403 error message is a dead end.
Magic Trick No. 6: Prevent Linking to Your Images
The greatest and most irritating bandwidth leech is having someone link to images on your site. You can foil such thieves very easily with .htaccess. Copy the following into your .htaccess file:
-==-
You don’t need to understand any of that! Just change ‘domain.com’ to the name of your domain.
(Again each command should be on a single line. There are 4 lines above, each starting with ‘Rewrite’)
If you want to really let them know they have been rumbled why not make an image like the one below (or take this one if you like)
call it stealing.gif, save it to your images file and add the following line after the code abov
-==-
(The above command should be on a single line)
Magic Trick No 7: Stop the Email Collectors
While you positively want to encourage robot visitors from the search engines, there are other less benevolent robots you would prefer stayed away. Chief among these are those nasty ‘bots that crawl around the web sucking email addresses from web pages and adding them to spam mail lists.
-==-
Note that at the end of each line for a named robot there appears an ‘[OR]’ – don’t forget to include that if you add any others to this list.
This is by no means foolproof. Many of these sniffers do not identify themselves and it is almost impossible to create an exhaustive list of those that do. It’s worth a try though if it even keeps some away. The above is as many as I could find.
….and Finally
There is one very important area of the .htaccess file’s use that we have not really mentioned and that is its use for user authentication. It is perfectly possible to configure your .htaccess files by hand to control access to directories on your site, but this is rarely necessary.
In most cases your host will provide a method to allow you to much more easily configure the file from your hosting control panel and there are a myriad of Perl scripts that will allow you to set up full user management systems by harnessing the power of .htaccess.
If you do want to go it alone there is a tutorial here that will get you there: http://www.apacheweek.com/features/userauth
If you are looking for scripts there a many here: http://www.hotscripts.com/Perl/Scripts_and_Programs/Password_Protection/
Two scripts that I have used and can recommend are:
1. Locked Area
The free version will be adequate for many situations, though both versions will give you control over access to one directory and its contents only.
http://www.locked-area.com/html/
2. Password Manager
Allows you very sophisticated control over access to multiple directories. Not cheap but very good value for money.
http://www.cgi-world.com/password_manager.html
Have fun!
Article reprinted with permission from Thomas Brunt’s OutFront, a Microsoft FrontPage learning community
http://www.outfront.net
Katherine Nolan
OutFront Moderator