robots.txt is a file that you can use to instruct where web crawlers should look for information and where they should not.
How it works?
Good web crawler first accesses root of a domain and looks for robots.txt file.
For example if robot wants to check www.example.com/welcome.html it will first check if www.example.com/robots.txt exists.
And again for example it finds:
# No robots, Please
In above file:
User-agent: * means this section applies to all robots and
Disallow: / instructs the robot that it should not visit any pages on the site.
Note: It is important to know that robots can ignore your /robots.txt and robots.txt file is a publicly available file.
First consideration is really important to know since the robots who ignore the instructions are usually malicious.
What to put inside?
robots.txt is a plain text file. Here are few examples:
To allow all robots to visit all files:
And opposite disallow all robots out:
If you need to disallow a specific agent to visit specific folder
User-agent: SpecificBot # replace the 'SpecificBot' with the actual user-agent of the bot
Above example shows also how you can put comments in the file.
In addition you can tell robots where your sitemap is located
Where to put it?
The short answer: in the top-level directory of your web server.
A bit longer: it should be located after your domain name. For example www.example.com/robots.txt not www.example.com/robot_file/robots.txt
If you just got your new CentOS server.
After a while you notice that timestamps in your logfiles are shifted with few hours. So what could be wrong?
Actually is really simple – most probably your timezone is not correct. To check run “date” from command line, this will show what is the time for your server.
Then you find that the timezone is not correct?!
How do you set the correct one?
Unfortunately, this is not an easy thing to figure out. Official documentation states that you can use system-config-date, but it has a bunch of dependencies (when I ran yum install system-config-date on one of my servers it asked to install 84 packages).
So is there an alternative way to do it?
All timezone files are located in /usr/share/zoneinfo. To select the appropriate named timezone for your location. For my location, Montreal, Canada, I actually have to select: America/Montreal. For you it could be different so make note of the appropriate folder and file for your timezone.
The active timezone used on your system is in the /etc/localtime file.
The default will vary depending on your server host and it depends on the value that was provided during installation.
We simply need to replace this file with the file we selected in the previous step.
Even I say replace, actually is recommended to create a link to the pertinent file rather than actually replacing the file.
Here are the steps to follow:
First, backup the existing localtime file (it is always good practice to make backups of original config files).
mv /etc/localtime /etc/localtime.bak
Next, create the link:
ln -s /usr/share/zoneinfo/ /etc/localtime
Test your change.
Run “date” from the command line, and ensure that the appropriate time, date, and timezone are reported.
hosts file is a text file used to map IP addresses to hostname before DNS was in place. So you will ask why to edit it?
Simple – sometimes is required to have map if specific host to different IP ( for example: testing ).
In general is not recommended to edit hosts file. Even some viruses are using it to map popular antivirus sites to localhost and hence deny access to them.
The file is located in %systemroot%\system32\drivers\etc\ (which for most of computers translate to C:\windows\system32\drivers\etc\)
In Windows XP and lower if you are logged as administrator you can edit the file directly, but to increase security (since most of ordinary users log as administrators) Microsoft decided to add in Windows Vista (and newer) additional layer of security so you are unable to edit the file directly.
Here is how to do it:
- In Start menu type Notepad
- Right click on Notepad and select run as Administrator
- Continue as usual – edit the file and save it
Note: Some antivirus products “protect” hosts file so you might need to disable that protection before editing.
Note: For newbies the format of the file is following:
Where x.x.x.x is IP address in numeric form
and FQDN is Fully Qualified Domain Name
If you are using telnet to test and troubleshoot services you will be “surprised” that telnet is not installed by default.
'telnet' is not recognized as an internal or external command,
operable program or batch file.
To to enable it back follow this procedure:
- Go to Start – > Control Panel -> Programs -> Turn Windows Features On and Off
- Check Telnet Client and click OK (See screenshot below)
- After a while it is ready
When testing e-mail server one of the tools that are proven to be useful is plain old telnet.
Actually this is not a surprise since SMTP protocol function is similar.
Here is how to use it:
1. Start from command prompt
telnet mailhost 25
Note: Replace mailhost with your email server.
Note1: depending on the server HELO have to be replaced with EHLO
Note2: Replace server.com with your domain
Note: Again replace email@example.com with your e-mail address
RCPT TO: firstname.lastname@example.org
Note: Replace email@example.com with recipient e-mail address
after that optionally
SUBJECT: Your subject
then type your message
to finish place dot at new line.
6. To exit type