top button
Flag Notify
    Connect to us
      Facebook Login
      Site Registration Why to Join

Facebook Login
Site Registration
Print Preview

Block.crawlers from seeing files with certain suffixes, i.e. .php

0 votes
34 views

Is there a way to block .php from being indexed by crawlers, but allow other type files to be indexed? When the crawlers access the php files, they are executed, creating lots of error messages (and taking up cpu cycles).

posted Jul 12, 2013 by anonymous

Share this question
Facebook Share Button Twitter Share Button Google+ Share Button LinkedIn Share Button Multiple Social Share Button

1 Answer

–1 vote

Google for "robots.txt".

answer Jul 12, 2013 by anonymous
Similar Questions
+1 vote

I'm writing my first php extension and I need to list included files (in PHP script) from RINIT function, but I cannot figure out how.

I deep into PHP source code and I think it's related to EG(included_files), but I can't to access the list.

PHP_RINIT_FUNCTION(extname)
{
 // SAPI NAME AND PHP SCRIPT FILE HANDLE PATH
 char *pt_var_sapi_name = sapi_module.name;
 char *pt_var_file_handle_path = SG(request_info).path_translated;

 // HOW CAN I USE EG(included_files) to get included files list?

 return SUCCESS;
}
0 votes

I am writing a crawler in python, which crawl quora. I can't read the content of quora without login. But google/bing crawls quora. One thing i can do is use browser automation and login in my account and the go links by link and crawl content, but this method is slow. So can any one tell me how should i start in writing this crawler.


Useful Links with Similar Problem
Contact Us
+91 9880187415
sales@queryhome.net
support@queryhome.net
#470/147, 3rd Floor, 5th Main,
HSR Layout Sector 7,
Bangalore - 560102,
Karnataka INDIA.
QUERY HOME
...