Web Crawler Bot Receives 403 Error When Accessing Robots.txt
Symptoms
Requests from a crawler bot user-agent produce a 403 error.
robots.txt
appears to have a valid rule to allow requests from the user agent.
Cause
The URL requested by the crawler bot user-agent must match the URL for the website described in site service. A URL which requires a redirect will result in a 403.
Solution
Verify that the base URL of the crawler requests matches exactly the URL found in Arc XP > Sites > [Your Website] > base path.
For example if your base path is https://www.yoursite.com
the request must be for https://www.yoursite.com/robots.txt
. Not including https://www.
in yoursite.com/robots.txt
will result in a 403 error.