Caution! This message was sent from outside the University of Manitoba.

Ahh ok, thanks.
I actually had the names of a bunch of bots in there, so wouldn't I need the parentheses?
ie:
RewriteCond %{HTTP_USER_AGENT} (googlebot|bingbot|Baiduspider|"AhrefsBot/6.1"|"Ahrefs"|"Baiduspider"|"BLEXBot"|"SemrushBot"|"claudebot"|"YandexBot/3.0"|Bytespider) [NC]

Regards,
-Montana


On Tue, Apr 22, 2025 at 2:56 PM Gilbert Detillieux <Gilbert.Detillieux@umanitoba.ca> wrote:
I think Adam is suggesting to use a regex in the RewriteCond, to avoid
the problematic characters in the pattern...

https://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewritecond

... states that "CondPattern is usually a perl compatible regular
expression, but there is additional syntax available to perform other
useful tests against the Teststring:".

So, something like this might work...

RewriteCond %{HTTP_USER_AGENT} "Unknown robot identified by bot.." [NC]

BTW, I don't think you want parentheses around the string, as that's
probably not supported syntax.  (Parentheses within the string will have
the usual PCRE syntax and semantics.)

Hope this helps.

Gilbert

On 2025-04-22 2:05 p.m., Montana Quiring wrote:
> Sorry man, excuse my ignorance, but not sure what you are asking.
> I got the bot name from AWstats, which I assume is just ASCII.
>
> Regards,
> -Montana
>
>
> On Tue, Apr 22, 2025 at 1:58 PM Adam Thompson <athompso@athompso.net
> <mailto:athompso@athompso.net>> wrote:
>
>     Urlencode or octal?  Or if it's a regex just use ".".
>     -Adam
>
>     Get Outlook for Android <https://aka.ms/AAb9ysg>
>     ------------------------------------------------------------------------
>     *From:* Montana Quiring <montanaq@gmail.com <mailto:montanaq@gmail.com>>
>     *Sent:* Tuesday, April 22, 2025 1:47:31 PM
>     *To:* Continuation of Round Table discussion <roundtable@muug.ca
>     <mailto:roundtable@muug.ca>>
>     *Subject:* [RndTbl] .htaccess file: stopping robot with escape
>     character in name
>     Hello Folks,
>
>     I'm trying to stop a bot from crawling a site using the .htaccess
>     file. The problem is that it's using the backslash character as its
>     name. Grrr...
>     It's called: Unknown robot identified by bot\*
>     This generates an internal server error:
>     RewriteCond %{HTTP_USER_AGENT} ("Unknown robot identified by bot\*")
>     [NC]
>     I tried, this, but it didn't help:
>     RewriteCond %{HTTP_USER_AGENT} ("Unknown robot identified by
>     bot\\*") [NC]
>
>     Any thoughts?
>
>     Regards,
>     -Montana

--
Gilbert E. Detillieux    E-mail: <gedetil@muug.ca>
Manitoba UNIX User Group   Web:    http://muug.ca/