The problem with 's/.*([[:digit:]]*).*/\1/g' is the first .* will swallow up as many characters as it can while still having the rest of the expression match something. Now, because the * means 0 or more of the previously matched character, the [[:digit:]]* and trailing .* will happily match nothing at all, so the initial .* still swallows everything. The fix is to make the first part more restrictive than .*, .e.g. [^0-9]* or [^[:digit:]]*, so it won't chew up your digits, but then Sean's RE is even simpler -- so long as you want all the digits and it doesn't matter where they are. If you needed to extract the first contiguous string of possibly several strings of digits, though, you'd need to get more elaborate.
An equivalent to Sean's command would be:
echo BUILD-AM005-a | tr -dc '0-9'
This would chew up the newline character as well, but that doesn't matter if you're going to use the result in a variable using var=`...` or var=$(...) .
Gilles
On 05/09/2007 04:14 PM, Steve Moffat wrote:
Well, ya... I guess I did the equivalent (though not so concise) method after sending the first email to roundtable...
echo APP-AM005-a | sed 's/[[:alpha:]]//g;s/[[:punct:]]//g'
I like the search inversion though Sean. Much cleaner!
So the problem I have is solved, thanks Sean. But why won't my original method work? The [[:digit:]]* should have matched all the consecutive digits shouldn't it? And then the ( ) brackets should place the match into buffer 1.
Steve
IBM Global Services sjm@ca.ibm.com (204)792-3245
----- Forwarded by Steve Moffat/CanWest/IBM on 05/09/2007 04:08 PM -----
*"Sean Walberg" <sean@ertw.com>* Sent by: swalberg@gmail.com 05/09/2007 04:05 PM
To
Steve Moffat/CanWest/IBM@IBMCA
cc
roundtable@muug.mb.ca
Subject
Re: [RndTbl] Oh great RE master
# echo BUILD-AM005-a | sed 's/[^0-9]//g' 005
Sean
On 5/9/07, *Steve Moffat* <_Steve.Moffat@ca.ibm.com _ mailto:Steve.Moffat@ca.ibm.com> wrote:
Hi All; I've been trying to write a sed function to return only a numeric portion of a string, but can't seem to get it working. The input is a single string of letters and numbers, with the numbers always consecutive. For example: BUILD-AM005-a I want to get the 005 out of this string. echo BUILD-AM005-a | sed 's/.*\([[:digit:]]\).*/\1/g' will return the digit 5. This is good! So I add an asterisk to try to match multiple digits like: echo BUILD-AM005-a | sed 's/.*\([[:digit:]]*\).*/\1/g' and instead of returning 005, it doesn't match anything, so returns nothing. Can any of you RE maters help me out? Steve Moffat IBM Global Services_ __sjm@ca.ibm.com_ <mailto:sjm@ca.ibm.com> (204)792-3245