What does the following number-processing sed code mean?A challenge for sed convert code from Mathematica to Matlabhow to modify matches to a regular expression with sed or another tool?sed insert in the beginning of multiple files is not workingExplanation for 'sed'How to use appropriate regex to find a pattern in awk?Bash numeric sort gives different results when columns are selected simultaneously vs. togethersed code understanding for text processingRemove a combination of numbers and symbols from a string using the $VARNAME//pattern/ waychange only part of the substring using sedWhen sed is used in an expect command, what is the proper way to escape the backslashes?
Did Dumbledore lie to Harry about how long he had James Potter's invisibility cloak when he was examining it? If so, why?
How easy is it to start Magic from scratch?
Hostile work environment after whistle-blowing on coworker and our boss. What do I do?
How do scammers retract money, while you can’t?
How do we know the LHC results are robust?
Increase performance creating Mandelbrot set in python
Different result between scanning in Epson's "color negative film" mode and scanning in positive -> invert curve in post?
Class Action - which options I have?
How do I extract a value from a time formatted value in excel?
A particular customize with green line and letters for subfloat
Integer addition + constant, is it a group?
Sort a list by elements of another list
Lay out the Carpet
Is this apparent Class Action settlement a spam message?
Energy of the particles in the particle accelerator
Is oxalic acid dihydrate considered a primary acid standard in analytical chemistry?
Is a stroke of luck acceptable after a series of unfavorable events?
Term for the "extreme-extension" version of a straw man fallacy?
Do the temporary hit points from the Battlerager barbarian's Reckless Abandon stack if I make multiple attacks on my turn?
How does it work when somebody invests in my business?
Trouble understanding the speech of overseas colleagues
Purchasing a ticket for someone else in another country?
Opposite of a diet
Avoiding estate tax by giving multiple gifts
What does the following number-processing sed code mean?
A challenge for sed convert code from Mathematica to Matlabhow to modify matches to a regular expression with sed or another tool?sed insert in the beginning of multiple files is not workingExplanation for 'sed'How to use appropriate regex to find a pattern in awk?Bash numeric sort gives different results when columns are selected simultaneously vs. togethersed code understanding for text processingRemove a combination of numbers and symbols from a string using the $VARNAME//pattern/ waychange only part of the substring using sedWhen sed is used in an expect command, what is the proper way to escape the backslashes?
I came across a code in script which add ',' after every 3 digits from backwards. the code only considers numeric data.
Following is the code.
sed 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g' number.txt
number.txt
1234
12345
123456
output
1,234
12,345
123,456
1234,567
Can anyone explain the code flow..
sed regular-expression numeric-data
New contributor
add a comment |
I came across a code in script which add ',' after every 3 digits from backwards. the code only considers numeric data.
Following is the code.
sed 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g' number.txt
number.txt
1234
12345
123456
output
1,234
12,345
123,456
1234,567
Can anyone explain the code flow..
sed regular-expression numeric-data
New contributor
1
You appear to know what the code does; there's no branching, and only one command. Where does your misunderstanding come in?
– Jeff Schaller♦
yesterday
add a comment |
I came across a code in script which add ',' after every 3 digits from backwards. the code only considers numeric data.
Following is the code.
sed 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g' number.txt
number.txt
1234
12345
123456
output
1,234
12,345
123,456
1234,567
Can anyone explain the code flow..
sed regular-expression numeric-data
New contributor
I came across a code in script which add ',' after every 3 digits from backwards. the code only considers numeric data.
Following is the code.
sed 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g' number.txt
number.txt
1234
12345
123456
output
1,234
12,345
123,456
1234,567
Can anyone explain the code flow..
sed regular-expression numeric-data
sed regular-expression numeric-data
New contributor
New contributor
edited yesterday
Jeff Schaller♦
44k1161142
44k1161142
New contributor
asked yesterday
Dhaval kaleDhaval kale
1
1
New contributor
New contributor
1
You appear to know what the code does; there's no branching, and only one command. Where does your misunderstanding come in?
– Jeff Schaller♦
yesterday
add a comment |
1
You appear to know what the code does; there's no branching, and only one command. Where does your misunderstanding come in?
– Jeff Schaller♦
yesterday
1
1
You appear to know what the code does; there's no branching, and only one command. Where does your misunderstanding come in?
– Jeff Schaller♦
yesterday
You appear to know what the code does; there's no branching, and only one command. Where does your misunderstanding come in?
– Jeff Schaller♦
yesterday
add a comment |
3 Answers
3
active
oldest
votes
sed (the stream editor) can operate in search and replace mode using regular expressions. There is a bit of sed-specific escaping going on, but for the regex itself, you can confer the explanation tool from regexr:
(
Capturing group #1. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
^
Beginning. Matches the beginning of the
string.
|
Alternation. Acts like a boolean OR. Matches
the expression before or after the |
.
[^
Negated set. Match any character that is not
in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
.
Character. Matches a "." character.
]
)
(
Capturing group #2. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
[
Character set. Match any character in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
]
+
Quantifier. Match 1 or more of the preceding
token.
)
(
Capturing group #3. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
[
Character set. Match any character in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
]
3
Quantifier. Match 3 of the preceding token.
)
1
... "0-9 Range. Matches a character in the range "0" to "9". Case sensitive
"; ByCase sensitive
, you mean English digits? or?
– αғsнιη
yesterday
No, case sensitive means big and small letters make a difference. Regular expressions are usually case sensitive (unless configured otherwise e.g. an i switch). In this particular case, the character set only contains digits. They have big and small variants so the case sensitivity does not actually matter. This is the output as given by regexr – feel free to try it. The explanation tool is quite interactive.
– Hermann
yesterday
add a comment |
The pattern captures (1) the start of line or something that is not a digit nor a dot, followed by (2) any number of digits, at least one, followed by (3) exactly three digits.
It then puts them back with a comma between (2) and (3), in effect adding one thousands separator. The first group is only required to avoid touching fractional parts after a decimal point, since we don't want 1.2345
to turn into 1.2,345
.
Note that the pattern is written in basic regular expressions (BRE), requiring backslashes in front of each ()
and to make them special. Moreover, it requires GNU sed, where
+
and |
also have special meanings in BRE as an extension. The command would be better written as an extended regular expression (sed -E
is supported by many sed implementations):
sed -E 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g'
Also, the pattern only does one replacement, it does not add multiple thousand separators to the same number. The /g
at the end will match multiple times on the same line, but it doesn't process already replaced data. 1234567
will become 1234,567
, not 1,234,567
. To fix that, we'll need to add a loop:
sed -E -e :a -e 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g' -e ta
Here, :a
is just a label, and the final ta
tests for a successful replacement, and jumps back to a
if a replacement was done, in effect repeating the process as many times as it does something. Hence 1234567
will turn into 1,234,567
.
add a comment |
sed 's/foo/bar/g' number.txt
: This reads the filenumber.txt
, and replaces the regex patternfoo
withbar
. This will occur for all matches on each line (/g
).(^|[^0-9.])([0-9]+)([0-9]3)
: This is the pattern to replace. Each part in the escaped parentheses(…)
is a "capturing group". The pattern inside is "captured" for later use.(^|[^0-9.])
: Find the beginning of the line^
or|
a character that is not a numeral or period[^0-9.]
. This essentially finds the character preceding a number.([0-9]+)
: Find one or more numerals[0-9]+
.([0-9]3)
: Find 3 numerals[0-9]3
.
12,3
: replace the above matches with the first two capturing groups followed by,
then the last capturing group. In other words, insert,
between the second and third patterns.
Because sed
is "greedy", it will try to maximise the length of the match. Hence the final capturing group will be the last three numerals in the number.
N.B. many of the "special" characters are escaped with , e.g.
(…)
and 3
. If your sed
supported "extended regular expressions" with -E
or -r
, then you would not have to escape these. This would improve readability.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Dhaval kale is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f508697%2fwhat-does-the-following-number-processing-sed-code-mean%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
sed (the stream editor) can operate in search and replace mode using regular expressions. There is a bit of sed-specific escaping going on, but for the regex itself, you can confer the explanation tool from regexr:
(
Capturing group #1. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
^
Beginning. Matches the beginning of the
string.
|
Alternation. Acts like a boolean OR. Matches
the expression before or after the |
.
[^
Negated set. Match any character that is not
in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
.
Character. Matches a "." character.
]
)
(
Capturing group #2. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
[
Character set. Match any character in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
]
+
Quantifier. Match 1 or more of the preceding
token.
)
(
Capturing group #3. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
[
Character set. Match any character in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
]
3
Quantifier. Match 3 of the preceding token.
)
1
... "0-9 Range. Matches a character in the range "0" to "9". Case sensitive
"; ByCase sensitive
, you mean English digits? or?
– αғsнιη
yesterday
No, case sensitive means big and small letters make a difference. Regular expressions are usually case sensitive (unless configured otherwise e.g. an i switch). In this particular case, the character set only contains digits. They have big and small variants so the case sensitivity does not actually matter. This is the output as given by regexr – feel free to try it. The explanation tool is quite interactive.
– Hermann
yesterday
add a comment |
sed (the stream editor) can operate in search and replace mode using regular expressions. There is a bit of sed-specific escaping going on, but for the regex itself, you can confer the explanation tool from regexr:
(
Capturing group #1. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
^
Beginning. Matches the beginning of the
string.
|
Alternation. Acts like a boolean OR. Matches
the expression before or after the |
.
[^
Negated set. Match any character that is not
in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
.
Character. Matches a "." character.
]
)
(
Capturing group #2. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
[
Character set. Match any character in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
]
+
Quantifier. Match 1 or more of the preceding
token.
)
(
Capturing group #3. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
[
Character set. Match any character in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
]
3
Quantifier. Match 3 of the preceding token.
)
1
... "0-9 Range. Matches a character in the range "0" to "9". Case sensitive
"; ByCase sensitive
, you mean English digits? or?
– αғsнιη
yesterday
No, case sensitive means big and small letters make a difference. Regular expressions are usually case sensitive (unless configured otherwise e.g. an i switch). In this particular case, the character set only contains digits. They have big and small variants so the case sensitivity does not actually matter. This is the output as given by regexr – feel free to try it. The explanation tool is quite interactive.
– Hermann
yesterday
add a comment |
sed (the stream editor) can operate in search and replace mode using regular expressions. There is a bit of sed-specific escaping going on, but for the regex itself, you can confer the explanation tool from regexr:
(
Capturing group #1. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
^
Beginning. Matches the beginning of the
string.
|
Alternation. Acts like a boolean OR. Matches
the expression before or after the |
.
[^
Negated set. Match any character that is not
in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
.
Character. Matches a "." character.
]
)
(
Capturing group #2. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
[
Character set. Match any character in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
]
+
Quantifier. Match 1 or more of the preceding
token.
)
(
Capturing group #3. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
[
Character set. Match any character in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
]
3
Quantifier. Match 3 of the preceding token.
)
sed (the stream editor) can operate in search and replace mode using regular expressions. There is a bit of sed-specific escaping going on, but for the regex itself, you can confer the explanation tool from regexr:
(
Capturing group #1. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
^
Beginning. Matches the beginning of the
string.
|
Alternation. Acts like a boolean OR. Matches
the expression before or after the |
.
[^
Negated set. Match any character that is not
in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
.
Character. Matches a "." character.
]
)
(
Capturing group #2. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
[
Character set. Match any character in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
]
+
Quantifier. Match 1 or more of the preceding
token.
)
(
Capturing group #3. Groups multiple tokens
together and creates a capture group for extracting a substring or using
a backreference.
[
Character set. Match any character in the set.
0-9
Range. Matches a character in the range "0"
to "9". Case sensitive.
]
3
Quantifier. Match 3 of the preceding token.
)
edited yesterday
Jeff Schaller♦
44k1161142
44k1161142
answered yesterday
HermannHermann
957515
957515
1
... "0-9 Range. Matches a character in the range "0" to "9". Case sensitive
"; ByCase sensitive
, you mean English digits? or?
– αғsнιη
yesterday
No, case sensitive means big and small letters make a difference. Regular expressions are usually case sensitive (unless configured otherwise e.g. an i switch). In this particular case, the character set only contains digits. They have big and small variants so the case sensitivity does not actually matter. This is the output as given by regexr – feel free to try it. The explanation tool is quite interactive.
– Hermann
yesterday
add a comment |
1
... "0-9 Range. Matches a character in the range "0" to "9". Case sensitive
"; ByCase sensitive
, you mean English digits? or?
– αғsнιη
yesterday
No, case sensitive means big and small letters make a difference. Regular expressions are usually case sensitive (unless configured otherwise e.g. an i switch). In this particular case, the character set only contains digits. They have big and small variants so the case sensitivity does not actually matter. This is the output as given by regexr – feel free to try it. The explanation tool is quite interactive.
– Hermann
yesterday
1
1
... "
0-9 Range. Matches a character in the range "0" to "9". Case sensitive
"; By Case sensitive
, you mean English digits? or?– αғsнιη
yesterday
... "
0-9 Range. Matches a character in the range "0" to "9". Case sensitive
"; By Case sensitive
, you mean English digits? or?– αғsнιη
yesterday
No, case sensitive means big and small letters make a difference. Regular expressions are usually case sensitive (unless configured otherwise e.g. an i switch). In this particular case, the character set only contains digits. They have big and small variants so the case sensitivity does not actually matter. This is the output as given by regexr – feel free to try it. The explanation tool is quite interactive.
– Hermann
yesterday
No, case sensitive means big and small letters make a difference. Regular expressions are usually case sensitive (unless configured otherwise e.g. an i switch). In this particular case, the character set only contains digits. They have big and small variants so the case sensitivity does not actually matter. This is the output as given by regexr – feel free to try it. The explanation tool is quite interactive.
– Hermann
yesterday
add a comment |
The pattern captures (1) the start of line or something that is not a digit nor a dot, followed by (2) any number of digits, at least one, followed by (3) exactly three digits.
It then puts them back with a comma between (2) and (3), in effect adding one thousands separator. The first group is only required to avoid touching fractional parts after a decimal point, since we don't want 1.2345
to turn into 1.2,345
.
Note that the pattern is written in basic regular expressions (BRE), requiring backslashes in front of each ()
and to make them special. Moreover, it requires GNU sed, where
+
and |
also have special meanings in BRE as an extension. The command would be better written as an extended regular expression (sed -E
is supported by many sed implementations):
sed -E 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g'
Also, the pattern only does one replacement, it does not add multiple thousand separators to the same number. The /g
at the end will match multiple times on the same line, but it doesn't process already replaced data. 1234567
will become 1234,567
, not 1,234,567
. To fix that, we'll need to add a loop:
sed -E -e :a -e 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g' -e ta
Here, :a
is just a label, and the final ta
tests for a successful replacement, and jumps back to a
if a replacement was done, in effect repeating the process as many times as it does something. Hence 1234567
will turn into 1,234,567
.
add a comment |
The pattern captures (1) the start of line or something that is not a digit nor a dot, followed by (2) any number of digits, at least one, followed by (3) exactly three digits.
It then puts them back with a comma between (2) and (3), in effect adding one thousands separator. The first group is only required to avoid touching fractional parts after a decimal point, since we don't want 1.2345
to turn into 1.2,345
.
Note that the pattern is written in basic regular expressions (BRE), requiring backslashes in front of each ()
and to make them special. Moreover, it requires GNU sed, where
+
and |
also have special meanings in BRE as an extension. The command would be better written as an extended regular expression (sed -E
is supported by many sed implementations):
sed -E 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g'
Also, the pattern only does one replacement, it does not add multiple thousand separators to the same number. The /g
at the end will match multiple times on the same line, but it doesn't process already replaced data. 1234567
will become 1234,567
, not 1,234,567
. To fix that, we'll need to add a loop:
sed -E -e :a -e 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g' -e ta
Here, :a
is just a label, and the final ta
tests for a successful replacement, and jumps back to a
if a replacement was done, in effect repeating the process as many times as it does something. Hence 1234567
will turn into 1,234,567
.
add a comment |
The pattern captures (1) the start of line or something that is not a digit nor a dot, followed by (2) any number of digits, at least one, followed by (3) exactly three digits.
It then puts them back with a comma between (2) and (3), in effect adding one thousands separator. The first group is only required to avoid touching fractional parts after a decimal point, since we don't want 1.2345
to turn into 1.2,345
.
Note that the pattern is written in basic regular expressions (BRE), requiring backslashes in front of each ()
and to make them special. Moreover, it requires GNU sed, where
+
and |
also have special meanings in BRE as an extension. The command would be better written as an extended regular expression (sed -E
is supported by many sed implementations):
sed -E 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g'
Also, the pattern only does one replacement, it does not add multiple thousand separators to the same number. The /g
at the end will match multiple times on the same line, but it doesn't process already replaced data. 1234567
will become 1234,567
, not 1,234,567
. To fix that, we'll need to add a loop:
sed -E -e :a -e 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g' -e ta
Here, :a
is just a label, and the final ta
tests for a successful replacement, and jumps back to a
if a replacement was done, in effect repeating the process as many times as it does something. Hence 1234567
will turn into 1,234,567
.
The pattern captures (1) the start of line or something that is not a digit nor a dot, followed by (2) any number of digits, at least one, followed by (3) exactly three digits.
It then puts them back with a comma between (2) and (3), in effect adding one thousands separator. The first group is only required to avoid touching fractional parts after a decimal point, since we don't want 1.2345
to turn into 1.2,345
.
Note that the pattern is written in basic regular expressions (BRE), requiring backslashes in front of each ()
and to make them special. Moreover, it requires GNU sed, where
+
and |
also have special meanings in BRE as an extension. The command would be better written as an extended regular expression (sed -E
is supported by many sed implementations):
sed -E 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g'
Also, the pattern only does one replacement, it does not add multiple thousand separators to the same number. The /g
at the end will match multiple times on the same line, but it doesn't process already replaced data. 1234567
will become 1234,567
, not 1,234,567
. To fix that, we'll need to add a loop:
sed -E -e :a -e 's/(^|[^0-9.])([0-9]+)([0-9]3)/12,3/g' -e ta
Here, :a
is just a label, and the final ta
tests for a successful replacement, and jumps back to a
if a replacement was done, in effect repeating the process as many times as it does something. Hence 1234567
will turn into 1,234,567
.
answered yesterday
ilkkachuilkkachu
62.8k10103180
62.8k10103180
add a comment |
add a comment |
sed 's/foo/bar/g' number.txt
: This reads the filenumber.txt
, and replaces the regex patternfoo
withbar
. This will occur for all matches on each line (/g
).(^|[^0-9.])([0-9]+)([0-9]3)
: This is the pattern to replace. Each part in the escaped parentheses(…)
is a "capturing group". The pattern inside is "captured" for later use.(^|[^0-9.])
: Find the beginning of the line^
or|
a character that is not a numeral or period[^0-9.]
. This essentially finds the character preceding a number.([0-9]+)
: Find one or more numerals[0-9]+
.([0-9]3)
: Find 3 numerals[0-9]3
.
12,3
: replace the above matches with the first two capturing groups followed by,
then the last capturing group. In other words, insert,
between the second and third patterns.
Because sed
is "greedy", it will try to maximise the length of the match. Hence the final capturing group will be the last three numerals in the number.
N.B. many of the "special" characters are escaped with , e.g.
(…)
and 3
. If your sed
supported "extended regular expressions" with -E
or -r
, then you would not have to escape these. This would improve readability.
add a comment |
sed 's/foo/bar/g' number.txt
: This reads the filenumber.txt
, and replaces the regex patternfoo
withbar
. This will occur for all matches on each line (/g
).(^|[^0-9.])([0-9]+)([0-9]3)
: This is the pattern to replace. Each part in the escaped parentheses(…)
is a "capturing group". The pattern inside is "captured" for later use.(^|[^0-9.])
: Find the beginning of the line^
or|
a character that is not a numeral or period[^0-9.]
. This essentially finds the character preceding a number.([0-9]+)
: Find one or more numerals[0-9]+
.([0-9]3)
: Find 3 numerals[0-9]3
.
12,3
: replace the above matches with the first two capturing groups followed by,
then the last capturing group. In other words, insert,
between the second and third patterns.
Because sed
is "greedy", it will try to maximise the length of the match. Hence the final capturing group will be the last three numerals in the number.
N.B. many of the "special" characters are escaped with , e.g.
(…)
and 3
. If your sed
supported "extended regular expressions" with -E
or -r
, then you would not have to escape these. This would improve readability.
add a comment |
sed 's/foo/bar/g' number.txt
: This reads the filenumber.txt
, and replaces the regex patternfoo
withbar
. This will occur for all matches on each line (/g
).(^|[^0-9.])([0-9]+)([0-9]3)
: This is the pattern to replace. Each part in the escaped parentheses(…)
is a "capturing group". The pattern inside is "captured" for later use.(^|[^0-9.])
: Find the beginning of the line^
or|
a character that is not a numeral or period[^0-9.]
. This essentially finds the character preceding a number.([0-9]+)
: Find one or more numerals[0-9]+
.([0-9]3)
: Find 3 numerals[0-9]3
.
12,3
: replace the above matches with the first two capturing groups followed by,
then the last capturing group. In other words, insert,
between the second and third patterns.
Because sed
is "greedy", it will try to maximise the length of the match. Hence the final capturing group will be the last three numerals in the number.
N.B. many of the "special" characters are escaped with , e.g.
(…)
and 3
. If your sed
supported "extended regular expressions" with -E
or -r
, then you would not have to escape these. This would improve readability.
sed 's/foo/bar/g' number.txt
: This reads the filenumber.txt
, and replaces the regex patternfoo
withbar
. This will occur for all matches on each line (/g
).(^|[^0-9.])([0-9]+)([0-9]3)
: This is the pattern to replace. Each part in the escaped parentheses(…)
is a "capturing group". The pattern inside is "captured" for later use.(^|[^0-9.])
: Find the beginning of the line^
or|
a character that is not a numeral or period[^0-9.]
. This essentially finds the character preceding a number.([0-9]+)
: Find one or more numerals[0-9]+
.([0-9]3)
: Find 3 numerals[0-9]3
.
12,3
: replace the above matches with the first two capturing groups followed by,
then the last capturing group. In other words, insert,
between the second and third patterns.
Because sed
is "greedy", it will try to maximise the length of the match. Hence the final capturing group will be the last three numerals in the number.
N.B. many of the "special" characters are escaped with , e.g.
(…)
and 3
. If your sed
supported "extended regular expressions" with -E
or -r
, then you would not have to escape these. This would improve readability.
edited yesterday
answered yesterday
SparhawkSparhawk
10.3k744101
10.3k744101
add a comment |
add a comment |
Dhaval kale is a new contributor. Be nice, and check out our Code of Conduct.
Dhaval kale is a new contributor. Be nice, and check out our Code of Conduct.
Dhaval kale is a new contributor. Be nice, and check out our Code of Conduct.
Dhaval kale is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f508697%2fwhat-does-the-following-number-processing-sed-code-mean%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
You appear to know what the code does; there's no branching, and only one command. Where does your misunderstanding come in?
– Jeff Schaller♦
yesterday