using conditional awk statement to control regex patternunix awk begin statementConditional block vs conditional statement (if)AWK print regex patternAwk: if and conditional statement in same blockhelp to rectify awk statementBreaking awk statement while searching a pattern in a filePass month name dynamically in AWK (GNU) with control statementUsing if/else statement in awkUsing variables in awk statementsearch pattern pair using awk
How to find program name(s) of an installed package?
Which models of the Boeing 737 are still in production?
Did Shadowfax go to Valinor?
Theorem, big Paralist and Amsart
If I cast Expeditious Retreat, can I Dash as a bonus action on the same turn?
How does strength of boric acid solution increase in presence of salicylic acid?
Have astronauts in space suits ever taken selfies? If so, how?
Why doesn't Newton's third law mean a person bounces back to where they started when they hit the ground?
How to say job offer in Mandarin/Cantonese?
What do you call a Matrix-like slowdown and camera movement effect?
Dragon forelimb placement
Can I ask the recruiters in my resume to put the reason why I am rejected?
Approximately how much travel time was saved by the opening of the Suez Canal in 1869?
How do we improve the relationship with a client software team that performs poorly and is becoming less collaborative?
A newer friend of my brother's gave him a load of baseball cards that are supposedly extremely valuable. Is this a scam?
"to be prejudice towards/against someone" vs "to be prejudiced against/towards someone"
LaTeX closing $ signs makes cursor jump
Why are 150k or 200k jobs considered good when there are 300k+ births a month?
Is it tax fraud for an individual to declare non-taxable revenue as taxable income? (US tax laws)
Mathematical cryptic clues
Show that if two triangles built on parallel lines, with equal bases have the same perimeter only if they are congruent.
can i play a electric guitar through a bass amp?
I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine
What do the dots in this tr command do: tr .............A-Z A-ZA-Z <<< "JVPQBOV" (with 13 dots)
using conditional awk statement to control regex pattern
unix awk begin statementConditional block vs conditional statement (if)AWK print regex patternAwk: if and conditional statement in same blockhelp to rectify awk statementBreaking awk statement while searching a pattern in a filePass month name dynamically in AWK (GNU) with control statementUsing if/else statement in awkUsing variables in awk statementsearch pattern pair using awk
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have code that parses through a csv file and then checks each field against a regex. however there are a few fields that need to be mandatory if another field has data in it, so I essentially need a conditional block to control the flow of data . so for example of sample file see below
"S","HEY","J","B","0",""
so what I need is a way to say
if $1 == "S"
USE THIS regex ($3~/^("[A-Z0-9]1")$/) print "3RDfield invalid-HEADER-FILE";
else
USE THIS REGEX ($3~/^("")$/) print "3RDfield invalid-HEADER-FILE";
I have tried using inline version
$1== "S" && ($3~/^"[A-Z0-9]1"$/) print "3RD field invalid-HEADER- FILE";
$1 != "S" && ($3~/^("")$/) print "3RDfield invalid-HEADER-FILE";
awk
add a comment |
I have code that parses through a csv file and then checks each field against a regex. however there are a few fields that need to be mandatory if another field has data in it, so I essentially need a conditional block to control the flow of data . so for example of sample file see below
"S","HEY","J","B","0",""
so what I need is a way to say
if $1 == "S"
USE THIS regex ($3~/^("[A-Z0-9]1")$/) print "3RDfield invalid-HEADER-FILE";
else
USE THIS REGEX ($3~/^("")$/) print "3RDfield invalid-HEADER-FILE";
I have tried using inline version
$1== "S" && ($3~/^"[A-Z0-9]1"$/) print "3RD field invalid-HEADER- FILE";
$1 != "S" && ($3~/^("")$/) print "3RDfield invalid-HEADER-FILE";
awk
Why not$1 != "S" && ($5~/^("")$/)
for that second block? That's what you initially described.
– Kusalananda♦
Mar 11 at 10:24
That should be$1==""S"" && ..
,$44!=""B"" && ..
, right? (otherwise it will matchS,...
(an unquotedS
as$1
).
– mosvy
Mar 11 at 10:30
@Kusalananda apologies I copied the wrong segment of code see post for correct version
– jordanb111
Mar 11 at 10:35
add a comment |
I have code that parses through a csv file and then checks each field against a regex. however there are a few fields that need to be mandatory if another field has data in it, so I essentially need a conditional block to control the flow of data . so for example of sample file see below
"S","HEY","J","B","0",""
so what I need is a way to say
if $1 == "S"
USE THIS regex ($3~/^("[A-Z0-9]1")$/) print "3RDfield invalid-HEADER-FILE";
else
USE THIS REGEX ($3~/^("")$/) print "3RDfield invalid-HEADER-FILE";
I have tried using inline version
$1== "S" && ($3~/^"[A-Z0-9]1"$/) print "3RD field invalid-HEADER- FILE";
$1 != "S" && ($3~/^("")$/) print "3RDfield invalid-HEADER-FILE";
awk
I have code that parses through a csv file and then checks each field against a regex. however there are a few fields that need to be mandatory if another field has data in it, so I essentially need a conditional block to control the flow of data . so for example of sample file see below
"S","HEY","J","B","0",""
so what I need is a way to say
if $1 == "S"
USE THIS regex ($3~/^("[A-Z0-9]1")$/) print "3RDfield invalid-HEADER-FILE";
else
USE THIS REGEX ($3~/^("")$/) print "3RDfield invalid-HEADER-FILE";
I have tried using inline version
$1== "S" && ($3~/^"[A-Z0-9]1"$/) print "3RD field invalid-HEADER- FILE";
$1 != "S" && ($3~/^("")$/) print "3RDfield invalid-HEADER-FILE";
awk
awk
edited 2 days ago
Rui F Ribeiro
41.9k1483142
41.9k1483142
asked Mar 11 at 10:21
jordanb111jordanb111
83
83
Why not$1 != "S" && ($5~/^("")$/)
for that second block? That's what you initially described.
– Kusalananda♦
Mar 11 at 10:24
That should be$1==""S"" && ..
,$44!=""B"" && ..
, right? (otherwise it will matchS,...
(an unquotedS
as$1
).
– mosvy
Mar 11 at 10:30
@Kusalananda apologies I copied the wrong segment of code see post for correct version
– jordanb111
Mar 11 at 10:35
add a comment |
Why not$1 != "S" && ($5~/^("")$/)
for that second block? That's what you initially described.
– Kusalananda♦
Mar 11 at 10:24
That should be$1==""S"" && ..
,$44!=""B"" && ..
, right? (otherwise it will matchS,...
(an unquotedS
as$1
).
– mosvy
Mar 11 at 10:30
@Kusalananda apologies I copied the wrong segment of code see post for correct version
– jordanb111
Mar 11 at 10:35
Why not
$1 != "S" && ($5~/^("")$/)
for that second block? That's what you initially described.– Kusalananda♦
Mar 11 at 10:24
Why not
$1 != "S" && ($5~/^("")$/)
for that second block? That's what you initially described.– Kusalananda♦
Mar 11 at 10:24
That should be
$1==""S"" && ..
, $44!=""B"" && ..
, right? (otherwise it will match S,...
(an unquoted S
as $1
).– mosvy
Mar 11 at 10:30
That should be
$1==""S"" && ..
, $44!=""B"" && ..
, right? (otherwise it will match S,...
(an unquoted S
as $1
).– mosvy
Mar 11 at 10:30
@Kusalananda apologies I copied the wrong segment of code see post for correct version
– jordanb111
Mar 11 at 10:35
@Kusalananda apologies I copied the wrong segment of code see post for correct version
– jordanb111
Mar 11 at 10:35
add a comment |
1 Answer
1
active
oldest
votes
if ($1 == ""S"")
regex = "^"[[:upper:][:digit:]]"$"
else
regex = "^""$"
$5 ~ regex print "error"
Or using the ternary operator:
$5 ~ ($1 == ""S"" ? "^"[[:upper:][:digit:]]"$" : "^""$")
print "error"
Note that [A-Z]
, [0-9]
could (and in practice sometimes do) match just about anything in locales other than C, while [[:digit:]]
matches [0123456789]
and [[:upper:]]
uppercase letters (all the ones in the locale, not necessarily limited to latin ones without diacritics).
1
is superfluous.
unfortunately due to the system this will be implemented on it doesn't have posix support so [[:upper:]] and the like wont work on the system hence the older a-z method
– jordanb111
Mar 11 at 10:37
and yes you are correct about 1 I have now removed it
– jordanb111
Mar 11 at 10:39
@jordanb111, OK. If it doesn't support POSIX character classes, chances are it won't support1
either (superfluous anyway). If[A-Z]
matches other characters thanABCDEFGHIJKLMNOPQRSTUVWXYZ
, you can always replace it with[ABCDEFGHIJKLMNOPQRSTUVWXYZ]
or fix the locale toC
.
– Stéphane Chazelas
Mar 11 at 10:39
in my testing it hasn't matched any other characters but I will continue and see if it matches anything else
– jordanb111
Mar 11 at 10:46
@jordanb111, in a locale using the UTF-8 charset, you can doperl -XCO -le 'print chr($_) for 0x1..0xd7ff,0xe000..0x10ffff' | awk '/^[A-Z]$/'
to find the single-character collating elements that[A-Z]
matches (it may also match multi-character collating elements like HungarianDzs
). For/usr/xpg4/bin/awk
on Solaris, I get 1205 characters, includingb-z
– Stéphane Chazelas
Mar 11 at 10:59
|
show 3 more comments
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f505598%2fusing-conditional-awk-statement-to-control-regex-pattern%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
if ($1 == ""S"")
regex = "^"[[:upper:][:digit:]]"$"
else
regex = "^""$"
$5 ~ regex print "error"
Or using the ternary operator:
$5 ~ ($1 == ""S"" ? "^"[[:upper:][:digit:]]"$" : "^""$")
print "error"
Note that [A-Z]
, [0-9]
could (and in practice sometimes do) match just about anything in locales other than C, while [[:digit:]]
matches [0123456789]
and [[:upper:]]
uppercase letters (all the ones in the locale, not necessarily limited to latin ones without diacritics).
1
is superfluous.
unfortunately due to the system this will be implemented on it doesn't have posix support so [[:upper:]] and the like wont work on the system hence the older a-z method
– jordanb111
Mar 11 at 10:37
and yes you are correct about 1 I have now removed it
– jordanb111
Mar 11 at 10:39
@jordanb111, OK. If it doesn't support POSIX character classes, chances are it won't support1
either (superfluous anyway). If[A-Z]
matches other characters thanABCDEFGHIJKLMNOPQRSTUVWXYZ
, you can always replace it with[ABCDEFGHIJKLMNOPQRSTUVWXYZ]
or fix the locale toC
.
– Stéphane Chazelas
Mar 11 at 10:39
in my testing it hasn't matched any other characters but I will continue and see if it matches anything else
– jordanb111
Mar 11 at 10:46
@jordanb111, in a locale using the UTF-8 charset, you can doperl -XCO -le 'print chr($_) for 0x1..0xd7ff,0xe000..0x10ffff' | awk '/^[A-Z]$/'
to find the single-character collating elements that[A-Z]
matches (it may also match multi-character collating elements like HungarianDzs
). For/usr/xpg4/bin/awk
on Solaris, I get 1205 characters, includingb-z
– Stéphane Chazelas
Mar 11 at 10:59
|
show 3 more comments
if ($1 == ""S"")
regex = "^"[[:upper:][:digit:]]"$"
else
regex = "^""$"
$5 ~ regex print "error"
Or using the ternary operator:
$5 ~ ($1 == ""S"" ? "^"[[:upper:][:digit:]]"$" : "^""$")
print "error"
Note that [A-Z]
, [0-9]
could (and in practice sometimes do) match just about anything in locales other than C, while [[:digit:]]
matches [0123456789]
and [[:upper:]]
uppercase letters (all the ones in the locale, not necessarily limited to latin ones without diacritics).
1
is superfluous.
unfortunately due to the system this will be implemented on it doesn't have posix support so [[:upper:]] and the like wont work on the system hence the older a-z method
– jordanb111
Mar 11 at 10:37
and yes you are correct about 1 I have now removed it
– jordanb111
Mar 11 at 10:39
@jordanb111, OK. If it doesn't support POSIX character classes, chances are it won't support1
either (superfluous anyway). If[A-Z]
matches other characters thanABCDEFGHIJKLMNOPQRSTUVWXYZ
, you can always replace it with[ABCDEFGHIJKLMNOPQRSTUVWXYZ]
or fix the locale toC
.
– Stéphane Chazelas
Mar 11 at 10:39
in my testing it hasn't matched any other characters but I will continue and see if it matches anything else
– jordanb111
Mar 11 at 10:46
@jordanb111, in a locale using the UTF-8 charset, you can doperl -XCO -le 'print chr($_) for 0x1..0xd7ff,0xe000..0x10ffff' | awk '/^[A-Z]$/'
to find the single-character collating elements that[A-Z]
matches (it may also match multi-character collating elements like HungarianDzs
). For/usr/xpg4/bin/awk
on Solaris, I get 1205 characters, includingb-z
– Stéphane Chazelas
Mar 11 at 10:59
|
show 3 more comments
if ($1 == ""S"")
regex = "^"[[:upper:][:digit:]]"$"
else
regex = "^""$"
$5 ~ regex print "error"
Or using the ternary operator:
$5 ~ ($1 == ""S"" ? "^"[[:upper:][:digit:]]"$" : "^""$")
print "error"
Note that [A-Z]
, [0-9]
could (and in practice sometimes do) match just about anything in locales other than C, while [[:digit:]]
matches [0123456789]
and [[:upper:]]
uppercase letters (all the ones in the locale, not necessarily limited to latin ones without diacritics).
1
is superfluous.
if ($1 == ""S"")
regex = "^"[[:upper:][:digit:]]"$"
else
regex = "^""$"
$5 ~ regex print "error"
Or using the ternary operator:
$5 ~ ($1 == ""S"" ? "^"[[:upper:][:digit:]]"$" : "^""$")
print "error"
Note that [A-Z]
, [0-9]
could (and in practice sometimes do) match just about anything in locales other than C, while [[:digit:]]
matches [0123456789]
and [[:upper:]]
uppercase letters (all the ones in the locale, not necessarily limited to latin ones without diacritics).
1
is superfluous.
edited Mar 11 at 11:01
answered Mar 11 at 10:32
Stéphane ChazelasStéphane Chazelas
313k57592948
313k57592948
unfortunately due to the system this will be implemented on it doesn't have posix support so [[:upper:]] and the like wont work on the system hence the older a-z method
– jordanb111
Mar 11 at 10:37
and yes you are correct about 1 I have now removed it
– jordanb111
Mar 11 at 10:39
@jordanb111, OK. If it doesn't support POSIX character classes, chances are it won't support1
either (superfluous anyway). If[A-Z]
matches other characters thanABCDEFGHIJKLMNOPQRSTUVWXYZ
, you can always replace it with[ABCDEFGHIJKLMNOPQRSTUVWXYZ]
or fix the locale toC
.
– Stéphane Chazelas
Mar 11 at 10:39
in my testing it hasn't matched any other characters but I will continue and see if it matches anything else
– jordanb111
Mar 11 at 10:46
@jordanb111, in a locale using the UTF-8 charset, you can doperl -XCO -le 'print chr($_) for 0x1..0xd7ff,0xe000..0x10ffff' | awk '/^[A-Z]$/'
to find the single-character collating elements that[A-Z]
matches (it may also match multi-character collating elements like HungarianDzs
). For/usr/xpg4/bin/awk
on Solaris, I get 1205 characters, includingb-z
– Stéphane Chazelas
Mar 11 at 10:59
|
show 3 more comments
unfortunately due to the system this will be implemented on it doesn't have posix support so [[:upper:]] and the like wont work on the system hence the older a-z method
– jordanb111
Mar 11 at 10:37
and yes you are correct about 1 I have now removed it
– jordanb111
Mar 11 at 10:39
@jordanb111, OK. If it doesn't support POSIX character classes, chances are it won't support1
either (superfluous anyway). If[A-Z]
matches other characters thanABCDEFGHIJKLMNOPQRSTUVWXYZ
, you can always replace it with[ABCDEFGHIJKLMNOPQRSTUVWXYZ]
or fix the locale toC
.
– Stéphane Chazelas
Mar 11 at 10:39
in my testing it hasn't matched any other characters but I will continue and see if it matches anything else
– jordanb111
Mar 11 at 10:46
@jordanb111, in a locale using the UTF-8 charset, you can doperl -XCO -le 'print chr($_) for 0x1..0xd7ff,0xe000..0x10ffff' | awk '/^[A-Z]$/'
to find the single-character collating elements that[A-Z]
matches (it may also match multi-character collating elements like HungarianDzs
). For/usr/xpg4/bin/awk
on Solaris, I get 1205 characters, includingb-z
– Stéphane Chazelas
Mar 11 at 10:59
unfortunately due to the system this will be implemented on it doesn't have posix support so [[:upper:]] and the like wont work on the system hence the older a-z method
– jordanb111
Mar 11 at 10:37
unfortunately due to the system this will be implemented on it doesn't have posix support so [[:upper:]] and the like wont work on the system hence the older a-z method
– jordanb111
Mar 11 at 10:37
and yes you are correct about 1 I have now removed it
– jordanb111
Mar 11 at 10:39
and yes you are correct about 1 I have now removed it
– jordanb111
Mar 11 at 10:39
@jordanb111, OK. If it doesn't support POSIX character classes, chances are it won't support
1
either (superfluous anyway). If [A-Z]
matches other characters than ABCDEFGHIJKLMNOPQRSTUVWXYZ
, you can always replace it with [ABCDEFGHIJKLMNOPQRSTUVWXYZ]
or fix the locale to C
.– Stéphane Chazelas
Mar 11 at 10:39
@jordanb111, OK. If it doesn't support POSIX character classes, chances are it won't support
1
either (superfluous anyway). If [A-Z]
matches other characters than ABCDEFGHIJKLMNOPQRSTUVWXYZ
, you can always replace it with [ABCDEFGHIJKLMNOPQRSTUVWXYZ]
or fix the locale to C
.– Stéphane Chazelas
Mar 11 at 10:39
in my testing it hasn't matched any other characters but I will continue and see if it matches anything else
– jordanb111
Mar 11 at 10:46
in my testing it hasn't matched any other characters but I will continue and see if it matches anything else
– jordanb111
Mar 11 at 10:46
@jordanb111, in a locale using the UTF-8 charset, you can do
perl -XCO -le 'print chr($_) for 0x1..0xd7ff,0xe000..0x10ffff' | awk '/^[A-Z]$/'
to find the single-character collating elements that [A-Z]
matches (it may also match multi-character collating elements like Hungarian Dzs
). For /usr/xpg4/bin/awk
on Solaris, I get 1205 characters, including b-z
– Stéphane Chazelas
Mar 11 at 10:59
@jordanb111, in a locale using the UTF-8 charset, you can do
perl -XCO -le 'print chr($_) for 0x1..0xd7ff,0xe000..0x10ffff' | awk '/^[A-Z]$/'
to find the single-character collating elements that [A-Z]
matches (it may also match multi-character collating elements like Hungarian Dzs
). For /usr/xpg4/bin/awk
on Solaris, I get 1205 characters, including b-z
– Stéphane Chazelas
Mar 11 at 10:59
|
show 3 more comments
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f505598%2fusing-conditional-awk-statement-to-control-regex-pattern%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Why not
$1 != "S" && ($5~/^("")$/)
for that second block? That's what you initially described.– Kusalananda♦
Mar 11 at 10:24
That should be
$1==""S"" && ..
,$44!=""B"" && ..
, right? (otherwise it will matchS,...
(an unquotedS
as$1
).– mosvy
Mar 11 at 10:30
@Kusalananda apologies I copied the wrong segment of code see post for correct version
– jordanb111
Mar 11 at 10:35