Regular Expressions: Reuse Patterns Using Capture Groups

number1code · July 31, 2018, 2:46am

why are there no hints when i push get a hint
anyways, says to use a regex capture group to match numbers that are repeated only three times in a string, each separated by a space.

let repeatNum = “42 42 42”;
//the best answer i could come up with below
let reRegex = /(\d+)\s\1\s\1/;
let result = reRegex.test(repeatNum);

i keep failing the test case where repeatNum is “42 42 42 42” because i am matching despite it being more than 3 repeats of the number

any kind of pointing me in the right direction would help

camperextraordinaire · July 31, 2018, 2:52am

With your current regular expression, it will match the first three 42’s. Below I have surrounded in parentheses what the regular expression matches. It finds three of the same number separated by spaces. Think about two special symbols you have learned about which help you define a start and end of a string.

(42 42 42) 42

number1code · July 31, 2018, 3:01am

thanks mister. the answer ended up being /^(\d+)\s\1\s\1$/ and i went fishing for some sort of fancy answer which included the {3} quantity specifier somehow with 2 spaces thrown in

airathalis · July 31, 2018, 3:15am

I ran into the same problem number1code did earlier today while I was doing this challenge. I managed to solve it after reading some things on the forum that suggested you needed to do exactly as you indicated. However, I am not quite sure I understand precisely why the pattern will match 42 42 42 42. The best answer I could come up with by myself was because the regexp engine is greedy and so will try to match the longest possible set that matches. Thus, if was 42 42 42 42 42, it would match (42 42 42) 42 42. In other words, because it is greedy it will match anything beyond the (42 42 42) that matches the pattern as well. If that is the case, is it possible to make it lazy, such that you can avoid using the construction where you use a special character to indicate start and end of pattern to search for?

lionel-rowe · July 31, 2018, 3:26am

regex.test(str) checks whether str contains at least one substring that matches regex. It’s nothing to do with being greedy.

For example: + is greedy by default, but adding ? after it makes it lazy. However, /a+?/.test('aaa') still evaluates to true, because “a” (the matched portion) is a substring of “aaa”.

If you want to test the whole string rather than any substring, it’s necessary to use ^ and $.

airathalis · July 31, 2018, 3:41am

I think I understand. So, essentially, without the ^ and $ to, for a lack of better phrasing, signal this is the exact sort of string I’m looking for problems can arise. That is, without the ^ and $, once the regex matches a substring, whatever follows that substring will return true. So for example, regex.test(str) will return true for the following string “42 42 42 randomword” because the regex will match the (42 42 42). Thus, you need the ^ and $ to prevent the regex from returning true to strings that beyond what the regexp was intended to return true to.

lionel-rowe · July 31, 2018, 6:18am

Effectively, yes. A slightly more accurate way of thinking of it is like this:

/^abc$/.test(str) checks whether str contains a substring that starts with [start of string], followed by the characters “abc”, followed by [end of string].

Note that [start of string] and [end of string] aren’t actual characters, but they’re still tokens that regexes treat as part of the string when looking for matches. Another example of this type of token is \b, which represents a word boundary. /Home\b/ matches the first 4 characters of “Home Alone”, but it doesn’t match the first 4 characters of “Homer Simpson”.

BrBearOFS · August 9, 2018, 10:55pm

OK I am just missing parts here…

^ is the beginning of the group/string
\d is digits ( no limit of those)
the \s is the space.
I get that the \1 is the capture group… but would that mean that that the first (\1) is the first capture group followed by the \s space and the the second \1 (follwed by $) should be \2 ? OR

is the second \1 just a shorthand way of saying repeat the first capture group at the $ end ?

THanks in advance…

number1code · August 10, 2018, 2:42am

you would only need to use \2 if you had a second capture you wish to reference.
for example: /(first)(second)\1\2\2\1/ will match the string “firstsecondfirstsecondsecondfirst”

notice the capture groups are given the identifying number based off the order they are in the regex from left to right and that the (capture groups) themselves will be used in matching the first instance of a match within a string

BrBearOFS · August 10, 2018, 9:41am

Got it… thanks for breaking it down… !

gxcad · September 25, 2018, 12:39am

Thank you to everyone in this thread, all my questions have been answered (along with some testing to make sure I understand correctly…)!

The thing that got me the most was why the ^ and $ had to be used, but that makes sense as despite not matching the entire ‘48 48 48 48’, it still does match somewhere within the string, and you only want it to match at all if the entire string is ‘48 48 48’. By using BOTH ^ and $, you ensure that the string is exactly the number of times you specify in the rest of your regex, since your string cannot begin and end with the same set of ‘48 48 48’ unless the string is exactly that length.

For those reading this in the future and still confused, I’d recommend using a real time regex tester that highlights what the regex finds in the string and turning on ^ and/or $ and turning them off to see what the difference is. For me that illustrated why both have to be used really clearly.