Does string contain A, and if so, does a section of string contain B

Does string contain A, and if so, does a section of string contain B

Post by Jason Carl » Tue, 08 Dec 2009 08:16:35


Tricky subject, sorry.

I'm wanting to check a textarea to see if it contains "<img", and if
so, does the section between "<img" and the following ">" contain
"mydomain.com".

This is particularly tricky since there can be more than one
"<img...>" in the field.

I can do this in Perl easily enough:

while ($comment =~ /(<img[^>]+?>)/sgxi) {
if ($1 =~ /mydomain\.com/gi) {
# do whatever
}
}


But how do I create something similar in Javascript?

TIA,

Jason
 
 
 

Does string contain A, and if so, does a section of string contain B

Post by Evertjan » Tue, 08 Dec 2009 08:27:33

Jason Carlton wrote on 07 dec 2009 in comp.lang.javascript:


No it is not.



Do you think that is easy, look at javascript!



var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)


--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)

 
 
 

Does string contain A, and if so, does a section of string contain B

Post by Jason Carl » Tue, 08 Dec 2009 08:57:04


> > if ($1 =~ /mydomain\.com/gi) { >>>> # do whateve> >> > gt;}> > >} >>
> Do you think that is easy, look at javascrip>! >>> > > But how do I create something similar in Javascrip>? >>
> var booleanResult<= /]+mydomain\.c>m[>>]*>/i.test(st>) >>
> >-
> Evertja>.
> The Netherland>.
> (Please change the x'es to dots in my emailaddress)


Awesome! Thanks, Evertjan, that is easy. I couldn't find anything on
the i.test() function you used, though. Is there a different name for
that function?

Similarly, how do I do the opposite and test if any of t<e ""
tags do NOT contain mydomain.com?
 
 
 

Does string contain A, and if so, does a section of string contain B

Post by Thomas 'Po » Tue, 08 Dec 2009 10:55:39


I presume this can be done better in Perl, too.


That is not equivalent to what you are doing in Perl above, though.
Incidentally, you should not assume people know other languages than those
discussed in the target newsgroup, although it is often the case. When in
doubt, explain what the code in the other language does.


It is _not_ the i.test() function. The `i' (case-*i*nsensitive) belongs to
the RegExp literal, like in Perl. I am getting the idea here that you do
not know Perl (and Perl-compatible Regular Expressions) either.


Any name you want to give it. The property name stands for a reference to a
Function object; that object can have any number of references to it.
(However, it is required here that the base object of the reference is a
RegExp instance).


Possibility: Non-capturing negative lookahead (borrowed from PCRE, too).
RTFM.


PointedEars
--
realism: HTML 4.01 Strict
evangelism: XHTML 1.0 Strict
madness: XHTML 1.1 as application/xhtml+xml
-- Bjoern Hoehrmann
 
 
 

Does string contain A, and if so, does a section of string contain B

Post by Jason Carl » Tue, 08 Dec 2009 17:10:31

> I presume this can be done better in Perl, too.

TIMTOWTDI.



Don't be a douche. I'd never seen the switch followed by .test, and
really have never used a switch in Javascript, so I didn't catch that
this is what that was. Sue me.



I looked into that before posting, but I'm not sure that (a) I'm doing
it right, and (b) it's going to do what I'm needing.

This just returns true on everything:

booleanResult = /(?!<img[^>]+mydomain\.com[^>]*>)/gi.test(comment);


This returns false if there's only one <img...> tag that doesn't
contain mydomain.com, but if I have multiple tags then it returns true
if any of them do not contain mydomain.com:

booleanResult = /(?=<img[^>]+mydomain\.com[^>]*>)/gi.test(comment);

Which means that it would return this as false:

var comment = "Test <img src=' http://www.yqcomputer.com/ '>";

But this as true:

var comment = "Test <img src=' http://www.yqcomputer.com/
logo.gif'><br>Test <img src=' http://www.yqcomputer.com/ '>";


I need it to return false if ANY of the instances existed that didn't
contain mydomain.com.
 
 
 

Does string contain A, and if so, does a section of string contain B

Post by abozhilo » Tue, 08 Dec 2009 18:59:44


[^>]+

+ is greedy and here you have backtracking when engine go to `>`. You
can see in RegexBuddy with string:

<img src="mydomain.com" alt="" /> => Regex engine make 66 step before
match.

If you make plus lazzy:

<img[^>]+?mydomain\.com[^>]*> => 30 step

Regards.
 
 
 

Does string contain A, and if so, does a section of string contain B

Post by Csaba Gabo » Tue, 08 Dec 2009 22:26:10


....

I would try something like:
if (!(1+comment.replace(
/<img[^>]+?mydomain\.com[^>]*?>/gi,"<img>").
search(/<img[^>]+?>/i)))
alert ("all have mydomain.com");
else alert ("non mydomain.com detected");

That first replace is for degenerate cases of <img> in the string.
The second replace replaces all properly formed <img ...> elements
with a dummy element. The search then checks for any rogue
elements still left.

However, what about the case of something like:
<img src='othercomain.com' title='<img src="mydomain.com">'>
Everything discussed so far will fail on that - a
broader approach is necessary if you want to protect
against more complicated strings.

Csaba Gabor from Vienna
 
 
 

Does string contain A, and if so, does a section of string contain B

Post by Evertjan » Wed, 09 Dec 2009 00:51:44

Thomas 'PointedEars' Lahn wrote on 07 dec 2009 in comp.lang.javascript:


Indeed, I don't know a perl from a swine.


No lookahead needed,
if "none" of the tags is ment.

var invertedBooleanResult = !/<img[^>]+mydomain\.com[^>]*>/i.test(str)


--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
 
 
 

Does string contain A, and if so, does a section of string contain B

Post by Thomas 'Po » Wed, 09 Dec 2009 02:01:37


That's too bad.


Score adjusted

PointedEars
 
 
 

Does string contain A, and if so, does a section of string contain B

Post by Asen Bozhi » Wed, 09 Dec 2009 04:25:51


> lt;/]+?mydomain\.com>^>]>?>/gi<"").> > sear<h(//i)>)
> lert ("all have mydomain.co>");
> else alert ("non mydomain.com detecte>")>
>
> That first replace is for degenerate cases <> in the str>ng.
> The second replace replaces all properly f<rmed elem>nts
> with a dummy element. he search then checks for any r>gue
> elements still left.

Interesting. But your approach make two steps before completely
analyze input string.
What about this one<

/])+?>/i;

Will be match first image which doesn't contain "mydomain.com".

Regards ;~)
 
 
 

Does string contain A, and if so, does a section of string contain B

Post by Jason Carl » Wed, 09 Dec 2009 09:03:28


Thanks to all of you! This really helped a lot.

- Jason